WO2020024415A1

WO2020024415A1 - Voiceprint recognition processing method and apparatus, electronic device and storage medium

Info

Publication number: WO2020024415A1
Application number: PCT/CN2018/107954
Authority: WO
Inventors: 潘燕飞
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-08-03
Filing date: 2018-09-27
Publication date: 2020-02-06
Also published as: CN109087647A; CN109087647B

Abstract

Provided are a voiceprint recognition processing method and apparatus, an electronic device and a storage medium. The method comprises: outputting prompt information containing a first random code if an instruction for recognizing a voiceprint is obtained, wherein the prompt information refers to information used for prompting the content needing to be contained in speech provided by a recognized person (S210); acquiring speech which is provided by the recognized person and contains the first random code (S220); converting the speech into text by means of speech recognition, and extracting a second random code contained in the text (S230); and checking the second random code by using the first random code, and if the second random code is successfully checked, carrying out voiceprint recognition to obtain a voiceprint recognition result (S240).

Description

Voiceprint recognition processing method, device, electronic equipment and storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 3, 2018, with the application number 201810877973.1, and the application name is "Voiceprint Recognition Processing Method, Device, Electronic Equipment and Storage Medium" Citations are incorporated in this application.

Technical field

The present application relates to the technical field of voiceprint recognition, and in particular, to a voiceprint recognition processing method, device, electronic device, and storage medium.

Background technique

Voiceprint recognition is a system that recognizes the identity of a speaker based on the characteristics of the person's voice, and uses voice to verify the identity of the speaker. This technology has good features such as convenience, stability, measurability, accuracy, and security. As a non-contact collection and identification technology, voiceprint has low acquisition cost, convenient acquisition, and simple use. It has great application prospects in banks, social security, public security, smart home, and mobile payment.

In traditional voiceprint recognition, the usual method is to obtain a voice, and then match the voiceprint in the database based on the acquired voice to verify the identity of the identified person. However, this voiceprint recognition method has security problems.

Summary of the invention

The embodiments of the present application provide a voiceprint recognition processing method, device, electronic device, and storage medium, which can solve the security problem caused by the inability to determine the sound source in voiceprint recognition.

In a first aspect, an embodiment of the present application provides a voiceprint recognition processing method. The method includes: if an instruction to recognize a voiceprint is obtained, outputting prompt information including a first random code, where the prompt information refers to The information provided by the identified person should contain information to prompt; obtain the speech provided by the identified person containing the first random code; convert the speech to text through speech recognition, and extract the text containing A second random code; and verifying the second random code using the first random code; if the second random code is successfully verified, performing voiceprint recognition to obtain a voiceprint recognition result.

In a second aspect, an embodiment of the present application further provides a voiceprint recognition processing device, wherein the device includes: an output unit configured to output prompt information including a first random code if an instruction to recognize the voiceprint is obtained, so that The prompt information refers to information for prompting the content provided by the identified person's voice; the acquisition unit is used to acquire the voice provided by the identified person containing the first random code; the extraction unit is used to pass Speech recognition, converting the speech into text, and extracting a second random code included in the text; and a verification unit for verifying the second random code using the first random code, if The second random code is successfully verified, and voiceprint recognition is performed to obtain a voiceprint recognition result.

In a third aspect, an embodiment of the present application further provides an electronic device including a memory and a processor. The memory stores a computer program, and the processor implements the voiceprint recognition processing method when the processor executes the computer program.

According to a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium. The storage medium stores a computer program, where the computer program includes program instructions, and the program instructions can implement the foregoing sound when executed by a processor. Pattern recognition processing method.

Embodiments of the present application provide a voiceprint recognition processing method, device, electronic device, and storage medium. In the embodiment of the present application, the voiceprint recognition device outputs prompt information including a first random code, thereby obtaining a voice provided by the identified person, obtaining the second random code from the voice, and converting the first random code. Match the second random code, and if the second random code is successfully verified, further perform voiceprint recognition of the voice to obtain a voiceprint recognition result, which can ensure that the voice provider is a living body, thereby improving voice Pattern recognition security.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to explain the technical solutions of the embodiments of the present application more clearly, the drawings used in the description of the embodiments are briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application. For ordinary technicians, other drawings can be obtained based on these drawings without paying creative work.

FIG. 1 is a schematic diagram of an application scenario of a voiceprint recognition processing method according to an embodiment of the present application; FIG.

2 is a schematic flowchart of a voiceprint recognition and processing method according to an embodiment of the present application;

3 is a schematic flowchart of a voiceprint recognition processing method according to another embodiment of the present application;

FIG. 4 is a schematic block diagram of a voiceprint recognition processing apparatus according to an embodiment of the present application; FIG.

FIG. 5 is a schematic block diagram of an electronic device according to an embodiment of the present application.

detailed description

In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

Please refer to FIG. 1. FIG. 1 is a schematic diagram of an application scenario of a voiceprint recognition processing method according to an embodiment of the present application. The application scenario includes:

(1) User refers to the person being identified, that is, the person performing voiceprint recognition through the voiceprint recognition device.

(2) Voiceprint recognition device, an electronic device, can be a terminal device. Voiceprint recognition is performed by acquiring the voice provided by the identified person. The voiceprint recognition device can be obtained through the microphone component included in the voiceprint recognition device itself. The voice of the identified person.

The working process of each subject in Figure 1 is as follows: if the voiceprint recognition device obtains an instruction to recognize the voiceprint, it outputs a prompt message, and the identified person sends out a voice according to the prompt information. The voiceprint recognition result is obtained, and the identified person is identified and identified.

It should be noted that FIG. 1 only shows one identified person. In the actual operation process, there can be multiple identified persons. At the same time, the voiceprint recognition device can be a separate electronic device terminal or other electronic devices. Components, components or functional units included in other electronic devices, such as the functional units of smart terminal devices, to complete all or part of the functions of voiceprint recognition. The application scenario of the above voiceprint recognition processing method is only used to illustrate the technology of this application The solution is not intended to limit the technical solution of the present application.

FIG. 2 is a schematic flowchart of a voiceprint recognition processing method according to an embodiment of the present application. The voiceprint recognition processing method is applied to the voiceprint recognition device in FIG. 1. The voiceprint recognition device may be a separate electronic device terminal, or may be a component or component of another electronic device to complete all of the voiceprint recognition. Or some features.

FIG. 2 is a schematic flowchart of a voiceprint recognition processing method according to an embodiment of the present application. As shown in Figure 2, the method includes the following steps S210-S240:

S210. If an instruction to recognize a voiceprint is obtained, prompt information including a first random code is output, and the prompt information refers to information used to prompt the content provided by the identified person's voice.

Specifically, voiceprint recognition refers to the identification of the speaker's identity based on the characteristics of the speaker's sound waves, and to identify or confirm the identity of the speaker who issued the voice. Voiceprint recognition processing is widely used in wisdom that requires confirmation of the identity of the person. Construction, smart home, financial security and other fields.

The random code is a series of characters with a predetermined number of bits randomly generated by the voiceprint recognition device. The random code can include numbers, characters, letters, and combinations of the above forms. For example, the random code can be "6589", "technology", "jym", or "6589jym". By setting a random code in the voice provided by the identified person, the voice provided by the living body can be ensured during voiceprint recognition processing, the identification of the person by recording is avoided, and the security of verification during voiceprint recognition processing is improved. It should be noted that the first random code is also a random code, and is only a difference between the random codes in order to distinguish different random codes.

The prompt information refers to information for prompting the content that should be included in the voice provided by the identified person, and is used to make a clear prompt for the content that should be included in the voice provided by the user. For example, the prompt information shown may include random information. The code can also include environmental information for voiceprint recognition, such as the address and floor information of the identified person, or the company address and company name of the identified person, which are used to determine the identity of the identified person and can be narrowed down in the database. Data matching range. The prompt information may further include a segment of random text for the identified person to provide voice according to the prompt random text. Further, the provided random text ensures that the voiceprint recognition voice provider is a living body, further ensuring the security of voiceprint recognition.

Specifically, the voiceprint recognition can be effectively applied to the security access control of public areas of a building, for example, the access control of a residential area or an office area. However, in traditional voiceprint recognition, the usual method is to obtain a voice, and then match the voiceprint in the database based on the acquired voice to verify the identity of the identified person. However, this method cannot determine whether the acquired voice is provided by a living body or a recording obtained through a recording medium. Therefore, voiceprint recognition has a security problem. In the embodiment of the present application, if the voiceprint recognition device obtains an instruction to recognize a voiceprint, it outputs prompt information including a first random code, and the prompt information is used to prompt the content that the voice provided by the identified person should contain. For example, when a person is at the entrance of a residential area or an office area, if the voiceprint recognition device detects the human body to activate voiceprint recognition through infrared detection, or activate the voiceprint recognition through the access control button. If the voiceprint recognition device activates voiceprint recognition, when performing personnel recognition, the voiceprint recognition device outputs prompt information including the first random code, so that the identified person provides a voice including the first random code, and the voiceprint recognition The device obtains the voice provided by the identified person, analyzes the content of the first random code through the voiceprint recognition processing technology, and verifies the first random code to realize the live detection in the voiceprint recognition processing to prevent recording counterfeiting. Personnel perform voiceprint recognition to achieve the purpose of living body detection, ensure the security of voiceprint recognition processing, and further identify the identity of the identified person based on the voiceprint characteristics of the identified person.

Similarly, when a user logs in to an account on some terminals, the user who logs in to the account can also be authenticated through voiceprint recognition processing. When the terminal enters the personnel identification interface, the terminal can also output a prompt message containing a first random code. , Verify the voiceprint recognition process for the identity of the person who logged in to the account.

Further, the voiceprint recognition device outputs prompt information including the first random code, and the prompt information may be the prompt information displayed by the voiceprint recognition device in a text form on the display interface of the voiceprint recognition device, or the voiceprint The recognition device prompts the user to provide the prompt information included in the voice in the form of a voice broadcast, and may also simultaneously display in the form of a text display and a voice broadcast.

S220. Acquire the voice provided by the identified person and including the first random code.

Specifically, after the voiceprint recognition device prompts the content of the first random code to be included in the voice required for voiceprint recognition, the identified person provides a voice including the first random code according to the prompt information output by the voiceprint recognition device, For example, the identified person reads the first random code "6589", "technology", "jym" or "6589jym", etc., for verification by the voiceprint recognition device when the voiceprint recognition device performs the voiceprint recognition. The voice provider is a living body, and the voiceprint recognition device obtains the voice provided by the recognized person through a component such as a microphone.

S230: Convert the voice into text through speech recognition, and extract a second random code included in the text.

Among them, speech recognition technology, also known as automatic speech recognition, English is Automatic Speech Recognition, abbreviated as ASR, and its goal is to convert vocabulary content in human speech into computer-readable input.

Specifically, after the voiceprint recognition device obtains the voice provided by the identified person, it uses ASR voice recognition technology to convert the voice into text and segment the converted text, according to the first paragraph defined in the prompt information in step S220. Information of a random code, the information of the first random code including the number of bits, content of the first random code, and the position of the first random code in the voice, etc. are extracted from the converted text The second random code. For example, if the information of the first random code included in step S220 includes a two-digit random code at the front of the speech, then in the converted text, the first two digits are taken as the second random code.

S240. Use the first random code to verify the second random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.

Specifically, the voiceprint recognition device implements verification of the second random code by comparing the first random code with the second random code, and then determines whether the identified person provided by the voice is a living body. If the first random code and the second random code are the same, it means that the identified person provided by the voice is alive, otherwise, it indicates that the identified person provided by the voice may be provided by a recording, which is not safe.

If the second random code contained in the text obtained by the voiceprint recognition device through the voice recognition technology is different from the first random code output by the voiceprint recognition device, the second random code check fails, and a jump occurs There is a process, and the voiceprint recognition device prompts that the identification fails. Further, the prompt information including the first random code may be output again. Otherwise, if the second random code contained in the text obtained by the voice recognition technology is the same as the first random code output by the voiceprint recognition device, and the second random code is successfully checked, the voice is performed according to the voice. Pattern recognition processing, query to find out the voice of the person / person bank corresponding to this voice information, and then compare the voiceprints one to one or more than one. Through voiceprint recognition, get the voiceprint recognition results and perform user identification. Verification to ensure the security of voiceprint recognition processing.

In the embodiment of the present application, the voiceprint recognition device outputs prompt information including a first random code, thereby obtaining a voice provided by the identified person, obtaining the second random code from the voice, and converting the first random code. Match with the second random code, if the second random code is successfully verified, and then perform voiceprint recognition of the voice to obtain a voiceprint recognition result, it can ensure that the voice provider is a living body, thereby improving the voice Pattern recognition security.

Further, if there are N people in a scene, such as N people living in a community, or N people living in a building, for voiceprint recognition, there are N voiceprint model data in the database that need to be matched. The larger the value of N is, the voiceprint recognition needs to compare voiceprint models one by one. The number of voiceprint matching processes that need to be processed during voiceprint recognition processing is large, resulting in a low recognition rate of voiceprint recognition processing. Please refer to FIG. 3. FIG. 3 is a schematic flowchart of a voiceprint recognition processing method according to another embodiment of the present application. The voiceprint recognition processing method includes the following steps S310-S340:

S310. If an instruction to recognize a voiceprint is obtained, prompt information including a first random code and preset content is output, where the prompt information refers to information used to prompt the content provided by the recognized voice.

The preset content is preset voice content provided by the user when the voiceprint recognition device is registered. The preset content may be related to an application environment of voiceprint recognition, and the application environment includes a position or a name during voiceprint recognition. Users can set different preset contents according to different application environments. For example, in a residential area, the preset content may be "I live in xxx room xxx", while in an office area, the preset content may be "my company's address is xxx" or "my company's name is xxx" Wait.

Specifically, if the voiceprint recognition device obtains an instruction to recognize a voiceprint, it outputs prompt information including a first random code and preset content. Wherein, in order to ensure user privacy, the voiceprint recognition device displays prompt information in a text form on a screen of the voiceprint recognition device through a display interface, or when the prompt information is output by a voice broadcast, the preset content uses a linguistic designation In the form of generation, the metaphor refers to that the specific content contained in the preset content is not explicitly prompted, and the preset content is not clear and specific. For example, the voiceprint recognition device prompts the user to enter a voice containing "first random code + my address", or the voiceprint recognition device prompts the user to enter a voice containing "random code + my company's address" or "first random code + my company" Name "in the form of voice, etc., without prompting" My address "," My company's address "or" My company's name ".

The preset content can not only limit the identification of the identified person in different use environments, but also the voiceprint recognition device can reduce the range of data matching during voiceprint recognition through the preset content when performing voiceprint recognition. For example, in a residential area, if the preset content is "I live in xx building xx room", the prompt message output by the voiceprint recognition device is "random code + my address", and when the voiceprint recognition device performs voiceprint recognition If it is detected that the acquired voice content contains "I live in xx room xx room", when performing data matching in the database, the data range matched during voiceprint recognition can be narrowed down to the data range containing the "xx building" keyword It is not necessary to match the data in the entire database one by one when performing voiceprint recognition processing, thereby improving the efficiency of voiceprint recognition processing. If the voiceprint recognition device detects voiceprint data in the database that matches the voiceprint contained in the acquired voice information, it passes the identity verification of the identified person; otherwise, it does not pass the identity verification of the identified person. By using the first random code and the prompt information of the preset content as the voice content of the voiceprint recognition process, not only can the verification of the random code ensure that the voiceprint recognition process is the voice provided by the living body, but also can pass the preset content. The included user information guarantees security during voiceprint recognition processing, and can reduce the range of voiceprint data capacity matching during voiceprint recognition processing, improving the efficiency of voiceprint recognition processing.

In one embodiment, the order of the first random code and the preset content is defined in the voice.

Specifically, the voiceprint recognition device outputs prompt information including the first random code and preset content, and an order of the first random code and preset content is limited in the voice. For example, the sequence of voice content required by the voiceprint recognition device to be provided by the identified person may be "first random code + preset content", or "preset content + first random code", or both of the above-mentioned sequential cycles, or the above Two orders are output randomly. By limiting the different order of the first random code and the preset content included in the prompt information, in subsequent steps, according to each time the position of the first random code and the preset content in the voice, The recognition obtains the first random code and preset content included in the voice, and further uses the first random code and the preset content to perform voiceprint recognition processing verification, thereby achieving unpredictability provided by the voice. When the voiceprint recognition device performs voiceprint recognition, it can detect the sequence of voice content to further ensure that the voice is a real-time voice provided by the living body, and ensure the security of voiceprint recognition.

Further, the voiceprint recognition device outputs prompt information including the first random code and preset content, and the prompt information may be displayed in text form on a display interface of the display screen of the voiceprint recognition device, such as "random code + during registration" "Preset content", or the voiceprint recognition device prompts the user with the content of the voice that should be provided in the form of a voice announcement.

S320. Acquire a voice provided by the identified person and including the first random code and the preset content.

The terminal device obtains a segment of speech provided by the identified person according to a specific use scenario, which includes the first random code and preset content.

Specifically, after the voiceprint recognition device prompts the prompt information that the voiceprint recognition requires for voiceprint recognition, the identified person provides a voice including the first random code and the preset content according to different usage scenarios, such as in In residential areas, presets including "random code and residential address provided during registration" can be provided, and in office areas, presets including "random code and office address or registered office name provided during registration" can be provided Content for voiceprint recognition processing by the voiceprint recognition device. The voiceprint recognition device obtains a segment of speech provided by the identified person according to the current usage scenario, including the first random code and preset content.

S330. Convert the voice into text through speech recognition, and extract a second random code and the preset content included in the text.

Specifically, after the voiceprint recognition device obtains the speech provided by the identified person, the speech is converted into text by ASR speech recognition technology, and the converted text is segmented according to the first random included in the prompt information in step S330. The position and content of the code and the preset content in the voice, and the number of digits of the second random code, and extract the information contained in the second random code and the preset content from the converted text By comparing the second random code extracted from the acquired voice with the first random code prompted by the voiceprint recognition device, and matching the corresponding data in a database according to the user information extracted from the acquired voice, In order to realize the identification of the user's identity through voiceprint recognition. For example, if the first random code of two bits is in front of the preset content, the first two bits of the converted text are taken as the second random code. If the preset content is "I live in xx room xx room", the voiceprint recognition device matches the voiceprint model in the database with xx building, thereby reducing the range of voiceprint model matching during voiceprint recognition and improving The efficiency of voiceprint recognition.

S340. Use the first random code to verify the second random code. If the second random code is successfully verified, perform voiceprint recognition according to the preset content to obtain a voiceprint recognition result.

Specifically, the voiceprint recognition device according to the first random code and the voice sequence of the preset content included in the prompt information, and the second random code obtained by segmenting the voice into text by voice recognition And comparing with the first random code. If the second random code contained in the text obtained by speech recognition is different from the first random code output by the voiceprint recognition device, the second random code check fails, and there is a flow, the voiceprint recognition device It is prompted that the identification fails, and further, the prompt information including the first random code and the preset content may be output again. Otherwise, if the second random code included in the text obtained through speech recognition is the same as the first random code output by the voiceprint recognition device, and the second random code is successfully checked, then the Set the user information contained in the content, perform voiceprint recognition matching within the data range corresponding to the user information, query to find the sound of the person / person library corresponding to this voice information, and then perform one-to-one or one-to-many Comparison of individual voiceprints, user authentication is performed through voiceprint recognition. For example, in a residential area, if the preset content is "I live in xx building xx room", the prompt message output by the voiceprint recognition device is "random code + my address", and when the voiceprint recognition device performs voiceprint recognition If it is detected that the acquired voice content contains "I live in xx room xx room", when performing data matching in the database, the data range matched during voiceprint recognition can be narrowed down to the data range containing the "xx building" keyword Or reduce the range of data for voiceprint recognition to family members in the “xx building xx room”, instead of matching the data in the entire database one by one for voiceprint recognition processing, thereby improving voiceprint recognition processing s efficiency. Because the voice of the person / person bank corresponding to this piece of voice information determined according to the user information, the voiceprint recognition processing data stored in the database of the voiceprint recognition device has been reduced, thereby greatly reducing the voiceprint recognition processing. The comparison amount improves the efficiency and accuracy of voiceprint recognition processing.

In one embodiment, if the instruction for identifying the voiceprint is obtained, the step of outputting prompt information including the first random code further includes: the prompt information includes the position of the first random code in the voice.

Specifically, the prompt information includes the position of the first random code in the voice, which refers to the order in which the first random code of the first random code in the voice is prompted in the prompt message, and is preset in the voice. Determining the position of the first random code. Pre-setting the position of the first random code in the voice means that the position of the first random code in the voice is predefined, for example, the identified person first speaks the first random code The position of the first random code in the voice is at the head of the voice, and the identified person finally speaks the first random code, and the position of the first random code in the voice is in the voice The tail of the speech. In a subsequent step, the first random code included in the voice is obtained according to the position of the first random code in the voice, and the first random code is further verified. In this case, the voiceprint recognition device detects only the first random code in the voice, and other voice content included in the voice is not considered. At this time, the position of the first random code in the voice is limited. For example, the first random code is in the front of the voice, or the first random code is in the tail of the voice. Taking the first few digits or the last few digits of the speech-transformed text according to the number of bits of the first random code.

Further, the position of the first random code in the voice included in the prompt information is randomly defined during each voiceprint recognition process.

Specifically, the position of the first random code in the voice is randomly defined at each voiceprint recognition process, which means that the position of the first random code in the voice is not fixed at each voiceprint recognition process. , It can prompt the identified person that the first random code is in the front of the voice, in the middle of the voice, or in the tail of the voice. At each voiceprint recognition, the voiceprint recognition device randomly defines the The position of the first random code in the voice, and storing the position of the first random code to the identified person in the voice through the prompt information, and in the subsequent steps, according to each time the first A position of a random code in the voice acquires the second random code included in the voice, and further verifies the second random code. For example, the first random code is at the front of the speech once, and the first random code is at the front or tail of the next speech, etc., by the position of the first random code in the speech is Random limitation can realize more flexible security verification for voiceprint recognition processing.

In one embodiment, the prompt information further includes that the voice is a voice within a preset time length.

Specifically, the voiceprint recognition device requires the identified person to provide a voice within a preset time length. By limiting the time length of the voice, the security of the voiceprint recognition processing can be further ensured. The preset time length can be, for example, a voice within 15 seconds, or a voice between 15 seconds and 30 seconds. By setting the time length of the voice, the voiceprint recognition processing can be more accurately limited. Conditions to improve the security of voiceprint recognition processing. Since the preset time length of the voice is preset in the background, others will not easily know that by limiting the length of the voice provided by the identified person, the voiceprint recognition process can be prevented from being continuously tried, and the voiceprint recognition process can be further guaranteed. Security.

In one embodiment, the time length of the voice is randomly limited, and the identified person is prompted to require the identified person to provide the speech within a preset time length, such as requiring the identified person to provide a period of speech within 15 seconds, or The identified person is required to provide a voice within 20 seconds, which can be randomly limited by the preset time length of the voice, and can also realize the detection of the living body in the voiceprint recognition processing to prevent the voice recorder from performing voiceprint recognition.

It should be noted that the voiceprint recognition and processing methods described in the foregoing embodiments may be combined with technical features included in different methods as needed to obtain a combined implementation solution, but all fall within the protection scope required by this application. within.

Referring to FIG. 4, corresponding to the voiceprint recognition processing method described above, an embodiment of the present application further provides a voiceprint recognition processing device. FIG. 4 is a schematic block diagram of a voiceprint recognition processing device according to an embodiment of the present application. The voiceprint recognition processing device includes a unit for performing the above-mentioned voiceprint recognition processing method, and the device may be configured in an electronic device such as a desktop computer, a notebook, or a smart phone. Specifically, referring to FIG. 4, the voiceprint recognition processing device includes an output unit 401, an obtaining unit 402, an extraction unit 403, and a verification unit 404.

The output unit 401 is configured to output prompt information including a first random code if an instruction to recognize a voiceprint is obtained, where the prompt information refers to information used to prompt the content provided by the recognized voice;

An obtaining unit 402, configured to obtain a voice provided by the identified person and including the first random code;

An extraction unit 403, configured to convert the speech into text through speech recognition, and extract a second random code included in the text; and

The verification unit 404 is configured to verify the second random code by using the first random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.

In one embodiment, the prompt information further includes preset content, and an order of the first random code and the preset content is limited in the voice;

The checking unit 404 is configured to check the second random code using the first random code, and if the second random code is successfully checked, perform voiceprint recognition according to the preset content, Get voiceprint recognition results.

In one embodiment, the prompt information output by the output unit 401 includes a position of the first random code in the voice and the voice is a voice within a preset time length.

It should be noted that those skilled in the art can clearly understand that the specific implementation process of the voiceprint recognition processing device 400 and each unit can refer to the corresponding descriptions in the foregoing method embodiments. For convenience and concise description, This will not be repeated here.

At the same time, the division and connection of each unit in the voiceprint recognition processing device is only for illustration. In other embodiments, the voiceprint recognition processing device can be divided into different units as required, and the voiceprint recognition processing can also be Each unit in the device adopts different connection sequences and methods to complete all or part of the functions of the voiceprint recognition processing device.

The above-mentioned voiceprint recognition processing device can be implemented in the form of a computer program, which can be run on an electronic device as shown in FIG. 5.

Please refer to FIG. 5, which is a schematic block diagram of an electronic device according to an embodiment of the present application. The electronic device 500 may be a terminal, or a component or component in another device. The terminal may be an electronic device with a communication function, such as a desktop computer.

Referring to FIG. 5, the electronic device 500 includes a processor 502, a memory, a network interface 505, and an audio input interface 506 connected through a system bus 501. The memory may include a non-volatile storage medium 503 and an internal memory 504.

The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute one of the voiceprint recognition processing methods described above.

The processor 502 is used to provide computing and control capabilities to support the operation of the entire electronic device 500.

The internal memory 504 provides an environment for running a computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute one of the voiceprint recognition processing methods described above.

The network interface 505 is configured to perform network communication with other devices, the audio input interface 506 is configured to obtain a voice provided by an identified person, and the audio input interface 506 may be a microphone (microphone) or the like. Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the electronic device 500 to which the solution of the present application is applied. The specific electronic device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.

The processor 502 is configured to run a computer program 5032 stored in a memory to implement the voiceprint recognition processing method in the embodiment of the present application.

It should be understood that, in the embodiment of the present application, the processor 502 may be a central processing unit (CPU), and the processor 502 may also be another general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), Application-specific integrated circuits (Application Specific Integrated Circuits, ASICs), ready-made programmable gate arrays (Field-Programmable Gate Arrays, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor, or the processor may be any conventional processor.

A person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by a computer program. The computer program may be stored in a storage medium, and the storage medium is a computer-readable storage medium. The computer program is executed by at least one processor in the computer system to implement the steps of the embodiment of the voiceprint recognition processing method described above.

Therefore, an embodiment of the present application further provides a computer-readable storage medium. The storage medium stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to execute the steps of the voiceprint recognition processing method described in the foregoing embodiments.

The storage medium may be various computer-readable storage media that can store a computer program, such as a U disk, a mobile hard disk, a read-only memory (ROM), a magnetic disk, or an optical disk.

Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in combination with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software, Interchangeability. In the above description, the composition and steps of each example have been described generally in terms of functions. Whether these functions are performed by hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.

The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, and these modifications or replacements should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

A voiceprint recognition processing method includes:

If an instruction to recognize a voiceprint is obtained, a prompt message including a first random code is output, and the prompt information refers to information for prompting what should be included in the voice provided by the identified person;

Acquiring the speech provided by the identified person and including the first random code;

Converting the speech into text through speech recognition, and extracting a second random code contained in the text; and

Use the first random code to verify the second random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.
The voiceprint recognition processing method according to claim 1, wherein the prompt information further includes preset content, and an order of the first random code and the preset content is limited in the voice;

The verifying the second random code using the first random code, and if the verification of the second random code is successful, the step of performing voiceprint recognition to obtain a voiceprint recognition result includes:

Use the first random code to verify the second random code, and if the second random code is successfully verified, perform voiceprint recognition according to the preset content to obtain a voiceprint recognition result.
The voiceprint recognition processing method according to claim 1, wherein the step of outputting prompt information including a first random code if an instruction to recognize a voiceprint is obtained, further comprising:

The prompt information includes a position of the first random code in the voice.
The voiceprint recognition processing method according to claim 3, wherein the prompt information includes the position of the first random code in the voice is randomly defined each time the voiceprint recognition is performed.
The voiceprint recognition processing method according to claim 1, wherein the prompt information further comprises that the voice is a voice within a preset time length.
A voiceprint recognition processing device includes:

An output unit, configured to output prompt information including a first random code if an instruction to recognize a voiceprint is obtained, where the prompt information refers to information used to prompt the content provided by the identified person's voice;

An obtaining unit, configured to obtain a voice provided by the identified person and including the first random code;

An extraction unit, configured to convert the speech into text through speech recognition, and extract a second random code included in the text; and

The verification unit is configured to verify the second random code by using the first random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.
The voiceprint recognition processing device according to claim 6, wherein the prompt information further includes preset content, and an order of the first random code and the preset content is limited in the voice;

The verification unit is configured to verify the second random code by using the first random code. If the verification of the second random code is successful, perform voiceprint recognition according to the preset content to obtain Voiceprint recognition results.
The voiceprint recognition processing device according to claim 6, wherein the prompt information output by the output unit includes a position of the first random code in a voice.
The voiceprint recognition processing device according to claim 8, wherein the prompt information output by the output unit includes the position of the first random code in the voice is randomly defined each time the voiceprint is recognized.
The voiceprint recognition processing device according to claim 6, wherein the prompt information output by the output unit further includes that the voice is a voice within a preset time length.
An electronic device includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the following steps when the processor executes the computer program:

If an instruction to recognize a voiceprint is obtained, a prompt message including a first random code is output, and the prompt information refers to information for prompting what should be included in the voice provided by the identified person;

Acquiring the speech provided by the identified person and including the first random code;

Converting the speech into text through speech recognition, and extracting a second random code contained in the text; and

Use the first random code to verify the second random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.
The electronic device according to claim 11, wherein the prompt information further includes preset content, and an order of the first random code and the preset content is limited in the voice;

The verifying the second random code using the first random code, and if the verification of the second random code is successful, the step of performing voiceprint recognition to obtain a voiceprint recognition result includes:

The second random code is verified using the first random code. If the second random code is successfully verified, voiceprint recognition is performed according to the preset content to obtain a voiceprint recognition result.
The electronic device according to claim 11, wherein if the instruction for identifying the voiceprint is obtained, the step of outputting the prompt information including the first random code further comprises:

The prompt information includes a position of the first random code in the voice.
The electronic device according to claim 13, wherein the position of the first random code in the voice included in the prompt information is randomly defined every time the voiceprint is recognized.
The electronic device according to claim 11, wherein the prompt information further comprises that the voice is a voice within a preset time length.
A storage medium, wherein the storage medium stores a computer program, and the computer program, when executed by a processor, can implement the following operations:

If an instruction to recognize a voiceprint is obtained, a prompt message including a first random code is output, and the prompt information refers to information for prompting what should be included in the voice provided by the identified person;

Acquiring the speech provided by the identified person and including the first random code;

Converting the speech into text through speech recognition, and extracting a second random code contained in the text; and

Use the first random code to verify the second random code. If the second random code is successfully verified, perform voiceprint recognition to obtain a voiceprint recognition result.
The storage medium according to claim 16, wherein the prompt information further includes preset content, and an order of the first random code and the preset content is limited in the voice;

The verifying the second random code using the first random code, and if the verification of the second random code is successful, the step of performing voiceprint recognition to obtain a voiceprint recognition result includes:

Use the first random code to verify the second random code, and if the second random code is successfully verified, perform voiceprint recognition according to the preset content to obtain a voiceprint recognition result.
The storage medium according to claim 16, wherein if the instruction for identifying the voiceprint is obtained, the step of outputting the prompt information including the first random code further comprises:

The prompt information includes a position of the first random code in the voice.
The storage medium according to claim 18, wherein the position of the first random code in the voice included in the prompt information is randomly defined every time the voiceprint is recognized.
The storage medium according to claim 16, wherein the prompt information further includes that the voice is a voice within a preset time length.