CN111899744A

CN111899744A - Voice information processing method, device, server and storage medium

Info

Publication number: CN111899744A
Application number: CN202010685191.5A
Authority: CN
Inventors: 周昌宇; 刘金财; 王涛
Original assignee: China United Network Communications Group Co Ltd
Current assignee: China United Network Communications Group Co Ltd
Priority date: 2020-07-16
Filing date: 2020-07-16
Publication date: 2020-11-06

Abstract

The application provides a voice information processing method, a voice information processing device, a voice information processing server and a voice information processing storage medium. According to the technical scheme, whether the target account has the leakage risk or not can be accurately judged according to the voiceprint characteristic information and the voiceprint library of the incoming call user, abnormal operation information is pushed to the user corresponding to the target account when the incoming call user is determined not to be the user corresponding to the target account, the user can be timely reminded, timely early warning of the leakage risk of the account is achieved, user information safety is guaranteed, and user perception is improved.

Description

Voice information processing method, device, server and storage medium

Technical Field

The present application relates to the field of information security technologies, and in particular, to a method, an apparatus, a server, and a storage medium for processing voice information.

Background

With the rapid development of communication technology, when a user needs to handle related communication services, the user can handle the related communication services on line by dialing the intelligent service telephone through the terminal without going to a business hall, so that the user experience is improved.

In the prior art, in the process of transacting business through a terminal, a user needs to input a relevant service password based on prompt to verify the identity of the user, and when the identity verification passes, business inquiry and business transaction can be executed.

However, in practical applications, the service password of the local user may be leaked, and if a non-local user queries the relevant service information of the local user by using the known service password, the intelligent service system cannot distinguish the initiating user of the service transaction, so that a risk of user information leakage exists.

Disclosure of Invention

The embodiment of the application provides a voice information processing method, a voice information processing device, a server and a storage medium, which are used for solving the problem that user information possibly has leakage risks.

In a first aspect, an embodiment of the present application provides a method for processing voice information, including:

acquiring voice information of an incoming call user, wherein the voice information carries a target account to be operated;

based on the voice information, extracting voiceprint characteristic information of the incoming call user;

comparing the voiceprint characteristic information of the incoming call user with user voiceprint information corresponding to the target account stored in a voiceprint library, and determining whether the incoming call user is a user corresponding to the target account;

and when the incoming call user is determined not to be the user corresponding to the target account, pushing abnormal operation prompt information to the user corresponding to the target account.

In a possible design of the first aspect, when it is determined that the incoming call user is not a user corresponding to the target account, pushing abnormal operation prompt information to the user corresponding to the target account includes:

if the incoming call user is determined not to be the user corresponding to the target account, generating an abnormal operation log, wherein the abnormal operation log is used for identifying that the service password corresponding to the target account has a leakage risk;

determining the times of abnormal operation of the target account based on the abnormal operation log;

and when the number of times of the abnormal operation of the target account is greater than or equal to a first preset threshold value, pushing abnormal operation prompt information to a user corresponding to the target account.

As an example, after the prompt message of the abnormal operation is pushed to the user corresponding to the target account, the method further includes:

clearing an abnormal operation log of the target account, and updating the number of times of abnormal operation of the target account to be 0.

As another example, the method further comprises:

and when the number of times of the abnormal operation of the target account is greater than or equal to a second preset threshold value, forbidding the target account to be operated in a voice mode, wherein the second preset threshold value is greater than the first preset threshold value.

Optionally, after determining the number of times that the target account is abnormally operated based on the abnormal operation log, the method further includes:

and when the number of times of the abnormal operation of the target account is less than a second preset threshold value, executing the operation requested by the voice information.

In another possible design of the first aspect, before the extracting voiceprint feature information of the incoming call user based on the voice information, the method further includes:

acquiring a service password provided by the incoming call user;

and determining that the identity authentication of the incoming call user is passed based on the service password.

In yet another possible design of the first aspect, before comparing the voiceprint feature information of the incoming call user with the user voiceprint information corresponding to the target account stored in the voiceprint library and determining whether the incoming call user is a user corresponding to the target account, the method further includes:

obtaining a set of voice data samples, the set of voice data samples comprising: at least one voice data sample;

performing framing processing and sampling processing on each voice data sample in the voice data sample set by using a preset neural network model to obtain voiceprint characteristic information corresponding to each voice data sample;

and associating the voiceprint characteristic information corresponding to each voice data sample with the corresponding account, and storing each obtained association relation into the voiceprint library.

In yet another possible design of the first aspect, the method further includes:

determining the operation executed by the user corresponding to the target account based on the abnormal operation prompt information;

and determining whether to update the voiceprint library or not based on the modification of the abnormal operation prompt information by the user corresponding to the target account.

In a second aspect, an embodiment of the present application provides a speech information processing apparatus, including: the device comprises an acquisition module, a processing module and a sending module;

the acquisition module is used for acquiring voice information of an incoming call user, and the voice information carries a target account to be operated;

the processing module is configured to: based on the voice information, extracting voiceprint characteristic information of the incoming call user, comparing the voiceprint characteristic information of the incoming call user with user voiceprint information corresponding to the target account stored in a voiceprint library, and determining whether the incoming call user is a user corresponding to the target account;

the sending module is configured to, when the processing module determines that the incoming call user is not the user corresponding to the target account, push an abnormal operation prompt message to the user corresponding to the target account.

In a possible design of the second aspect, the processing module is further configured to generate an abnormal operation log when it is determined that the incoming call user is not a user corresponding to the target account, where the abnormal operation log is used to identify that a service password corresponding to the target account is at a risk of being leaked, and determine, based on the abnormal operation log, the number of times that the target account is abnormally operated;

the sending module is specifically configured to, when the number of times that the target account is abnormally operated is greater than or equal to a first preset threshold, push abnormal operation prompt information to a user corresponding to the target account.

As an example, the processing module is further configured to clear an abnormal operation log of the target account and update the number of times of abnormal operation of the target account to 0 after the sending module pushes abnormal operation prompt information to the user corresponding to the target account.

As another example, the processing module is further configured to prohibit the target account from being operated in a voice manner when the number of times that the target account is abnormally operated is greater than or equal to a second preset threshold, where the second preset threshold is greater than the first preset threshold.

Optionally, the processing module is further configured to, after determining, based on the abnormal operation log, the number of times that the target account is abnormally operated, and when the number of times that the target account is abnormally operated is smaller than a second preset threshold, execute an operation requested by the voice message.

In another possible design of the second aspect, the obtaining module is further configured to obtain a service password provided by the incoming call user before the processing module extracts voiceprint feature information of the incoming call user based on the voice information;

the processing module is further configured to determine that the identity authentication of the incoming call user passes based on the service password.

In yet another possible design of the second aspect, the obtaining module is further configured to obtain a voice data sample set before the processing module compares voiceprint feature information of the incoming call user with user voiceprint information corresponding to the target account stored in a voiceprint library, and determines whether the incoming call user is a user corresponding to the target account, where the voice data sample set includes: at least one voice data sample;

the processing module is further configured to perform framing processing and sampling processing on each voice data sample in the voice data sample set by using a preset neural network model to obtain voiceprint feature information corresponding to each voice data sample, associate the voiceprint feature information corresponding to each voice data sample with the corresponding account, and store each obtained association relationship in the voiceprint library.

In yet another possible design of the second aspect, the processing module is further configured to:

In a third aspect, embodiments of the present application further provide a server, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the processor executes the program to implement the method according to the first aspect and possible designs.

In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium, in which computer instructions are stored, and when the computer instructions are executed on a computer, the computer is caused to execute the method according to the first aspect and each possible design.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic view of an application scenario of a speech information processing method according to an embodiment of the present application;

fig. 2 is a flowchart of a first embodiment of a method for processing voice information provided by the present application;

fig. 3 is a flowchart of a second embodiment of a speech information processing method provided in the present application;

fig. 4 is a flowchart of a third embodiment of a speech information processing method provided in the present application;

fig. 5 is a flowchart of a fourth embodiment of a method for processing voice information provided by the present application;

fig. 6 is a flowchart of a fifth embodiment of a method for processing voice information provided by the present application;

FIG. 7 is a schematic structural diagram of an embodiment of speech information processing provided in the present application;

fig. 8 is a schematic structural diagram of a server for implementing a voice information processing method according to an embodiment of the present application.

With the above figures, certain embodiments of the present disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the presently disclosed concepts in any way, but rather to illustrate the presently disclosed concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

First, terms related to embodiments of the present application will be explained:

voiceprint (Voiceprint) is a sound spectrum displayed by an electroacoustic instrument and carrying speech information, and the vocal organs of each person are different.

The voiceprint feature information is personalized, and can distinguish voices of different people, and further judge whether the voiceprint feature information comes from the same person or not based on the two voiceprint feature information.

Before introducing the embodiments of the present application, the background of the present application is first explained as follows:

with the rapid development of communication technology, when a terminal user needs to handle the business of an operator, the terminal user does not need to go to a business hall for handling, and only needs to dial a service telephone of an intelligent customer service system of the operator by using the terminal, the terminal user can handle the business online, so that the user experience is improved.

In the prior art, in the process of transacting business through a terminal, a user mainly inputs a service password of a target account number based on voice prompt to verify the identity of the user, and can execute business inquiry and business transaction when the identity verification passes. However, in practical applications, the service password of the target account may be leaked, and if a lawless person uses the obtained service password to query the related service information of the target account, there may be a risk of leakage of the service information of the user, and the service of the account may be operated without authorization of the account.

In order to solve the above problems, the present application provides a voice information processing method, in which voice information of an incoming call user is acquired, the voice information carries a target account to be operated, voiceprint feature information of the incoming call user is extracted based on the voice information, the voiceprint feature information of the incoming call user is compared with user voiceprint information corresponding to the target account stored in a voiceprint library, whether the incoming call user is a user corresponding to the target account is determined, and when the incoming call user is not the user corresponding to the target account, abnormal operation prompt information is pushed to the user corresponding to the target account. According to the technical scheme, when the incoming call user is not the user corresponding to the target account, the abnormal operation prompt information can be timely pushed to the user, so that the user can timely perform corresponding processing, the risk of user information leakage is reduced, and the user information safety is guaranteed to a certain extent.

The technical concept of the technical scheme of the application is as follows: in a scene that a user inquires information through an intelligent customer service system to handle a service, the reason for low user information security is that an operator intelligent customer service system cannot distinguish an initiating user who handles the service, the intelligent customer service system can execute corresponding operation as long as a service password input by a calling user is correct, if the calling user is not a user who requests an account, service information of the account is inquired, and the account user is not aware of the condition, so that the risk of user information leakage exists.

In the embodiment of the application, for an actual service scene, the inventor finds that a voiceprint library is generated by acquiring the relationship between voiceprint feature information of each account and a corresponding account in advance, after a service password input by an incoming call user passes verification, user voiceprint feature information corresponding to the user voiceprint feature information can be extracted based on voice information of the incoming call user, and then the voiceprint feature information is compared with voiceprint feature information of the corresponding account in the voiceprint library, if the voiceprint feature information and the voice information are not consistent, abnormal operation information can be pushed to the user, and the user corresponding to the account can be reminded in time. Namely, the security of the user account is improved through the double verification of the service password and the voiceprint characteristic information.

In addition, when the voiceprint information of the incoming call user is inconsistent with the voiceprint feature information of the corresponding account in the voiceprint library, abnormal operation information (which may also be called a suspected leakage log) can be generated, and the number of times that the account is abnormally operated can be determined.

Before the technical solution of the present application is introduced, an application scenario of the embodiment of the present application is first introduced.

Exemplarily, fig. 1 is a schematic view of an application scenario of a speech information processing method provided in an embodiment of the present application. Referring to fig. 1, the application scenario may include a user terminal 11, a server 12, and a user 13.

Alternatively, when the user 13 has a service handling requirement, the user terminal 11 may be operated in various ways to send a service operation request to the server 12, for example, by dialing a service call, operating a client installed on the user terminal, and the like. The embodiment of the present application is explained in a voice manner by dialing a service phone.

When a user requests to operate a target account in a voice mode such as dialing a phone call, the server can acquire voice information of an incoming call user, process the voice information, determine the target account needing to be operated, and request the incoming call user to provide a service password of the target account, verify the identity of the incoming call user based on the service password, after the verification is passed, process the voice information of the incoming call user by the server, acquire voiceprint feature information of the incoming call user, compare the voiceprint feature information with the voiceprint feature information corresponding to the target account in a voiceprint library, and judge whether the service password of the target account has a risk of leakage according to a comparison result.

In the embodiment of the application, at least one piece of voiceprint feature information with the account identification is stored in the voiceprint library, and in practical application, the server can update the stored voiceprint feature information of some accounts in the voiceprint library according to the acquired voice information of a plurality of users.

Optionally, when the server determines that the service password of the target account is at risk of leakage, the server may generate an account abnormal operation record and send the warning information to the terminal used by the user of the target account, or when the number of times of the account abnormal operation record reaches a set first preset threshold, the server sends the warning information to the terminal used by the user of the target account.

As an example, after the server sends the warning information, the server may also clear the account abnormal operation record, and when it is subsequently determined that the service password of the target account is at risk of leakage, the account abnormal operation record is regenerated.

As another example, after the server sends the warning information, and when it is determined that the number of times of the abnormal operation record of the account reaches a set second preset threshold, the server closes the function of performing online voice operation on the target account by using the target account until the service password of the target account is exposed, where in this embodiment, the second preset threshold is greater than the first preset threshold.

It can be understood that the embodiment of the present application does not limit a specific implementation scheme for the server to push the warning information, and the implementation scheme can be determined according to actual requirements, which is not described herein again.

The technical solution of the present application will be described in detail below with reference to specific examples. It should be noted that the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Fig. 2 is a flowchart of a first embodiment of a voice information processing method provided in the present application. As shown in fig. 2, the method may include the steps of:

s201, acquiring voice information of the incoming call user.

And the voice information carries a target account to be operated.

Illustratively, when a user has a service handling requirement, a service call is dialed through a user terminal, and the server acquires voice information of an incoming call user.

Optionally, in order to determine the account that the incoming call user wants to operate, the voice message carries a target account to be operated, and thus, the server may obtain the target account to be operated by processing the voice message.

S202, based on the voice information, extracting voiceprint feature information of the incoming call user.

In the embodiment of the present application, after acquiring the voice information of the incoming call user, the server may process the voice information, for example, perform framing and sampling processing on the voice information of the incoming call user by using a pre-trained voiceprint extraction model, so as to obtain voiceprint feature information of the incoming call user.

Optionally, the voiceprint extraction model is obtained by training a Convolutional Neural Network (CNN) model using the historical speech information of the user. Optionally, the CNN is a type of feedforward neural network model that includes convolution calculation and has a depth structure, a voiceprint extraction model is obtained by training the model in advance, and then framing and sampling processing is performed on the obtained voice information by using the voiceprint extraction model, so that voiceprint feature information corresponding to the voice information can be obtained, and accuracy of extracted user voiceprint feature information is improved.

S203, comparing the voiceprint characteristic information of the incoming call user with the user voiceprint information corresponding to the target account in the voiceprint library, and determining whether the incoming call user is the user corresponding to the target account.

At least one piece of voiceprint characteristic information with account number identification is stored in the voiceprint library.

Illustratively, after acquiring voiceprint feature information of the incoming call user, the server compares the voiceprint feature information with user voiceprint information of a target account in a voiceprint library to obtain a comparison result. And if the comparison result shows that the voiceprint characteristic information of the incoming call user is inconsistent with the user voiceprint information of the target account, determining that the incoming call user is not the user corresponding to the target account.

Optionally, the voiceprint feature information in the voiceprint library is extracted and stored in advance, the server records the user voice files corresponding to the account numbers in advance, and then framing and sampling processing is performed on each user voice file by using a pre-trained voiceprint extraction model to obtain the voiceprint feature information of the user corresponding to each account number.

And S204, when the incoming call user is determined not to be the user corresponding to the target account, pushing abnormal operation prompt information to the user corresponding to the target account.

For example, after the server obtains the comparison result, if it is determined that the voiceprint feature information of the target account is inconsistent with the voiceprint feature information of the incoming call user, it is determined that the incoming call user is not the user corresponding to the target account, and therefore, abnormal operation information is pushed to the user corresponding to the target account.

Optionally, the method for the server to push the abnormal message to the user corresponding to the target account may include: directly pushing a message prompt to a target account by a server; and the server dials a voice call and the like to the target account, which is only an example and does not limit the manner of pushing the abnormal operation prompt message.

According to the voice information processing method provided by the embodiment of the application, the voiceprint characteristic information of the voice of the incoming call user is obtained, the voiceprint characteristic information of the target account number to be operated by the incoming call user, which is pre-stored in the voiceprint library, is extracted, the voiceprint characteristic information of the incoming call user is compared with the voiceprint information of the user, which corresponds to the target account number, stored in the voiceprint library, whether the incoming call user is the user corresponding to the target account number is determined, and when the incoming call user is determined not to be the user corresponding to the target account number, abnormal operation prompt information is pushed to the user corresponding to the target account number. According to the technical scheme, when the incoming call user is inconsistent with the user of the target account, the abnormal operation information is pushed, so that the user can find the abnormal operation information in time and perform corresponding processing, and the user information safety is ensured to a certain extent.

On the basis of the foregoing embodiment, fig. 3 is a flowchart of a second embodiment of the speech information processing method provided by the present application, and as shown in fig. 3, the foregoing S204 may be implemented by the following steps:

s301, if the incoming call user is determined not to be the user corresponding to the target account, generating an abnormal operation log.

And the abnormal operation log is used for identifying that the service password corresponding to the target account is leaked.

For example, when the incoming call user is not the user corresponding to the target account, the server generates an abnormal operation log corresponding to the target account based on the comparison result, and stores the abnormal operation log in the server, where the abnormal operation log is the same as the log of information that the user corresponding to the target account is inconsistent with the incoming call user, and may be used to identify that the service password corresponding to the target account is leaked because the provided service password is correct but not provided by the user corresponding to the target account.

Optionally, the abnormal operation log is bound with the target account.

And S302, determining the times of the abnormal operation of the target account based on the abnormal operation log.

For example, since the target account and the abnormal operation log are bound to each other, the number of times that the target account is abnormally operated may be determined based on the abnormal operation log, and as the generated abnormal operation log increases, the number of times that the target account is abnormally operated also increases correspondingly.

And S303, when the number of times of the abnormal operation of the target account is greater than or equal to a first preset threshold value, pushing abnormal operation prompt information to a user corresponding to the target account.

Illustratively, if the first preset threshold is 3, when the number of times of the abnormal operation bound by the target account is greater than 3 or equal to 3, the server pushes abnormal operation information to the user corresponding to the target account to prompt the user.

Optionally, the setting manner of the first preset threshold may include: the target account number corresponds to the preset times of the user; the server does not limit the setting mode of the first preset threshold value according to the times obtained according to the daily life rule of the target user and the like.

According to the voice information processing method provided by the embodiment of the application, the number of times of the abnormal operation of the target account is compared with the first preset threshold, and when the number of times of the abnormal operation of the target account is larger than or equal to the first preset threshold, the abnormal operation prompt information is pushed to the user corresponding to the target account so as to remind the user corresponding to the target account that the target account has a service password leakage risk, so that the user can timely make corresponding processing, and the information safety of the target account is guaranteed.

In one possible design of the embodiment of the present application, the method may further include the following steps:

clearing an abnormal operation log of the target account, and updating the number of times of abnormal operation of the target account to 0.

For example, when the number of times of the abnormal operation bound by the target account is greater than or equal to a first preset threshold, for example, greater than 3 or equal to 3, after the server pushes the abnormal operation information to the user corresponding to the target account, the server also clears the abnormal operation log bound by the target account and the number of times of the corresponding abnormal operation, that is, sets the number of times of the abnormal operation of the target account to 0.

For example, if the second preset threshold is 5, when the number of times of abnormal operation bound to the target account is greater than 5 or equal to 5, the server prohibits operating the target account in a voice manner, and pushes information that voice-manner operation service is prohibited to a user corresponding to the target account, so as to avoid a problem that the user corresponding to the target account is abnormally operated due to missing the abnormal operation information.

Optionally, in this possible design of the application, after determining, based on the abnormal operation log, the number of times that the target account is abnormally operated, the method may further include the following steps:

Illustratively, when the number of times of the abnormal operation of the target account is less than 5, the server still operates the target account according to the indication of the incoming call user, so that the normal execution of the request operation can be ensured.

It is worth mentioning that: the second preset threshold value must be greater than the first preset threshold value; for example, the number of times 3 is a first preset threshold, the number of times 5 is a second preset threshold, and specific values of the first preset threshold and the second preset threshold are all exemplified, and in actual application, the specific values may be set according to an actual scene, and details are not described here.

According to the voice information processing method, the abnormal operation times of the target account are respectively compared with the first preset threshold and the second preset threshold, the server pushes the abnormal operation information to the user corresponding to the target account, the user corresponding to the target account can be reminded, the target account has a service password leakage risk, and the information safety of the target account can be protected to the greatest extent by forbidding the voice operation of the target account.

On the basis of the foregoing embodiments, fig. 4 is a flowchart of a third embodiment of a speech information processing method provided in the present application. As shown in fig. 4, before S202, the method further includes the steps of:

s401, obtaining a service password provided by the incoming call user.

For example, before the incoming call user operates the target account, the incoming call user is required to input the service password of the target account, and the server obtains the service password input by the incoming call user.

Alternatively, the service password may be a combination of pure numbers, a combination of pure letters, or a combination of letters and numbers, etc.

Optionally, the service password of the target account may be automatically generated by the server when the account is claimed for the first time, and may be set by the user corresponding to the target account. The embodiment of the present application does not limit the generation manner of the range password

S402, based on the service password, the identity authentication of the incoming call user is determined to be passed.

Illustratively, when the service password input by the incoming call user is consistent with the service password of the target account, that is, the identity authentication of the incoming call user is passed, the server extracts the voiceprint feature information of the incoming call user.

Illustratively, the incoming call user is prohibited from operating the target account number if the authentication fails.

According to the voice information processing method, whether the incoming call user has the authority of operating the target account is verified according to the service password of the corresponding account input by the incoming call user, and if the verification fails, the server does not perform other operations. In the technical scheme, through double verification of the voiceprint characteristic information and the service password, double protection can be provided for the information security of the target account, and the working efficiency of the server is improved.

On the basis of the foregoing embodiments, fig. 5 is a flowchart of a fourth embodiment of a speech information processing method provided by the present application. As shown in fig. 5, before S202, the method further includes the steps of:

s501, obtaining a voice data sample set, wherein the voice data sample set comprises: at least one speech data sample.

Illustratively, the voice data samples of the user corresponding to the accounts are obtained in advance, and the voice data samples of the user can be obtained when the user first claims the account, or can be recorded online at any time, which is not limited here and can be determined according to different situations.

Optionally, the obtained voice data sample may be a word or a segment of a word of the user corresponding to the target account, which is not limited herein, but includes at least one voice data sample.

S502, framing and sampling each voice data sample in the voice data sample set by using a preset neural network model to obtain voiceprint characteristic information corresponding to each voice data sample.

Illustratively, for a voice data sample of an account, the server inputs each acquired voice data sample into a trained convolutional neural network model (CNN), and performs framing and sampling processing on each voice data sample by using the convolutional neural network model to obtain voiceprint feature information corresponding to the voice data sample.

Similarly, for voice data samples of other account numbers, the voice data samples corresponding to each account number can be obtained according to the same extraction method of the voiceprint feature information, and the voiceprint feature information corresponding to each account number is obtained through processing of the CNN model one by one.

And S503, associating the voiceprint characteristic information corresponding to each voice data sample with the corresponding account number, and storing the voiceprint characteristic information and the corresponding account number in the voiceprint library.

In this embodiment, the server associates each piece of voiceprint feature information obtained in the above steps with a corresponding account, so that the account corresponding to each piece of voice data in the voice data sample set has the corresponding voiceprint feature information, and stores the corresponding voiceprint feature information in the voiceprint library.

After the processing, the voiceprint library at least comprises a piece of voiceprint characteristic information of the user corresponding to the target account.

According to the voice information processing method, the voice print characteristic information of the user corresponding to each account is obtained through CNN model processing according to the voice data samples provided by the users corresponding to the accounts and is stored in a voice print library. In the technical scheme, by generating the voiceprint library, a realization premise is provided for subsequent voiceprint security verification, and reliable information security support is provided for the information security of the target account.

On the basis of the foregoing embodiments, fig. 6 is a flowchart of a fifth embodiment of a speech information processing method provided by the present application. As shown in fig. 6, after S204, the method further includes the steps of:

s601, determining the operation executed by the user corresponding to the target account based on the abnormal operation prompt information.

For example, after the server pushes the abnormal operation information to the user corresponding to the target account, the server may detect the operation information of the user of the target account on the target account within a preset time period.

Optionally, the preset time period may be one day, half a month, or one month, and the specific value may be determined according to actual settings.

Optionally, if the server detects that the user modifies the service password of the target account within the preset time period, it indicates that the incoming call user may not be the user of the target account, and therefore, when the incoming call user requests the target account again, the server needs to determine again whether the service password and the voiceprint feature information of the incoming call user pass the verification, and if both pass the verification, the server operates the target account based on the request of the incoming call user.

Optionally, if the server detects that the user does not modify the service password within the preset time period, the voice operation function of the target account may be locked or closed.

S602, based on the modification of the abnormal operation prompt information by the user corresponding to the target account, determining whether to update the voiceprint library.

Optionally, the server detects that the user does not modify the service password within the preset time period and locks or closes the voice operation function of the target account, then opens the voice operation function of the target account based on the request of the user, and reacquires the voiceprint feature information of the user corresponding to the target account, at this time, the voiceprint library may be updated according to the newly acquired voiceprint feature information, otherwise, the voiceprint library is not updated.

According to the voice information processing method, whether the voiceprint library is updated or not is determined by modifying the abnormal operation prompt information based on the target account corresponding to the user. According to the technical scheme, whether the voiceprint library is updated or not is judged, so that the accuracy of subsequent voiceprint characteristic information judgment can be improved, and a foundation is laid for ensuring the information safety of a user.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Fig. 7 is a schematic structural diagram of an embodiment of speech information processing provided in the present application. Referring to fig. 7, the apparatus may include: an obtaining module 701, a processing module 702 and a sending module 703.

An obtaining module 701, configured to obtain voice information of an incoming call user, where the voice information carries a target account to be operated;

a processing module 702, configured to extract voiceprint feature information of the incoming call user based on the voice information, compare the voiceprint feature information of the incoming call user with user voiceprint information corresponding to the target account stored in a voiceprint library, and determine whether the incoming call user is a user corresponding to the target account;

a sending module 703 is configured to, when the processing module 702 determines that the incoming call user is not the user corresponding to the target account, push an abnormal operation prompt message to the user corresponding to the target account.

In a possible design, the processing module 702 is further configured to generate an abnormal operation log when it is determined that the incoming call user is not the user corresponding to the target account, where the abnormal operation log is used to identify that a risk of leakage of a service password corresponding to the target account exists, and determine, based on the abnormal operation log, the number of times that the target account is abnormally operated;

the sending module 703 is specifically configured to, when the number of times that the target account is abnormally operated is greater than or equal to a first preset threshold, push an abnormal operation prompt message to a user corresponding to the target account.

As an example, the processing module 702 is further configured to, after the sending module 703 pushes the abnormal operation prompting information to the user corresponding to the target account, clear an abnormal operation log of the target account, and update the number of times of the abnormal operation of the target account to 0.

As another example, the processing module 702 is further configured to prohibit operating the target account in a voice manner when the number of times that the target account is abnormally operated is greater than or equal to a second preset threshold, where the second preset threshold is greater than the first preset threshold.

Optionally, the processing module 702 is further configured to, after determining, based on the abnormal operation log, the number of times that the target account is abnormally operated, execute the operation requested by the voice message when the number of times that the target account is abnormally operated is smaller than a second preset threshold.

In another possible design, the obtaining module 701 is further configured to obtain a service password provided by the incoming call user before the processing module 702 extracts voiceprint feature information of the incoming call user based on the voice information;

the processing module 702 is further configured to determine that the identity authentication of the incoming call user passes based on the service password.

In another possible design, the obtaining module 701 is further configured to, before the processing module 702 compares the voiceprint feature information of the incoming call user with the user voiceprint information corresponding to the target account stored in the voiceprint library, and determines whether the incoming call user is a user corresponding to the target account, obtain a voice data sample set, where the voice data sample set includes: at least one voice data sample;

the processing module 702 is further configured to perform framing processing and sampling processing on each voice data sample in the voice data sample set by using a preset neural network model to obtain voiceprint feature information corresponding to each voice data sample, associate the voiceprint feature information corresponding to each voice data sample with a corresponding account, and store each obtained association relationship in the voiceprint library.

In yet another possible design, the processing module 702 is further configured to:

The apparatus provided in the embodiment of the present application may be used to execute the method in the embodiments shown in fig. 2 to fig. 6, and the implementation principle and the technical effect are similar, which are not described herein again.

It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the processing module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a function of the processing module may be called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

Fig. 8 is a schematic structural diagram of a server for implementing a voice information processing method according to an embodiment of the present application. As shown in fig. 8, the server may include: the system comprises a processor 81, a memory 82, a communication interface 83 and a system bus 84, wherein the memory 82 and the communication interface 83 are connected with the processor 81 through the system bus 84 and complete mutual communication, the memory 82 is used for storing computer execution instructions, the communication interface 83 is used for communicating with other devices, and the processor 81 implements the scheme of the embodiment shown in fig. 2 to fig. 6 when executing the computer execution instructions.

In fig. 8, the processor 81 may be a general-purpose processor including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.

The memory 82 may comprise Random Access Memory (RAM), read-only memory (RAM), and non-volatile memory (non-volatile memory), such as at least one disk memory.

The communication interface 83 is used to enable communication between the database access device and other devices (e.g., clients, read-write libraries, and read-only libraries).

The system bus 84 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

Optionally, an embodiment of the present application further provides a computer-readable storage medium, where computer instructions are stored, and when the computer instructions are executed on a computer, the computer is caused to execute the method according to the embodiment shown in fig. 2 to 6.

Optionally, an embodiment of the present application further provides a chip for executing the instruction, where the chip is configured to execute the method in the embodiment shown in fig. 2 to 6.

Embodiments of the present application further provide a program product, where the program product includes a computer program, where the computer program is stored in a computer-readable storage medium, and the computer program can be read by at least one processor from the computer-readable storage medium, and the at least one processor can implement the method in the embodiments shown in fig. 2 to 6 when executing the computer program.

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method for processing speech information, comprising:

2. The method according to claim 1, wherein when the incoming call user is not a user corresponding to the target account, pushing abnormal operation prompt information to the user corresponding to the target account includes:

3. The method according to claim 2, wherein after the abnormal operation prompt message is pushed to the user corresponding to the target account, the method further comprises:

4. The method of claim 2, further comprising:

5. The method of claim 4, wherein after the determining a number of times the target account number is abnormally operated based on the abnormal operation log, the method further comprises:

6. The method according to any one of claims 1-5, wherein before said extracting voiceprint feature information of said incoming call user based on said voice information, further comprising:

acquiring a service password provided by the incoming call user;

7. The method according to any one of claims 1 to 5, wherein before comparing the voiceprint feature information of the incoming call user with the user voiceprint information corresponding to the target account stored in a voiceprint library and determining whether the incoming call user is a user corresponding to the target account, the method further comprises:

8. The method according to any one of claims 1-5, further comprising:

9. A speech information processing apparatus characterized by comprising: the device comprises an acquisition module, a processing module and a sending module;

the processing module is used for extracting voiceprint characteristic information of the incoming call user based on the voice information, comparing the voiceprint characteristic information of the incoming call user with user voiceprint information corresponding to the target account stored in a voiceprint library, and determining whether the incoming call user is a user corresponding to the target account;

10. A server comprising a processor, a memory and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of the preceding claims 1-8 when executing the program.

11. A computer-readable storage medium having stored thereon computer instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1-8.