CN107767856B

CN107767856B - Voice processing method and device and server

Info

Publication number: CN107767856B
Application number: CN201711084067.8A
Authority: CN
Inventors: 张翼飞
Original assignee: Bank of China Ltd
Current assignee: Bank of China Ltd
Priority date: 2017-11-07
Filing date: 2017-11-07
Publication date: 2021-11-19
Anticipated expiration: 2037-11-07
Also published as: CN107767856A

Abstract

The invention provides a voice processing method, a voice processing device and a server, which can prompt a user by voice to help the user handle business when a user terminal displays a functional page corresponding to banking business to be handled, and further can realize the application of voice service in the banking business.

Description

Voice processing method and device and server

Technical Field

The invention relates to the field of finance, in particular to a voice processing method, a voice processing device and a server.

Background

The voice technology enables a computer to listen, see, speak and feel, and is a development direction of future human-computer interaction, wherein voice becomes the best viewed human-computer interaction mode in the future, and has more advantages than other interaction modes.

With the continuous development of voice technology, more and more applications are added to voice services, for example, a navigation map realizes voice map navigation, and an input method can realize voice input. Among them, the voice service is a service using a voice technology.

Although voice services have been incorporated into many applications, voice services have not been used when banking is being conducted.

Disclosure of Invention

In view of the above, the present invention provides a voice processing method, apparatus and server, so as to solve the problem that voice service is not used yet when banking is handled.

In order to solve the technical problems, the invention adopts the following technical scheme:

a voice processing method is applied to an interaction module and comprises the following steps:

when a user terminal displays a functional page corresponding to banking business to be transacted, receiving a text to be translated sent by the user terminal; the text to be translated is obtained according to the information of the node to be processed in the preset business process corresponding to the business account ID of the banking business;

sending the text to be translated to a voice synthesis module;

receiving the voice message fed back by the voice synthesis module;

and sending the voice message to the user terminal so as to enable the user terminal to play the voice message.

Preferably, when the user terminal displays a functional page corresponding to a banking service to be transacted, before receiving a text to be translated sent by the user terminal, the method further includes:

receiving a voice instruction sent by the user terminal; wherein, the voice instruction carries user voice and ticket information;

when the ticket information is legal ticket information, acquiring a service type corresponding to the user voice;

searching for function entry information corresponding to the service type and the service ID;

and sending the function entry information and the service ID to the user terminal so that the user terminal renders a page according to the function entry information to obtain the function page, and obtains the text to be translated according to a preset service flow corresponding to the service ID.

Preferably, the sending the voice message to the user terminal so that after the user terminal plays the voice message, the method further includes:

receiving user input voice sent by the user terminal; the user input voice is the voice which is prompted by the user terminal to be input by the user according to the next node to be processed in the preset business process;

sending the user input voice to the voice recognition module;

receiving a recognition result sent by the voice recognition module;

and sending the identification result to the user terminal so that the user terminal renders the functional page according to the identification result.

Preferably, the determining whether the ticket information is legal ticket information includes:

sending the ticket information to an external service system;

and judging whether a verification passing message sent by the external service system is received.

Preferably, the obtaining of the service type corresponding to the user voice includes:

sending the user voice to a voice recognition module;

receiving a voice recognition result sent by the voice recognition module;

sending the voice recognition result to an intention recognition module;

and receiving the service type corresponding to the voice recognition result fed back by the intention recognition module.

A speech processing device applied to an interaction module comprises:

the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving a text to be translated sent by a user terminal when the user terminal displays a functional page corresponding to banking business to be transacted; the text to be translated is obtained according to the information of the node to be processed in the preset business process corresponding to the business account ID of the banking business;

the first sending module is used for sending the text to be translated to the voice synthesis module;

the second receiving module is used for receiving the voice message fed back by the voice synthesis module;

and the second sending module is used for sending the voice message to the user terminal so as to enable the user terminal to play the voice message.

Preferably, the method further comprises the following steps:

the third receiving module is used for receiving a voice instruction sent by the user terminal before the first receiving module receives the text to be translated sent by the user terminal when the user terminal displays a functional page corresponding to the banking business to be transacted; wherein, the voice instruction carries user voice and ticket information;

the acquisition module is used for acquiring the service type corresponding to the voice of the user when the ticket information is legal ticket information;

the searching module is used for searching the function entry information corresponding to the service type and the service ID;

and the third sending module is used for sending the function entry information and the service ID to the user terminal so that the user terminal renders a page according to the function entry information to obtain the function page, and obtains the text to be translated according to a preset service process corresponding to the service ID.

Preferably, the method further comprises the following steps:

the input voice receiving module is used for sending the voice message to the user terminal by the second sending module so as to receive the user input voice sent by the user terminal after the user terminal plays the voice message; the user input voice is the voice which is prompted by the user terminal to be input by the user according to the next node to be processed in the preset business process;

the voice sending module is used for sending the voice input by the user to the voice recognition module;

the result receiving module is used for receiving the recognition result sent by the voice recognition module;

and the result sending module is used for sending the identification result to the user terminal so that the user terminal renders the function page according to the identification result.

Preferably, the voice processing apparatus further includes a determining module, where the determining module is configured to, when determining whether the ticket information is legal ticket information, specifically:

sending the ticket information to an external service system;

Preferably, the obtaining module is configured to, when obtaining the service type corresponding to the user voice, specifically:

sending the user voice to a voice recognition module;

receiving a voice recognition result sent by the voice recognition module;

sending the voice recognition result to an intention recognition module;

A server comprising a transmit port and a receive port;

the receiving port is used for receiving the text to be translated sent by the user terminal and receiving the voice message fed back by the voice synthesis module when the user terminal displays the functional page corresponding to the banking business to be transacted; the text to be translated is obtained according to the information of the node to be processed in the preset business process corresponding to the business account ID of the banking business;

the sending port is used for sending the text to be translated to the voice synthesis module and sending the voice message to the user terminal so that the user terminal can play the voice message.

Preferably, the method further comprises the following steps: a processor;

the receiving port is also used for receiving a voice instruction sent by the user terminal before receiving a text to be translated sent by the user terminal when the user terminal displays a functional page corresponding to banking business to be transacted; wherein, the voice instruction carries user voice and ticket information;

the processor is used for acquiring a service type corresponding to the user voice and searching for function entry information and the service ID corresponding to the service type when the ticket information is legal ticket information;

the sending port is further configured to send the function entry information and the service ID to the user terminal, so that the user terminal renders a page according to the function entry information to obtain the function page, and obtains the text to be translated according to a preset service process corresponding to the service ID.

Compared with the prior art, the invention has the following beneficial effects:

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flow chart of a method of speech processing according to the present invention;

FIG. 2 is a flow chart of another method of speech processing provided by the present invention;

FIG. 3 is a schematic structural diagram of a speech processing apparatus according to the present invention;

fig. 4 is a schematic structural diagram of another speech processing apparatus provided in the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a voice processing method which is applied to an interactive module, wherein the interactive module is communicated with a user terminal, a voice synthesis module, an external service system and a voice recognition module.

The voice processing method comprises the following steps:

s11, receiving a text to be translated sent by the user terminal when the user terminal displays a functional page corresponding to the banking business to be transacted;

the text to be translated is obtained according to the information of the node to be processed in the preset business process corresponding to the business account ID of the banking business.

Specifically, when the user wants to transact the account transfer service, the function page is the account transfer page, and when the user wants to transact the money withdrawal service, the function page is the money withdrawal page.

The text to be translated is a Chinese character string, and the Chinese character string needs to be translated into voice and output. The text to be translated may be "please tell me your payee be me account or he account".

S12, sending the text to be translated to a speech synthesis module;

the speech synthesis module translates the text to be translated into speech. The voice synthesis module comprises a text analysis module, a rhythm generation module and a voice generation module. And after receiving the text to be translated, the voice synthesis module performs text analysis, prosody generation and voice generation operations on the text to be translated, and finally obtains a voice message corresponding to the text to be translated.

S13, receiving the voice message fed back by the voice synthesis module;

and S14, sending the voice message to the user terminal so that the user terminal plays the voice message.

And after the user terminal receives the voice message, playing the voice message through a loudspeaker of the user terminal. For example, a voice of 'please tell me your payee is my line account or his line account' is played through a loudspeaker, and the user is prompted to carry out the next operation.

Optionally, on the basis of this embodiment, after step S14, the method further includes:

1) receiving user input voice sent by the user terminal; the user input voice is the voice which is prompted by the user terminal to be input by the user according to the next node to be processed in the preset business process;

specifically, when information to be input in a text box is displayed in a preset business process, if the transfer amount needs to be input on a function page corresponding to the transfer, the text box of the transfer amount is blank, at this time, the user needs to input the transfer amount, and at this time, the user inputs voice input by the user. The user input speech may be 200-tuple.

2) Sending the user input voice to the voice recognition module;

the interaction module sends the user input voice to the voice recognition module, and the voice recognition module recognizes the user input voice to obtain a recognition result, such as a character string corresponding to a recognition result of 200 yuan.

3) Receiving a recognition result sent by the voice recognition module;

4) and sending the identification result to the user terminal so that the user terminal renders the functional page according to the identification result.

Specifically, after receiving the recognition result, the user terminal renders a functional page according to the recognition result, for example, the functional page is input 200 in a dialog box corresponding to the transfer amount.

It should be noted that, a user inputs information in each text box according to the method in this embodiment, and after all the text boxes input information, the user clicks a confirmation button on the user terminal, the user terminal sends all the input information to the target system, and the target system performs subsequent operations, such as determining whether data input by the user is correct.

In the embodiment, when the user terminal displays the functional page corresponding to the banking business to be handled, the user is prompted by voice to help the user handle the business, and then the voice service can be applied to the banking business.

In addition, the voice processing method provided by the invention does not need to improve the bank background processing system, the set interaction module is independent of the bank background processing system, and the interaction module only interacts with the user terminal, so that the low coupling and the subsequent expansion between the systems are ensured.

Optionally, on the basis of the foregoing embodiment, referring to fig. 2, when the user terminal displays a functional page corresponding to a banking service to be handled, before receiving a text to be translated sent by the user terminal, the method further includes:

s21, receiving a voice instruction sent by the user terminal;

the voice instruction carries user voice and ticket information.

In order to understand this step clearly and clearly to those skilled in the art, the process of requesting a ticket by the next user will be described.

The user terminal sends a ticket application request to a target system, wherein the ticket application request comprises a user ID, and the target system can be a background online banking system, a background account transfer system and the like.

The target system adds a preset character string in the ticket application request to obtain a new ticket application request, and sends the new ticket application request to an external service system, wherein the external service system can be a mutual trust support system. And the external service system generates a ticket according to the new ticket application request and sends the ticket to the user terminal through the target system.

When a user wants to execute a certain banking service through a user terminal, a voice instruction is sent to the interaction module, wherein the voice instruction includes user voice and ticket information, and the user voice refers to voice input by the user, and may be: "I want to transfer money across rows".

The ticket information comprises original ticket information and a ticket requested by the user terminal. The original ticket information refers to the user ID and the preset character string, and the ticket requested by the user terminal is the ticket sent by the external service system.

S22, when the ticket information is legal, acquiring the service type corresponding to the user voice;

optionally, on the basis of this embodiment, determining whether the ticket information is legal ticket information includes:

and sending the ticket information to an external service system, and judging whether a verification passing message sent by the external service system is received.

Specifically, the external service system is used for verifying whether the ticket is legal ticket information, so the interactive module needs to send the ticket information to the external service system, the external service system verifies whether the ticket information is legal ticket information, if the ticket information is legal ticket information, the external service system sends a verification passing message to the interactive module, and if the ticket information is not legal ticket information, the external service system sends a verification failing message to the interactive module.

After the interactive module sends the ticket information to the external service system, whether the ticket information is legal ticket information can be determined by judging whether a verification passing message sent by the external service system is received or not.

It should be noted that, in this embodiment, the ticket mechanism is used to verify the identity of the user terminal, and in addition, mechanisms such as a white list may also be used to verify the identity of the user.

Optionally, on the basis of this embodiment, acquiring the service type corresponding to the user voice includes:

1) sending the user voice to a voice recognition module;

2) receiving a voice recognition result sent by a voice recognition module;

after the voice recognition result is sent to the voice recognition module, the voice recognition module obtains a voice recognition result by converting the voice into a text. If the user sends out the voice of 'I transfer across lines', the voice recognition module recognizes the text of 'I transfer across lines'.

3) Sending the voice recognition result to an intention recognition module;

4) and receiving the service type corresponding to the voice recognition result fed back by the intention recognition module.

After receiving the voice recognition result, the intention recognition module analyzes and obtains the service type through word segmentation, part of speech tagging, voice dependence analysis, deep voice classification and other modes, for example, the intention recognition module receives a text that 'I wants to transfer across lines', and the service type obtained through analysis is 'transfer across lines'.

S23, searching function entry information and a service ID corresponding to the service type;

specifically, the database of the interaction module stores function entry information and service IDs corresponding to different service types, and when a service type is determined, the function entry information and the service ID corresponding to the service type can be found. Wherein, the function entry information may be a uniform resource locator url.

And S24, sending the function entrance information and the service ID to the user terminal.

After receiving the function entry information and the service ID, the user terminal can perform page rendering according to the function entry information to obtain a function page, and obtain a text to be translated according to a preset service flow corresponding to the service ID.

Specifically, the user terminal is provided with a preset service flow, and the preset service flow is an interactive flow defined in an xml format. The preset service process comprises a plurality of service nodes, each service node has a task to be processed, after the user terminal obtains a service ID, the preset service process corresponding to the service ID is searched, and the information of the node to be processed is searched and obtained from the preset service process, wherein the information comprises a text to be translated.

It should be noted that when the user terminal performs page rendering according to the function entry information to obtain the function page, a hypertext transfer protocol http service providing a representational state transfer REST style is adopted.

In this embodiment, the service type corresponding to the user voice can be obtained according to the voice instruction input by the user, and then the function entry information and the service ID corresponding to the service type are searched for and sent to the user terminal, so that the user terminal renders a page and obtains a text to be translated.

Optionally, another embodiment of the present invention provides a speech processing apparatus, which is applied to an interaction module, and referring to fig. 3, the speech processing apparatus includes:

the first receiving module 101 is configured to receive a text to be translated sent by a user terminal when the user terminal displays a functional page corresponding to a banking service to be transacted; the method comprises the steps that a text to be translated is obtained according to information of a node to be processed in a preset business process corresponding to a business account ID of banking business;

the first sending module 102 is configured to send a text to be translated to the speech synthesis module;

the second receiving module 103 is configured to receive the voice message fed back by the voice synthesis module;

and a second sending module 104, configured to send the voice message to the user terminal, so that the user terminal plays the voice message.

Optionally, on the basis of this embodiment, the method further includes:

It should be noted that, for the working process of each module in this embodiment, please refer to the corresponding description in the above embodiments, which is not described herein again.

Optionally, in addition to the embodiment of the speech processing apparatus, with reference to fig. 4, the speech processing apparatus further includes:

a third receiving module 105, configured to receive, when the user terminal displays a functional page corresponding to a banking service to be transacted, a voice instruction sent by the user terminal before the first receiving module 101 receives a text to be translated sent by the user terminal; wherein, the voice instruction carries user voice and ticket information;

the obtaining module 106 is configured to obtain a service type corresponding to the user voice when the ticket information is legal ticket information;

a searching module 107, configured to search for function entry information and a service ID corresponding to a service type;

the third sending module 108 is configured to send the function entry information and the service ID to the user terminal, so that the user terminal renders a page according to the function entry information to obtain a function page, and obtains a text to be translated according to a preset service flow corresponding to the service ID.

Optionally, the voice processing apparatus further includes a determining module, where the determining module is configured to, when determining whether the ticket information is legal ticket information, specifically:

sending the ticket information to an external service system;

Optionally, further, when the obtaining module 106 is configured to obtain a service type corresponding to the user voice, the obtaining module is specifically configured to:

sending the user voice to a voice recognition module;

receiving a voice recognition result sent by a voice recognition module;

sending the voice recognition result to an intention recognition module;

Optionally, another embodiment of the present invention provides a server, including a sending port and a receiving port;

the receiving port is used for receiving the text to be translated sent by the user terminal and receiving the voice message fed back by the voice synthesis module when the user terminal displays the functional page corresponding to the banking business to be transacted; the method comprises the steps that a text to be translated is obtained according to information of a node to be processed in a preset business process corresponding to a business account ID of banking business;

and the sending port is used for sending the text to be translated to the voice synthesis module and sending the voice message to the user terminal so that the user terminal can play the voice message.

Optionally, on the basis of the embodiment of the server, the method further includes:

the server of claim 9, further comprising: a processor;

the receiving port is also used for receiving a voice instruction sent by the user terminal before receiving the text to be translated sent by the user terminal when the user terminal displays the functional page corresponding to the banking business to be transacted; wherein, the voice instruction carries user voice and ticket information;

the processor is used for acquiring the service type corresponding to the voice of the user and searching the function entry information and the service ID corresponding to the service type when the ticket information is legal ticket information;

and the sending port is also used for sending the function entry information and the service ID to the user terminal so that the user terminal can render the page according to the function entry information to obtain a function page and obtain the text to be translated according to a preset service flow corresponding to the service ID.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A voice processing method is applied to an interaction module and comprises the following steps:

receiving a voice instruction sent by a user terminal; wherein, the voice instruction carries user voice and ticket information;

searching for function entry information and a service ID corresponding to the service type;

sending the function entry information and the service ID to the user terminal so that the user terminal renders a page according to the function entry information to obtain a function page, and obtains a text to be translated according to a preset service flow corresponding to the service ID;

when the user terminal displays a functional page corresponding to the banking business to be transacted, receiving a text to be translated sent by the user terminal; the text to be translated is obtained according to the information of the node to be processed in the preset business process corresponding to the business account ID of the banking business;

sending the text to be translated to a voice synthesis module;

receiving the voice message fed back by the voice synthesis module;

2. The voice processing method according to claim 1, wherein sending the voice message to the user terminal so that the user terminal plays the voice message further comprises:

sending the user input voice to the voice recognition module;

receiving a recognition result sent by the voice recognition module;

3. The voice processing method according to claim 1, wherein determining whether the ticket information is valid ticket information comprises:

sending the ticket information to an external service system;

4. The voice processing method according to claim 1, wherein obtaining the service type corresponding to the user voice comprises:

sending the user voice to a voice recognition module;

receiving a voice recognition result sent by the voice recognition module;

sending the voice recognition result to an intention recognition module;

5. A speech processing device applied to an interaction module comprises:

the second sending module is used for sending the voice message to the user terminal so as to enable the user terminal to play the voice message;

the searching module is used for searching the function entry information and the service ID corresponding to the service type;

6. The speech processing apparatus according to claim 5, further comprising:

7. The speech processing apparatus according to claim 5, further comprising a determining module, where the determining module is configured to, when determining whether the ticket information is legal ticket information, specifically:

sending the ticket information to an external service system;

8. The speech processing apparatus according to claim 5, wherein the obtaining module, when obtaining the service type corresponding to the user speech, is specifically configured to:

sending the user voice to a voice recognition module;

receiving a voice recognition result sent by the voice recognition module;

sending the voice recognition result to an intention recognition module;

9. A server is characterized by comprising a sending port, a receiving port and a processor;

the sending port is used for sending the text to be translated to the voice synthesis module and sending the voice message to the user terminal so that the user terminal can play the voice message;

the processor is used for acquiring a service type corresponding to the user voice and searching for function entry information and a service ID corresponding to the service type when the ticket information is legal ticket information;