CN108960836A

CN108960836A - Voice payment method, apparatus and system

Info

Publication number: CN108960836A
Application number: CN201711450685.XA
Authority: CN
Inventors: 李想; 吴本谷; 李宝祥; 王晓鹏
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2018-12-07
Anticipated expiration: 2037-12-27
Also published as: CN108960836B

Abstract

The present invention proposes a kind of voice payment method, apparatus and system, and wherein method includes: to send payment with reading content after receiving payment instruction, which includes: registration with some or all of reading in content with reading word, partly or entirely number；Receive payment voice signal, when paying voice signal and payment with reading content matching, it determines that the word phonological component in payment voice signal and digital phonological component belong to the probability of each vocal print account respectively, and then determines and the payment matched vocal print account to be paid of voice signal；Delivery operation is carried out to vocal print account to be paid according to payment instruction, wherein, the calculating of each probability, and it is paid with being used as with reading word and number in reading content with reading content using registering, improve the recognition accuracy of payment voice signal, and payment includes number in content with reading, and avoids other people and usurps a possibility that user recording carries out voice payment, improves the safety of voice payment.

Description

Voice payment method, apparatus and system

Technical field

The present invention relates to speech ciphering equipment technical field more particularly to a kind of voice payment method, apparatus and system.

Background technique

Current speech ciphering equipment, such as intelligent sound box etc., the process for carrying out voice payment is mainly that speech ciphering equipment is corresponding Background devices are sent to speech ciphering equipment with reading content for preset, and obtain the voice signal that speech ciphering equipment is monitored, by language Sound signal inputs preset identification model, obtains corresponding vocal print account and is paid.However in above-mentioned voice payment method, know Other model is trained by the user voice signal of random acquisition, the user voice signal of random acquisition with read content Correlation is not high, causes the recognition accuracy of identification model low, and is difficult to avoid that other people usurp user recording and carry out voice payment A possibility that, reduce the safety of voice payment.

Summary of the invention

The present invention is directed to solve at least some of the technical problems in related technologies.

For this purpose, the first purpose of this invention is to propose a kind of voice payment method, for solving language in the prior art The problem of sound payment safety difference.

Second object of the present invention is to propose a kind of voice payment method.

Third object of the present invention is to propose a kind of voice payment device.

Fourth object of the present invention is to propose a kind of voice payment device.

5th purpose of the invention is to propose a kind of voice payment system.

6th purpose of the invention is to propose a kind of electronic equipment.

7th purpose of the invention is to propose a kind of electronic equipment.

8th purpose of the invention is to propose a kind of non-transitorycomputer readable storage medium.

9th purpose of the invention is to propose a kind of non-transitorycomputer readable storage medium.

Tenth purpose of the invention is to propose a kind of computer program product.

11st purpose of the invention is to propose a kind of computer program product.

In order to achieve the above object, first aspect present invention embodiment proposes a kind of voice payment method, comprising:

After receiving payment instruction, payment is sent with reading content；The payment includes: registration with reading content with reading content In some or all of with read word, and registration with read content in some or all of number；

Receive payment voice signal, when the payment voice signal and the payment are with reading content matching, it is determining described in Word phonological component in payment voice signal belongs in the first probability and the payment voice signal of each vocal print account Digital speech partly belong to the second probability of each vocal print account；

According to first probability and second probability, the determining and matched sound to be paid of payment voice signal Line account；

Delivery operation is carried out to the vocal print account to be paid according to the payment instruction.

Further, the method further include:

After receiving register instruction, registration is sent with reading content；

Registration voice signal is received, when the registration voice signal and the registration are with reading content matching, according to described Register instruction creates vocal print account.

Further, the method further include:

According to the word phonological component registered in voice signal and corresponding vocal print account, training words recognition Model；The words recognition model is used to determine that the word phonological component in the payment voice signal to belong to each vocal print account The first probability；

According to the digital speech part registered in voice signal and corresponding vocal print account, training number identification Model；The number identification model is used to determine that the digital speech in the payment voice signal to partly belong to each vocal print account The second probability.

Further, the registration includes: all a position integers with reading the number in content.

Further, described according to first probability and second probability, it is determining with the payment voice signal Matched vocal print account to be paid, comprising:

The weighted sum of corresponding first probability and corresponding second probability is determined as institute for each vocal print account State the probability that payment voice signal belongs to the vocal print account；

Meet predetermined probabilities threshold value in the probability that the payment voice signal belongs to the vocal print account, and meets described pre- If the vocal print account quantity of probability threshold value is 1, determine that the vocal print account is matched wait prop up with the payment voice signal Pay vocal print account.

Further, it is described delivery operation is carried out to the vocal print account to be paid according to the payment instruction before, also Include:

Obtain the user account currently logged in；

Judge whether the vocal print account to be paid has the obligation authority of the user account；

It is described that delivery operation is carried out to the vocal print account to be paid according to the payment instruction, comprising:

When the vocal print account to be paid has the obligation authority of the user account, according to the payment instruction to institute It states user account and carries out delivery operation.

Voice payment method provided in this embodiment sends payment with reading content after receiving payment instruction；Payment with Read content include: registration with read content in some or all of with read word, and registration with read content in some or all of Number；Payment voice signal is received, when paying voice signal and payment with reading content matching, is determined in payment voice signal The digital speech that word phonological component belongs in the first probability of each vocal print account, and payment voice signal partly belongs to respectively Second probability of a vocal print account；According to the first probability and the second probability, determine matched to be paid with payment voice signal Vocal print account；Delivery operation is carried out to vocal print account to be paid according to payment instruction, wherein belong to word phonological component each The probability and digital speech of vocal print account partly belong to the calculating of the probability of each vocal print account, and using registration in reading It is used as payment with reading content with reading word and number in appearance, improves the recognition accuracy of payment voice signal, improve language The accuracy that sound is paid, and pay with reading to include number in content, it avoids other people and usurps user recording progress voice payment Possibility, to improve the safety of voice payment.

In order to achieve the above object, second aspect of the present invention embodiment proposes a kind of voice payment method, comprising:

Report payment instruction；

Payment is received with reading content；It is described payment with read content include: registration with read content in some or all of with read Word, and registration is with number some or all of in reading content；

Payment voice signal is obtained, and reports the payment voice signal.

Further, the method further include:

Report register instruction；

Registration is received with reading content；

Registration voice signal is obtained, and reports the registration voice signal.

Voice payment method provided in this embodiment, by reporting payment instruction；Payment is received with reading content；Payment is with reading Content include: registration with read content in some or all of with read word, and registration with read content in some or all of number Word；Payment voice signal is obtained, and reports payment voice signal, so that background server in payment voice signal and is paid with reading When content matching, determine that the word phonological component in payment voice signal belongs to the first probability of each vocal print account, Yi Jizhi Pay the second probability that the digital speech in voice signal partly belongs to each vocal print account；It is general according to the first probability and second Rate determines and the payment matched vocal print account to be paid of voice signal；Vocal print account to be paid is propped up according to payment instruction Pay operation, wherein using registration with, with reading word and number as paying with reading content, improving payment voice in reading content The recognition accuracy of signal improves the accuracy of voice payment, and pays with reading to include number in content, avoids other people and steals A possibility that carrying out voice payment with user recording, to improve the safety of voice payment.

In order to achieve the above object, third aspect present invention embodiment proposes a kind of voice payment device, comprising:

Sending module, for after receiving payment instruction, sending payment with reading content；The payment is with reading content packet Include: registration with read content in some or all of with read word, and registration with read content in some or all of number；

Determining module, for receiving payment voice signal, in the payment voice signal and the payment with reading content Timing determines that the word phonological component in the payment voice signal belongs to the first probability of each vocal print account and described Digital speech in payment voice signal partly belongs to the second probability of each vocal print account；

The determining module is also used to according to first probability and second probability, the determining and payment language The matched vocal print account to be paid of sound signal；

Payment module, for carrying out delivery operation to the vocal print account to be paid according to the payment instruction.

Further, the device further include: creation module；

The sending module, is also used to after receiving register instruction, sends registration with reading content；

The creation module, for receiving registration voice signal, in the registration voice signal and the registration are with reading When holding matching, vocal print account is created according to the register instruction.

Further, the device further include: training module；

The training module, for according to the word phonological component and corresponding vocal print in the registration voice signal Account, training words recognition model；The words recognition model is used to determine the word voice portion in the payment voice signal Belong to the first probability of each vocal print account；

The training module is also used to according to the digital speech part registered in voice signal and corresponding sound Line account, training number identification model；The number identification model is used to determine the digital speech in the payment voice signal Partly belong to the second probability of each vocal print account.

Further, the determining module is specifically used for,

Further, the device further include: obtain module and judgment module；

The acquisition module, for obtaining the user account currently logged in；

The judgment module, for judging whether the vocal print account to be paid has the Authority TO Pay of the user account Limit；

The payment module, specifically for having the obligation authority of the user account in the vocal print account to be paid When, delivery operation is carried out to the user account according to the payment instruction.

Voice payment device provided in this embodiment sends payment with reading content after receiving payment instruction；Payment with Read content include: registration with read content in some or all of with read word, and registration with read content in some or all of Number；Payment voice signal is received, when paying voice signal and payment with reading content matching, is determined in payment voice signal The digital speech that word phonological component belongs in the first probability of each vocal print account, and payment voice signal partly belongs to respectively Second probability of a vocal print account；According to the first probability and the second probability, determine matched to be paid with payment voice signal Vocal print account；Delivery operation is carried out to vocal print account to be paid according to payment instruction, wherein belong to word phonological component each The probability and digital speech of vocal print account partly belong to the calculating of the probability of each vocal print account, and using registration in reading It is used as payment with reading content with reading word and number in appearance, improves the recognition accuracy of payment voice signal, improve language The accuracy that sound is paid, and pay with reading to include number in content, it avoids other people and usurps user recording progress voice payment Possibility, to improve the safety of voice payment.

In order to achieve the above object, fourth aspect present invention embodiment proposes a kind of voice payment device, comprising:

Reporting module, for reporting payment instruction；

Receiving module, for receiving payment with reading content；The payment includes: registration with reading the portion in content with reading content Divide or all with reading word, and registration with number some or all of in reading content；

The reporting module is also used to obtain payment voice signal, and reports the payment voice signal.

Further, the reporting module is also used to report register instruction；

The receiving module is also used to receive registration with reading content；

The reporting module is also used to obtain registration voice signal, and reports the registration voice signal.

Voice payment device provided in this embodiment, by reporting payment instruction；Payment is received with reading content；Payment is with reading Content include: registration with read content in some or all of with read word, and registration with read content in some or all of number Word；Payment voice signal is obtained, and reports payment voice signal, so that background server in payment voice signal and is paid with reading When content matching, determine that the word phonological component in payment voice signal belongs to the first probability of each vocal print account, Yi Jizhi Pay the second probability that the digital speech in voice signal partly belongs to each vocal print account；It is general according to the first probability and second Rate determines and the payment matched vocal print account to be paid of voice signal；Vocal print account to be paid is propped up according to payment instruction Pay operation, wherein using registration with, with reading word and number as paying with reading content, improving payment voice in reading content The recognition accuracy of signal improves the accuracy of voice payment, and pays with reading to include number in content, avoids other people and steals A possibility that carrying out voice payment with user recording, to improve the safety of voice payment.

In order to achieve the above object, fifth aspect present invention embodiment proposes a kind of voice payment system, comprising:

Speech ciphering equipment, and the background server being connect with the speech ciphering equipment；

The background server is used for, and after the payment instruction for receiving speech ciphering equipment transmission, is sent out to the speech ciphering equipment Send payment with reading content；It is described payment with read content include: registration with read content in some or all of word, and registration with Number some or all of in reading content；

The background server is also used to, and the payment voice signal that the speech ciphering equipment is sent is received, in the payment language When sound signal and the payment are with reading content matching, determine that the word phonological component in the payment voice signal belongs to each sound It is second general to partly belong to each vocal print account for digital speech in first probability of line account and the payment voice signal Rate；According to first probability and second probability, the determining and matched vocal print account to be paid of the payment voice signal Family；Delivery operation is carried out to the vocal print account to be paid according to the payment instruction.

Further, the background server is also used to,

After receiving the register instruction that the speech ciphering equipment is sent, registration is sent with reading content to the speech ciphering equipment；

The background server receives the registration voice signal that the speech ciphering equipment is sent, in the registration voice signal and When the registration is with reading content matching, vocal print account is created according to the register instruction.

Further, the background server is also used to,

Further, the background server is specifically used for,

Obtain the user account currently logged in；

In order to achieve the above object, sixth aspect present invention embodiment proposes a kind of electronic equipment, comprising: memory, processing Device and storage are on a memory and the computer program that can run on a processor, which is characterized in that processor execution institute The voice payment method as described in first aspect embodiment is realized when stating program.

In order to achieve the above object, seventh aspect present invention embodiment proposes a kind of electronic equipment, comprising: memory, processing Device and storage are on a memory and the computer program that can run on a processor, which is characterized in that processor execution institute The voice payment method as described in second aspect embodiment is realized when stating program.

To achieve the goals above, eighth aspect present invention embodiment proposes a kind of computer readable storage medium, On be stored with computer program, voice payment side as described in first aspect embodiment is realized when which is executed by processor Method.

To achieve the goals above, ninth aspect present invention embodiment proposes a kind of computer readable storage medium, On be stored with computer program, voice payment side as described in second aspect embodiment is realized when which is executed by processor Method.

To achieve the goals above, tenth aspect present invention embodiment proposes a kind of computer program product, when described When instruction processing unit in computer program product executes, the voice payment method as described in first aspect embodiment is executed.

To achieve the goals above, the tenth one side embodiment of the invention proposes a kind of computer program product, works as institute When stating the instruction processing unit execution in computer program product, the voice payment method as described in second aspect embodiment is executed.

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.

Detailed description of the invention

Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:

Fig. 1 is a kind of flow diagram of voice payment method provided in an embodiment of the present invention；

Fig. 2 is the flow diagram of another voice payment method provided in an embodiment of the present invention；

Fig. 3 is the flow diagram of another voice payment method provided in an embodiment of the present invention；

Fig. 4 is a kind of structural schematic diagram of voice payment device provided in an embodiment of the present invention；

Fig. 5 is the structural schematic diagram of another voice payment device provided in an embodiment of the present invention；

Fig. 6 is the structural schematic diagram of another voice payment device provided in an embodiment of the present invention；

Fig. 7 is a kind of structural schematic diagram of voice payment system provided in an embodiment of the present invention；

Fig. 8 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.

Specific embodiment

The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.

Below with reference to the accompanying drawings the voice payment method, apparatus and system of the embodiment of the present invention are described.

Fig. 1 is a kind of flow diagram of voice payment method provided in an embodiment of the present invention.As shown in Figure 1, the voice Method of payment the following steps are included:

S101, after receiving payment instruction, send payment with read content；Payment includes: registration in reading with reading content With reading word, and registration with number some or all of in reading content some or all of in appearance.

The executing subject of voice payment method provided by the invention is voice payment device, and voice payment device specifically can be with For the corresponding background server of speech ciphering equipment, or speech ciphering equipment itself.Speech ciphering equipment for example can be, intelligent sound box, Intelligent air condition, intelligent washing machine, smart television etc. can carry out interactive voice with user, accordingly be grasped according to the instruction of user The equipment of work.

In the present embodiment, in the case where voice payment device background server corresponding for speech ciphering equipment, payment instruction Acquisition modes can be during speech ciphering equipment and user interact, to monitor the payment phonetic order for getting user Afterwards, it is sent directly to background server；Alternatively, speech ciphering equipment obtains in text after carrying out speech recognition to payment phonetic order Hold, sends background server for content of text.

In the present embodiment, in the case where voice payment device is speech ciphering equipment, the acquisition modes of payment instruction can be, During speech ciphering equipment and user interact, the payment phonetic order of the user got is monitored, or to payment voice Instruction carries out the content of text obtained after speech recognition.Wherein, payment phonetic order can be the language including paying correlation word Sound instruction.Pay correlation word such as " payment ", " payment ", " paying dues " etc., pay correlation word can according to actual needs into Row setting.

Registration can be prestored in the present embodiment, in voice payment device with reading content.After receiving payment instruction, language Sound payment mechanism can be from registration with reading to select part in content with reading word or all with reading word, from registration with reading content In select part number or all number, will select and be determined as payment with reading content with reading word and number.Further, A possibility that user recording carries out voice payment is usurped in order to further decrease other people, voice payment device can be to the number selected Word carries out random combine, and/or, random combine is carried out with reading word to what is selected, is got paid with reading content.Wherein, it registers It may include: all a position integers with reading the number in content.

In the present embodiment, in the case where voice payment device background server corresponding for speech ciphering equipment, voice payment Payment can be sent to speech ciphering equipment with reading content by device, be broadcast with reading content by loudspeaker etc. so that speech ciphering equipment will be paid It puts to user；Alternatively, payment can be shown with reading content in speech ciphering equipment in the case where there is display screen on speech ciphering equipment On display screen；Alternatively, can will be paid in the case where speech ciphering equipment is communicated with other smart machines with display screen It is shown on the display screen of other smart machines with reading content.

In the present embodiment, voice payment device be speech ciphering equipment in the case where, voice payment device can will payment with It reads content and user is played to by loudspeaker etc.；Alternatively, can will be paid in the case where having display screen on voice payment device It is shown on the display screen of voice payment device with reading content；Alternatively, in voice payment device and other intelligence with display screen In the case that energy equipment is communicated, payment can be shown on the display screen of other smart machines with reading content.User couple Payment is carried out with reading content with after reading, voice payment device can control the payment voice signal of the acquisition user such as microphone.

S102, payment voice signal is received, when paying voice signal and payment with reading content matching, determines payment voice Word phonological component in signal belongs to the digital speech portion in the first probability of each vocal print account, and payment voice signal Belong to the second probability of each vocal print account.

In the present embodiment, voice payment device can carry out payment voice signal after receiving payment voice signal Identification obtains the corresponding content of text of payment voice signal, and content of text is compared with payment with reading content, judges text Content is with payment with reading whether content matches；If content of text and payment are mismatched with reading content, grasped without voice payment Make.

In the present embodiment, in the case where content of text and payment are with reading content matching, voice payment device can be to branch It pays voice signal to be split, obtains word phonological component therein and digital phonological component.For word phonological component, voice Payment mechanism can input word phonological component in trained words recognition model in advance, obtain the output of words recognition model Word phonological component belong to the first probability of each vocal print account；Wherein, words recognition model can according to each user Voice signal and the training of vocal print account obtain.Alternatively, voice payment device can by word phonological component with prestore The voice signal of each user is compared, and determines that word phonological component belongs to the first probability of each vocal print account.

For digital speech part, digital speech part can be inputted trained number in advance by voice payment device knows In other model, the digital speech for obtaining digital identification model output partly belongs to the second probability of each vocal print account；Wherein, number Word identification model can obtain in the voice signal of each user and the training of vocal print account according to.Alternatively, voice payment fills Setting digital speech part can be compared with the voice signal of each user prestored, determine that digital speech partly belongs to respectively Second probability of a vocal print account.

S103, according to the first probability and the second probability, determine and the payment matched vocal print account to be paid of voice signal.

In the present embodiment, the process that voice payment device executes step 103 is specifically as follows, for each vocal print account, By the weighted sum of corresponding first probability and corresponding second probability, it is determined as payment voice signal and belongs to the vocal print account Probability；Meet predetermined probabilities threshold value in the probability that payment voice signal belongs to vocal print account, and meets the sound of predetermined probabilities threshold value When line account quantity is 1, determine that vocal print account is and the payment matched vocal print account to be paid of voice signal.

In the present embodiment, when the vocal print account quantity for meeting predetermined probabilities threshold value is two or more, in order to avoid Vocal print account matching error to be paid, voice payment device is without delivery operation.Further, voice payment device can be with Payment is retransmitted with reading content, payment voice signal is received again and is judged, until transmission times of the payment with reading content More than preset quantity threshold value.

S104, delivery operation is carried out to vocal print account to be paid according to payment instruction.

In the present embodiment, before step 104, the method can also include: the user account for obtaining and currently logging in；Sentence Whether vocal print account to be paid of breaking has the obligation authority of user account.Corresponding, step 104 is specifically as follows, to be paid When vocal print account has the obligation authority of user account, delivery operation is carried out to user account according to payment instruction.

It should be noted that voice payment device can will be paid in the case where voice payment device is speech ciphering equipment Instruction is sent to the corresponding background server of speech ciphering equipment, so that background server is according to payment instruction to vocal print account to be paid Carry out delivery operation.

In the present embodiment, a user account can correspond to multiple vocal print accounts, and the corresponding user of user account can be with Multiple vocal print accounts are carried out with the setting of obligation authority.Such as in home scenarios, each kinsfolk can possess a sound Line account, user account can be registered by one of kinsfolk, and the kinsfolk can be to other kinsfolks Obligation authority be configured.

Voice payment method provided in this embodiment belongs to the probability and number of each vocal print account to word phonological component Word phonological component belongs to the calculating of the probability of each vocal print account, and using registration with read in content with reading word and number As payment with reading content, the recognition accuracy of payment voice signal is improved, improves the accuracy of voice payment, and pay With reading to include number in content, avoids other people and usurp a possibility that user recording carries out voice payment, to improve voice The safety of payment.

Fig. 2 is the flow diagram of another voice payment method provided in an embodiment of the present invention, as shown in Fig. 2, in Fig. 1 On the basis of illustrated embodiment, the method can also include register flow path:

S105, after receiving register instruction, send registration with read content.

In the present embodiment, in the case where voice payment device background server corresponding for speech ciphering equipment, register instruction Acquisition modes can be during speech ciphering equipment and user interact, to monitor the registration phonetic order for getting user Afterwards, it is sent directly to background server；Alternatively, speech ciphering equipment obtains in text after carrying out speech recognition to registration phonetic order Hold, sends background server for content of text.

In the present embodiment, in the case where voice payment device is speech ciphering equipment, the acquisition modes of register instruction can be, During speech ciphering equipment and user interact, the registration phonetic order of the user got is monitored, or to registration voice Instruction carries out the content of text obtained after speech recognition.Wherein, registering phonetic order can be the language including registering correlation word Sound instruction.Register correlation word such as " registration ", " opening an account ", " open-minded " etc., registration correlation word can according to actual needs into Row setting.

Registration can be prestored in the present embodiment, in voice payment device with reading content.

In the present embodiment, in the case where voice payment device background server corresponding for speech ciphering equipment, voice payment Registration can be sent to speech ciphering equipment with reading content by device, be broadcast with reading content by loudspeaker etc. so that speech ciphering equipment will be registered It puts to user；Alternatively, registration can be shown with reading content in speech ciphering equipment in the case where there is display screen on speech ciphering equipment On display screen；Alternatively, can will be registered in the case where speech ciphering equipment is communicated with other smart machines with display screen It is shown on the display screen of other smart machines with reading content.

In the present embodiment, in the case where voice payment device is speech ciphering equipment, voice payment device is receiving registration After instruction, registration can be played into user by loudspeaker etc. with reading content；Alternatively, having display screen on voice payment device In the case where, registration can be shown on the display screen of voice payment device with reading content；Alternatively, voice payment device with In the case that other smart machines with display screen are communicated, registration can be shown with reading content in other smart machines Display screen on.User carries out with after reading, voice payment device can control the acquisition user such as microphone registration with reading content Registration voice signal.

S106, registration voice signal is received, when registering voice signal and registration with reading content matching, according to register instruction Create vocal print account.

In the present embodiment, voice payment device can save vocal print account and corresponding note after creating vocal print account Volume voice signal.It should be noted that the vocal print account of voice payment device creation is being capable of unique identification use in the present embodiment Family, with the registration one-to-one vocal print account of voice signal.

S107, according to registration voice signal in word phonological component and corresponding vocal print account, training words recognition Model；Words recognition model is for determining that the word phonological component in payment voice signal belongs to the first general of each vocal print account Rate.

In the present embodiment, since user carries out the registration voice signal repeatedly obtained with reading in the presence of poor with reading content to registration The opposite sex, and words recognition model needs a large amount of registration voice signal to be trained, to guarantee the standard of words recognition model identification Exactness, therefore, in registration process, voice payment device can repeatedly send registration with reading content, obtain a large amount of registration language Sound signal, according to the word phonological component in vocal print account and corresponding a large amount of registration voice signals to words recognition model into Row training, it is ensured that the recognition accuracy of words recognition model.

S108, according to registration voice signal in digital speech part and corresponding vocal print account, training number identification Model；Digital identification model is for determining that the digital speech in payment voice signal partly belongs to the second general of each vocal print account Rate.

In the present embodiment, voice payment device can be according in vocal print account and corresponding a large amount of registration voice signals Digital speech part is trained digital identification model, it is ensured that the recognition accuracy of digital identification model.

Voice payment method provided in this embodiment, using registration with reading content to words recognition model and number identification mould Type is trained, and further improves the safety of voice payment.

Fig. 3 is the flow diagram of another voice payment method provided in an embodiment of the present invention, as shown in figure 3, the language Sound method of payment the following steps are included:

301, payment instruction is reported.

The executing subject of voice payment method provided by the invention is voice payment device, and voice payment device specifically can be with For speech ciphering equipment.Speech ciphering equipment for example can be, intelligent sound box, intelligent air condition, intelligent washing machine, smart television etc. can with Family carries out interactive voice, the equipment for carrying out corresponding operating according to the instruction of user.

In the present embodiment, the acquisition modes of payment instruction can be, during speech ciphering equipment is interacted with user, prison The payment phonetic order of the user got is listened, or the content of text obtained after speech recognition is carried out to payment phonetic order. Wherein, payment phonetic order can be the phonetic order including paying correlation word.Payment correlation word such as " payment " " is paid Expense ", " paying dues " etc., payment correlation word can be set according to actual needs.

In the present embodiment, payment instruction can be sent to backstage and taken by voice payment device after getting payment instruction It is engaged in device, background server, can be from registration with reading to select part in content with reading word or complete after receiving payment instruction Portion selects part number or all number with reading word, from registration with reading in content, will select with reading word and number really It is set to payment with reading content.Further, a possibility that user recording carries out voice payment is usurped in order to further decrease other people, Background server can carry out random combine to the number selected, and/or, random combine is carried out with reading word to what is selected, is obtained To payment with reading content.Wherein, registering with reading the number in content may include: all a position integers.

302, payment is received with reading content；Payment with read content include: registration with read content in some or all of with read Word, and registration is with number some or all of in reading content.

In the present embodiment, voice payment device can pass through payment with reading content after receiving payment with reading content Loudspeaker etc. plays to user；Alternatively, payment can be shown with reading content in the case where having display screen on voice payment device Show on the display screen of voice payment device；Alternatively, being carried out in voice payment device and other smart machines with display screen In the case where communication, payment can be shown on the display screen of other smart machines with reading content.User is to payment in reading After appearance is carried out with reading, voice payment device can control the payment voice signal of the acquisition user such as microphone.

303, payment voice signal is obtained, and reports payment voice signal.

In the present embodiment, voice payment device can send payment voice signal after getting payment voice signal It is handled to background server.Background server can carry out payment voice signal after receiving payment voice signal Identification obtains the corresponding content of text of payment voice signal, and content of text is compared with payment with reading content, judges text Content is with payment with reading whether content matches；If content of text and payment are mismatched with reading content, grasped without voice payment Make.

In the present embodiment, in the case where content of text and payment are with reading content matching, background server can be to payment Voice signal is split, and word phonological component therein and digital phonological component are obtained.For word phonological component, backstage is taken Business device can input word phonological component in trained words recognition model in advance, obtain the word of words recognition model output Language phonological component belongs to the first probability of each vocal print account；Wherein, words recognition model can according to each user language What sound signal and the training of vocal print account obtained.For digital speech part, background server can be defeated by digital speech part Enter in digital identification model trained in advance, the digital speech for obtaining digital identification model output partly belongs to each vocal print account Second probability at family；Wherein, digital identification model can according to the voice signal of each user and vocal print account it is trained It arrives.

Word phonological component belongs to the first probability of each vocal print account and digital speech partly belongs to respectively getting After second probability of a vocal print account, for each vocal print account, background server can be by corresponding first probability and right The weighted sum for the second probability answered is determined as paying the probability that voice signal belongs to the vocal print account；In payment voice signal category When the vocal print account quantity for meeting predetermined probabilities threshold value in the probability of vocal print account, and meeting predetermined probabilities threshold value is 1, sound is determined Line account be and payment the matched vocal print account to be paid of voice signal；Vocal print account to be paid is propped up according to payment instruction Pay operation.

Further, on the basis of the above embodiments, the method can also include: to report register instruction；It receives Registration is with reading content；Registration voice signal is obtained, and reports registration voice signal.

Wherein, the acquisition modes of register instruction can be during speech ciphering equipment is interacted with user, to monitor and obtain The registration phonetic order of the user arrived, or the content of text obtained after speech recognition is carried out to registration phonetic order.Wherein, it infuses Volume phonetic order can be the phonetic order including registering correlation word.Registration correlation word such as " registration ", " is opened at " opening an account " It is logical " etc., registration correlation word can be set according to actual needs.

In the present embodiment, register instruction can be reported to backstage and taken by voice payment device after getting register instruction Business device, so that background server obtains the registration prestored with reading content and being sent to voice payment device.

In the present embodiment, after the registration that platform server is sent upon receipt of voice payment device is with reading content, Ke Yitong It crosses loudspeaker etc. and plays to user；Alternatively, in the case where having display screen on voice payment device, it can be by registration with reading content It is shown on the display screen of voice payment device；Alternatively, voice payment device and other smart machines with display screen into In the case where row communication, registration can be shown on the display screen of other smart machines with reading content；So that user is to registration It carries out obtaining registration voice signal with reading with reading content.

In the present embodiment, voice payment device can acquire the registration voice signal of user by microphone etc., and report To background server, so that background server is created when registering voice signal and registration with reading content matching according to register instruction Build vocal print account；According to the word phonological component and corresponding vocal print account registered in voice signal, training words recognition mould Type；Words recognition model is for determining that the word phonological component in payment voice signal belongs to the first general of each vocal print account Rate；According to the digital speech part and corresponding vocal print account registered in voice signal, training number identification model；Number Identification model is for determining that the digital speech in payment voice signal partly belongs to the second probability of each vocal print account.

Voice payment method provided in this embodiment improves the recognition accuracy of payment voice signal, improves voice The accuracy of payment, and pay with read content in include number, avoid other people usurp user recording carry out voice payment can Energy property, to improve the safety of voice payment.

Fig. 4 is a kind of structural schematic diagram of voice payment device provided in an embodiment of the present invention.As shown in Figure 4, comprising: hair Send module 41, determining module 42 and payment module 43.

Wherein, sending module 41, for after receiving payment instruction, sending payment with reading content；The payment is with reading Content include: registration with read content in some or all of with read word, and registration with read content in some or all of number Word；

Determining module 42, for receiving payment voice signal, in the payment voice signal and the payment with reading content When matching, determine that the word phonological component in the payment voice signal belongs to the first probability of each vocal print account, Yi Jisuo State the second probability that the digital speech in payment voice signal partly belongs to each vocal print account；

The determining module 42 is also used to the determining and payment according to first probability and second probability The matched vocal print account to be paid of voice signal；

Payment module 43, for carrying out delivery operation to the vocal print account to be paid according to the payment instruction.

Voice payment device provided by the invention is specifically as follows speech ciphering equipment or the corresponding background service of speech ciphering equipment Device.Speech ciphering equipment for example can be that intelligent sound box, intelligent air condition, intelligent washing machine, smart television etc. can carry out language with user Sound interaction, the equipment that corresponding operating is carried out according to the instruction of user.

In the case where voice payment device background server corresponding for speech ciphering equipment, the acquisition modes of payment instruction can Think, during speech ciphering equipment is interacted with user, after monitoring gets the payment phonetic order of user, is sent directly to Background server；Alternatively, speech ciphering equipment obtains content of text after carrying out speech recognition to payment phonetic order, content of text is sent out It is sent to background server.

Further, on the basis of the above embodiments, the determining module 42 is specifically used for, for each vocal print account The weighted sum of corresponding first probability and corresponding second probability is determined as the payment voice signal and belongs to the sound by family The probability of line account；Meet predetermined probabilities threshold value in the probability that the payment voice signal belongs to the vocal print account, and meets When the vocal print account quantity of the predetermined probabilities threshold value is 1, determine that the vocal print account is to match with the payment voice signal Vocal print account to be paid.

Further, on the basis of the above embodiments, the device can also include: to obtain module and judge mould Block；

Wherein, the acquisition module, for obtaining the user account currently logged in；

Voice payment device provided in this embodiment belongs to the probability and number of each vocal print account to word phonological component Word phonological component belongs to the calculating of the probability of each vocal print account, and using registration with read in content with reading word and number As payment with reading content, the recognition accuracy of payment voice signal is improved, improves the accuracy of voice payment, and pay With reading to include number in content, avoids other people and usurp a possibility that user recording carries out voice payment, to improve voice The safety of payment.

Further, in conjunction with reference Fig. 5, on the basis of the embodiment shown in fig. 4, the device further include: creation mould Block 44 and training module 45；

Wherein, the sending module 41, is also used to after receiving register instruction, sends registration with reading content；

The creation module 44, for receiving registration voice signal, in the registration voice signal and the registration with reading When content matching, vocal print account is created according to the register instruction；

The training module 45, for according to the word phonological component and corresponding sound in the registration voice signal Line account, training words recognition model；The words recognition model is used to determine the word voice in the payment voice signal Partly belong to the first probability of each vocal print account；

The training module 45 is also used to according to the digital speech part and corresponding in the registration voice signal Vocal print account, training number identification model；The number identification model is used to determine the digital language in the payment voice signal Line belongs to the second probability of each vocal print account.

In the case where voice payment device background server corresponding for speech ciphering equipment, the acquisition modes of register instruction can Think, during speech ciphering equipment is interacted with user, after monitoring gets the registration phonetic order of user, is sent directly to Background server；Alternatively, speech ciphering equipment obtains content of text after carrying out speech recognition to registration phonetic order, content of text is sent out It is sent to background server.

Registration can be prestored in the present embodiment, in voice payment device with reading content.It is voice in voice payment device In the case where equipment, voice payment device can broadcast registration with reading content after receiving register instruction by loudspeaker etc. It puts to user；Alternatively, registration can be shown with reading content in voice branch in the case where having display screen on voice payment device On the display screen for paying device；Alternatively, there is the case where smart machine of display screen is communicated with other in voice payment device Under, registration can be shown on the display screen of other smart machines with reading content.User carries out with reading registration with reading content Afterwards, voice payment device can control the registration voice signal of the acquisition user such as microphone.

In the present embodiment, since user carries out the registration voice signal repeatedly obtained with reading in the presence of poor with reading content to registration The opposite sex, and words recognition model needs a large amount of registration voice signal to be trained, to guarantee the standard of words recognition model identification Exactness, therefore, in registration process, voice payment device can repeatedly send registration with reading content, obtain a large amount of registration language Sound signal, according to the word phonological component in vocal print account and corresponding a large amount of registration voice signals to words recognition model into Row training, it is ensured that the recognition accuracy of words recognition model.In addition, voice payment device can be according to vocal print account and correspondence A large amount of registration voice signals in digital speech part digital identification model is trained, it is ensured that the knowledge of digital identification model Other accuracy.

Voice payment device provided in this embodiment, using registration with reading content to words recognition model and number identification mould Type is trained, and further improves the safety of voice payment.

Fig. 6 is the structural schematic diagram of another voice payment device provided in an embodiment of the present invention.As shown in Figure 6, comprising: Reporting module 61 and receiving module 62.

Wherein, reporting module 61, for reporting payment instruction；

Receiving module 62, for receiving payment with reading content；The payment includes: registration with reading in content with reading content Partly or entirely with reading word, and registration with number some or all of in reading content；

The reporting module 61, is also used to obtain payment voice signal, and reports the payment voice signal.

Voice payment device provided by the invention is specifically as follows speech ciphering equipment.Speech ciphering equipment for example can be intelligent sound Case, intelligent air condition, intelligent washing machine, smart television etc. can carry out interactive voice with user, carry out phase according to the instruction of user The equipment that should be operated.

Further, on the basis of the above embodiments, the reporting module 61, is also used to report register instruction；

The receiving module 62 is also used to receive registration with reading content；

The reporting module 61, is also used to obtain registration voice signal, and reports the registration voice signal.

Voice payment device provided in this embodiment improves the recognition accuracy of payment voice signal, improves voice The accuracy of payment, and pay with read content in include number, avoid other people usurp user recording carry out voice payment can Energy property, to improve the safety of voice payment.

Fig. 7 is a kind of structural schematic diagram of voice payment system provided in an embodiment of the present invention.As shown in fig. 7, comprises: language Sound equipment 71, and the background server 72 being connect with the speech ciphering equipment.

Wherein, the background server 72 is used for, after the payment instruction for receiving speech ciphering equipment transmission, to the voice Equipment sends payment with reading content；The payment includes: registration with word some or all of in reading content with reading content, and Registration is with number some or all of in reading content；

The background server 72 is also used to, and the payment voice signal that the speech ciphering equipment is sent is received, in the payment When voice signal and the payment are with reading content matching, it is each to determine that the word phonological component in the payment voice signal belongs to Digital speech in first probability of vocal print account and the payment voice signal partly belongs to the second of each vocal print account Probability；According to first probability and second probability, the determining and matched vocal print to be paid of payment voice signal Account；Delivery operation is carried out to the vocal print account to be paid according to the payment instruction.Wherein, it registers with reading the number in content Word may include: all a position integers.

Further, the background server 72 is also used to, after receiving the register instruction that the speech ciphering equipment is sent, Registration is sent with reading content to the speech ciphering equipment；The background server receives the registration voice letter that the speech ciphering equipment is sent Number, when the registration voice signal and the registration are with reading content matching, vocal print account is created according to the register instruction.

Further, the background server 72 is also used to, according to it is described registration voice signal in word phonological component, And corresponding vocal print account, training words recognition model；The words recognition model is for determining the payment voice signal In word phonological component belong to the first probability of each vocal print account；According to the digital speech portion in the registration voice signal Point and corresponding vocal print account, training number identification model；The number identification model is for determining the payment voice letter Digital speech in number partly belongs to the second probability of each vocal print account.

Further, the background server 72 is specifically used for, for each vocal print account, by corresponding first probability with And the weighted sum of corresponding second probability, it is determined as the probability that the payment voice signal belongs to the vocal print account；In the branch It pays voice signal and belongs to the probability of the vocal print account and meet predetermined probabilities threshold value, and meet the vocal print of the predetermined probabilities threshold value When account quantity is 1, determine that the vocal print account is and the matched vocal print account to be paid of the payment voice signal.

Further, the background server 72 is specifically used for, and obtains the user account currently logged in；Judgement is described wait prop up Pay the obligation authority whether vocal print account has the user account；There is the user account in the vocal print account to be paid Obligation authority when, according to the payment instruction to the user account carry out delivery operation.

It should be noted that being described for the concrete function of speech ciphering equipment and background server, Ke Yican in the present embodiment Fig. 1 is examined to embodiment illustrated in fig. 3, is no longer illustrated herein.

Fig. 8 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.The electronic equipment includes:

Memory 1001, processor 1002 and it is stored in the calculating that can be run on memory 1001 and on processor 1002 Machine program.

Processor 1002 realizes the voice payment method provided in embodiment as shown in Figure 1 or 2 when executing described program, Or realize the voice payment method provided in embodiment as shown in Figure 3.

Further, electronic equipment further include:

Communication interface 1003, for the communication between memory 1001 and processor 1002.

Memory 1001, for storing the computer program that can be run on processor 1002.

Memory 1001 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

Processor 1002 realizes the voice payment provided in embodiment as shown in Figure 1 or 2 when for executing described program Method, or the voice payment method provided in embodiment as shown in Figure 3 is provided.

If memory 1001, processor 1002 and the independent realization of communication interface 1003, communication interface 1003, memory 1001 and processor 1002 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard Architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, control Bus processed etc..Only to be indicated with a thick line in Fig. 8, it is not intended that an only bus or a type of convenient for indicating Bus.

Optionally, in specific implementation, if memory 1001, processor 1002 and communication interface 1003, are integrated in one It is realized on block chip, then memory 1001, processor 1002 and communication interface 1003 can be completed mutual by internal interface Communication.

Processor 1002 may be a central processing unit (Central Processing Unit, referred to as CPU), or Person is specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC) or quilt It is configured to implement one or more integrated circuits of the embodiment of the present invention.

The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is processed It realizes when device executes such as the voice payment method in Fig. 1 or embodiment illustrated in fig. 2.

The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is processed Device realizes the voice payment method in embodiment as shown in Figure 3 when executing.

The present invention also provides a kind of computer program products, which is characterized in that the finger in the computer program product When order is executed by processor, execute such as the voice payment method in Fig. 1 or embodiment illustrated in fig. 2.

The present invention also provides a kind of computer program products, which is characterized in that the finger in the computer program product When order is executed by processor, the voice payment method in embodiment as shown in Figure 3 is executed.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.

Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as to limit of the invention System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of the invention Type.

Claims

1. a kind of voice payment method characterized by comprising

After receiving payment instruction, payment is sent with reading content；The payment includes: registration with reading in content with reading content Partly or entirely with reading word, and registration with number some or all of in reading content；

Payment voice signal is received, when the payment voice signal and the payment are with reading content matching, determines the payment Word phonological component in voice signal belongs to the number in the first probability and the payment voice signal of each vocal print account Word phonological component belongs to the second probability of each vocal print account；

According to first probability and second probability, the determining and matched vocal print account to be paid of the payment voice signal Family；

2. a kind of voice payment method characterized by comprising

Report payment instruction；

Payment voice signal is obtained, and reports the payment voice signal.

3. a kind of voice payment device characterized by comprising

Sending module, for after receiving payment instruction, sending payment with reading content；The payment includes: note with reading content Volume with read content in some or all of with read word, and registration with read content in some or all of number；

Determining module, for receiving payment voice signal, when the payment voice signal and the payment are with reading content matching, Determine that the word phonological component in the payment voice signal belongs to the first probability and the payment language of each vocal print account Digital speech in sound signal partly belongs to the second probability of each vocal print account；

The determining module is also used to according to first probability and second probability, determining to believe with the payment voice Number matched vocal print account to be paid；

4. a kind of voice payment device characterized by comprising

Reporting module, for reporting payment instruction；

Receiving module, for receiving payment with reading content；It is described payment with read content include: registration with read content in part or All with reading word, and registration with number some or all of in reading content；

5. a kind of voice payment system characterized by comprising speech ciphering equipment, and the backstage being connect with the speech ciphering equipment Server；

The background server is used for, and after the payment instruction for receiving speech ciphering equipment transmission, sends branch to the speech ciphering equipment It pays with reading content；The payment includes: registration with word some or all of in reading content with reading content, and is registered in reading Number some or all of in appearance；

The background server is also used to, and receives the payment voice signal that the speech ciphering equipment is sent, and is believed in the payment voice Number with it is described payment with read content matching when, determine it is described payment voice signal in word phonological component belong to each vocal print account Digital speech in first probability at family and the payment voice signal partly belongs to the second probability of each vocal print account； According to first probability and second probability, the determining and matched vocal print account to be paid of the payment voice signal； Delivery operation is carried out to the vocal print account to be paid according to the payment instruction.

6. a kind of electronic equipment characterized by comprising memory, processor and storage are on a memory and can be in processor The computer program of upper operation when the processor executes described program, realizes the voice as described in any in claim 1-2 Method of payment.

7. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program quilt The voice payment method as described in any in claim 1-2 is realized when processor executes.

8. a kind of computer program product, which is characterized in that when the instruction in the computer program product is executed by processor When, execute the voice payment method as described in any in claim 1-2.