CN108122555B

CN108122555B - Communication method, voice recognition device and terminal device

Info

Publication number: CN108122555B
Application number: CN201711364939.6A
Authority: CN
Inventors: 鞠才
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-12-18
Filing date: 2017-12-18
Publication date: 2021-07-23
Anticipated expiration: 2037-12-18
Also published as: CN108122555A

Abstract

The invention provides a communication method, voice recognition equipment and terminal equipment, wherein the method comprises the following steps: the method comprises the steps of collecting an address book text from terminal equipment, wherein at least one contact person is stored in the address book text, training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal equipment, collecting voice for calling the target contact person from the terminal equipment, identifying the voice based on the target language model corresponding to the terminal equipment to obtain a target contact person, and sending the target contact person to the terminal equipment for calling. When the method is used for recognizing the voice of the calling target contact person, the target contact person can be accurately recognized and the target contact person can be accurately dialed, the problem that the contact person is wrongly dialed due to the fact that the same tone and different characters are possible is solved, the recognition accuracy rate of the contact person is improved, and the probability of correctly dialing the contact person is improved.

Description

Communication method, voice recognition device and terminal device

Technical Field

The present invention relates to the field of terminal device technologies, and in particular, to a communication method, a voice recognition device, and a terminal device.

Background

With the development of speech recognition technology, speech recognition has been applied to a variety of fields. For example, the name of a contact person is input through voice to make a call, and the method for making the call brings great convenience to the life of a user.

However, the contact persons in the address list in each terminal device are different, and names of homophones and different characters may exist in different terminal devices, so that when the names of the contact persons are input in a voice mode, the names may be wrongly recognized, and the contact persons cannot be accurately dialed.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, a first objective of the present invention is to provide a communication method, which trains a language model according to an address book text on a terminal device to obtain a target language model corresponding to the terminal device, so as to recognize a voice of a calling target contact according to the target language model, send the obtained target contact to the terminal device for calling, and improve accuracy of recognition of the contact.

A second object of the present invention is to provide another communication method.

A third object of the invention is to propose a speech recognition device.

A fourth object of the present invention is to provide a terminal device.

A fifth object of the invention is to propose a computer device.

A sixth object of the invention is to propose a computer program product.

A seventh object of the invention is to propose a non-transitory computer-readable storage medium.

To achieve the above object, an embodiment of a first aspect of the present invention provides a communication method, including:

acquiring an address book text from terminal equipment; at least one contact person is stored in the address book text;

training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal equipment;

collecting voice for calling a target contact from the terminal equipment;

and recognizing the voice based on the target language model to obtain the target contact and sending the target contact to the terminal equipment for calling.

The communication method comprises the steps of collecting an address book text from a terminal device, wherein at least one contact person is stored in the address book text, training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal device, collecting voice for calling the target contact person from the terminal device, identifying the voice based on the target language model corresponding to the terminal device, and obtaining a target contact person to send the target contact person to the terminal device for calling. In the embodiment, the address book text collected from the terminal equipment is used as the training corpus to train the language model, so that the target language model corresponding to the terminal equipment is obtained, when the language model is used for identifying the voice of the calling target contact person, the target contact person can be accurately identified and the target contact person can be accurately dialed, the problem that the contact person is incorrectly dialed due to the fact that the names of the contact persons are identified by different homophones is solved, the identification accuracy of the contact person is improved, and the probability of correctly dialing the contact person is improved.

To achieve the above object, a second aspect of the present invention provides a communication method, including:

sending an address book text to the voice recognition equipment; at least one contact person is stored in the address book text;

collecting voice of a user for calling a target contact person and sending the voice to the voice recognition equipment for recognition;

receiving the target contact person returned by the voice recognition equipment; the target contact person is identified by the voice identification equipment through a target language model, and the target language model is obtained by utilizing the address book text to train a language model;

and calling the target contact.

According to the communication method, the address book text is sent to the voice recognition device, the voice of a user for calling the target contact person is collected and sent to the voice recognition device for recognition, the target contact person returned by the voice recognition device is received, and the target contact person is called, wherein the target contact person is recognized by the voice recognition device through the target language model, and the target language model is obtained by utilizing the address book text for training the language model. In the embodiment, the address book text is used as the training corpus to train the language model to obtain the target language model, so that the voice recognition equipment recognizes the voice of the calling target contact through the target language model, can accurately recognize the target contact and accurately dial the target contact, solves the problem that the name of the contact is wrongly recognized and then the contact is wrongly dialed due to the fact that the contact is in the same tone and in different characters, improves the recognition accuracy of the contact, and further improves the probability of correctly dialing the contact.

To achieve the above object, a third embodiment of the present invention provides a speech recognition apparatus, including:

the first acquisition module is used for acquiring an address book text from the terminal equipment; at least one contact person is stored in the address book text;

the training module is used for training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal equipment;

the second acquisition module is used for acquiring voice for calling the target contact person from the terminal equipment;

and the recognition module is used for recognizing the voice based on the target language model to obtain the target contact and sending the target contact to the terminal equipment for calling.

The voice recognition device acquires an address book text from the terminal device, wherein at least one contact person is stored in the address book text, the address book text is used as a training corpus to train a language model to obtain a target language model corresponding to the terminal device, the voice for calling the target contact person is acquired from the terminal device, the voice is recognized based on the target language model corresponding to the terminal device, and the target contact person is obtained and sent to the terminal device to call. In the embodiment, the address book text collected from the terminal equipment is used as the training corpus to train the language model, so that the target language model corresponding to the terminal equipment is obtained, when the language model is used for identifying the voice of the calling target contact person, the target contact person can be accurately identified and the target contact person can be accurately dialed, the problem that the contact person is incorrectly dialed due to the fact that the names of the contact persons are identified by different homophones is solved, the identification accuracy of the contact person is improved, and the probability of correctly dialing the contact person is improved.

In order to achieve the above object, a fourth aspect of the present invention provides a terminal device, including:

the sending module is used for sending the address book text to the voice recognition equipment; at least one contact person is stored in the address book text;

the acquisition and sending module is used for acquiring the voice of a user for calling a target contact person and sending the voice to the voice recognition equipment for recognition;

the receiving module is used for receiving the target contact person returned by the voice recognition equipment; the target contact person is identified by the voice identification equipment through a target language model, and the target language model is obtained by utilizing the address book text to train a language model;

and the calling module is used for calling the target contact person.

The terminal device of the embodiment of the invention collects the voice of a user for calling a target contact person and sends the voice to the voice recognition device for recognition by sending the address book text to the voice recognition device, receives the target contact person returned by the voice recognition device and calls the target contact person, wherein the target contact person is recognized by the voice recognition device through a target language model, and the target language model is obtained by utilizing the address book text for training a language model. In the embodiment, the address book text is used as the training corpus to train the language model to obtain the target language model, so that the voice recognition equipment recognizes the voice of the calling target contact through the target language model, can accurately recognize the target contact and accurately dial the target contact, solves the problem that the name of the contact is wrongly recognized and then the contact is wrongly dialed due to the fact that the contact is in the same tone and in different characters, improves the recognition accuracy of the contact, and further improves the probability of correctly dialing the contact.

In order to achieve the above object, a fifth embodiment of the present invention provides a computing device, including a processor and a memory;

the processor reads the executable program code stored in the memory to run a program corresponding to the executable program code, so as to implement the communication method according to the embodiment of the first aspect, or implement the communication method according to the embodiment of the second aspect.

In order to achieve the above object, a sixth aspect of the present invention provides a computer program product, wherein instructions of the computer program product, when executed by a processor, implement the communication method according to the second aspect, or implement the communication method according to the second aspect.

In order to achieve the above object, a seventh embodiment of the present invention proposes a non-transitory computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the communication method according to the first embodiment or implements the communication method according to the second embodiment.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flow chart of a communication method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of another communication method according to an embodiment of the present invention;

fig. 3 is a schematic flow chart illustrating another communication method according to an embodiment of the present invention;

fig. 4 is a schematic process diagram of a communication method according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of another communication method according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a speech recognition device according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present invention;

FIG. 8 is a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

A communication method, a voice recognition apparatus, and a terminal apparatus according to embodiments of the present invention are described below with reference to the accompanying drawings.

A communication method proposed by an embodiment of the present invention is described below from the speech recognition device side. Fig. 1 is a flowchart illustrating a communication method according to an embodiment of the present invention.

As shown in fig. 1, the communication method includes the following steps:

step 101, collecting an address book text from terminal equipment; at least one contact person is stored in the address book text.

In this embodiment, the address book text on the terminal device with the function of dialing the contact, such as a mobile phone, a tablet computer, and an intelligent watch, can be sent to the voice recognition device in a wireless or wired manner.

It is understood that the address book text may be a telephone address book text, an address book text in the QQ, a WeChat communication book text, etc., wherein at least one contact is stored in the address book text, including but not limited to the name, work address, mailbox, etc. of the contact.

Certainly, the terminal device may also integrate information such as the name, telephone number, QQ number, and micro signal of the contact person to form an address book text, and send the address book text to the voice recognition device.

And 102, training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal equipment.

In this embodiment, the speech recognition device extracts names of contacts from the address book text according to the received address book text, and performs language model training as a training corpus to obtain a target language model corresponding to the terminal device.

And 103, collecting voice for calling the target contact from the terminal equipment.

In this embodiment, when the user inputs the voice of the call target contact on the terminal device, such as "call XXX", the terminal device may collect the voice of the call target contact through the microphone and send the voice to the voice recognition device.

And 104, recognizing the voice based on the target language model to obtain a target contact and sending the target contact to the terminal equipment for calling.

After receiving the voice to be recognized, the voice recognition device recognizes the voice through a target language model corresponding to the terminal device to obtain the name of a target contact person and sends the target contact person to the terminal device, and after receiving the target contact person, the terminal device extracts a corresponding telephone number or a QQ number and the like according to the name of the target contact person to dial.

In the embodiment, the language model is trained by taking the address book text collected from the terminal device as the training corpus to obtain the target language model corresponding to the terminal device, so that when the voice of the calling target contact is recognized, the name of the target contact can be accurately recognized, and the target contact can be accurately dialed.

The following describes a situation in which the speech recognition device collects address book texts from one or more terminal devices and trains to obtain a target language model corresponding to each terminal device. Fig. 2 is a flowchart illustrating another communication method according to an embodiment of the present invention.

As shown in fig. 2, the communication method includes the following steps:

step 201, collecting address book text and identification information of the terminal device from the terminal device.

In this embodiment, when the address book text is collected from one or more terminal devices, the identification information of the terminal devices is collected at the same time. The Identification information of the terminal device is used to indicate the uniqueness of the terminal device, and includes but is not limited to a User Identification Number (Called User Identification Number, abbreviated as CUID), an International Mobile Subscriber identity Number (International Mobile Subscriber Identification Number, abbreviated as IMSI), and the like.

Step 202, processing the address book text to obtain the coded data of the address book text.

Because the collected address book text may contain illegal characters, in order to improve the quality of the training corpus and improve the recognition accuracy of the target language model, the training corpus may be preprocessed first. Specifically, the address book text is subjected to size conversion writing, simplified and traditional body conversion, illegal character removal and other processing, and a text with only the predefined legal characters such as Chinese characters, letters and the like is obtained.

Then, each character after preprocessing is coded, so that each character has a unique number, and the coded data of each character is obtained. The coded data carries the position information of the characters in the address book text.

And step 203, training the language model by using the coded data to obtain a target language model.

In this embodiment, when the speech recognition device may have an original language model, that is, a default language model, for each terminal device, the original language model may be updated and trained or retrained by using the encoded data, so as to obtain a target language model corresponding to the terminal device.

The trained target language model is used for acquiring position information, such as a number, of a relevant character of each character and a first probability that each relevant character appears behind the character.

Step 204, establishing a mapping relation between the target language model and the identification information.

And training each terminal device to obtain a target language model, and establishing a mapping relation between the target language model and the identifier of the terminal device by the voice recognition device in order to search the target language model corresponding to the terminal device.

And step 205, collecting voice for calling the target contact from the terminal equipment.

When a user inputs the voice of the calling target contact person on certain terminal equipment, the terminal equipment can collect the voice of the calling target contact person through the microphone and send the voice to the voice recognition equipment, and therefore the voice recognition equipment collects the voice for calling the target contact person from the terminal equipment.

Step 206, extracting the identification information of the terminal device from the voice.

After the voice recognition device receives the voice sent by the terminal device, the identification information of the terminal device sending the voice can be extracted from the voice.

Step 207, replacing the original language model in the decoder with the target language model matched with the identification information of the terminal device.

The voice recognition device inquires the mapping relation between the target language model and the identification according to the identification information of the terminal device, and searches the target language model matched with the identification information of the terminal device from one or more target language models so as to obtain the target language model corresponding to the terminal device.

And step 208, performing voice recognition based on the target language model matched with the identifier of the terminal equipment to obtain a target contact person, and sending the target contact person to the terminal equipment for calling.

After a target language model matched with the identification of the terminal equipment is obtained, the voice of the calling target contact acquired from the terminal equipment corresponding to the target language model is recognized based on the target language model.

Specifically, a target language model is used for recognizing the speech to be recognized, and a first recognition character of the current speech frame of the speech and position information of a relevant character of the first recognition character are obtained. Then, the predicted character of the next speech frame of speech is predicted by using the position information of the relevant character of the first recognized character and the first probability of the relevant character.

Further, speech features of the next speech frame, such as Mel-Frequency Cepstral Coefficients (MFCC), are extracted. And then, updating the predicted character according to the extracted voice characteristics so as to improve the accuracy of prediction, and obtaining the second recognized character of the next voice frame and the position information of the related character of the second recognized character until the last voice frame of the voice is recognized, so as to obtain the target contact, namely the name of the target contact.

And after the target contact person is identified, feeding the target contact person back to the terminal equipment. After receiving the target contact person, the terminal device extracts the stored mobile phone number, or QQ number, or micro signal number of the target contact person according to the target contact person, and calls the target contact person.

For example, after a user opens a call interface on a mobile phone, a voice is input to "XXX make a call", the mobile phone sends the collected voice to a voice recognition device, the voice recognition device searches a target language model corresponding to the mobile phone according to identification information of the mobile phone in the voice to replace an original language model and performs voice recognition to obtain a target contact person, the voice recognition device sends the target contact person to the mobile phone, and after the mobile phone receives the target contact person sent by the voice recognition device, a corresponding mobile phone number is extracted according to the target contact person and the target contact person is dialed.

For another example, if the user opens a chat interface of an application program on the mobile phone, such as Baidu hi, and the voice inputs "call XXX for video", the mobile phone sends the collected voice to the voice recognition device for recognition. And the voice recognition equipment searches a target language model corresponding to the mobile phone according to the identification information of the mobile phone in the voice to perform voice recognition, so as to obtain a target contact person, and sends the target contact person to the mobile phone. After receiving the target contact person, the mobile phone searches an account of the target contact person from the Baidu hi address list of the user and carries out video calling.

According to the communication method provided by the embodiment of the invention, when the voice recognition device recognizes the voice of the calling target contact acquired on a certain terminal device, the target language model corresponding to the terminal device can be searched by inquiring the mapping relation between the target language model and the terminal device identifier according to the identifier information of the terminal device, so that the voice of the calling target contact is recognized according to the target language model corresponding to the terminal device, the target contact on the terminal device can be accurately recognized, and the problem that the name of a person is possibly recognized wrongly due to the fact that the existing address books on different terminal devices are different and the situations of homophones and different characters exist is solved. In the embodiment, the voice recognition is performed on different terminal devices without adopting a default language model, so that the recognition accuracy is improved.

In order to reduce the occupation of the memory of the speech recognition device by the encoded data, the encoded data may be cached in the reading platform. Fig. 3 is a flowchart illustrating another communication method according to an embodiment of the present invention.

As shown in fig. 3, the communication method includes the following steps:

step 301, collecting an address book text from a terminal device; at least one contact person is stored in the address book text.

In this embodiment, the address book text may be a telephone address book text, an address book text in the QQ, a WeChat communication book text, and the like, where at least one contact is stored in the address book text, and the contact includes, but is not limited to, a name, a telephone number, and the like of the contact.

Certainly, the terminal device may also integrate information such as the name, telephone number, QQ number, and micro signal of the contact person to form an address book text, and send the address book text to the voice recognition device. As shown in fig. 4, the terminal device transmits the address book text to the voice recognition device. In fig. 4, the solid line represents the uploading process, and the dotted line represents the recognition process.

Step 302, processing the address book text to obtain the coded data of the address book text.

Step 303, buffering the encoded data into the reading platform.

After the voice recognition device obtains the coded data of the address book text, as shown in fig. 4, the coded data is cached in the reading platform for storage, so that the occupation of the coded data on the storage space of the voice recognition device can be reduced.

And step 304, collecting voice for calling the target contact from the terminal equipment.

At a certain moment, the user inputs the voice of the calling target contact person on the terminal equipment to make a call to the target contact person, and at the moment, the terminal equipment can collect the voice of the calling target contact person through the microphone. As shown in fig. 4, the terminal device sends the collected voice of the call target contact to the voice recognition device.

Step 305, extracting the identification information of the terminal device from the voice.

And step 306, reading the coded data corresponding to the identification from the reading platform according to the identification information.

As shown in fig. 4, the voice recognition device reads the encoded data corresponding to the identification of the terminal device, such as the CUID, from the reading platform based on the identification information, that is, reads the encoded data corresponding to the terminal device.

And 307, training the language model by using the coded data to obtain a target language model corresponding to the terminal equipment.

In this embodiment, the language model is trained using the encoded data corresponding to the terminal device, so as to obtain a target language model corresponding to the terminal device. The trained target language model is used for acquiring position information, such as a number, of a relevant character of each character and a first probability that each relevant character appears behind the character.

And 308, performing voice recognition based on the target language model corresponding to the terminal equipment to obtain a target contact person, and sending the target contact person to the terminal equipment for calling.

After a target language model corresponding to the terminal equipment is obtained through training, the voice of the calling target contact collected from the terminal equipment is recognized based on the target language model. For a specific identification process, reference may be made to the related contents described in the above embodiments. After the target contact is identified, as shown in fig. 4, the voice recognition device sends the target contact to the terminal device, and the terminal device receives the call according to the target contact.

It can be understood that the address book texts collected from a plurality of terminal devices can be respectively processed to obtain coded data corresponding to the terminal devices, and then the coded data is cached in the reading platform according to the identification information of the terminal devices, so that when the voice recognition device recognizes the voice of the call target contact collected on a certain terminal device for the first time, the coded data corresponding to the terminal devices is read from the reading platform according to the identification information of the terminal devices, the coded data is trained on a language model to obtain the target language model corresponding to the terminal devices, and finally the collected voice of the call target contact is recognized by using the target language model.

When the voice recognition device recognizes the voice of the calling target contact collected from the terminal device again, the recognition can be performed according to the trained target language model.

According to the communication method, the coded data are cached in the reading platform, so that the memory of the voice recognition device is saved, the voice recognition device can accurately recognize the target contact from the voice of the calling target contact collected from the terminal device, and the target contact can be accurately dialed.

In order to implement the above embodiments, the present invention further provides a communication method. A communication method provided in an embodiment of the present invention is described below from a terminal device side. Fig. 5 is a flowchart illustrating another communication method according to an embodiment of the present invention.

As shown in fig. 5, the communication method includes:

step 501, sending an address book text to a voice recognition device; at least one contact person is stored in the address book text.

In this embodiment, the terminal device may send the locally stored address book text, such as the phone address book text, the QQ address book text, the Baidu hi address book text, and the like, to the voice recognition device. At least one contact person is stored in the address book text, and the contact person includes but is not limited to information such as a name, a telephone number and the like of the contact person.

Step 502, collecting the voice of the user for calling the target contact person and sending the voice to the voice recognition device for recognition.

When a user inputs the voice of the call target contact person on the terminal equipment, the terminal equipment can collect the voice of the call target contact person input by the user through the microphone and send the voice to the voice recognition equipment for recognition. And after receiving the voice to be recognized, the voice recognition equipment recognizes the voice according to the target language model corresponding to the terminal equipment to obtain the target contact person, and returns the target contact person to the terminal equipment. The target language model is obtained by the speech recognition device through training the language model by using address book texts collected from the terminal device in advance.

Step 503, receiving the target contact returned by the voice recognition device.

Step 504, call the target contact.

The terminal equipment can search the target contact person from the address book text according to the target contact person and call the target contact person.

In order to implement the above embodiments, the present invention further provides a speech recognition device. Fig. 6 is a schematic structural diagram of a speech recognition device according to an embodiment of the present invention.

As shown in fig. 6, the voice recognition apparatus includes: a first acquisition module 610, a training module 620, a second acquisition module 630, and an identification module 640.

The first acquisition module 610 is configured to acquire an address book text from a terminal device; at least one contact person is stored in the address book text.

The training module 620 is configured to perform language model training by using the address book text as a training corpus to obtain a target language model corresponding to the terminal device.

The second collecting module 630 is used for collecting the voice for calling the target contact from the terminal device.

The recognition module 640 is configured to recognize the voice based on the target language model, and obtain a target contact person, and send the target contact person to the terminal device for calling.

It should be noted that the foregoing explanation of the communication method embodiment described from the speech recognition device side is also applicable to the speech recognition device of this embodiment, and therefore is not described herein again.

In order to implement the above embodiments, the present invention further provides a terminal device. Fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.

As shown in fig. 7, the terminal device includes: a sending module 710, an acquisition sending module 720, a receiving module 730 and a calling module 740.

The sending module 710 is configured to send an address book text to the voice recognition device; at least one contact person is stored in the address book text.

The collecting and sending module 720 is configured to collect the voice used by the user to call the target contact and send the voice to the voice recognition device for recognition.

The receiving module 730 is configured to receive a target contact returned by the voice recognition device; the target contact is identified by the voice identification device through a target language model, and the target language model is obtained by utilizing the address book text to train a language model.

The calling module 740 is used to call the target contact.

In order to implement the above embodiments, the present invention also provides a computer device including a processor and a memory.

Wherein the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory for realizing the communication method as described above from the voice recognition device side or realizing the communication method as described above from the terminal device side.

FIG. 8 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application. The computer device 12 shown in fig. 8 is only an example, and should not bring any limitation to the function and the scope of use of the embodiments of the present application.

As shown in FIG. 8, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. These architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 8, and commonly referred to as a "hard drive"). Although not shown in FIG. 8, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described herein.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network such as the Internet) via Network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, for example, implementing the methods mentioned in the foregoing embodiments, by executing programs stored in the system memory 28.

To achieve the above embodiments, the present invention further proposes a computer program product, wherein instructions in the computer program product, when executed by a processor, implement the communication method described from the voice recognition device side as in the foregoing embodiments, or implement the communication method described from the terminal device side as in the foregoing embodiments.

To achieve the above embodiments, the present invention also proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the communication method as described above from the voice recognition device side, and implements the communication method as described above from the terminal device side.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A method of communication, comprising:

training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal device, wherein the target language model is used for acquiring position information of relevant characters of each character and a first probability of each relevant character appearing behind the character;

collecting voice for calling a target contact from the terminal equipment;

2. The method according to claim 1, wherein the training of the language model by using the address book text as a training corpus to obtain the target language model corresponding to the terminal device comprises:

processing the address book text to obtain coded data of the address book text; the coded data carries position information of characters in the address book text;

and training a language model by using the coded data to obtain the target language model.

3. The method of claim 2, further comprising:

acquiring identification information of terminal equipment when an address book text of a user is acquired from the terminal equipment;

after the training of the language model by using the encoded data to obtain the target language model, the method further includes:

and establishing a mapping relation between the target language model and the identification information.

4. The method of claim 3, wherein prior to said recognizing the speech based on the target language model, further comprising:

extracting identification information of the terminal equipment from the voice;

inquiring the mapping relation according to the identification information of the terminal equipment, and acquiring the target language model matched with the identification information of the terminal equipment;

and replacing the original language model in the decoder by using the target language model matched with the identification information of the terminal equipment.

5. The method of claim 4, wherein the recognizing the voice based on the target language model to obtain the target contact and sending the target contact to a terminal device for calling comprises:

inputting the speech into the decoder;

recognizing the voice by using the target language model, and acquiring a first recognition character of a current voice frame of the voice and position information of a related character of the first recognition character;

predicting to obtain a predicted character of a next voice frame of the voice according to the position information of the related character of the first recognition character and the first probability of the related character;

extracting the voice characteristics of the next voice frame to update the predicted characters to obtain second recognition characters of the next voice frame and position information of related characters of the second recognition characters until the last voice frame of the voice is recognized, and obtaining the target contact person;

and feeding back the target contact person to the terminal equipment so as to enable the terminal equipment to initiate a call to the target contact person.

6. The method of claim 2, wherein after processing the address book text and obtaining the encoded data of the address book text, further comprising:

caching the coded data into a reading platform;

the training of the language model by using the coded data to obtain the target language model comprises:

reading the encoded data from the reading platform;

7. A method of communication, comprising:

the terminal equipment sends an address book text to the voice recognition equipment; at least one contact person is stored in the address book text;

receiving the target contact person returned by the voice recognition equipment; the target contact is recognized by the voice recognition equipment through a target language model, the target language model is obtained by utilizing the address book text to train a language model, and the target language model is used for acquiring position information of relevant characters of each character and a first probability of the occurrence of each relevant character behind the character;

and calling the target contact.

8. The method of claim 7, further comprising:

and when the address book text is sent to the voice recognition equipment, the identification information of the terminal equipment is sent at the same time.

9. A speech recognition device, comprising:

the training module is used for training a language model by using the address book text as a training corpus to obtain a target language model corresponding to the terminal device, wherein the target language model is used for acquiring position information of relevant characters of each character and a first probability of the relevant character appearing behind the character;

10. A terminal device, comprising:

the receiving module is used for receiving the target contact person returned by the voice recognition equipment; the target contact is recognized by the voice recognition equipment through a target language model, the target language model is obtained by utilizing the address book text to train a language model, and the target language model is used for acquiring position information of relevant characters of each character and a first probability of the occurrence of each relevant character behind the character;

and the calling module is used for calling the target contact person.

11. A computer device comprising a processor and a memory;

wherein the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory, so as to implement the communication method according to any one of claims 1 to 6, or implement the communication method according to any one of claims 7 to 8.

12. A non-transitory computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the communication method according to any one of claims 1 to 6 and implementing the communication method according to any one of claims 7 to 8.