CN111161732B

CN111161732B - Voice acquisition method and device, electronic equipment and storage medium

Info

Publication number: CN111161732B
Application number: CN201911404996.1A
Authority: CN
Inventors: 张瑞华; 徐世超; 梁志婷
Original assignee: Miaozhen Information Technology Co Ltd
Current assignee: Miaozhen Information Technology Co Ltd
Priority date: 2019-12-30
Filing date: 2019-12-30
Publication date: 2022-12-13
Anticipated expiration: 2039-12-30
Also published as: CN111161732A

Abstract

The invention discloses a voice acquisition method, a voice acquisition device, electronic equipment and a storage medium, wherein the voice acquisition method comprises the following steps: receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. The invention solves the problem of low accuracy of the acquired identity information corresponding to the voice.

Description

Voice acquisition method and device, electronic equipment and storage medium

Technical Field

The invention relates to the field of electronic equipment, in particular to a voice acquisition method and device, electronic equipment and a storage medium.

Background

At present, in the process of collecting voice, the identity information corresponding to the voice needs to be manually input so as to mark the voice-emitting object, and the voice and the identity information of the voice-emitting object are correspondingly stored.

In practice, it is found that the condition of wrong input or missing input is easy to occur when identity information is manually input, so that the matching degree of finally acquired voice and identity information is low, and the voice emitting object is difficult to accurately determine. Therefore, the method for manually inputting the identity information to perform voice acquisition has the problem of low accuracy of the identity information corresponding to the acquired voice.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a voice acquisition method, a voice acquisition device, electronic equipment and a storage medium, and at least solves the technical problem of low accuracy of acquired identity information corresponding to voice.

According to an aspect of an embodiment of the present invention, there is provided a voice collecting method, including: receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software; responding to the binding instruction, and sending out a sound wave prestored in the application program software so that a voice acquisition device acquires the sound wave and transmits the sound wave to a server, wherein the sound wave carries identity identification information prestored in the application program software; and under the condition that the server binds the identity identification information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information.

According to another aspect of the embodiments of the present invention, there is also provided a voice collecting apparatus, including: the receiving unit is used for receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software; the sending unit is used for responding to the binding instruction and sending sound waves stored in the application program software in advance so that the sound waves are collected by the voice collecting equipment and transmitted to the server, wherein the sound waves carry identity identification information stored in the application program software in advance; and the storage unit is used for controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information under the condition that the server binds the identity identification information corresponding to the sound wave and the voice acquisition equipment.

According to another aspect of the embodiments of the present invention, there is also provided a storage medium, in which a computer program is stored, wherein the computer program is configured to execute the above-mentioned voice collecting method when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned voice collecting method through the computer program.

In the embodiment of the invention, a binding instruction triggered by touch operation is received in an interactive interface displayed by application software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry the identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. This process can utilize application software to send the sound wave of prestoring, make pronunciation collection equipment receive this sound wave and transmit this sound wave to the server, thereby make the server bind the identification information and the pronunciation collection equipment that carry in this sound wave, thereby correspond the storage with the pronunciation that this pronunciation collection equipment gathered with this identification information, need not manual input identification information, only need read near field communication card through pronunciation collection equipment, the probability that the condition of mistake defeated or hourglass defeated takes place when having reduced manual input identification information, the technical problem that the identification information degree of accuracy that the pronunciation that has solved the collection corresponds is low.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of a network environment for an alternative voice capture method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an alternative method of speech acquisition according to an embodiment of the present invention;

FIG. 3 is a flow chart of an alternative method of speech acquisition according to an embodiment of the present invention;

FIG. 4 is a flow chart of an alternative method of speech acquisition according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an alternative voice capture device according to an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an alternative voice capture device according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiments of the present invention, there is provided a voice collecting method, optionally, as an optional implementation manner, the voice collecting method may be but is not limited to be applied to a voice collecting system in a network environment as shown in fig. 1, where the voice collecting system includes a user equipment 102, a network 110, and a server 112. The user device 102 includes a display 108, a processor 106, and a memory 104. The display 108 is configured to display an interactive interface displayed by the application software, and may also detect a human-computer interaction operation on the interactive interface, for example, a user may perform a touch operation on the interactive interface, for example, click a virtual key on the interactive interface, and the like, which is not limited in the embodiment of the present invention; and the processor 106 is configured to generate a corresponding operation instruction (e.g., a binding instruction for instructing to bind the voice collecting device) according to the human-computer interaction operation. The memory 104 is used for storing the above operation instructions.

S101, receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software by user equipment 102;

s102, the user equipment 102 responds to the binding instruction and sends out sound waves stored in the application program software in advance so that the voice acquisition equipment acquires the sound waves and transmits the sound waves to the server, wherein the sound waves carry identity identification information stored in the application program software in advance;

s103, the server 112 binds the identity identification information corresponding to the sound wave with the voice acquisition equipment;

s104, the server 112 sends a control instruction to the network 110;

s105, the network 110 sends a control command to the ue 102;

s106, the user equipment 102 responds to the control instruction, and controls the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information under the condition that the server binds the identity identification information corresponding to the sound wave and the voice acquisition equipment.

In the embodiment of the present invention, please refer to steps S101 to S106 described above, where the user equipment 102 may be a terminal device such as a tablet computer, a notebook computer, and a PC that supports running application software, and the embodiment of the present invention is not limited in this respect, the user equipment 102 may run the application software, and an interactive interface displayed by the application software is displayed on the display 108 of the user equipment 102, where the interactive interface at least includes a virtual binding key, and a user may trigger a binding instruction by clicking the virtual binding key, and after receiving the binding instruction triggered by the user, may send a sound wave pre-stored in the application software in response to the binding instruction, and the sound wave carries identity information pre-stored in the application software. In addition, when the user equipment 102 runs the application software for the first time, the user is required to register and input the sound wave of the user, and the input sound wave carries the identification information of the user, so that the user can automatically log in when running the application software by using the user equipment 102 again, and the sound wave input by the user is stored in the application software. Further, the user device 102 may emit a sound wave pre-stored in the application software in response to the binding instruction, and the voice collecting device may receive the sound wave and transmit the sound wave to the server 112. The server 112 includes a database 114 and a processing engine 116, the database 114 may be configured to store each sound wave and identification information matched with the sound wave, the processing engine 116 may be configured to bind the identification information corresponding to the sound wave with the voice collecting device, in this process, application software may be used to send out the sound wave stored in advance, so that the voice collecting device receives the sound wave and transmits the sound wave to the server, and thus, the server binds the identification information carried in the sound wave with the voice collecting device, so that the voice collected by the voice collecting device is stored in correspondence with the identification information, and the identification information does not need to be manually input, and only the near field communication card needs to be read by the voice collecting device, thereby reducing the probability of wrong input or missing input when the identification information is manually input, and solving the technical problem of low accuracy of the identification information corresponding to the collected voice.

Optionally, in this embodiment, the user equipment may be, but is not limited to, a tablet computer, a notebook computer, a PC, and other computer equipment that supports running an application client. The server and the user equipment may, but are not limited to, implement data interaction through a network, which may include, but is not limited to, a wireless network or a wired network. Wherein, this wireless network includes: bluetooth, WIFI, and other networks that enable wireless communication. Such wired networks may include, but are not limited to: wide area networks, metropolitan area networks, local area networks. The above is merely an example, and this is not limited in this embodiment.

Optionally, as an optional implementation manner, as shown in fig. 2, the voice collecting method includes:

s201, receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

s202, responding to the binding instruction, sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software;

s203, under the condition that the server binds the identity identification information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information.

In the embodiment of the invention, the terminal equipment is loaded with application program software for binding the voice acquisition equipment and the identity identification information, when a user operates the application program software for the first time through the terminal equipment, a user registration interface can be displayed, the user registration interface at least comprises a sound wave input key, after the user clicks the sound wave input key, a sound wave input by the user can be acquired by using a voice acquisition module of the terminal equipment, the sound wave carries the identity identification information of the user, so that the user enters a user login mode when the application program software is operated again in the following process, and the sound wave carrying the identity identification information and input by the user on the user registration interface can be acquired in the user login mode. Optionally, the user registration interface may further include a registration name input box of the user, and the user may also input the registration name in the registration name input box, so that when the application software is subsequently operated again and enters the user login mode, the identity information and the like input by the user in the user registration interface may also be obtained, which is not limited in the embodiment of the present invention. When the user runs the application software through the terminal device again after completing registration, an interactive interface presented by the application software running on the terminal device may be displayed, the user may trigger a corresponding instruction by performing a corresponding interactive operation on the interactive interface, for example, the user may trigger a binding instruction by clicking a virtual binding key or a virtual binding icon in the interactive interface, and the terminal device may send out a sound wave pre-stored in the application software in response to the binding instruction, so that the voice acquisition device acquires the sound wave and transmits the sound wave to the server. The voice collecting device may be a recording device worn by an offline store employee, and specifically may be an earphone, a recording pen, or an electronic device with a recording function, and the like, where the electronic device with the recording function may include but is not limited to a mobile phone with a recording function, a tablet with a recording function, and the like, and is not limited in the embodiment of the present invention. The server can receive the sound wave transmitted by the terminal device and determine the identity information corresponding to the sound wave, so that the identity information corresponding to the sound wave is bound with the voice acquisition device, wherein the identity information can be ID information of an employee or number information of different employees, and the like. The server can bind the identity identification information with the voice acquisition equipment, and correspondingly stores the voice acquired by the voice acquisition equipment and the identity identification information under the condition of binding the identity identification information with the voice acquisition equipment. Under the condition that identity information and pronunciation collection equipment bind, pronunciation collection equipment can add identity information for the pronunciation of gathering, with this pronunciation and the corresponding storage of identity information to the pronunciation collection equipment of gathering, or again, pronunciation collection equipment can also be with the speech transmission to the server of gathering, so that the pronunciation and the corresponding storage of identity information to the server that the server gathered the pronunciation collection equipment, or again, pronunciation collection equipment can also be with the speech transmission to terminal equipment of gathering, so that terminal equipment corresponds the storage to terminal equipment with the pronunciation and the identity information of pronunciation collection equipment collection. Optionally, after the terminal device correspondingly stores the voice and the identity information acquired by the voice acquisition device to the terminal device, the voice and the identity information acquired by the voice acquisition device may be acquired by running application software in the terminal device.

In the embodiment of the invention, a binding instruction triggered by touch operation is received in an interactive interface displayed by application program software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. This process can utilize application software to send the sound wave of prestoring, make pronunciation collection equipment receive this sound wave and transmit this sound wave to the server, thereby make the server bind the identification information and the pronunciation collection equipment that carry in this sound wave, thereby correspond the storage with the pronunciation that this pronunciation collection equipment gathered with this identification information, need not manual input identification information, only need read near field communication card through pronunciation collection equipment, the probability that the condition of mistake defeated or hourglass defeated takes place when having reduced manual input identification information, the technical problem that the identification information degree of accuracy that the pronunciation that has solved the collection corresponds is low.

As an optional implementation manner, as shown in fig. 3, fig. 3 is another optional speech acquisition method disclosed in the embodiment of the present invention, and the speech acquisition method shown in fig. 3 may include the following steps:

s301, receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

s302, responding to the binding instruction, and sending out sound waves prestored in the application program software so that the voice acquisition equipment acquires the sound waves and transmits the sound waves to a server, wherein the sound waves carry identity identification information prestored in the application program software;

s303, under the condition that the server binds the identity information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity information, wherein the identity information is the identity information matched with the identity identification information corresponding to the sound wave searched by the server;

s304, determining a voice processing type corresponding to the identity information;

s305, processing the voice corresponding to the identity identification information according to the voice processing mode indicated by the voice processing type.

In the embodiment of the present invention, the voice processing type corresponding to the identity information may also be determined, and the voice corresponding to the identity information is processed according to the voice processing mode indicated by the voice processing type.

In the embodiment of the invention, after the identity information is sent to the server, the server can also search the identity information corresponding to the identity information in the database, wherein the database can store a plurality of identity information and the identity information corresponding to each identity information. The identity information may include, but is not limited to, name, position, age, working age, and the like. Specifically, the server may bind the identity information and the voice acquisition device, and store the voice acquired by the voice acquisition device and the identity information correspondingly. In addition, the voice processing type corresponding to the identity information can be determined according to the identity information, and the voice corresponding to the identity information is processed according to the voice processing mode indicated by the voice processing type. Optionally, determining the voice processing type corresponding to the identity information may include: and determining a voice processing type corresponding to the position information according to the position information in the identity information. Under the condition that the voice collecting equipment collects the voices of the staff for processing, different voice processing types can be adopted for different positions of the staff for voice processing. For example, the work requirements for the employee whose position is the supervisor and the employee whose position is the sale are different, the voice processing type for the employee whose position is the supervisor may be a voice processing type for monitoring the management instruction, and the voice processing type for the employee whose position is the sale may be a voice processing type for monitoring the service operation, where the voice processing manner indicated by the voice processing type for monitoring the management instruction may match the collected voice with the pre-stored management instruction voice, and the voice processing manner indicated by the voice processing type for monitoring the service operation may match the collected voice with the pre-stored service operation, so as to implement the classification processing of the voice by means of the identity information corresponding to the voice, and improve the voice processing efficiency.

In the embodiment of the invention, a binding instruction triggered by touch operation is received in an interactive interface displayed by application program software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. This process can utilize application software to send the sound wave of prestore, make voice acquisition equipment receive this sound wave and transmit this sound wave to the server, thereby make the server bind identification information and voice acquisition equipment that carry in this sound wave, thereby correspond the storage with this identification information with the pronunciation that this voice acquisition equipment gathered, need not manual input identification information, only need read near field communication card through voice acquisition equipment, the probability that the condition that wrong defeated or missed when having reduced manual input identification information takes place, the technical problem that the identification information degree of accuracy that the pronunciation that has solved and gathered correspond is low. In addition, the identity information can be sent to the server, so that the server searches identity information matched with the identity information, the identity information and the voice acquisition equipment can be bound, collected voice and the identity information are correspondingly stored under the condition that the identity information and the voice acquisition equipment are bound, more identity information corresponding to the voice can be acquired, corresponding processing can be conveniently carried out according to different types of identity information, and the acquired information diversity is improved. In addition, the voice can be processed according to the voice processing mode indicated by the voice processing type corresponding to the identity identification information, the requirement of processing the voice in a classified mode is met, and the voice processing efficiency is improved.

As an optional implementation manner, as shown in fig. 4, fig. 4 is another optional voice collecting method disclosed in the embodiment of the present invention, and the voice collecting method shown in fig. 4 may include the following steps:

s401, receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

s402, responding to the binding instruction, sending out sound waves stored in the application program software in advance, so that the voice acquisition equipment acquires the sound waves by using the sound wave receiving module and transmits the sound waves to the server, wherein the sound waves carry identity identification information stored in the application program software in advance;

s403, sending a prompting message for prompting to send out sound waves again under the condition of receiving a retry instruction which is returned by the server and used for indicating that the binding fails;

s404, under the condition that the server binds the identity information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity information, wherein the identity information is the identity information matched with the identity identification information corresponding to the sound wave searched by the server;

s405, determining a voice processing type corresponding to the identity information;

s406, processing the voice corresponding to the identity information according to the voice processing mode indicated by the voice processing type.

In the embodiment of the invention, the voice acquisition equipment is provided with the sound wave receiving module, and the sound wave receiving module can be used for receiving the sound wave which is sent by the terminal equipment and carries the identity identification information. After the voice collecting device receives the sound waves by using the sound wave receiving module and transmits the sound waves to the server, if the server returns a retry instruction for indicating that the binding fails, a prompt message for prompting to re-send the sound waves is sent out on the terminal device in response to the retry instruction.

According to another aspect of the embodiment of the invention, a voice acquisition device for implementing the voice acquisition method is also provided. As shown in fig. 5, the apparatus includes:

a receiving unit 501, configured to receive a binding instruction triggered by a touch operation in an interactive interface displayed by application software;

the sending unit 502 is configured to send, in response to the binding instruction, a sound wave pre-stored in the application software, so that the voice collecting device collects the sound wave and transmits the sound wave to the server, where the sound wave carries identity information pre-stored in the application software;

the storage unit 503 is configured to control the voice acquisition device to store the acquired voice and the identity information correspondingly under the condition that the server binds the identity information corresponding to the sound wave and the voice acquisition device.

In the embodiment of the invention, the terminal equipment is loaded with application program software for binding the voice acquisition equipment and the identity identification information, when a user operates the application program software for the first time through the terminal equipment, a user registration interface can be displayed, the user registration interface at least comprises a sound wave input key, after the user clicks the sound wave input key, a sound wave input by the user can be acquired by using a voice acquisition module of the terminal equipment, the sound wave carries the identity identification information of the user, so that the user enters a user login mode when the application program software is operated again in the subsequent process, and the sound wave which is input by the user on the user registration interface and carries the identity identification information can be acquired in the user login mode. Optionally, the user registration interface may further include a registration name input box of the user, and the user may also input the registration name in the registration name input box, so that when the application software is subsequently operated again and enters the user login mode, the identity information and the like input by the user in the user registration interface may also be obtained, which is not limited in the embodiment of the present invention. When the user runs the application software through the terminal device again after completing registration, an interactive interface presented by the application software running on the terminal device may be displayed, the user may trigger a corresponding instruction by performing a corresponding interactive operation on the interactive interface, for example, the user may trigger a binding instruction by clicking a virtual binding key or a virtual binding icon in the interactive interface, and the terminal device may send out a sound wave pre-stored in the application software in response to the binding instruction, so that the voice acquisition device acquires the sound wave and transmits the sound wave to the server. The voice collecting device may be a recording device worn by an offline store employee, and specifically may be an earphone, a recording pen, or an electronic device with a recording function, and the like, where the electronic device with the recording function may include, but is not limited to, a mobile phone with a recording function, a tablet with a recording function, and the like, and the embodiment of the present invention is not limited. The server can receive the sound wave transmitted by the terminal device and determine the identity information corresponding to the sound wave, so that the identity information corresponding to the sound wave is bound with the voice acquisition device, wherein the identity information can be ID information of employees or number information of different employees, and the like, and the embodiment of the invention is not limited. The server can bind the identity identification information with the voice acquisition equipment, and correspondingly stores the voice acquired by the voice acquisition equipment and the identity identification information under the condition of binding the identity identification information with the voice acquisition equipment. Under the condition that identity information and pronunciation collection equipment bind, pronunciation collection equipment can add identity information for the pronunciation of gathering, with this pronunciation and the corresponding storage of identity information to the pronunciation collection equipment of gathering, or again, pronunciation collection equipment can also be with the speech transmission to the server of gathering, so that the pronunciation and the corresponding storage of identity information to the server that the server gathered the pronunciation collection equipment, or again, pronunciation collection equipment can also be with the speech transmission to terminal equipment of gathering, so that terminal equipment corresponds the storage to terminal equipment with the pronunciation and the identity information of pronunciation collection equipment collection. Optionally, after the terminal device correspondingly stores the voice and the identity information acquired by the voice acquisition device to the terminal device, the voice and the identity information acquired by the voice acquisition device may be acquired by running application software in the terminal device.

In the embodiment of the invention, a binding instruction triggered by touch operation is received in an interactive interface displayed by application program software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. This process can utilize application software to send the sound wave of prestore, make voice acquisition equipment receive this sound wave and transmit this sound wave to the server, thereby make the server bind identification information and voice acquisition equipment that carry in this sound wave, thereby correspond the storage with this identification information with the pronunciation that this voice acquisition equipment gathered, need not manual input identification information, only need read near field communication card through voice acquisition equipment, the probability that the condition that wrong defeated or missed when having reduced manual input identification information takes place, the technical problem that the identification information degree of accuracy that the pronunciation that has solved and gathered correspond is low.

As an optional implementation manner, the storage unit 503 is configured to, when the server binds the identity information corresponding to the sound wave and the voice acquisition device, control the voice acquisition device to store the acquired voice and the identity information in a corresponding manner specifically:

the storage unit 503 is specifically configured to, under the condition that the server binds the identity information corresponding to the sound wave with the voice acquisition device, control the voice acquisition device to correspondingly store the acquired voice and the identity information, where the identity information is identity information matched with the identity identification information corresponding to the sound wave found by the server.

As an alternative embodiment, as shown in fig. 6, the voice collecting apparatus shown in fig. 6 is obtained by improving on the basis of the voice collecting apparatus shown in fig. 5, and compared with the voice collecting apparatus shown in fig. 5, the voice collecting apparatus shown in fig. 6 further includes:

a determining unit 504, configured to determine a voice processing type corresponding to the identity information after the voice acquired by the voice acquisition device is stored in correspondence with the identity information;

and the processing unit 505 is configured to process the voice corresponding to the identity information according to the voice processing manner indicated by the voice processing type.

In the embodiment of the present invention, after the identification information is sent to the server, the server may further search the database for the identification information corresponding to the identification information, where the database may store a plurality of identification information and the identification information corresponding to each identification information. The identity information may include, but is not limited to, name, position, age, working age, and the like. Specifically, the server may bind the identity information and the voice acquisition device, and store the voice acquired by the voice acquisition device and the identity information correspondingly. In addition, the voice processing type corresponding to the identity information can be determined according to the identity information, and the voice corresponding to the identity information is processed according to the voice processing mode indicated by the voice processing type. Optionally, determining the voice processing type corresponding to the identity information may include: and determining a voice processing type corresponding to the position information according to the position information in the identity information. Under the condition that the voice acquisition equipment acquires and processes the voice of the staff, different voice processing types can be adopted for voice processing according to different positions of the staff. For example, the work requirements for the employee whose position is the supervisor and the employee whose position is the sale are different, the voice processing type for the employee whose position is the supervisor may be a voice processing type for monitoring the management instruction, and the voice processing type for the employee whose position is the sale may be a voice processing type for monitoring the service operation, where the voice processing manner indicated by the voice processing type for monitoring the management instruction may match the collected voice with the pre-stored management instruction voice, and the voice processing manner indicated by the voice processing type for monitoring the service operation may match the collected voice with the pre-stored service operation, so as to implement the classification processing of the voice by means of the identity information corresponding to the voice, and improve the voice processing efficiency.

As an optional implementation manner, the sending unit 502 is configured to send out a sound wave pre-stored in the application software, so that the voice collecting device collects the sound wave and transmits the sound wave to the server specifically:

the sending unit 502 is configured to send out sound waves pre-stored in the application software, so that the voice collecting device collects the sound waves by using the sound wave receiving module and transmits the sound waves to the server.

As an alternative embodiment, the issuing unit 502 is further configured to issue a prompt message for prompting to issue the sound wave again in case of receiving a retry instruction returned by the server to indicate that the binding fails after issuing the sound wave stored in the application software in advance in response to the binding instruction so that the voice collecting device collects the sound wave and transmits the sound wave to the server.

In the embodiment of the invention, a binding instruction triggered by touch operation is received in an interactive interface displayed by application program software; responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software; under the condition that the server binds the identity identification information corresponding to the sound waves with the voice acquisition equipment, the voice acquisition equipment is controlled to correspondingly store the acquired voice and the identity identification information. This process can utilize application software to send the sound wave of prestoring, make pronunciation collection equipment receive this sound wave and transmit this sound wave to the server, thereby make the server bind the identification information and the pronunciation collection equipment that carry in this sound wave, thereby correspond the storage with the pronunciation that this pronunciation collection equipment gathered with this identification information, need not manual input identification information, only need read near field communication card through pronunciation collection equipment, the probability that the condition of mistake defeated or hourglass defeated takes place when having reduced manual input identification information, the technical problem that the identification information degree of accuracy that the pronunciation that has solved the collection corresponds is low. In addition, the identity information can be sent to the server, so that the server searches identity information matched with the identity information, the identity information and the voice acquisition equipment can be bound, collected voice and the identity information are correspondingly stored under the condition that the identity information and the voice acquisition equipment are bound, more identity information corresponding to the voice can be acquired, corresponding processing can be conveniently carried out according to different types of identity information, and the acquired information diversity is improved. In addition, the voice can be processed according to the voice processing mode indicated by the voice processing type corresponding to the identity identification information, the requirement of processing the voice in a classified mode is met, and the voice processing efficiency is improved.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above-mentioned voice collecting method, as shown in fig. 7, the electronic device includes a memory 702 and a processor 704, the memory 702 stores a computer program therein, and the processor 704 is configured to execute the steps in any of the above-mentioned method embodiments through the computer program.

S1, receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

s2, responding to the binding instruction, and sending out sound waves prestored in the application program software to enable the voice acquisition equipment to acquire the sound waves and transmit the sound waves to the server, wherein the sound waves carry identity identification information prestored in the application program software;

and S3, under the condition that the server binds the identity identification information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 7 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 7 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 7, or have a different configuration than shown in FIG. 7.

The memory 702 may be used to store software programs and modules, such as program instructions/modules corresponding to the voice acquisition method and apparatus in the embodiments of the present invention, and the processor 704 executes various functional applications and data processing by running the software programs and modules stored in the memory 702, so as to implement the above-mentioned voice acquisition method. The memory 702 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 702 can further include memory located remotely from the processor 704, which can be coupled to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. As an example, as shown in fig. 7, the memory 702 may include, but is not limited to, the receiving unit 501, the sending unit 502, and the storage unit 503 in the voice capturing apparatus. In addition, other module units in the above-mentioned voice acquisition device may also be included, but are not limited to these, and are not described in this example again.

Optionally, the transmission device 706 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 706 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 706 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

In addition, the electronic device further includes: a display 708 for displaying an interactive interface displayed by the running application software; and a connection bus 714 for connecting the respective module parts in the above-described electronic apparatus.

According to a further aspect of embodiments of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the various methods in the foregoing embodiments may be implemented by a program instructing hardware related to the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, read-Only memories (ROMs), random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described in detail in a certain embodiment.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be implemented in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for speech acquisition, comprising:

receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

responding to the binding instruction, and sending out sound waves prestored in the application program software to enable voice acquisition equipment to acquire the sound waves and transmit the sound waves to a server, wherein the sound waves carry identity identification information prestored in the application program software and corresponding to a user account number currently logged in by the application program software;

under the condition that the server binds the identity identification information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information;

after controlling the voice collected by the voice collecting device and the identity information to be correspondingly stored, the method further comprises the following steps:

determining a voice processing type corresponding to the identity information;

and processing the voice corresponding to the identity identification information according to the voice processing mode indicated by the voice processing type.

2. The method according to claim 1, wherein, in a case where the server binds the identification information corresponding to the sound wave with the voice collecting device, controlling the voice collecting device to store the collected voice corresponding to the identification information includes:

and under the condition that the server binds the identity information corresponding to the sound wave with the voice acquisition equipment, controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity information, wherein the identity information is the identity information matched with the identity identification information corresponding to the sound wave searched by the server.

3. The method according to claim 1, wherein the emitting the sound waves stored in the application software in advance to enable a voice collecting device to collect the sound waves and transmit the sound waves to a server comprises:

and sending out sound waves prestored in the application program software so that the voice acquisition equipment acquires the sound waves by using a sound wave receiving module and transmits the sound waves to a server.

4. The method according to any one of claims 1 to 3, wherein after the sound wave pre-stored in the application software is emitted in response to the binding instruction, so that the voice collecting device collects the sound wave and transmits the sound wave to the server, the method further comprises:

and sending out a prompt message for prompting to send out the sound wave again under the condition of receiving a retry instruction which is returned by the server and used for indicating that the binding fails.

5. A speech acquisition device, comprising:

the receiving unit is used for receiving a binding instruction triggered by touch operation in an interactive interface displayed by application program software;

the sending unit is used for responding to the binding instruction and sending sound waves prestored in the application program software so as to enable a voice collecting device to collect the sound waves and transmit the sound waves to a server, wherein the sound waves carry the identity information prestored in the application program software and corresponding to the user account number currently logged in by the application program software;

the storage unit is used for controlling the voice acquisition equipment to correspondingly store the acquired voice and the identity identification information under the condition that the server binds the identity identification information corresponding to the sound wave and the voice acquisition equipment;

wherein the apparatus further comprises:

the determining unit is used for determining a voice processing type corresponding to the identity identification information after the voice acquired by the voice acquisition equipment is correspondingly stored with the identity identification information;

and the processing unit is used for processing the voice corresponding to the identity identification information according to the voice processing mode indicated by the voice processing type.

6. The apparatus according to claim 5, wherein the storage unit is configured to, when the server binds the identification information corresponding to the sound wave and the voice acquisition device, control the voice acquisition device to store the acquired voice and the identification information in a corresponding manner, specifically:

the storage unit is specifically configured to control the voice acquisition device to correspondingly store the acquired voice and the identity information under the condition that the server binds the identity information corresponding to the sound wave and the voice acquisition device, where the identity information is identity information matched with the identity identification information corresponding to the sound wave and found by the server.

7. A storage medium comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 4.

8. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 4 by means of the computer program.