CN112307161A

CN112307161A - Method and apparatus for playing audio

Info

Publication number: CN112307161A
Application number: CN202010120432.1A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-02-26
Filing date: 2020-02-26
Publication date: 2021-02-02
Anticipated expiration: 2040-02-26
Also published as: CN112307161B

Abstract

The embodiment of the application discloses a method and a device for playing audio. One embodiment of the method comprises: acquiring audio to be played; acquiring voice input indication information of target equipment, wherein the voice input indication information is used for indicating whether the target equipment allows receiving voice input in a preset time period; determining a frequency band matched with the voice input indication information; and playing the audio to be played according to the determined frequency band. The implementation mode improves the audio output frequency band under the situation that human-computer interaction is needed, so that the influence on human voice is reduced.

Description

Method and apparatus for playing audio

Technical Field

Embodiments of the present application relate to the field of computer technologies, and in particular, to a method and an apparatus for playing audio.

Background

With the development of computer technology, intelligent voice interaction equipment is applied more and more widely.

In order to reduce the noise input in the speech recognition process, a filter circuit is usually built in the smart device to filter the noise similar to the human voice frequency band.

Disclosure of Invention

The embodiment of the application provides a method and a device for playing audio.

In a first aspect, an embodiment of the present application provides a method for playing audio, the method including: acquiring audio to be played; acquiring voice input indication information of target equipment, wherein the voice input indication information is used for indicating whether the target equipment allows receiving voice input in a preset time period; determining a frequency band matched with the voice input indication information; and playing the audio to be played according to the determined frequency band.

In some embodiments, the determining the frequency band matching with the voice input indication information includes: in response to the fact that the voice input indication information is used for indicating that the target device is allowed to receive voice input in the preset time period, selecting a target number of frequency bands from the first preset corresponding relation table according to the sequence of the frequency bands from high to low; and determining a frequency band matched with the voice input indication information from the target number of frequency bands.

In some embodiments, the determining the frequency band matching with the voice input indication information includes: and in response to the fact that the voice input indication information is used for indicating that the target device is not allowed to receive the voice input in the preset time period, selecting a frequency band consistent with a frequency band corresponding to the audio to be played from the second preset corresponding relation table as a frequency band matched with the voice input indication information.

In some embodiments, the acquiring the voice input indication information of the target device includes: and generating voice input indication information for indicating that the target device is not allowed to receive voice input at the preset time in response to receiving the instruction for representing the audio playing.

In some embodiments, the playing the audio to be played according to the determined frequency band includes: performing sound wave transformation on the audio to be played to generate target audio, wherein the frequency band of the target audio is a subset of the determined frequency band; and playing the target audio.

In a second aspect, an embodiment of the present application provides an apparatus for playing audio, the apparatus including: a first acquisition unit configured to acquire audio to be played; a second acquisition unit configured to acquire voice input indication information of the target device, wherein the voice input indication information is used for indicating whether the target device is allowed to receive voice input in a preset time period; a determination unit configured to determine a frequency band matching the voice input indication information; and the playing unit is configured to play the audio to be played according to the determined frequency band.

In some embodiments, the determining unit includes: the selecting module is configured to select a target number of frequency bands from the first preset corresponding relation table according to the sequence of the frequency bands from high to low in response to the fact that the voice input indicating information is used for indicating that the target device is allowed to receive voice input in the preset time period; and the determining module is configured to determine a frequency band matched with the voice input indication information from the target number of frequency bands.

In some embodiments, the determining unit is further configured to: and in response to the fact that the voice input indication information is used for indicating that the target device is not allowed to receive the voice input in the preset time period, selecting a frequency band consistent with a frequency band corresponding to the audio to be played from the second preset corresponding relation table as a frequency band matched with the voice input indication information.

In some embodiments, the second obtaining unit is further configured to: in response to receiving an instruction characterizing playing of audio, generating voice input indication information for indicating that the target device is not allowed to receive voice input for a preset time period.

In some embodiments, the playing unit includes: the generating module is configured to perform sound wave transformation on the audio to be played to generate target audio, wherein the frequency band of the target audio is a subset of the determined frequency band; a playing module configured to play the target audio.

In a third aspect, an embodiment of the present application provides a terminal, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any implementation of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable medium on which a computer program is stored, which when executed by a processor, implements the method as described in any of the implementations of the first aspect.

According to the method and the device for playing the audio, firstly, the audio to be played is obtained; then, acquiring voice input indication information of the target device, wherein the voice input indication information is used for indicating whether the target device is allowed to receive voice input in a preset time period; then, determining a frequency band matched with the voice input indication information; and finally, playing the audio to be played according to the determined frequency band. Therefore, the audio output frequency band is improved under the situation that human-computer interaction is needed, and the influence on human voice is reduced. Moreover, the audio is played in a normal mode under the situation that human-computer interaction is not needed, and the auditory experience of a user is guaranteed to be influenced as little as possible.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for playing audio according to the present application;

FIG. 3 is a schematic diagram of one application scenario of a method for playing audio according to an embodiment of the present application;

FIG. 4 is a flow diagram of yet another embodiment of a method for playing audio according to the present application;

FIG. 5 is a schematic block diagram illustrating one embodiment of an apparatus for playing audio according to the present application;

FIG. 6 is a schematic block diagram of an electronic device suitable for use in implementing embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 shows an exemplary architecture 100 to which the method for playing audio or the apparatus for playing audio of the present application may be applied.

As shown in fig. 1, system architecture 100 may include terminal device 101, network 102, and server 103. Network 102 is the medium used to provide communication links between terminal devices 101 and server 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The terminal apparatus 101 interacts with the server 103 through the network 102 to receive or transmit messages and the like. The terminal device 101 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, a voice interaction application, and the like.

The terminal apparatus 101 may be hardware or software. When the terminal device 101 is hardware, it may be various electronic devices having a speaker and supporting human-computer interaction, including but not limited to a smart phone, a tablet computer, a smart speaker, a laptop portable computer, a desktop computer, and the like. When the terminal apparatus 101 is software, it can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 103 may be a server providing various services, such as a background server providing support for playing audio on the terminal device 101. The background server may analyze and process the received voice input instruction information, generate a processing result (for example, a frequency band matched with the voice input prompt information), and may also feed back the generated processing result to the terminal device.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the terminal apparatus 101 may also directly analyze the voice input instruction information and generate a processing result. At this time, the network 102 and the server 103 may not exist. The method for playing audio provided by the embodiment of the present application is generally executed by the terminal device 101, and accordingly, the apparatus for playing audio is generally disposed in the terminal device 101. Alternatively,

it should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for playing audio in accordance with the present application is shown. The method for playing audio includes the steps of:

step 201, acquiring an audio to be played.

In this embodiment, the execution main body of the method for playing audio (such as the terminal 101 shown in fig. 1) may acquire the audio to be played through a wired connection manner or a wireless connection manner. As an example, the execution subject may locally acquire pre-stored audio to be played. As yet another example, the execution body may acquire audio to be played from a communicatively connected electronic device. The audio to be played may include various audio that can be played. The audio to be played may include an audio file, such as song audio, by way of example. As yet another example, the audio to be played may include audio generated by a speech synthesis technique, such as a "i'm listening" speech file.

Step 202, acquiring voice input indication information of the target device.

In this embodiment, the execution main body may acquire the voice input instruction information of the target device by a wired connection manner or a wireless connection manner. The voice input indication information may be used to indicate whether the target device is allowed to receive voice input in a preset time period. The voice input may generally refer to that the execution body receives a voice of a user for voice interaction. In practice, whether to allow receiving the voice input may be embodied as whether a microphone is turned on or not, or whether to transmit the collected voice to a processor in the execution main body, which is not limited herein. The preset time period is generally a time period counted from the present or later (e.g., after 1 minute). Optionally, the preset time period may also be related to the time when the audio is played. For example, assuming that the audio is played after 2s, the preset time period may be a time period from 2s to 1 min. The target device may be any device that is specified in advance according to actual application requirements. The target device may also be a device that is determined according to rules, for example, a device that plays audio in a manner controlled by the target device. When the execution agent is a terminal, the target device may be the execution agent itself.

As an example, in response to receiving an instruction characterizing switching to a voice interaction mode, the execution body may generate voice input indication information indicating that the target device is allowed to receive voice input for a preset time period. The instruction for switching the representation to the voice interaction mode may include a preset wake-up word.

In this embodiment, the execution main body may acquire the voice input instruction information from the target apparatus. As an example, the voice input indication information may be preset, for example, voice input indication information that indicates that the target device is not allowed to receive voice input for a preset time period by default. As still another example, the voice input indication information indicating whether the target device allows receiving the voice input for a preset time period may be switched according to a preset time.

Step 203, determining a frequency band matched with the voice input indication information.

In this embodiment, the execution subject may determine the frequency band matching the voice input indication information acquired in step 202 in various ways. As an example, in response to determining that the acquired voice input indication information indicates that the target device is not allowed to receive voice input for a preset time period, the execution main body may determine a frequency band corresponding to the audio to be played as a frequency band matching the voice input indication information.

As still another example, in response to determining that the acquired voice input indication information indicates that the target device is allowed to receive voice input for a preset time period, the execution main body may determine a frequency band matching the voice input indication information according to the following steps: first, the frequency band of the audio to be played is determined. And then, determining the intersection between the frequency band of the audio to be played and the preset frequency band. The preset frequency band can be a voice frequency band for a person to speak under the voice recognition scene. Then, in response to determining that the intersection meets a preset playing condition, the execution main body may determine a frequency band corresponding to the audio to be played as a frequency band matched with the voice input indication information. In response to determining that the intersection does not satisfy the preset playing condition, the execution main body may determine a frequency band matched with the voice input indication information as a frequency band higher than a frequency band corresponding to the audio to be played. The preset playing condition may include, but is not limited to, at least one of the following: the audio time corresponding to the intersection between the frequency bands is smaller than a preset time threshold, and the minimum frequency corresponding to the intersection between the frequency bands is larger than a preset frequency threshold.

In some optional implementation manners of this embodiment, in response to determining that the voice input indication information is used to indicate that the target device is not allowed to receive the voice input in the preset time period, the execution main body may further select, from the second preset correspondence table, a frequency band that is consistent with a frequency band corresponding to the audio to be played. The second preset correspondence table may be configured to represent a correspondence between a frequency band and voice input indication information. For example, the voice input indication information for indicating that the target device is not allowed to receive the voice input in the preset time period corresponds to a first frequency band, and the voice input indication information for indicating that the target device is allowed to receive the voice input in the preset time period corresponds to a second frequency band, where the second frequency band may be a subset of the first frequency band.

And step 204, playing the audio to be played according to the determined frequency band.

In this embodiment, the execution main body may play the audio to be played in various ways according to the determined frequency band. As an example, in response to determining that the determined frequency band is consistent with the frequency band with the playing audio, the execution main body may directly play the audio to be played. As another example, the execution main body may further transform the audio to be played to the determined frequency band and then play the audio.

In some optional implementation manners of this embodiment, based on the above optional implementation manners, the execution main body may further play the audio to be played through the following steps:

firstly, sound wave transformation is carried out on audio to be played to generate target audio.

In these implementations, the execution main body may perform sound wave transformation on the audio to be played in various ways to generate the target audio. The frequency bands of the target audio are usually a subset of the determined frequency bands. The sound wave transformation can be used for mapping the frequency band of the audio to be played to the determined frequency band range. The manner of sound wave transformation may include amplification operations for each frame frequency in the audio, which may include, but is not limited to, at least one of: multiplied by a preset value greater than 1, plus a preset value greater than 0, etc.

And secondly, playing the target audio.

In these implementations, the execution subject may play the target audio generated in the first step.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of a method for playing audio according to an embodiment of the present application. In the application scenario of fig. 3, the user 301 says the wake-up word "small a" 303 to the terminal device 302. The terminal device 302 locally obtains the audio to be played "please talk …, i.e. listen" 304 in response to the wake-up word 303. The terminal device 302 acquires voice input indication information 305 indicating that the terminal device is allowed to receive voice input for a preset time period according to the received wake-up word 303. Then, the terminal device 302 may determine that the frequency band matching the voice input indication information 305 is a frequency band 306 with a higher frequency than the audio 304 to be played. Finally, the terminal device 302 may play the audio "please talk …, i am listening" 307 in the determined higher frequency band 306.

At present, in one of the prior arts, a filter circuit is usually built in an intelligent device to filter out noise similar to a human voice frequency band, which results in an increase in hardware manufacturing and maintenance costs. The method provided by the embodiment of the application determines the frequency band for playing the audio according to the voice input indication information, so that the audio output frequency band is improved under the situation that human-computer interaction is needed, and the influence on human voice is reduced. Moreover, the audio is played in a normal mode under the situation that human-computer interaction is not needed, and the auditory experience of a user is guaranteed to be influenced as little as possible.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for playing audio is shown. The process 400 of the method for playing audio includes the following steps:

step 401, obtaining an audio to be played.

Step 402, acquiring voice input indication information of a target device.

In some optional implementations of this embodiment, in response to receiving an instruction to characterize playing audio, an execution subject of the method for playing audio (e.g., the terminal device 101 shown in fig. 1) may generate voice input indication information indicating that the target device is not allowed to receive voice input for a preset time period. The instruction for playing the audio may be, for example, "play a song" or "play a piece of funny video" received in the voice interaction process.

It should be noted that, since most video files will have corresponding audio integrated therein, the instruction for playing audio may also include an instruction for playing video files with audio.

And step 403, in response to determining that the voice input indication information is used for indicating that the target device is allowed to receive voice input in a preset time period, selecting a target number of frequency bands from the first preset corresponding relation table according to the sequence of the frequency bands from high to low.

In this embodiment, in response to determining that the voice input indication information indicates that the target device is allowed to receive the voice input in the preset time period, the execution main body may select the target number of frequency bands from the first preset correspondence table in order of frequency bands from top to bottom. The first preset correspondence table may be configured to represent a correspondence between a frequency band and voice input indication information. The first preset mapping table may be the same as or different from the second preset mapping table, and is not limited herein. As an example, the frequency range of sound that can be heard by the human ear is 20Hz to 20 kHz. The first preset correspondence table may divide the frequency range into a preset number of frequency bands. Alternatively, in order to enrich the frequency range of the played audio, there may be an intersection between the above frequency bands. The target number may be any value (e.g., 2 or 3) specified in advance, or may be a value determined according to the actual application requirement (e.g., 50% of the preset number).

In step 404, a frequency band matching the voice input indication information is determined from the target number of frequency bands.

In this embodiment, the executing entity may determine a frequency band matching the voice input indication information from the target number of frequency bands selected in step 403 by various methods. As an example, the execution main body may randomly select a frequency band from the target number of frequency bands as a frequency band matched with the voice input indication information. As yet another example, the execution body may first determine a statistical value of a frequency band of the audio to be played. Wherein, the frequency band statistic may include, but is not limited to, at least one of the following: maximum, minimum, mean, variance, median. Then, the execution main body may select a frequency band having a statistical value closest to the statistical value of the frequency band of the audio to be played from the target number of frequency bands as a frequency band matched with the voice input indication information.

In some optional implementation manners of this embodiment, in response to determining that the voice input indication information is used to indicate that the target device is not allowed to receive the voice input in the preset time period, the execution main body may further select, from the second preset correspondence table, a frequency band that is consistent with a frequency band corresponding to the audio to be played as the frequency band that is matched with the voice input indication information. The second preset correspondence table may be configured to represent a correspondence between a frequency band and voice input indication information. The above consistency may refer to the same, or may refer to that the ratio of the frequency bands in the overlapping portion exceeds a preset threshold, which is not limited herein.

And 405, playing the audio to be played according to the determined frequency band.

Step 401, step 402, and step 405 are respectively consistent with step 201, step 202, step 204 and their optional implementations in the foregoing embodiments, and the above description on step 201, step 202, step 204 and their optional implementations also applies to step 401, step 402, and step 405, which is not described herein again.

As can be seen from fig. 4, the flow 400 of the method for playing audio in the present embodiment refines the step of determining the frequency band matching the voice input indication information. Therefore, the scheme described in the embodiment can set a proper output frequency band under the situation that human-computer interaction is needed, so that the interference on human voice input can be reduced, and the uncomfortable influence of the rising of the audio frequency on the auditory experience of a user is reduced as much as possible.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for playing audio, which corresponds to the embodiment of the method shown in fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 5, the apparatus 500 for playing audio provided by the present embodiment includes a first acquiring unit 501, a second acquiring unit 502, a determining unit 503, and a playing unit 504. The first obtaining unit 501 is configured to obtain audio to be played; a second obtaining unit 502 configured to obtain voice input indication information of the target device, wherein the voice input indication information is used for indicating whether the target device is allowed to receive voice input in a preset time period; a determination unit 503 configured to determine a frequency band matching the voice input indication information; and a playing unit 504 configured to play the audio to be played according to the determined frequency band.

In the present embodiment, in the apparatus 500 for playing audio: the specific processing of the first obtaining unit 501, the second obtaining unit 502, the determining unit 503 and the playing unit 504 and the technical effects thereof can refer to the related descriptions of step 201, step 202, step 203 and step 204 in the corresponding embodiment of fig. 2, which are not described herein again.

In some optional implementations of the present embodiment, the determining unit 503 may include a selecting module (not shown in the figure) and a determining module (not shown in the figure). The selecting module may be configured to select, in response to determining that the voice input indication information indicates that the target device is allowed to receive the voice input in the preset time period, the target number of frequency bands from the first preset correspondence table in an order from high to low. The determining module may be configured to determine a frequency band matching the voice input indication information from the target number of frequency bands.

In some optional implementations of this embodiment, the determining unit 503 may be further configured to be a selecting module, configured to select, in response to determining that the voice input indication information is used to indicate that the target device is allowed to receive voice input in a preset time period, a target number of frequency bands from the first preset correspondence table in an order from high to low; and the determining module is configured to determine a frequency band matched with the voice input indication information from the target number of frequency bands.

In some optional implementations of the present embodiment, the second obtaining unit 502 may be further configured to: in response to receiving an instruction characterizing playing of audio, generating voice input indication information for indicating that the target device is not allowed to receive voice input for a preset time period.

In some optional implementations of this embodiment, the playing unit 504 may include: a generating module (not shown in the figure) and a playing module (not shown in the figure). The generating module may be configured to perform sound wave transformation on the audio to be played to generate the target audio. Wherein the frequency bands of the target audio may be a subset of the determined frequency bands. The playing module may be configured to play the target audio.

The apparatus provided in the foregoing embodiment of the present application first obtains the audio to be played through the first obtaining unit 501. Then, the second acquisition unit 502 acquires the voice input instruction information of the target apparatus. Wherein the voice input indication information is used for indicating whether the target device is allowed to receive the voice input in a preset time period. After that, the determination unit 503 determines a frequency band matching the voice input instruction information. The playing unit 504 plays the audio to be played according to the determined frequency band. Therefore, the audio output frequency band is improved under the situation that human-computer interaction is needed, and the influence on human voice is reduced. Moreover, the audio is played in a normal mode under the situation that human-computer interaction is not needed, and the auditory experience of a user is guaranteed to be influenced as little as possible.

Referring now to fig. 6, shown is a schematic diagram of an electronic device (e.g., the terminal device in fig. 1) 600 suitable for implementing embodiments of the present application. The terminal device in the embodiments of the present application may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a smart speaker, and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; a storage device 608 including, for example, a flash memory; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present application.

It should be noted that the computer readable medium described in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (Radio Frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be included in the terminal device; or may exist separately without being assembled into the terminal device. The computer readable medium carries one or more programs which, when executed by the terminal device, cause the terminal device to: acquiring audio to be played; acquiring voice input indication information of target equipment, wherein the voice input indication information is used for indicating whether the target equipment allows receiving voice input in a preset time period; determining a frequency band matched with the voice input indication information; and playing the audio to be played according to the determined frequency band.

Computer program code for carrying out operations for embodiments of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor comprises a first acquisition unit, a second acquisition unit, a determination unit and a playing unit. Where the names of the units do not in some cases constitute a limitation on the unit itself, for example, the first acquisition unit may also be described as a "unit that acquires audio to be played".

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present application is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present application are mutually replaced to form the technical solution.

Claims

1. A method for playing audio, comprising:

acquiring audio to be played;

acquiring voice input indication information of target equipment, wherein the voice input indication information is used for indicating whether the target equipment allows receiving voice input in a preset time period;

determining a frequency band matched with the voice input indication information;

and playing the audio to be played according to the determined frequency band.

2. The method of claim 1, wherein the determining a frequency band matching the speech input indication information comprises:

in response to determining that the voice input indication information is used for indicating that the target device is allowed to receive voice input in a preset time period, selecting a target number of frequency bands from a first preset corresponding relation table according to a sequence of the frequency bands from high to low;

and determining a frequency band matched with the voice input indication information from the target number of frequency bands.

3. The method of claim 1 or 2, wherein the determining a frequency band matching the speech input indication information comprises:

and in response to determining that the voice input indication information is used for indicating that the target device is not allowed to receive voice input in a preset time period, selecting a frequency band consistent with a frequency band corresponding to the audio to be played from a second preset corresponding relation table as the frequency band matched with the voice input indication information.

4. The method of claim 3, wherein the obtaining of the voice input indication information of the target device comprises:

in response to receiving an instruction for playing audio, generating voice input indication information for indicating that the target device is not allowed to receive voice input for a preset time period.

5. The method of claim 2, wherein the playing the audio to be played according to the determined frequency band comprises:

performing sound wave transformation on the audio to be played to generate a target audio, wherein the frequency band of the target audio is a subset of the determined frequency band;

and playing the target audio.

6. An apparatus for playing audio, comprising:

a first acquisition unit configured to acquire audio to be played;

a second acquisition unit configured to acquire voice input indication information of a target device, wherein the voice input indication information is used for indicating whether the target device is allowed to receive voice input in a preset time period;

a determination unit configured to determine a frequency band matching the voice input indication information;

and the playing unit is configured to play the audio to be played according to the determined frequency band.

7. The apparatus of claim 6, wherein the determining unit comprises:

a selecting module configured to select a target number of frequency bands from a first preset correspondence table in an order from high to low in response to determining that the voice input indication information is used to indicate that the target device is allowed to receive voice input in a preset time period;

a determining module configured to determine a frequency band matching the voice input indication information from the target number of frequency bands.

8. The apparatus of claim 6 or 7, wherein the determining unit is further configured to:

9. The apparatus of claim 8, wherein the second obtaining unit is further configured to:

10. The apparatus of claim 7, wherein the play unit comprises:

a generating module configured to perform sound wave transformation on the audio to be played to generate a target audio, wherein a frequency band of the target audio is a subset of the determined frequency band;

a playback module configured to play the target audio.

11. A terminal, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.

12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.