CN110719545B

CN110719545B - Audio playing device and method for playing audio

Info

Publication number: CN110719545B
Application number: CN201910864212.7A
Authority: CN
Inventors: 梁文昭
Original assignee: Lianshang Xinchang Network Technology Co Ltd
Current assignee: Lianshang Xinchang Network Technology Co Ltd
Priority date: 2019-09-12
Filing date: 2019-09-12
Publication date: 2022-11-08
Anticipated expiration: 2039-09-12
Also published as: CN110719545A

Abstract

The purpose of this application is to provide an audio playback device and method for playing back audio; the audio playing device plays the current audio based on the first output level and collects environmental sound information; the audio playing device plays the current audio or pauses playing the current audio based on a second output level in response to detecting a control trigger event with respect to the ambient sound information. The method and the system facilitate timely reaction of the users when the users talk with the users around, improve team communication efficiency, and timely join in interested discussions.

Description

Audio playing device and method for playing audio

Technical Field

The present application relates to the field of communications, and more particularly, to a technique for playing audio.

Background

With the improvement of living standard, people have higher and higher requirements on music playing equipment, for example, people can obtain immersive listening experience by wearing listening equipment such as earphones and earplugs; in some cases, people also wear earphones and earplugs to isolate outside noise. In addition, listening device manufacturers will also improve the ambient sound blocking performance (passive noise reduction) of listening devices as much as possible, and even add an active noise reduction function for the ambient sound to the listening devices, so as to improve the user experience.

Disclosure of Invention

An object of the present application is to provide an audio playback apparatus and a method for playing back audio.

According to one aspect of the present application, a method for playing audio is provided, which is applied to an audio playing device. Wherein, the method comprises the following steps:

playing the current audio based on the first output level;

collecting environmental sound information; and

in response to detecting a control-triggering event with respect to the ambient sound information, playing the current audio based on a second output level, wherein the volume of playing the current audio based on the second output level is different from the volume of playing the current audio based on the first output level; alternatively, the first and second electrodes may be,

in response to detecting a control-triggering event with respect to the ambient sound information, pausing playing the current audio.

Accordingly, according to another aspect of the present application, there is provided an audio playback apparatus, wherein the apparatus comprises:

a processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to perform the operations of the above-described method.

The present application also provides a computer-readable medium storing instructions that, when executed by a computer, cause the computer to perform the operations of the above-described method.

According to another aspect of the present application, the present application further provides an audio playing device. Wherein, this equipment includes:

a first module for playing a current audio based on a first output level;

the second module is used for collecting environmental sound information; and

a third module for playing the current audio based on a second output level in response to detecting a control trigger event with respect to the ambient sound information, wherein a volume of playing the current audio based on the second output level is different from a volume of playing the current audio based on the first output level; alternatively, the first and second electrodes may be,

a third module for pausing playback of the current audio in response to detecting a control trigger event with respect to the ambient sound information.

With the improvement of the blocking performance of the earphones/earplugs and the like to the outside sound, the user is likely to not hear the surrounding people speaking with the user when enjoying music or to not participate in discussion when the surrounding friends talk about the topics of interest, so the experience is not good and much fun is reduced. In view of this, the present application provides an audio playing device and a method for playing audio, which collect environmental sounds, detect a specific control trigger event based on the environmental sounds, and change (e.g., reduce) the playing volume of the current audio or pause playing the current audio when the control trigger event is detected, so that a user can respond to the situation when someone around the user speaks, improve team communication efficiency, and join in a discussion of his own interest in time.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

FIG. 1a and FIG. 1b are flow charts of a method for playing audio according to an embodiment of the present application;

FIGS. 2a to 2c respectively show an implementation scenario of an embodiment of the present application;

fig. 3a and fig. 3b respectively show functional modules of an audio playing device in an embodiment of the present application;

FIG. 4 illustrates functional modules of an exemplary system for use in various embodiments of the present application.

The same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

The present application is described in further detail below with reference to the attached figures.

In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (e.g., central Processing Units (CPUs)), input/output interfaces, network interfaces, and memory.

The Memory may include forms of volatile Memory, random Access Memory (RAM), and/or non-volatile Memory in a computer-readable medium, such as Read Only Memory (ROM) or Flash Memory. Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase-Change Memory (PCM), programmable Random Access Memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read-Only Memory (ROM), electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory (Flash Memory) or other Memory technology, compact Disc Read-Only Memory (CD-DVD), digital versatile Disc (versatile Disc, DVD) or other optical storage, magnetic cassettes or other magnetic storage devices, magnetic tape or other non-magnetic storage devices that can store information for storage or other storage devices that can be accessed by a computer or other computer.

The device referred to in this application includes, but is not limited to, a user device, a network device, or a device formed by integrating a user device and a network device through a network. The user equipment includes, but is not limited to, any mobile electronic product, such as a smart phone, a tablet computer, etc., capable of performing human-computer interaction with a user (e.g., human-computer interaction through a touch panel), and the mobile electronic product may employ any operating system, such as an Android operating system, an iOS operating system, etc. The network Device includes an electronic Device capable of automatically performing numerical calculation and information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded Device, and the like. The network device includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud of a plurality of servers; here, the Cloud is composed of a large number of computers or web servers based on Cloud Computing (Cloud Computing), which is a kind of distributed Computing, one virtual supercomputer consisting of a collection of loosely coupled computers. Including, but not limited to, the internet, wide area networks, metropolitan area networks, local area networks, VPN networks, wireless Ad Hoc networks (Ad Hoc networks), etc. Preferably, the device may also be a program running on the user device, the network device, or a device formed by integrating the user device and the network device, the touch terminal, or the network device and the touch terminal through a network.

Of course, those skilled in the art will understand that the above-described apparatus is merely exemplary, and that other existing or future existing apparatus, as may be suitable for use in the present application, are intended to be encompassed within the scope of the present application and are hereby incorporated by reference.

In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

With the improvement of the blocking performance of the earphones/earplugs and the like to the outside sound, the user is likely to not hear the surrounding people speaking with the user when enjoying music or to not participate in discussion when the surrounding friends talk about the topics of interest, so the experience is not good and much fun is reduced. In view of the above, the present application provides an audio playing device and a method for playing audio, which will be described in detail below with reference to the accompanying drawings.

The following describes in detail a specific embodiment of the present application based on an audio playback apparatus. Wherein the audio playing device is, in some embodiments, a mobile phone, a tablet computer, a personal computer or other electronic products, and in some embodiments, includes a main body portion for performing data processing, a sound collecting unit (e.g., the portion including a microphone and corresponding peripheral circuitry) for collecting ambient sound, and optionally an audio output unit (e.g., a headset, an ear plug, a speaker, etc.); wherein in some embodiments the sound capturing unit is attached to the audio output unit, for example the sound capturing unit comprises one or several microphones arranged on the audio output unit. It will be understood by those skilled in the art that existing and future audio playback devices, as may be suitable for use in the present application, are included within the scope of the present application and are incorporated by reference herein. For example, without limitation, the audio output unit and/or the sound collection unit may be disposed inside the main body, may be connected to the main body by a cable, and may communicate with the main body based on a communication protocol such as bluetooth or Wi-Fi.

According to one aspect of the present application, a method for playing audio is provided, which is applied to an audio playing device. Wherein the method comprises step S100, step S200 and step S310, refer to fig. 1a; or the method comprises step S100, step S200 and step S320, see fig. 1b. In some embodiments, the method is implemented based on the scenario shown in fig. 2a, in which the audio playing device 10 outputs audio through a pair of headphones, and captures ambient sounds at the position of the user through a microphone (for example, the first microphone 201 in fig. 2 a).

Specifically, in step S100, the audio playing device plays the current audio (e.g., a piece of music) based on the first output level. For example, the first output level corresponds to an output volume of the audio output unit, e.g., 60dB.

In step S200, the audio playback apparatus collects ambient sound information through a microphone. In some embodiments, the microphone is built into the audio playback device, and in other embodiments, the microphone is coupled to the audio playback device in a wired or wireless manner. In particular, in some embodiments, the microphone is attached to an external audio output unit (e.g., the headset), for example, the microphone is fixed to the headset. It will be understood by those skilled in the art that the above-described microphone arrangements are exemplary only and not intended to limit the scope of the present application, and that other existing or future arrangements of microphones, such as may be suitable for use in the present application, are also included within the scope of the present application and are hereby incorporated by reference.

In step S310, the audio playing device plays the current audio based on the second output level in response to detecting the control trigger event related to the environmental sound information. Where the volume at which the current audio is played based on the second output level is different from the volume at which the current audio is played based on the first output level, in some exemplary embodiments the audio output volume corresponding to the second output level is lower than the audio output volume corresponding to the first output level, for example, the output volume corresponding to the second output level is 30dB (or other smaller value). In particular, in some embodiments, the output volume of the audio output unit is reduced to 0dB based on the second output level.

In step S320, the audio playing device pauses playing the current audio in response to detecting the control trigger event related to the environmental sound information.

In order to facilitate the user to resume listening to audio after the conversation or discussion is finished, thereby improving the user experience, in some embodiments, the method further includes step S400 (not shown). In step S400, the audio playing device plays the current audio based on the first output level in response to detecting a recovery triggering event. For the case that the audio playing device plays the current audio at the second output level after detecting the control trigger event, the audio playing device resumes playing the current audio at the first output level, for example, the output volume of the audio output unit after adjusting the output level is restored from 30dB (or other smaller value) to 60dB. For the case that the audio playing device pauses playing the current audio after detecting the control trigger event, the audio playing device continues playing the current audio at the first output level from the playing breakpoint position of the current audio or the vicinity of the breakpoint position, for example, the audio output unit outputs the current audio at an output volume of 60dB.

Wherein to facilitate user action, in some embodiments the recovery triggering event comprises any one of:

a user performs a level restoration operation, such as pressing a physical/virtual key on the audio playing device, issuing a voice command, and making a somatosensory input (e.g., the audio playing device detects a gesture or a shaking motion of the user through a camera to detect the input; or the audio playing device detects shaking or flipping of the audio playing device by the user based on a built-in gyroscope to detect the input; or the audio playing device detects the operation through a peripheral device, such as the audio output unit detects a somatosensory operation such as shaking or the like of the user or an operation of the physical/virtual key by the user, and sends a detection result to a main body part of the audio playing device) to perform the level restoration operation;

the user does not answer within a preset time period, for example, the user does not speak for a preset time period (e.g., 10 seconds), or does not operate any physical/virtual key or the like, or the user does not perform any somatosensory operation.

In some embodiments, in step S310, the audio playing device provides the level adjustment prompt information in response to detecting the control trigger event related to the environmental sound information, and plays the current audio based on the second output level. For example, the level adjustment prompt information is provided in some embodiments by sound (including voice, preset ring tone, etc.), text push, etc. to notify the user of the attention of the conversation or discussion content, so that the user can focus on the conversation or discussion more quickly, thereby improving the user experience.

In some embodiments, the control trigger event described above includes at least any one of:

-the volume of the ambient sound information is larger than a preset volume threshold;

-the volume of the ambient sound information increases over time, e.g. the speaker gradually approaches the user or the speaker increases the volume;

-a sound property of the ambient sound information fulfils a preset property condition, e.g. a human voice is detected from the ambient sound information, or a sound complying with a preset frequency or voiceprint (e.g. a person-specific sound) is detected from the ambient sound information.

In some embodiments, referring to fig. 2b, the audio playing device 10 collects ambient sounds through the first microphone 201 and the second microphone 202, and detects the control trigger event based on sound information collected by the first microphone 201 and the second microphone 202, respectively, so as to reduce misjudgment and be suitable for implementing a specific logic function in some embodiments. Specifically, the audio playback device collects first ambient sound information based on the first microphone 201, and collects second ambient sound information based on the second microphone 202. Accordingly, for the case where the audio playing device plays the current audio at the second output level after detecting the above-mentioned control trigger event, in step S310, the audio playing device plays the current audio based on the second output level in response to detecting the control trigger event with respect to the first ambient sound information and the second ambient sound information. For the case that the audio playing device pauses playing the current audio after detecting the control trigger event, in step S320, the audio playing device pauses playing the current audio in response to detecting the control trigger event related to the first environmental sound information and the second environmental sound information.

In some embodiments, in the case that the audio playing apparatus 10 collects the ambient sound through the first microphone 201 and the second microphone 202, the method further includes step S500 and step S600 (both not shown). In step S500, the audio playing device determines sound source orientation information based on the first environmental sound information and the second environmental sound information; then, in step S600, the audio playback apparatus indicates the sound source position to the user based on the sound source position information. Wherein, the sound source orientation is provided to the user in the form of voice broadcast based on the audio output unit, indication symbols/indication words presented on the screen of the audio playback apparatus, etc., so that the user can quickly locate the sound source and more quickly focus on the conversation or discussion, thereby improving the user experience.

Specifically, in some embodiments, the sound source direction information is used to determine the sound source direction based on the volume of the same environmental sound collected by the first microphone 201 and the second microphone 202 (for example, the same environmental sound collected by the first microphone 201 and the second microphone 202 is determined if the frequencies of the sound collected by the two

microphones

201 and 202 are the same or close), or the time difference between the first microphone 201 and the second microphone 202 receiving the same environmental sound. Taking the first microphone 201 and the second microphone 202 separately disposed on two sides of the head of the user (for example, separately disposed on two side earphone units of an earphone worn by the user), and the audio playing device determining the sound source direction based on the time difference, referring to fig. 2c, when the difference between the distances of the speaker and the first microphone 201 and the second microphone 202 is d, there will be a time difference between the arrival times of the ambient sound at the first microphone 201 and the second microphone 202 (the time difference can be used to calculate the difference d between the distances); based on the time difference, it is possible to determine whether the sound source is located substantially on the left side or right side of the user. Taking the orientation shown in fig. 2c as an example, if the environmental sound first reaches the first microphone 201 and then reaches the second microphone 202, the sound source is roughly determined to be located at the left side of the user, and the user can visually search for the speaker from the left space without searching for the speaker in the whole space, so that the user can quickly focus on the conversation or discussion. Further, theoretically, the set of points satisfying the characteristic in the space on the plane where the first microphone 201 and the second microphone 202 are located should be one branch of a hyperbola that focuses on the positions where the first microphone 201 and the second microphone 202 are located, and the direction of the sound source can be further determined to be within the range between the asymptotes of the hyperbolas, so that the efficiency of the user in searching for the speaker is further improved based on the hyperbola determined by the difference d between the positions of the first microphone 201 and the second microphone 202 and the distance.

It should be understood that the above-described embodiments for determining the orientation of the sound source based on the time difference are only examples and are not intended to limit the present application in any way. In other embodiments, the orientation of the audio source may also be determined based on other means. For example, although the ears are closer, the sound blocking effect of the user's head will cause the sound level difference between the same ambient sound received by the first microphone 201 and the second microphone 202, and the sound source should be located at the side with the larger sound level. For another example, since the sound waves have different phases at different positions in space, the phase difference of the sound wave vibrations can also be used to identify the sound source position. For another example, the sound wave transmitted from one side of the user bypasses the user to reach the other side, and the diffraction capability of the sound wave is related to the ratio between the wavelength and the size of the obstacle, and for the same obstacle, the higher the sound frequency is, the greater the attenuation of the corresponding component is, and thus the timbre of the sound received by the microphones on different sides is also different. In addition, in some embodiments, the parameters are combined to obtain more accurate sound source orientation. It should be understood by those skilled in the art that these ways for determining the orientation of the sound source are only examples and not any limitation to the present application, and other existing ways for determining the orientation of the sound source based on the first ambient sound information and the second ambient sound information, such as the way that the sound source is determined by the present application, are also included in the scope of protection of the present application and are included by reference.

In some embodiments, in order to improve the positioning accuracy of the sound source position, in step S500, the audio playing device tracks the first environmental sound information and the second environmental sound information to determine the sound source position information. For example, the audio playing device first collects first environmental sound information and second environmental sound information by using the first microphone 201 and the second microphone 202, and determines first orientation information of the sound source based on any one of the above manners; while the user rotates the head based on the first orientation information, the audio playing device continuously collects the first environmental sound information and the second environmental sound information by the first microphone 201 and the second microphone 202, and continuously determines the second orientation information of the sound source based on any one of the above modes; the audio playing device then determines the azimuth information of the sound source based on the first azimuth information and the second azimuth information, for example, the azimuth information of the sound source is determined by the intersection of the spatial ranges covered by the first azimuth information and the second azimuth information, so as to improve the measurement accuracy of the azimuth of the sound source.

In some embodiments, the method further comprises step S700 (not shown). In step S700, the audio playing device obtains attitude information of the audio playing device, where the attitude information is used to characterize an angular state (attitude) including (but not limited to) pitch, roll, etc. of the audio playing device in space, for example, the attitude information includes that the audio playing device is in a landscape/portrait state. In some embodiments, the gesture information is obtained based on a sensing device such as a gyroscope or a gravity sensor built in the audio playback device. Then, in step S600, the audio playing device presents the orientation of the sound source relative to the audio playing device based on the posture information and the sound source orientation information, so as to indicate the sound source orientation to the user, for example, visually indicate the sound source orientation at a corresponding angle in the landscape/portrait state of the audio playing device, refer to fig. 2c.

In order to avoid unnecessary disturbance in the case of determining the sound source direction information based on the first environmental sound information and the second environmental sound information, and only detecting whether a person speaks to the user or talks about a topic of interest to the user in a spatial direction of interest to the user, in some embodiments, the control trigger event includes: the sound source direction detected by the audio playing equipment meets the preset direction condition. For example, the direction of the sound source detected by the audio playback device is included in the spatial range specified by the user in advance, or the approximate range of the direction of the sound source detected by the audio playback device intersects with the spatial range specified by the user in advance. On the basis, the control trigger event also optionally comprises any one of the following items to avoid the system from being sensitive to generate misoperation:

-the volume of the first ambient sound information is larger than a preset first volume threshold;

-the volume of the second ambient sound information is larger than a preset second volume threshold;

-the volume of the first ambient sound information increases over time;

-the volume of the second ambient sound information increases over time;

-the difference between the volume of the first ambient sound information and the second ambient sound information decreases over time, e.g. the sound source gradually becomes facing the user.

Of course, the sound source direction meeting the preset direction condition is not necessarily a prerequisite for the above control trigger events; accordingly, in some embodiments, where the first and second ambient sound information is collected with the first and

second microphones

201 and 202, the control trigger event includes any of:

-the volume of the first ambient sound information increases over time;

-the volume of the second ambient sound information increases over time;

In some embodiments, the control trigger event comprises: the environmental sound information includes predetermined keyword information so that the user is timely notified of listening or joining the discussion when someone mentions a corresponding keyword (e.g., someone mentions a topic of interest to the user) in the surroundings; accordingly, in step S310, the audio playing device plays the current audio based on the second output level in response to detecting that the ambient sound information includes the predetermined keyword information. Particularly, if the preset keywords comprise the stop-reporting vocabularies of the vehicles, the user can be reminded of arriving at the stop and getting off in time. For example, when the keyword contains the name of the user, the user can quickly react when someone else calls himself or talks about himself; when the keyword contains the preset station name (such as a subway station) of the user, the user can get off the train in time when taking a vehicle; when the keywords include a website forecast (such as a 'next station' prompt during the website report), the user can get off the bus as soon as possible when taking a vehicle and coming to the station; 823060, 8230and so on. In some embodiments, the audio playing device first obtains text information corresponding to the environmental sound information (for example, the audio playing device locally converts voice in the environmental sound information into characters, or the audio playing device sends the environmental sound information to the cloud and receives characters returned after the corresponding voice is converted by the cloud), and detects whether preset keywords are contained in the text information; of course, the audio playing device can also send the environmental sound information to the cloud end, detect whether the environmental sound information contains preset keywords or not through the cloud end, and return the judgment result to the audio playing device. In some embodiments, the predetermined keyword information is entered by the user in advance locally on the audio playing device, and in other embodiments, the predetermined keyword information is sent locally on the audio playing device by the cloud server (for example, the audio playing device initiates a synchronization operation, or the cloud server pushes the predetermined keyword information to the audio playing device).

It should be understood by those skilled in the art that the above-described embodiments are merely exemplary and not restrictive of the present application, and that other corresponding embodiments, which are currently or later become known, may be applied to the present application and are included within the scope of the present application and are incorporated herein by reference.

In this application, a Microphone (or called "Microphone", "Microphone" or "Microphone") may have different directivities in different embodiments. For example, the microphones used to implement the present application may be Omnidirectional microphones, or may be single-directional microphones, bi-directional microphones, or Microphone arrays (Microphone arrays) to achieve more precise positioning of the sound source, wherein common single-directional microphones include cardiac directional microphones and hyper-cardiac directional microphones.

Corresponding to the method, according to another aspect of the present application, the present application further provides an audio playing device. The audio playing device includes a first module 100, a second module 200, and a third module 310, refer to fig. 3a, where the first module 100, the second module 200, and the third module 310 are respectively configured to perform operations of step S100, step S200, and step S310 in the embodiment corresponding to fig. 1a, and for a specific implementation, reference is made to the related embodiments, which are not repeated herein; alternatively, the audio playing device includes a first module 100, a second module 200, and a third module 320, referring to fig. 3b, where the first module 100, the second module 200, and the third module 320 are respectively configured to perform operations of step S100, step S200, and step S320 in the embodiment corresponding to fig. 1b, and a specific implementation manner refers to the related embodiments and is not described herein again.

In some embodiments, the audio playing device further includes a fourth module 400 (not shown), where the fourth module 400 is used to execute the step S400 in the foregoing embodiments, please refer to the foregoing related embodiments, which are not described herein again.

In some embodiments, the audio playing device further includes a fifth module 500 (not shown), where the fifth module 500 is used for performing the step S500 in the above embodiments, please refer to the above related embodiments, and details are not repeated herein.

In some embodiments, the audio playing device further includes a sixth module 600 (not shown), where the sixth module 600 is used to execute the step S600 in the foregoing embodiments, please refer to the foregoing related embodiments, which is not described herein again.

In some embodiments, the audio playing device further includes a seventh module 700 (not shown), where the seventh module 700 is configured to perform the step S700 in the foregoing embodiments, please refer to the foregoing related embodiments, which is not repeated herein.

Some specific embodiments of the present application are detailed above. It should be understood that the above embodiments are only examples and are not intended to limit the specific embodiments of the present application.

The present application also provides a computer readable storage medium having stored thereon computer code which, when executed, performs the method of any of the preceding claims.

The present application also provides a computer program product, which when executed by a computer device, performs the method of any of the preceding claims.

The present application further provides a computer device, comprising:

one or more processors;

a memory for storing one or more computer programs;

the one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any preceding claim.

FIG. 4 illustrates an exemplary system that can be used to implement the various embodiments described in this application.

As shown, in some embodiments, the system 1000 can be implemented as any one of the audio playback devices in the embodiments. In some embodiments, system 1000 may include one or more computer-readable media (e.g., system memory or NVM/storage 1020) having instructions and one or more processors (e.g., processor(s) 1005) coupled with the one or more computer-readable media and configured to execute the instructions to implement modules to perform the actions described herein.

For one embodiment, system control module 1010 may include any suitable interface controllers to provide any suitable interface to at least one of the processor(s) 1005 and/or to any suitable device or component in communication with system control module 1010.

The system control module 1010 may include a memory controller module 1030 to provide an interface to the system memory 1015. Memory controller module 1030 may be a hardware module, a software module, and/or a firmware module.

System memory 1015 may be used to load and store data and/or instructions, for example, for system 1000. For one embodiment, system memory 1015 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the system memory 1015 may include a double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).

For one embodiment, system control module 1010 may include one or more input/output (I/O) controllers to provide an interface to NVM/storage 1020 and communication interface(s) 1025.

For example, NVM/storage 1020 may be used to store data and/or instructions. NVM/storage 1020 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more Hard Disk drive(s) (HDD (s)), one or more Compact Disc (CD) drive(s), and/or one or more Digital Versatile Disc (DVD) drive (s)).

NVM/storage 1020 may include storage resources that are physically part of a device on which system 1000 is installed or may be accessed by the device and not necessarily part of the device. For example, NVM/storage 1020 may be accessed over a network via communication interface(s) 1025.

Communication interface(s) 1025 may provide an interface for system 1000 to communicate over one or more networks and/or with any other suitable device. System 1000 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols.

For one embodiment, at least one of the processor(s) 1005 may be packaged together with logic for one or more controller(s) of the system control module 1010, such as the memory controller module 1030. For one embodiment, at least one of the processor(s) 1005 may be packaged together with logic for one or more controller(s) of the system control module 1010 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1005 may be integrated on the same die with logic for one or more controller(s) of the system control module 1010. For one embodiment, at least one of the processor(s) 1005 may be integrated on the same die with logic of one or more controllers of the system control module 1010 to form a system on a chip (SoC).

In various embodiments, system 1000 may be, but is not limited to being: a server, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, system 1000 may have more or fewer components and/or different architectures. For example, in some embodiments, system 1000 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.

It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

In addition, some of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application through the operation of the computer. Those skilled in the art will appreciate that the form in which the computer program instructions reside on a computer-readable medium includes, but is not limited to, source files, executable files, installation package files, and the like, and that the manner in which the computer program instructions are executed by a computer includes, but is not limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding installed program. In this regard, computer readable media can be any available computer readable storage media or communication media that can be accessed by a computer.

Communication media includes media whereby communication signals, including, for example, computer readable instructions, data structures, program modules, or other data, are transmitted from one system to another. Communication media may include conductive transmission media such as cables and wires (e.g., fiber optics, coaxial, etc.) and wireless (non-conductive transmission) media capable of propagating energy waves such as acoustic, electromagnetic, RF, microwave, and infrared. Computer readable instructions, data structures, program modules or other data may be embodied in a modulated data signal, such as a carrier wave or similar mechanism that is embodied in a wireless medium, such as part of spread-spectrum techniques, for example. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. The modulation may be analog, digital, or hybrid modulation techniques.

By way of example, and not limitation, computer-readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media include, but are not limited to, volatile memory such as random access memory (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, feRAM); and magnetic and optical storage devices (hard disk, tape, CD, DVD); or other now known media or later developed that are capable of storing computer-readable information/data for use by a computer system.

An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or a solution according to the aforementioned embodiments of the present application.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. A method for playing audio, applied to an audio playing device, wherein the method comprises:

playing the current audio based on the first output level;

collecting environmental sound information, wherein the environmental sound information comprises first environmental sound information and second environmental sound information, the first environmental sound information is collected based on a first microphone, the second environmental sound information is collected based on a second microphone, the first environmental sound information and the second environmental sound information are used for detecting a control trigger event, the first microphone and the second microphone are respectively arranged on two side earphone units of an earphone worn by a user, the control trigger event comprises that the difference between the volumes of the first environmental sound information and the second environmental sound information is reduced along with time, and the difference between the volumes of the first environmental sound information and the second environmental sound information is reduced along with time to determine that a sound source is opposite to the user;

in response to detecting the control trigger event regarding the first and second ambient sound information, providing level adjustment prompt information, and playing the current audio based on a second output level, wherein the level adjustment prompt information is used for notifying a user of attention to conversation or discussion content, and the volume of playing the current audio based on the second output level is different from the volume of playing the current audio based on the first output level.

2. The method of claim 1, wherein the method further comprises:

in response to detecting a recovery trigger event, playing the current audio based on the first output level.

3. The method of claim 2, wherein the recovery triggering event comprises any one of:

the user executes the level recovery operation;

the user does not respond within a preset time length.

4. The method of claim 1, wherein the control trigger event comprises at least any one of:

the volume of the environmental sound information is larger than a preset volume threshold;

the volume of the ambient sound information increases over time;

and the sound attribute of the environmental sound information meets a preset attribute condition.

5. The method of claim 1, wherein the method further comprises:

determining sound source orientation information based on the first ambient sound information and the second ambient sound information;

and indicating the direction of the sound source to the user based on the sound source direction information.

6. The method of claim 5, wherein the step of determining sound source orientation information based on the first ambient sound information and the second ambient sound information comprises:

tracking the first environmental sound information and the second environmental sound information to determine sound source orientation information.

7. The method of claim 5, wherein the step of indicating the audio source position to the user based on the audio source position information is preceded by:

acquiring attitude information of the audio playing equipment;

the step of indicating the direction of the sound source to the user based on the sound source direction information includes:

and presenting the position of the sound source relative to the audio playing equipment based on the attitude information and the sound source position information, thereby indicating the position of the sound source to the user.

8. The method of claim 5, wherein the control trigger event comprises:

the sound source azimuth information meets a preset azimuth condition.

9. The method of claim 8, wherein the control trigger event further comprises any one of:

the volume of the first environmental sound information is larger than a preset first volume threshold;

the volume of the second environment sound information is larger than a preset second volume threshold;

the volume of the first ambient sound information increases over time;

the volume of the second ambient sound information increases with time;

a difference between the volumes of the first ambient sound information and the second ambient sound information decreases with time.

10. The method of claim 1, wherein the control trigger event further comprises any one of:

the volume of the first ambient sound information increases over time;

the volume of the second ambient sound information increases over time.

11. The method of claim 1, wherein the control trigger event comprises: the ambient sound information includes predetermined keyword information, and the playing the current audio based on a second output level in response to detecting a control trigger event with respect to the ambient sound information includes:

playing the current audio based on a second output level in response to detecting that the ambient sound information includes the predetermined keyword information.

12. An audio playback apparatus, wherein the apparatus comprises:

a processor; and

a memory arranged to store computer-executable instructions that, when executed, cause the processor to perform operations according to the method of any one of claims 1 to 11.

13. A computer-readable medium storing instructions that, when executed by a computer, cause the computer to perform operations according to the method of any one of claims 1 to 11.