CN109522425A

CN109522425A - A kind of method, apparatus and storage equipment of adjustment multimedia environment

Info

Publication number: CN109522425A
Application number: CN201811346948.7A
Authority: CN
Inventors: 薄海硕; 顾嘉唯
Original assignee: Beijing Genius Intelligent Technology Co Ltd
Current assignee: Luka Beijing Intelligent Technology Co ltd
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2019-03-26
Anticipated expiration: 2038-11-13
Also published as: CN109522425B

Abstract

This application discloses a kind of method, apparatus for adjusting multimedia environment and storage equipment, this method comprises: detection ambient sound, output characteristic value；Track volume and/or lamp effect are adjusted according to the characteristic value.The beneficial effect that the application can obtain is, builds the atmosphere of a whole set of acousto-optic integrated, has real-time, dynamic changeability, and interactivity sufficiently increases for the smart of product and personification and divides.

Description

A kind of method, apparatus and storage equipment of adjustment multimedia environment

Technical field

This application involves field of artificial intelligence, more particularly, to a kind of method, apparatus for adjusting multimedia environment and Store equipment.

Background technique

Deep neural network (Deep neural network, DNN) is in all trades and professions using more and more extensive.With people Quality of life raising, the requirement to audio-visual class product is also higher and higher.People are when enjoying audio-visual class product, environment In various sound influence whether the hearing experience of people.Simple lighting effects and sound change can not meet people couple The enjoyment of audiovisual.The prior art can't build the atmosphere of acousto-optic integrated in conjunction with sound and light.

Summary of the invention

The embodiment of the present application provides the method, apparatus and storage equipment of a kind of adjustment multimedia environment.Solving cannot tie Chorus sound and light build the atmosphere problem of acousto-optic integrated.

The embodiment of the present application provides a kind of method for adjusting multimedia environment, this method comprises:

Detect ambient sound, output characteristic value；

Track volume and/or lamp effect are adjusted according to the characteristic value.

Further, described to include: according to characteristic value adjustment track volume and/or lamp effect

According to the characteristic value, the ambient sound is divided into noise and specific audio；

According to the corresponding characteristic value adjustment track volume of the specific audio and/or lamp effect.

Further, the parameter of the characteristic value includes: volume value, time-domain signal phase-amplitude and time frequency signal harmonic wave energy Amount.

Further, described to include: according to the corresponding characteristic value adjustment track volume of the specific audio and/or lamp effect

With one of volume value, time-domain signal phase-amplitude and time frequency signal harmonic energy be index to characteristic value according to It sorts from large to small；

According to volume value, time-domain signal phase-amplitude or the time frequency signal harmonic wave energy in the characteristic value after sequence as index The corresponding characteristic value adjustment track volume of the maximum value of amount and/or lamp effect.

Further, in the characteristic value according to after sequence as index volume value, time-domain signal phase-amplitude or The corresponding characteristic value adjustment track volume of the maximum value of time frequency signal harmonic energy and/or lamp effect further include:

It is more than one in the maximum value of volume value, time-domain signal phase-amplitude or time frequency signal harmonic energy as index In the case where, one of the parameter of characteristic value more not as index selects the not parameter as the characteristic value of index One of maximum value corresponding characteristic value adjustment track volume and/or lamp effect；

In the case where this is not more than one as the maximum value of one of parameter of characteristic value indexed, do not make For the other of the parameter of characteristic value of index, the not maximum as the other of parameter of characteristic value indexed is selected It is worth corresponding characteristic value adjustment track volume and/or lamp effect.

Volume value, time-domain signal phase-amplitude or time frequency signal harmonic energy in characteristic value after sequence as index The corresponding feature value parameter of maximum value reach corresponding preset threshold in the case where, adjust track volume and/or lamp effect.

Further, the adjustment track volume includes:

Adjust the volume proportionate relationship between different tracks.

Further, the adjustment lamp effect includes:

Call preset signal light control agreement, display lamp effect.

The embodiment of the present application also provides a kind of storage equipment, are stored thereon with program data, and described program data are used for The method of above-mentioned adjustment multimedia environment is realized when being executed by processor.

The embodiment of the present application also provides a kind of device for adjusting multimedia environment, which includes:

Equipment is stored, for storing program data；

Processor, the side for executing the program data in the storage equipment to realize above-mentioned adjustment multimedia environment Method.

The beneficial effect that the application can obtain is, builds the atmosphere of a whole set of acousto-optic integrated, has real-time, moves State variability, interactivity sufficiently increase for the smart of product and personification and divide.

Detailed description of the invention

The drawings described herein are used to provide a further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please are not constituted an undue limitation on the present application for explaining the application.In the accompanying drawings:

Fig. 1 is computer composed structure block diagram；

Fig. 2 is a kind of flow chart of method for adjusting multimedia environment provided by the embodiments of the present application；

Fig. 3 is a kind of structural schematic diagram of device for adjusting multimedia environment provided by the embodiments of the present application.

Specific embodiment

To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with the application specific embodiment and Technical scheme is clearly and completely described in corresponding attached drawing.Obviously, described embodiment is only the application one Section Example, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall in the protection scope of this application.

Fig. 1 is computer composed structure block diagram, and the main component of computer is shown.It is processor 110, interior in Fig. 1 115 access system bus 140 of portion's memory 105, bus bridge 120 and network interface, bus bridge 120 are used for bridge system bus 140 and I/O bus 145, I/O interface access I/O bus 145, and USB interface and external memory are connect with I/O interface.Fig. 1 In, processor 110 can be one or more processors, and each processing can have one or more processor cores；It is interior Portion's memory 105 is volatile memory, such as register, buffer, various types of random access memory etc.；It is calculating When machine booting operation, the data in internal storage 105 include operating system and application program；Network interface 115 can be with For Ethernet interface, optical fiber interface etc.；System bus 140 can be used to data information, address information and control letter Breath；Bus bridge 120 can be used to carry out protocol conversion, system bus protocol is converted to I/O agreement or by I/O protocol conversion It is system bus protocol to realize that data are transmitted；I/O bus 145 is used to data information and control information, can be with bus termination Resistance or circuit interfere to reduce signal reflex；I/O interface 130 is mainly connect with various external equipments, for example, keyboard, mouse, Sensor etc., flash memory can access I/O bus by USB interface, and external memory is nonvolatile memory, such as firmly Disk, CD etc..After power the computer, processor can will be stored in external storage reading data therein to storage inside In device, and storage inside computer instruction therein is handled, completes the function of operating system and application program.This shows Example computer can be desktop computer, laptop, tablet computer, smart phone etc..

First pass through the audio of trained deep neural network and the analysis of conventional audio signal as further feature extractor Event classifier extracts audio event detection framework.Multiple conventional audio features become by the study of multilayer deep neural network After alternatively, deep layer audio frequency characteristics are obtained.In the present embodiment, above-mentioned deep neural network is convolutional neural networks.Above-mentioned tradition Audio signal analysis is short-time characteristic extraction, such as single-tone rail mel-frequency cepstrum coefficient, multitone rail mel-frequency cepstrum coefficient.

Start atmosphere mode, while starting audio event detection framework, multitone rail player and lighting control system.In this reality It applies in example, clicks App and enter atmosphere original list, click is chosen and sends specified atmosphere to equipment end.It is carried out by equipment end The presentation of atmosphere.Atmosphere mode can also be started by phonetic order.The battalion that above-mentioned atmosphere mode is made of music and light Make the mode of atmosphere.

Fig. 2 is a kind of flow chart of method for adjusting multimedia environment provided by the embodiments of the present application, which includes:

Step 205, ambient sound, output characteristic value are detected；

Microphone receives the sound in environment.Audio event detection framework detect in the received environment of microphone not in unison The characteristic value of sound.

In the present embodiment, the front end Mic uses 6 ring wheat hardware, and collected audio data has 6 channel normal sound frequencies According to, the echo cancellation signal data in 2 channels, the duplication analogue data in 1 channel, the empty data in 3 channels, totally 12 channel audio number According to.By carrying out Audio Signal Processing analysis to multichannel audio data, noise and specific audio thing are isolated.Above-mentioned noise packet Include: the indoor more people that crouch, which speak, exchanges sound, the sound televised, a series of sound etc. that people's activity issues.It is above-mentioned specific Audio includes: to clap hands, blown to equipment, tapping the sound such as desk, shutdown near equipment.

The clapping in environment, vehicle whistle sound, door slam and the indoor more people that crouch is received with microphone to speak friendship For streaming voice.Audio event detection framework detects clapping in environment, vehicle whistle sound, door slam and crouches indoor more People, which speaks, exchanges the characteristic value of sound.The characteristic value volume value of clapping is 98, time-domain signal phase-amplitude 26 and time frequency signal Harmonic energy 45, the characteristic value volume value of vehicle whistle sound are 98, time-domain signal phase-amplitude 26 and time frequency signal harmonic energy 32, the characteristic value volume value of door slam is 98, time-domain signal phase-amplitude 26 and time frequency signal harmonic energy 21, is crouched indoor more It is 56, time-domain signal phase-amplitude 16 and time frequency signal harmonic energy 11 that people, which speaks and exchanges sound volume value,.

Step 210, track volume is adjusted according to the characteristic value and/or lamp is imitated；

Each multitone rail audio includes the audio of 4~5 tracks, such as rains, thunders, bonfire, singing sound, 4 track groups At sound track audio more than one.Above-mentioned multitone rail audio file is carried out the fusion of multitone rail to play, and in playing process, according to The dynamic state of parameters of audio event detection framework input smoothly adjusts the volume proportionate relationship between different tracks in real time, makes up to Same category audio shows different speaker effects under various circumstances.

Optionally, classified according to features described above value to the alternative sounds in environment, it, will be described according to features described above value Ambient sound is divided into noise and specific audio；According to the corresponding characteristic value adjustment track volume of the specific audio and/or lamp effect.

In the present embodiment, according to the characteristic value in step 205, clapping is classified as the event of clapping hands, by vehicle whistle Sound is classified as vehicle whistle event, and door slam is classified as door close event, and the indoor more people that will crouch, which speak, to be exchanged sound classification and be Noise；According to default characteristic value priority, first compare volume value, then compare time-domain signal phase-amplitude, finally compares time-frequency letter Number harmonic energy.Above-mentioned clapping, vehicle whistle sound are identical with the volume value of door slam, and time-domain signal phase-amplitude is identical, clap The time frequency signal harmonic energy highest of hand sound, then take clapping as classification results.Audio event detection framework is by clapping Characteristic value volume value is 98, time-domain signal phase-amplitude 26 and time frequency signal harmonic energy 45 are input to multitone rail as parameter and broadcast Put device and lighting control system.

Optionally, in the present embodiment, features described above value parameter includes: volume value, time-domain signal phase-amplitude and time-frequency Signal harmonic energy.

Still further, volume value, time-domain signal phase-amplitude in the characteristic value according to after sequence as index Or the corresponding characteristic value adjustment track volume of maximum value and/or lamp effect of time frequency signal harmonic energy further include:

Optionally, the adjustment track volume includes: the volume proportionate relationship between the different tracks of adjustment.In the present embodiment In, according to the feature value parameter that audio detection frame exports, random () random function is called, random number is generated；According to above-mentioned Random number adjusts the volume proportionate relationship between different tracks according to mapping ruler shown in table 1.

In the present embodiment, multitone rail player receives the characteristic value sound of the clapping of audio event detection framework input After magnitude is 98, time-domain signal phase-amplitude 26 and time frequency signal harmonic energy 45, marked in advance in conjunction with itself multiple track Attributive character, dynamic realtime smoothly adjusts track volume.Above-mentioned attributive character is carried out in advance in server-side configuration file The audio attribute feature of multiple tracks of mark.It is clapped hands according to audio attribute feature, such as the track regulation of the rainforest in multitone rail When the energy value of sound reaches preset threshold, random () random function is called, generates random number；It is tuned up according to above-mentioned random number This track volume.Further, in the present embodiment, above-mentioned preset threshold is 80.Energy value in conjunction with clapping is 98, is greater than Above-mentioned preset threshold 80, tunes up the track volume of rainforest.It is provided according to audio attribute feature, such as track of the bonfire in multitone rail When the time-domain signal phase-amplitude of clapping reaches a preset threshold, random () random function is called, generates random number 1； This track volume ratio is tuned up to 70% according to above-mentioned random number 1, and reduces other track volumes accordingly.Further, exist In the present embodiment, above-mentioned preset threshold is 18.In conjunction with the time-domain signal phase-amplitude 26 of clapping, it is greater than above-mentioned preset threshold 18, random () random function is called, random number 2 is generated；The volume of bonfire track is tuned up to 60% according to above-mentioned random number 2, Reduce the volume 40% of singing sound rail.

Table 1, volume mapping ruler table

Optionally, the adjustment lamp effect includes: to call preset signal light control agreement, display lamp effect.In the present embodiment, According to the feature value parameter that audio detection frame exports, random () random function is called, random number is generated；According to shown in table 2 Mapping ruler calls default dominant hue, the direction of motion and motion profile.

Table 2, lamp imitate mapping ruler table

Lighting control system cooperates control light bandwagon effect by serial port using upper computer and lower computer.Lamp imitates root There are different light bandwagon effects according to different atmosphere.According to the parameter that audio event detection framework inputs, lamp effect shows can be original Different transformation is generated on the basis of atmosphere lighting effects, increases intelligence attribute value for product.

In the present embodiment, lighting control system receives the characteristic value volume value of the clapping of audio event detection framework input After 98, time-domain signal phase-amplitude 26 and time frequency signal harmonic energy 45, imitated in conjunction with the attributive character of itself in original lamp On the basis of in real time dynamic random go out different lamp effect expression effects.Above-mentioned attributive character is shifted to an earlier date in server-side configuration file The lights attributes feature being labeled.Above-mentioned lights attributes feature (light dominant hue) include: warm light, cold light, freely it is random (gradually Discoloration), grassland light, blue sky and white cloud light.The characteristic value volume value that lighting control system receives clapping is 98, time-domain signal phase width Value 26 and time frequency signal harmonic energy 45.When the energy value of clapping reaches preset threshold, random () random function is called, Random number is generated, lamp effect is adjusted according to random number.Further, in the present embodiment, above-mentioned preset threshold is 80.In conjunction with clapping hands The energy value of sound is 98, is greater than above-mentioned preset threshold 80, calls random () random function, random number 1 is generated, according to random Number 1 calls preset signal light control agreement, and (mass-tone is adjusted to warm light or freely random light, and motion profile is spiral, movement side To being oblique angle to bottom right), it carries out lamp effect and shows.

Fig. 3 is a kind of structural schematic diagram of device for adjusting multimedia environment provided by the embodiments of the present application, which shows It is intended to include storage equipment 305 and processor 310.

Equipment 305 is stored, for storing program data；

Processor, for executing the program data in the storage equipment to realize detection ambient sound, output characteristic value；Root Track volume and/or lamp effect are adjusted according to the characteristic value.

The embodiment of the present application also provides a kind of storage equipment, are stored thereon with program data, and described program data are used for Detection ambient sound, output characteristic value are realized when being executed by processor；Track volume and/or lamp effect are adjusted according to the characteristic value.

It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want There is also other identical elements in the process, method of element, commodity or equipment.

The above description is only an example of the present application, is not intended to limit this application.For those skilled in the art For, various changes and changes are possible in this application.All any modifications made within the spirit and principles of the present application are equal Replacement, improvement etc., should be included within the scope of the claims of this application.

Claims

1. a kind of method for adjusting multimedia environment, which is characterized in that this method comprises:

Detect ambient sound, output characteristic value；

2. the method for adjustment multimedia environment according to claim 1, which is characterized in that described according to the characteristic value tune Whole track volume and/or lamp effect include:

3. the method for adjustment multimedia environment according to claim 2, which is characterized in that the parameter packet of the characteristic value It includes: volume value, time-domain signal phase-amplitude and time frequency signal harmonic energy.

4. the method for adjustment multimedia environment according to claim 3, which is characterized in that described according to the specific audio Corresponding characteristic value adjustment track volume and/or lamp effect include:

With one of volume value, time-domain signal phase-amplitude and time frequency signal harmonic energy be index to characteristic value according to from big To small sequence；

According to volume value, time-domain signal phase-amplitude or the time frequency signal harmonic energy in the characteristic value after sequence as index The corresponding characteristic value adjustment track volume of maximum value and/or lamp effect.

5. the method for adjustment multimedia environment according to claim 4, which is characterized in that the feature according to after sequence As the volume value of index, the corresponding characteristic value tune of maximum value of time-domain signal phase-amplitude or time frequency signal harmonic energy in value Whole track volume and/or lamp effect further include:

In the more than one feelings of the maximum value of volume value, time-domain signal phase-amplitude or time frequency signal harmonic energy as index Under condition, more not one of the parameter of characteristic value as index selects this not in the parameter as the characteristic value indexed The corresponding characteristic value adjustment track volume of the maximum value of one and/or lamp effect；

In the case where this is not more than one as the maximum value of one of parameter of characteristic value indexed, it is not used as rope The other of parameter of characteristic value drawn selects the not maximum value pair as the other of parameter of characteristic value indexed The characteristic value adjustment track volume and/or lamp effect answered.

6. the method for adjustment multimedia environment according to claim 4, which is characterized in that the feature according to after sequence As the volume value of index, the corresponding characteristic value tune of maximum value of time-domain signal phase-amplitude or time frequency signal harmonic energy in value Whole track volume and/or lamp effect further include:

In characteristic value after sequence most as volume value, time-domain signal phase-amplitude or the time frequency signal harmonic energy of index It is worth greatly in the case that corresponding feature value parameter reaches corresponding preset threshold, adjusts track volume and/or lamp effect.

7. the method for adjustment multimedia environment according to claim 6, which is characterized in that the adjustment track volume packet It includes:

Adjust the volume proportionate relationship between different tracks.

8. the method for adjustment multimedia environment according to claim 4, which is characterized in that the adjustment lamp, which is imitated, includes:

Call preset signal light control agreement, display lamp effect.

9. a kind of storage equipment, is stored thereon with program data, which is characterized in that described program data are for being executed by processor The method of Shi Shixian adjustment multimedia environment of any of claims 1-8.

10. a kind of device for adjusting multimedia environment, which is characterized in that the device includes:

Equipment is stored, for storing program data；

Processor, for executing the program data in the storage equipment to realize tune of any of claims 1-8 The method of whole multimedia environment.