CN107147792A

CN107147792A - A kind of method for automatically configuring audio, device, mobile terminal and storage device

Info

Publication number: CN107147792A
Application number: CN201710367042.2A
Authority: CN
Inventors: 陈琼
Original assignee: Huizhou TCL Mobile Communication Co Ltd
Current assignee: Xiamen Reliable Intellectual Property Service Co ltd
Priority date: 2017-05-23
Filing date: 2017-05-23
Publication date: 2017-09-08
Anticipated expiration: 2037-05-23
Also published as: CN107147792B

Abstract

The invention discloses a kind of method for automatically configuring audio, device, mobile terminal and storage device, mobile terminal pre-saves the voice command of the one section of user recorded；When mobile terminal, which receives user, carries out the operational order of low electric wake-up device or the application for being actuated for music by voice command, by vocal print arithmetic analysis institute speech commands, identification and the identity for judging active user；When mobile terminal identifies the identity of user and plays the order of music and is triggered, automatically control and the audition style of music is set according to the audition hobby of active user or sets individual character to listen song menu.The present invention passes through self-defined voice command, when the voice command for playing music is performed, not only start the related application of music, and voice signal combination vocal print algorithm is subjected to user identity identification, and by the identity of user, to automatically configure the audio that loading meets user preferences.

Description

A kind of method for automatically configuring audio, device, mobile terminal and storage device

Technical field

The present invention relates to technical field of mobile terminals, and in particular to a kind of method for automatically configuring audio, device, movement are eventually End and storage device.

Background technology

As consumer electronics are fast-developing, intelligent mobile terminal has become amusement essential in people's daily life Instrument, at present, people more and more obtain information and progress using intelligent mobile terminals such as smart mobile phone, tablet personal computers Work is entertained.

Various new technologies are also produced therewith, and voice more and more should as basic, main, crucial interactive mode For in the interactive application of smart machine, due to the design short slab of consumer electronics at this stage as whole system, each system is set Producer is counted, and platform also has power consumption of the Primary Component producer all in reduction smart machine, as much as possible by product and is The stand-by time that low power dissipation design is carried out on the basis of performance is not sacrificed, extends electronic product to the full extent of system, finally Allow user can therefrom more frequently normally use the product function, and without worry because the reason for power consumption and Cisco unity malfunction when needing to use.

Identification engine is mainly made on application processor by current existing voice identification technology, is known when needing startup voice Need to wake up application processor when other function, peripheral components can be so waken up simultaneously, such as display, touch, LED, biography Sensor etc., certain audio system can also wake up, because application processor is primarily used to carry out the master control of the management of user mutual Module, so unavoidably causing power consumption to increase.

The scheme for accomplishing low electrical power consumed at present is that can not to carry out voice customized, in order to reach that low-power consumption is handled, is led to It is often that the engine that a limited number of order is combined into speech recognition is packaged into firmware burning into DSP, the purpose so done, Reduce the hardware cell of hub as far as possible first, the problem of reducing internal capacity size, reduce cost, but so cause is exactly User can not carry out the modification of personalized speech order, and voice command firmware also can only genuine when chip is provided Just defined and burning is good, can not change.

In the prior art when the music player that user is opened in mobile terminal is played out, if user needs to set The music style or audio pattern oneself liked are, it is necessary to which user is further set, and cumbersome setting can increase user Mobile terminal power consumption, reduce stand-by time.

Therefore, prior art has yet to be improved and developed.

The content of the invention

The technical problem to be solved in the present invention is, the drawbacks described above for prior art there is provided one kind automatically configures sound Method, device, mobile terminal and the storage device of effect, it is intended to by self-defined voice command, when the voice for performing broadcasting music When order, not only start the related application of music, and voice signal combination vocal print algorithm be subjected to user identity identification, And by the identity of user, to automatically configure the audio that loading meets user preferences.

The technical proposal for solving the technical problem of the invention is as follows：

A kind of method for automatically configuring audio, wherein, the method for automatically configuring audio comprises the following steps：

Step A：Mobile terminal pre-saves the voice command of the one section of user recorded, and institute's speech commands are used for as sound Signal carries out parsing so as to the identity of discriminating user by vocal print algorithm；

Step B：Carry out low electric wake-up device by voice command or be actuated for music to broadcast when mobile terminal receives user During the operational order for the application put, the identity of active user is recognized and judged by vocal print algorithm；

Step C：When mobile terminal identifies the identity of user and plays the order of music and is triggered, automatically control according to working as The audition hobby of preceding user sets the audition style of music or sets individual character to listen song menu.

The described method for automatically configuring audio, wherein, the step A is specifically included：

Step A1：Mobile terminal receives the voice command of the one section of user prerecorded, completes the self-defined of voice command；

Step A2：When starting mobile terminal or opening related application, institute's speech commands are parsed by vocal print algorithm, The identity of discriminating user；

Step A3：Mobile terminal receives the operation of the multiple different phonetic orders of increase, and is receiving the language to pre-setting After the instruction that sound order is modified, corresponding voice command is updated.

The described method for automatically configuring audio, wherein, the step B is specifically included：

Step B1：Mobile terminal receives user and carries out low electric wake-up device by voice command or be actuated for music During the operational order of application, the voice command of user is received by microphone；

Step B2：Institute's speech commands are parsed by the vocal print algorithm loaded in audio decoder, by analysis result with The voice command pre-saved is matched, and recognizes and judge the identity information of user.

The described method for automatically configuring audio, wherein, the step C is specifically included：

Step C1：After the identity information of user is identified, mobile terminal receives the operational order that user plays music；

Step C2：According to the audition of active user hobby sound effect parameters are updated, automatically configure music audition style or Person sets individual character to listen song menu.

A kind of device for automatically configuring audio, wherein, described device includes：

Application processor, for being interacted and interaction to user interface with upper layer application；

File system, the read-write storage for preserving the data in the data that user interacts with ancillary equipment, and application The interface of operation；

Power management module, is powered, and pass through application processor for being connected to each external equipment on application processor To be managed and monitor；

Audio decoder, for loading software algorithm or providing the function that analog-to-digital or digital revolving die are intended, and is carried Interface for connecting audio peripheral device；

Sensor hub, working condition and loading and all kinds of software algorithms of operation for monitoring each ancillary equipment；

Microphone, for catching human voice signal and by transmission of sound signals to the sensor hub；

The microphone, sensor hub, audio decoder, application processor and file system are attached successively, institute State power management module and be connected respectively with the sensor hub, audio decoder, application processor and is powered.

The described device for automatically configuring audio, wherein, the application processor, which is used to handle, includes screen display, image Processing, the encoding and decoding of audio processing, the system-level startup applied with application layer and closing, the read-write to ancillary equipment interact and will Accessed data are stored in file system.

The described device for automatically configuring audio, wherein, the audio decoder is specifically included：For loading software algorithm Audio process unit and decoder element for providing the function that analog-to-digital or digital revolving die are intended；

The sensor hub is specifically included：Working condition for monitoring each ancillary equipment, and be responsible for loading and transport The low electric treatment device unit and the internal storage location for storage of all kinds of software algorithms of row.

The described device for automatically configuring audio, wherein, the audio decoder passes through I2S with sensor hub module Bus is attached.

A kind of mobile terminal, wherein, including：Processor, the memory being connected with processor communication, the memory storage There is computer program, the described method for automatically configuring audio is realized when the computer program is used to be performed；The processing Device is used to call the computer program in the memory, to realize the described method for automatically configuring audio.

A kind of storage device, wherein, the storage device is stored with computer program, and the computer program can be held Go to realize described method.

The invention discloses a kind of method for automatically configuring audio, device, mobile terminal and storage device, methods described bag Include：Mobile terminal pre-saves the voice command of the one section of user recorded, and institute's speech commands are used for logical as voice signal Cross vocal print algorithm and carry out parsing so as to the identity of discriminating user；Low electricity is carried out when mobile terminal receives user by voice command Wake-up device or when being actuated for the operational order of application of music, is recognized by vocal print algorithm and judges active user Identity；When mobile terminal identifies the identity of user and plays the order of music and is triggered, automatically control according to current The audition hobby of user sets the audition style of music or sets individual character to listen song menu.The present invention is ordered by self-defined voice Order, when the voice command for playing music is performed, not only starts the related application of music, and by voice signal combination sound Line algorithm carries out user identity identification, and by the identity of user, to automatically configure the audio that loading meets user preferences.

Brief description of the drawings

Fig. 1 is the flow chart of the preferred embodiment for the method that the present invention automatically configures audio.

Fig. 2 is the structural representation for the device that the present invention automatically configures audio.

Fig. 3 is the functional schematic block diagram for the mobile terminal that the present invention automatically configures audio.

Embodiment

To make the objects, technical solutions and advantages of the present invention clearer, clear and definite, develop simultaneously embodiment pair referring to the drawings The present invention is further described.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and without It is of the invention in limiting.

The implementation method of window background described in present pre-ferred embodiments, as shown in figure 1, a kind of audio that automatically configures Method, wherein, comprise the following steps：

Step S100, mobile terminal pre-saves the voice command of the one section of user recorded, institute's speech commands be used for as Voice signal carries out parsing so as to the identity of discriminating user by vocal print algorithm.

Further, the step S100 is specifically included：

Step S101, mobile terminal receives the voice command of the one section of user prerecorded, completes the self-defined of voice command；

Step S102, when starting mobile terminal or opening related application, carries out parsing the voice life by vocal print algorithm Order, the identity of discriminating user；

Step S103, mobile terminal receives the operation of the multiple different phonetic orders of increase, and receiving to pre-setting After the instruction that voice command is modified, corresponding voice command is updated.

Specifically, the hub that the present invention is used includes one low electric main control unit, and there is enough ROM to be used for adding The algorithm of speech recognition is carried, the training flow of self defining keywords has been placed on application processor in addition, because training flow It is not to be frequently necessary to what is carried out, the training of voice command by step can be carried out by user interface, and be stored at In the file system for managing device, application instruction may finally be sent by user and carries out language to the main control module of sensor hub The firmware of sound recognizer updates, it is achieved thereby that the purpose of customized voice command function.

Step S200, carries out low electric wake-up device by voice command or is actuated for when mobile terminal receives user During the operational order of the application of music, the identity of active user is recognized and judged by vocal print algorithm.

Further, the step S200 is specifically included：

Step S201, mobile terminal receives user and carries out low electric wake-up device by voice command or be actuated for music Application operational order when, pass through microphone receive user voice command；

Institute's speech commands are parsed, by analysis result by step S202 by the vocal print algorithm loaded in audio decoder Matched with the voice command pre-saved, recognize and judge the identity information of user.

When it is implemented, speech recognition technology mainly includes Feature Extraction Technology, pattern match criterion and model training skill Three aspects of art.Different according to the object of identification, voice recognition tasks can substantially be divided into 3 classes, i.e. isolated word recognition（isolated word recognition）, keyword identification（Or keyword spotting, keyword spotting）And continuous speech recognition. Wherein, the task of isolated word recognition is the previously known isolated word of identification, such as " start ", " shutdown "；Continuous speech recognition Task be then the arbitrary continuous speech of identification, such as a sentence or one section of word；Keyword detection in continuous speech stream is directed to Be continuous speech, but whole words of it and nonrecognition, and simply detect that known some keywords wherein occur, such as one Detection " computer ", " world " the two words in section words.

According to for speaker, speech recognition technology can be divided into particular person speech recognition and unspecified person voice and known Not, the former can only recognize the voice of one or several people, and the latter can then be used by anyone.Obviously, unspecified person voice Identifying system more corresponds to actual needs, but it is more much more difficult than the identification for particular person.

In addition, according to speech ciphering equipment and passage, desktop can be divided into（PC）Speech recognition, call voice are recognized and embedded Equipment（Mobile phone, PDA etc.）Speech recognition.Different acquisition channels can make the acoustic characteristic of the pronunciation of people deform, therefore need Construct respective identifying system.

The main speech recognition schemes using ripe voice provider at present of the invention, reach and realize smart machine system Function, the function can improve speech identifying function by training, very low to system processing speed and requirement, and MCU dominant frequency surpasses 100Mhz is crossed, RAM is more than 64KB, and offline stock's bank can voluntarily be reduced according to MCU ROM.

The process that the vocal print algorithm loaded in by audio decoder is parsed to institute's speech commands is：Enter first Row speech detection, then carries out noise suppression and feature extraction, then carries out carrying out voice print matching after vocal print confirmation, identification and judgement The identity information of user.

Step S300, when mobile terminal identifies the identity of user and plays the order of music and is triggered, automatic control Make and the audition style of music is set according to the audition hobby of active user or sets individual character to listen song menu.

Further, the step S200 is specifically included：

Step S301, after the identity information of user is identified, mobile terminal receives the operational order that user plays music；

Sound effect parameters, are updated, automatically configure the audition style of music by step S302 according to the audition of active user hobby Or set individual character to listen song menu.

When it is implemented, the present invention is because self-defined voice recognition commands can be carried out, waking up machine in user leads to Cross after voice recognition commands and restart application, such as start in the application process for playing music, because each user（User） The difference of hobby and audition ability to music, then there is the audio each liked to set, it is universal for music audition style Setting manually can all be gone in the setting that music is applied by setting, then by user of the present invention carry out low electric wake-up device and When starting the application of corresponding keyword, because the voice of one section of voice command can be recorded, this voice signal can be sent Carry out parsing so as to recognize the identity of now user into vocal print algorithm, so that the related automatic audition for going to set active user The audio of hobby or the individual character for setting filtering to like listen song menu, so as to save existing multiple steps by a step Suddenly, it is achieved thereby that more listening for intelligence and hommization sings experience.

The present invention makes only to need the total power consumption of present like product 30% to realize super long standby time by hardware design, removes MCU controls The function of self-defined voice command is placed on mobile terminal by unit processed, only by low-power consumption DSP（Digital Signal Processing, Digital Signal Processing）, and plus the simulation wheat for the low-power consumption processing done dedicated for speech recognition technology Gram wind device, the intelligent function that the Platform of hardware based on mobile terminal of complexity could be realized is previously required to so as to break away from With equipment and product, lower cost accomplishes more to press close to practical, long standby, convenience, can also be used in more fields The fields such as such as simple intelligent device, bracelet, smart home.

The present invention is using currently a popular low electric arousal function and self-defined voice command and associates different applications On, and voice signal is subjected to identification with reference to vocal print technology, after judging that the user identity of the currently used equipment comes out, when When the order for playing music is triggered, the audition style of music can be automatically configured according to the hobby of active user automatically, from And the step of several steps are operated will be needed, what is become is more concise, so as to allow user's more intelligence when audition Can hommization.

Present invention also offers a kind of device for automatically configuring audio, as shown in Fig. 2 described device includes：

Application processor 10, for being interacted and interaction to user interface with upper layer application；

File system 11, the read-write for preserving the data in the data that user interacts with ancillary equipment, and application is deposited Store up the interface of operation；

Power management module 12, is powered for being connected to each external equipment on application processor, and is handled by application Device 10 is managed and monitored；

Audio decoder 13, for loading software algorithm or providing the function that analog-to-digital or digital revolving die are intended, and The interface of connection audio peripheral device is provided；

Sensor hub 14, working condition and loading and all kinds of software algorithms of operation for monitoring each ancillary equipment；

Microphone 15, for catching human voice signal and by transmission of sound signals to the sensor hub；

The microphone 15, sensor hub 14, audio decoder 13, application processor 10 and file system 11 are entered successively Row connection, the power management module 12 connects with the sensor hub 14, audio decoder 13, application processor 10 respectively Connect and be powered.

Specifically, the application processor 10 is used to handle and included at the encoding and decoding of screen display, image procossing, audio Reason, the system-level startup applied with application layer and closing, the read-write to ancillary equipment interact and preserve accessed data In file system 11, to read and write.Other ancillary equipment are carried out with unified management based on multitask system, based on each The priority of system task is managed collectively and run, it is ensured that the work that whole system can be orderly.

The file system 11 is used for preserving the data in the data that user interacts with ancillary equipment, and application The interface of read-write storage operation, the module that the whole data of system are managed with document form.

The power management module 12 is mainly connected to each external equipment on application processor 10 and is powered, and passes through Application processor 10 is managed and monitored, and in order to preferably reduce the power consumption of whole system, is based on by application processor 10 Being managed collectively to the equipment currently run and application in operating system, according to the service condition of priority, and equipment And consumption resource, strategy is managed situations such as internal memory ensureing that each application and hardware are set can normally work and rationally Power supply.

The audio decoder 13 is specifically included：For loading the audio process unit 131 of software algorithm and for carrying The decoder element 132 for the function of intending for analog-to-digital or digital revolving die.Audio process unit 131 is mainly processor The alone vocal print analytical algorithm to load the software algorithm such as present invention, and decoder element is referred to as CODEC units and turned there is provided simulation The function that numeral or digital revolving die are intended, and the interface such as microphone interface and loudspeaker work(of connection audio peripheral device are provided Put interface, earphone interface etc..

The present invention is by I2S（Inter-IC Sound, I2S buses are also known as integrated circuit built-in audio bus, are winged A kind of bus standard that Li Pu companies formulate for the voice data transmission between digital audio-frequency apparatus, the bus is specialized in audio Data transfer between equipment, is widely used in various multimedia systems, and it is employed along independent wire transmission clock and number It is believed that number design, by the way that data and clock signal are separated, it is to avoid be that user saves purchase because of the distortion that the time difference induces Resist the expense of the professional equipment of audio jitter.）Audio decoder 13 is connected by bus with sensor hub 14.

The sensor hub 14 is specifically included：Working condition for monitoring each ancillary equipment, and be responsible for adding Carry and run the low electric treatment device unit 141 and the internal storage location 142 for storage of all kinds of software algorithms.Low electric treatment device unit 141 are primarily used to manage the co-ordination of each ancillary equipment of the module, monitor the working condition of each ancillary equipment, and And it is responsible for being loaded and run in all kinds of software algorithms of the unit, the present invention is mainly loaded and updated and sent out from application processor 10 Bring the firmware of speech recognition.

The specific workflow of the present apparatus is：External voice is passed through in sensor hub again by outside input device Low electric treatment device unit carries out computing module and carries out arithmetic analysis, judges whether keyword matches, finally sends different automatically Interruptive command is transmitted directly to application processor by INT1 pin, and by audio decoder module path, because application processor It is under low electric holding state, but the kernel process operated on application processor monitors any may wake up entirely in real time The hardware interrupt of system.

When monitoring new hardware interrupt, will judge it is now what priority and what kind of end End, now external equipment hardware initialization of the corresponding BootLoader on application processor, powers, corresponding clock setting, Hang in file system, reading and saving system information, such as the read-write data or tune of upper subtask go file system currently to need The data to be shown, because sensor hub can send different interrupt responses, and in the file system of application processor Corresponding startup application is saved, after application is started, can be led to because collecting voice signal by sensor hub Cross I2S buses and deliver to audio decoder.

Because the digital microphone of the present invention, need not carry out analog signal and be converted into data signal, directly send Digital processor unit into audio decoder, this element is loaded with vocal print algorithm, by the way that voice is passed through into this arithmetic analysis Afterwards, result can be delivered to application processor progress right, because being by application processing when first time identity typing Device is interacted with user to be stored in file system, and vocal print original match signal is stored in file system, works as knowledge Do not go out after user profile, automatically can accordingly be updated sound effect parameters, it is achieved thereby that the purpose of the present invention.

The algorithm of wherein low electric treatment device arithmetic element updates, and is by required by first time User Defined voice command , this equipment can support 4 voice commands simultaneously, it is possible to which successively customization passes through custom software successively in order, by Then the hardware design of low-power consumption, and be not too frequently in view of the number of times of user's more newer command, and consider to update user The interface of voice command is more friendly, so recording self-defined voice by mobile terminal, mobile terminal opens answering for training voice With new speech order being gathered by mobile terminal Mike by button control, directly by voice packing algorithm by voice signal Be converted into phonetic feature code data be filled into be stored in wake-up algorithm firmware in, then on low electric treatment device speech recognition calculate Method carries out erasable action, and the wake-up algorithm after renewal then is re-write into DSP operation by data/address bus and controlling bus In module, so as to reach the purpose for updating phonetic order.

Present invention also offers a kind of mobile terminal, as shown in figure 3, described device includes：

As shown in figure 3, the mobile terminal includes：Processor (processor) 10, memory (memory) 20, communication interface (Communications Interface) 30 and bus 40；Wherein,

The processor 10, memory 20, communication interface 30 complete mutual communication by the bus 40；

The information transfer that the communication interface 30 is used between the communication equipment of the mobile terminal；

The processor 10 is used to call the computer program in the memory 20, is carried with performing above-mentioned each method embodiment The method of confession, for example including：Mobile terminal pre-saves the voice command of the one section of user recorded, and institute's speech commands are used for Parsing is carried out so as to the identity of discriminating user by vocal print algorithm as voice signal；Pass through language when mobile terminal receives user When the low electric wake-up device of sound order progress or the operational order for the application for being actuated for music, recognized by vocal print algorithm With the identity for judging active user；When mobile terminal identifies the identity of user and plays the order of music and is triggered, from Dynamic control sets the audition style of music according to the audition hobby of active user or sets individual character to listen song menu.

The present invention also provides a kind of storage device, wherein, the storage device is stored with computer program, the computer Program can be performed to realize the implementation method of the window background.

In summary, the invention provides a kind of method for automatically configuring audio, device, mobile terminal and storage device, Methods described includes：Mobile terminal pre-saves the voice command of the one section of user recorded, institute's speech commands be used for as Voice signal carries out parsing so as to the identity of discriminating user by vocal print algorithm；Ordered when mobile terminal receives user by voice When the low electric wake-up device of order progress or the operational order for the application for being actuated for music, recognize and sentence by vocal print algorithm The identity of disconnected active user；When mobile terminal identifies the identity of user and plays the order of music and is triggered, automatic control Make and the audition style of music is set according to the audition hobby of active user or sets individual character to listen song menu.The present invention is by making by oneself Adopted voice command, when the voice command for playing music is performed, not only starts the related application of music, and voice is believed Number combining vocal print algorithm carries out user identity identification, and by the identity of user, meets user preferences to automatically configure loading Audio.

Certainly, one of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, Computer program is can be by instruct related hardware（Such as processor, controller etc.）To complete, described program can be stored In the storage medium of an embodied on computer readable, the program may include such as the flow of above-mentioned each method embodiment upon execution.Its Described in storage medium can be for memory, magnetic disc, CD etc..

It should be appreciated that the application of the present invention is not limited to above-mentioned citing, for those of ordinary skills, can To be improved or converted according to the above description, all these modifications and variations should all belong to the guarantor of appended claims of the present invention Protect scope.

Claims

1. a kind of method for automatically configuring audio, it is characterised in that the method for automatically configuring audio comprises the following steps：

2. the method according to claim 1 for automatically configuring audio, it is characterised in that the step A is specifically included：

3. the method according to claim 1 for automatically configuring audio, it is characterised in that the step B is specifically included：

4. the method according to claim 1 for automatically configuring audio, it is characterised in that the step C is specifically included：

5. a kind of device for automatically configuring audio, it is characterised in that described device includes：

6. the device according to claim 5 for automatically configuring audio, it is characterised in that the application processor is used to handle Encoding and decoding processing including screen display, image procossing, audio, the system-level startup and closing applied with application layer, to periphery The read-write of equipment is interactive and accessed data are stored in file system.

7. the device according to claim 5 for automatically configuring audio, it is characterised in that the audio decoder implement body bag Include：For loading the audio process unit of software algorithm and for providing the function that analog-to-digital or digital revolving die are intended Decoder element；

8. the device according to claim 5 for automatically configuring audio, it is characterised in that the audio decoder and sensor Hub module is attached by I2S buses.

9. a kind of mobile terminal, it is characterised in that including：Processor, the memory being connected with processor communication, the memory Be stored with computer program, and the method as described in claim any one of 1-4 is realized when the computer program is used to be performed； The processor is used to call the computer program in the memory, to realize the side as described in claim any one of 1-4 Method.

10. a kind of storage device, it is characterised in that the storage device is stored with computer program, the computer program energy Enough it is performed to realize the method as described in claim any one of 1-4.