CN202855297U - Background music control device based on expression - Google Patents

Background music control device based on expression Download PDF

Info

Publication number
CN202855297U
CN202855297U CN 201220371686 CN201220371686U CN202855297U CN 202855297 U CN202855297 U CN 202855297U CN 201220371686 CN201220371686 CN 201220371686 CN 201220371686 U CN201220371686 U CN 201220371686U CN 202855297 U CN202855297 U CN 202855297U
Authority
CN
China
Prior art keywords
expression
background music
image
micro
image acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201220371686
Other languages
Chinese (zh)
Inventor
郭雷
陈智慧
赵天云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN 201220371686 priority Critical patent/CN202855297U/en
Application granted granted Critical
Publication of CN202855297U publication Critical patent/CN202855297U/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The utility model discloses a background music control device based on expression. The device comprises a power supply unit, an image acquisition unit, a main processor (DSP), a storage unit and a background music adjusting unit. The main processor (DSP) is respectively connected with the image acquisition unit, the storage unit, and the background music adjusting unit. The power supply unit is connected with the image acquisition unit, the main processor (DSP), the storage unit, and the background music adjusting unit. The device can identify facial expression and mood of users through expression, and adjust background music modes according to mood of the users, so as to achieve effect of interactive home furnishing and user, so that the device makes the home furnishing more humanized, and experience of the users is improved. The system is simple in operation, convenient in use, and has good practicality.

Description

Background music control device based on expression
Technical field
The present invention relates to video image and process and the control technology field, especially a kind of background music control device based on expression.
Background technology
Current, modern house is comfortable in order to build, warm living environment atmosphere, background music becomes in the household very part and parcel, so modern house has been installed background music control system, background music control system can effectively cover environmental noise, creates easily comfortable environment.But the household background music control system can only be selected song by manually regulating volume at present, can not select song according to people's mood automatic regulating volume, changes audio, can't be complementary with people's mood impersonality.
Summary of the invention
In order to solve the problem of above-mentioned existence, the object of the present invention is to provide a kind of background music control device based on expression that can automatically regulate audio according to people's mood, select song.
A kind of background music control device based on expression is characterized in that comprising power supply unit, image acquisition units, primary processor DSP, storage unit, power amplifier, sound effect processor, MP3 decoding device and micro-control unit MCU; Primary processor DSP respectively with image acquisition units, storage unit and micro-control unit MCU, micro-control unit MCU sequentially connects MP3 decoding device, sound effect processor and power amplifier; Micro-control unit MCU is directly sound effect processor connection also; Power supply unit connects with above-mentioned unit and working power is provided.
Described image acquisition units adopts cmos image sensor.
Adopt the RS-485 bus to be connected between described primary processor DSP and the micro-control unit MCU, adopt the baud rate the transmission of data of 9600bps.
The present invention is simple to operate easy to use, need not loaded down with trivial details operation.Can automatically adjust the music pattern of background music in the household by the detection to human face expression, thereby the hommization that realizes background music control system is interactive.Have extensibility, can pass through a plurality of background music regulons of the total line traffic control of RS-485.Has very strong practicality.
Description of drawings
Fig. 1 is system hardware structure figure of the present invention.
Embodiment
In conjunction with shown in Figure 1, a kind of background music control device based on expression of the present invention is comprised of power supply unit, image acquisition units, primary processor (DSP), storage unit and background music regulon.Described primary processor (DSP) is connected with image acquisition units, storage unit and background music regulon respectively; Described power supply unit is connected in image acquisition units, primary processor (DSP), storage unit and background music regulon;
Adopt the RS-485 bus to be connected between primary processor (DSP) and the background music regulon;
Primary processor (DSP) adopts the TMS320DM6467T chip of TI company;
Image acquisition units adopts cmos image sensor;
Storage unit is comprised of memory device and chip, comprises synchronous DRAM and flash chip; Wherein, synchronous DRAM is used for the intermediate images data storage, and flash chip is used for the storage of program; The background music regulon is comprised of background music controller spare and chip, comprises micro-control unit (MCU), MP3 decoding device, sound effect processor, power amplifier and audio amplifier;
Control procedure is as follows:
Image acquisition units is carried out image acquisition to people's face, 24 true color images that collect are encoded according to certain picture format, and the image after will encoding is stored on the SDRAM storer.At first primary processor (DSP) carries out the gray processing processing according to system program to gathering image, and the gray processing algorithm adopts weighted average method, and formula is as follows:
f(i,j)=0.30R(i,j)+0.59G(i,j)+0.11B(i,j)
F (i in the formula, j) gray-scale value of locating at (i, j) for the gray level image after the conversion, R (i, j) be that original image is at (i, j) the R component gray-scale value of locating, G (i, j) are that original image is at (i, j) the G component gray-scale value of locating, B (i, j) is the B component gray-scale value that original image is located at (i, j)
The Haar feature of image after utilizing the integral image algorithm to extract gray processing to process is come human face region in the detected image with the image Haar feature extracted by cascade classifier.This cascade classifier is people's face Haar feature of utilizing sample image, adopts the AdaBoost sorting algorithm, carries out the sorter training, and it is resulting then to make up several simple sorters.
After detecting human face region, estimate roughly the approximate region of human eye according to people's face section ratio feature, with the approximate region frame of human eye out, be referred to as to get window; If the height of facial image is h, width is u, and getting upper left angle point is initial point, and the origin coordinates that we get two windows in experiment is: left eye
Figure BDA00001949542300021
Right eye
Figure BDA00001949542300022
The size of window is
Figure BDA00001949542300023
Then according to people's face pupil and eyebrow the most black characteristics in window, the image of window inner region is done histogram analysis, we get 5% minimum pixel to take out that minimum part of gray scale, and the remainder gray scale is set to 255.Through behind this step Threshold segmentation. eyes and eyebrow can be split significantly.Then, image in the window is made horizontal projection, projection function is:
pv ( y ) = Σ x = 1 N I ( x , y )
I (x, y) is that N is that projected pixel is counted at the gray-scale value of (x, y) point in the formula.
Obtain a dimension curve.Obvious two troughs are arranged on the curve, represent respectively eyebrow zone and eye areas.Can obtain by the method that one-dimensional signal is processed the ordinate of eyes.Then, the gray-scale value with the eyebrow zone is set to 255. removal eyebrows.Do the vertical projection curve, projection function is again:
pv ( x ) = Σ y = 1 N I ( x , y )
I (x, y) is that N is that projected pixel is counted at the gray-scale value of (x, y) point in the formula.
Determine the horizontal ordinate of eyes.
Image is rotated correction: after position of human eye is determined, calculate the central point of two lines.Take this point as true origin.Set up the facial image coordinate system, the rotation facial image makes eyes maintenance level, rectifies facial image.
Image is carried out the yardstick normalized: after obtaining two centre distance information, determine again the position of face.Obtain two centers to the vertical range of face, utilize these two range informations from the facial image coordinate system, the major part of people's face to be cut out. and yardstick normalizes to same size.According to the facial ratio feature of people's face, roughly determine first the position of face.Get the face window from facial image, the origin coordinates of window is
Figure BDA00001949542300033
The size of window is
Figure BDA00001949542300034
The window inner region is done horizontal projection, get the minimum value of curve obtained as face coordinate in vertical direction.If distance is w between two, two centers are h to the distance of face.Centered by two eye distance decenterings, about get
Figure BDA00001949542300035
Upwards get
Figure BDA00001949542300036
Down get
Figure BDA00001949542300037
Cut out the major part of people's face, the facial image scaling of well cutting to same size, in the native system, is got 100 * 100 pixels.
Image is carried out gray scale normalization to be processed: regard facial image as a two-dimensional matrix M[w] [h]. the image size is w * h.The average of this image is: μ ‾ = 1 w · h Σ i = 0 w - 1 Σ j = 0 h - 1 M [ i ] [ j ]
The variance of image is: σ ‾ 2 = 1 w · h Σ i = 0 w - 1 Σ j = 0 h - 1 ( M [ i ] [ j ] - μ ‾ ) 2
Utilize following formula that facial image is done gray scale normalization:
μ in the formula 0, σ 0Average and the variance of image after the conversion.Gradation of image average and variance being transformed to the value of prior setting, pass through the method. the facial image after the normalization has identical average and variance.
Adopt two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT) to extract the human face expression frequency domain character to the facial image after the normalized, get upper left corner low frequency coefficient after the conversion and form and observe vector.
Adopt traversal method, be P through the facial image plane after the normalized with a pixel SerComm, length is the sample window of L, from left to right, slide from the top down, obtain the sampled images piece, each image block utilizes following two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT) formula to carry out conversion.
C ( u , v ) = a ( u ) a ( v ) Σ x = 0 M - 1 Σ y = 0 N - 1 f ( x , y ) cos ( ( 2 x + 1 ) uπ 2 M ) cos ( ( 2 y + 1 ) vπ 2 N )
(u=0,1,2,...,M-1;v=0,1,2,...,N-1)
C (u, v) is the result of two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT) in the formula, i.e. two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT) coefficient.Wherein a (u) and a (v) are defined as follows respectively:
a ( u ) = 1 / M , u = 0 2 / M , u = 1,2 , . . . M - 1
a ( v ) = 1 / N , v = 0 2 / N , v = 1,2 , . . . N - 1
Get P * L and be 16 * 16 sample window, sliding step is 4 * 4, get M=N=8, namely adopt 8 two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT), get low frequency part 4 * 4 coefficients of conversion coefficient as the observation vector of built-in type hidden Markov model (EHMM).
Utilize two-dimension discrete cosine transform (2D-Discrete Cosine Transform, 2D-DCT) extract corresponding observation vector after. calculate the likelihood probability that produces this observations vector with the built-in type hidden Markov model (EHMM) of three kinds of expressions respectively, the highest model of selection probability. come the expression information of people's face in the Recognition and Acquisition image with this.After obtaining the Expression Recognition object information, primary processor (DSP) passes through the RS-485 bus with the Expression Recognition consequential signal, be sent to the speed of 9600bps in the micro-control unit (MCU) of background music regulon, micro-control unit (MCU) calls corresponding program module according to the Expression Recognition consequential signal that receives, regulate MP3 decoding device and sound effect processor, obtain the music effect that needs.
Native system is that three kinds of expressions have been set up built-in type hidden Markov model (EHMM), and three kinds of expressions are respectively: " happy ", " normally " and " sad ".Surprised and glad basic facial expression during wherein " happy " comprises." sad " comprises indignation and sad basic facial expression." normally " refers to not other expressions in 6 kinds of basic facial expression definition.
The step of the built-in type hidden Markov model of three kinds of expressions of described training " happy " " normally " " sad ":
The built-in type hidden Markov model training of " happy ": select the sample image of " happy " expression, wherein " happy " expression refers to surprised and glad expression in 6 kinds of basic facial expressions; The number of getting super state is 5, is respectively forehead, eyes, nose, mouth and chin five major parts; The number that embeds state in the super state is defined as { 3,5,3,5,3}, the built-in type hidden Markov model of training " happy " expression;
The built-in type hidden Markov model training of " sad ": select the sample image of " sad " expression, wherein " sad " expression refers to indignation and sad expression in 6 kinds of basic facial expressions; The number of getting super state is 5, is respectively forehead, eyes, nose, mouth and chin five major parts; The number that embeds state in the super state is defined as { 3,5,3,5,3}, the built-in type hidden Markov model of training " sad " expression;
The built-in type hidden Markov model training of " normally ": select the sample image of " normally " expression, wherein " normally " expression refers to be not included in other expressions in " happy " and " sad " expression; The number of getting super state is 5, is respectively forehead, eyes, nose, mouth and chin five major parts; The number that embeds state in the super state is defined as { 3,5,3,5,3}, the built-in type hidden Markov model of training " normally " expression;
The sample image of described three kinds of expressions is selected from the Cohn-Kanade expression database of Carnegie Mellon University;
When the human face expression that recognizes was " happy ", micro-control unit (MCU) calls carried out the happiness program module, strengthens music pattern; When the human face expression that recognizes was " normally ", micro-control unit (MCU) calls carried out gentle program module, and the control music pattern is intermediate value; When the expression that recognizes was " sad ", micro-control unit (MCU) calls carried out dejected program module, weakens music pattern.
Described happiness program module is: background music is improved 5 decibels of volumes, strengthen 3 decibels of supper bass, improve 3 decibels of high pitchs, open surround sound;
Described gentle program module is: background music is reduced volume to 45 decibel, reduce supper bass to 30 decibel, reduce high pitch to 25 decibel, open surround sound;
Described dejected program module is: background music is reduced by 5 decibels of volumes, close supper bass, reduce high pitch to 20 decibel, close surround sound.
Strengthening music pattern among the present invention includes but not limited to: play the music list of happiness type, the size of raising volume, strengthen supper bass, promote high pitch, open surround sound.Music pattern is that intermediate value includes but not limited to: music list, reduction volume to comparatively suitable intermediate value, the reduction supper bass of playing gentle type is comparatively suitable intermediate value, intermediate value, the unlatching surround sound that the reduction high tone quality is most suitable.Weakening music pattern includes but not limited to: play sad type music list, reduce volume to smaller value, reduce high pitch to smaller value, close surround sound.

Claims (3)

1. the background music control device based on expression is characterized in that comprising power supply unit, image acquisition units, primary processor DSP, storage unit, power amplifier, sound effect processor, MP3 decoding device and micro-control unit MCU; Primary processor DSP respectively with image acquisition units, storage unit and micro-control unit MCU, micro-control unit MCU sequentially connects MP3 decoding device, sound effect processor and power amplifier; Micro-control unit MCU is directly sound effect processor connection also; Power supply unit connects with above-mentioned unit and working power is provided.
2. the background music control device based on expression according to claim 1 is characterized in that: described image acquisition units employing cmos image sensor.
3. according to the background music control device based on expression claimed in claim 1, it is characterized in that: adopt the RS-485 bus to be connected between described primary processor DSP and the micro-control unit MCU, the baud rate the transmission of data of employing 9600bps.
CN 201220371686 2012-07-30 2012-07-30 Background music control device based on expression Expired - Fee Related CN202855297U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201220371686 CN202855297U (en) 2012-07-30 2012-07-30 Background music control device based on expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201220371686 CN202855297U (en) 2012-07-30 2012-07-30 Background music control device based on expression

Publications (1)

Publication Number Publication Date
CN202855297U true CN202855297U (en) 2013-04-03

Family

ID=47986468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201220371686 Expired - Fee Related CN202855297U (en) 2012-07-30 2012-07-30 Background music control device based on expression

Country Status (1)

Country Link
CN (1) CN202855297U (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750964A (en) * 2012-07-30 2012-10-24 西北工业大学 Method and device used for controlling background music and based on facial expression
CN104864354A (en) * 2015-06-08 2015-08-26 浙江农林大学 LED mood passing lamp and method
CN108242238A (en) * 2018-01-11 2018-07-03 广东小天才科技有限公司 A kind of audio file generation method and device, terminal device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102750964A (en) * 2012-07-30 2012-10-24 西北工业大学 Method and device used for controlling background music and based on facial expression
CN102750964B (en) * 2012-07-30 2014-10-29 西北工业大学 Method and device used for controlling background music based on facial expression
CN104864354A (en) * 2015-06-08 2015-08-26 浙江农林大学 LED mood passing lamp and method
CN104864354B (en) * 2015-06-08 2017-05-10 浙江农林大学 LED mood passing lamp and method
CN108242238A (en) * 2018-01-11 2018-07-03 广东小天才科技有限公司 A kind of audio file generation method and device, terminal device
CN108242238B (en) * 2018-01-11 2019-12-31 广东小天才科技有限公司 Audio file generation method and device and terminal equipment

Similar Documents

Publication Publication Date Title
CN102750964B (en) Method and device used for controlling background music based on facial expression
CN105976809B (en) Identification method and system based on speech and facial expression bimodal emotion fusion
Zeng et al. Image retrieval using spatiograms of colors quantized by gaussian mixture models
WO2022033150A1 (en) Image recognition method, apparatus, electronic device, and storage medium
CN105469065B (en) A kind of discrete emotion identification method based on recurrent neural network
CN109472198B (en) Gesture robust video smiling face recognition method
CN103810490B (en) A kind of method and apparatus for the attribute for determining facial image
CN102332095B (en) Face motion tracking method, face motion tracking system and method for enhancing reality
US8421885B2 (en) Image processing system, image processing method, and computer readable medium
CN104680121B (en) Method and device for processing face image
CN103020992B (en) A kind of video image conspicuousness detection method based on motion color-associations
CN108198130B (en) Image processing method, image processing device, storage medium and electronic equipment
CN101615245A (en) Expression recognition method based on AVR and enhancing LBP
CN104794693B (en) A kind of portrait optimization method of face key area automatic detection masking-out
CN107911643B (en) Method and device for showing scene special effect in video communication
Türkan et al. Human eye localization using edge projections.
CN110175526A (en) Dog Emotion identification model training method, device, computer equipment and storage medium
CN109389076B (en) Image segmentation method and device
CN104636097A (en) Font size adaptive adjustment method and mobile terminal based on eyes
CN105956570B (en) Smiling face's recognition methods based on lip feature and deep learning
WO2021203880A1 (en) Speech enhancement method, neural network training method, and related device
CN103077506A (en) Local and non-local combined self-adaption image denoising method
CN104008364A (en) Face recognition method
CN104881852B (en) Image partition method based on immune clone and fuzzy kernel clustering
CN111341350A (en) Man-machine interaction control method and system, intelligent robot and storage medium

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130403

Termination date: 20140730

EXPY Termination of patent right or utility model