WO2021187645A1

WO2021187645A1 - Mobile terminal

Info

Publication number: WO2021187645A1
Application number: PCT/KR2020/003862
Authority: WO
Inventors: 유주현; 조현학; 김정곤; 이건섭; 송호성
Original assignee: 엘지전자 주식회사
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2021-09-23

Abstract

The present disclosure relates to a mobile terminal capable of adjusting the input amount of ambient noise, and the mobile terminal can comprise: one or more microphones for receiving a speech signal including an original sound signal and a noise signal; a camera for acquiring a video; a display for displaying a preview screen including a mixing level adjustment menu for adjusting the image acquired by the camera and the input amount of ambient noise; and a processor for receiving a request for the ambient noise adjustment through the mixing level adjustment menu, determining a noise mixing level according to the received request, and adjusting the input amount of ambient noise according to the determined noise mixing level.

Description

mobile terminal

The present invention relates to a mobile terminal, and more particularly, to a mobile terminal capable of controlling the inflow of ambient noise.

The terminal may be divided into a mobile/portable terminal and a stationary terminal according to whether the terminal can be moved. Again, the mobile terminal can be divided into a handheld terminal and a vehicle mounted terminal depending on whether the user can carry it directly.

The functions of mobile terminals are diversifying. For example, there are functions for data and voice communication, photography and video recording through a camera, voice recording, music file playback through a speaker system, and outputting an image or video to the display unit. Some terminals add an electronic game play function or perform a multimedia player function. In particular, recent mobile terminals can receive multicast signals that provide broadcast and visual content such as video or television programs.

As such a terminal is diversified in functions, for example, in the form of a multimedia player equipped with complex functions such as taking pictures or videos, playing music or video files, playing games, and receiving broadcasts. is being implemented.

Recently, a video shot by an individual through a terminal is uploaded to a content providing server or a server providing a social network service and shared with other users.

However, the conventional terminal is equipped with only a noise canceling function that removes ambient noise when shooting a video. Accordingly, all sounds other than the voice output by the desired object (person or object) are removed, and there is a problem in that the original sound output by the object is distorted.

In addition, when only the noise canceling function is removed, there is a problem in that the realism of the shooting environment of the moving picture is transmitted.

An object of the present disclosure is to provide a mobile terminal that allows a user to introduce as much ambient noise as desired without distorting the original sound when shooting a video.

An object of the present disclosure is to provide a mobile terminal capable of producing content having a quality suitable for sharing through personal broadcasting and social network service (SNS) without separate voice editing.

A mobile terminal according to an embodiment of the present disclosure includes one or more microphones for receiving an audio signal including an original sound signal and a noise signal, a camera for acquiring an image, and a mixing level for controlling an inflow of an image acquired by the camera and ambient noise Receives a request for adjusting the ambient noise through a display displaying a preview screen including an adjustment menu and the mixing level adjustment menu, determines a noise mixing level according to the received request, and determines the ambient noise level according to the determined noise mixing level It may include a processor that adjusts the amount of noise introduced.

The processor may mix the original sound signal and the audio signal according to the determined noise mixing level to adjust the amount of inflow of the ambient noise.

The processor may remove the noise signal from the voice signal to obtain an estimated original sound signal obtained by estimating the original sound signal, and may mix the estimated original sound signal and the voice signal according to the determined noise mixing level.

According to an embodiment of the present disclosure, the user may control the amount of ambient noise inflow with only a simple touch input when shooting a video. Accordingly, there is an effect that a video can be captured regardless of the surrounding environment.

In addition, it is possible to shoot a video even in a noisy place, and there is an effect of excellent noise removal performance.

In addition, after shooting a video, it can be uploaded directly to a server for personal broadcasting or SNS without editing, so that the user's convenience can be greatly improved.

1 shows a mobile terminal according to an embodiment of the present disclosure.

2 is a view for explaining a noise removal method according to the prior art.

3 is a view for explaining an example of adjusting the amount of ambient noise inflow according to an embodiment of the present disclosure.

4 is a view for explaining in detail a process in which a removal rate of a noise signal from an original sound signal input through a microphone is adjusted according to an embodiment of the present disclosure.

5 is a flowchart illustrating a method of operating a mobile terminal according to an embodiment of the present disclosure.

6 is a diagram illustrating an example of a preview screen according to an embodiment of the present disclosure.

7 is a table illustrating a relationship between a scaling factor and an ambient noise mixing level according to an embodiment of the present disclosure.

1 shows a mobile terminal 100 according to an embodiment of the present disclosure.

The mobile terminal 100 includes a TV, a projector, a mobile phone, a smart phone, a desktop computer, a notebook computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), a navigation system, a tablet PC, a wearable device, and a set-top box (STB). ), a DMB receiver, a radio, a washing machine, a refrigerator, a desktop computer, a digital signage, a robot, a vehicle, etc., may be implemented as a stationary device or a movable device.

Referring to FIG. 1 , the mobile terminal 100 includes a communication unit 110 , an input unit 120 , a learning processor 130 , a sensing unit 140 , an output unit 150 , a memory 170 , and a processor 180 . may include

The communication unit 110 may transmit/receive data to and from external devices such as another mobile terminal or an external server using wired/wireless communication technology. For example, the communication unit 110 may transmit/receive sensor information, a user input, a learning model, a control signal, and the like with external devices.

At this time, the communication technology used by the communication unit 110 includes GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), LTE (Long Term Evolution), 5G, WLAN (Wireless LAN), Wi-Fi (Wireless-Fidelity) ), Bluetooth, RFID (Radio Frequency Identification), Infrared Data Association (IrDA), ZigBee, NFC (Near Field Communication), and the like.

The input unit 120 may acquire various types of data.

In this case, the input unit 120 may include a camera for inputting an image signal, a microphone for receiving an audio signal, a user input unit for receiving information from a user, and the like. Here, the camera or microphone may be treated as a sensor, and a signal obtained from the camera or microphone may be referred to as sensing data or sensor information.

The input unit 120 may acquire training data for model training and input data to be used when acquiring an output using the training model. The input unit 120 may acquire raw input data, and in this case, the processor 180 or the learning processor 130 may extract an input feature as a preprocessing for the input data.

The input unit 120 may include a camera (Camera, 121) for inputting an image signal, a microphone (Microphone, 122) for receiving an audio signal, and a user input unit (User Input Unit, 123) for receiving information from a user. have.

The voice data or image data collected by the input unit 120 may be analyzed and processed as a user's control command.

The input unit 120 is for inputting image information (or signal), audio information (or signal), data, or information input from a user. For input of image information, the mobile terminal 100 may include one or more Cameras 121 may be provided.

The camera 121 processes an image frame such as a still image or a moving image obtained by an image sensor in a video call mode or a photographing mode. The processed image frame may be displayed on the display unit 151 or stored in the memory 170 .

The microphone 122 processes an external sound signal as electrical voice data. The processed voice data may be utilized in various ways according to a function (or a running application program) being performed by the mobile terminal 100 . Meanwhile, various noise removal algorithms for removing noise generated in the process of receiving an external sound signal may be applied to the microphone 122 .

The user input unit 123 is for receiving information from a user, and when information is input through the user input unit 123 , the processor 180 may control the operation of the mobile terminal 100 to correspond to the input information. .

The user input unit 123 includes a mechanical input means (or a mechanical key, for example, a button located on the front/rear or side of the terminal 100, a dome switch, a jog wheel, a jog switch, etc.) and It may include a touch input means. As an example, the touch input means consists of a virtual key, a soft key, or a visual key displayed on the touch screen through software processing, or a part other than the touch screen. It may be made of a touch key (touch key) disposed on the.

The learning processor 130 may train a model composed of an artificial neural network by using the training data. Here, the learned artificial neural network may be referred to as a learning model. The learning model may be used to infer a result value with respect to new input data other than the training data, and the inferred value may be used as a basis for a decision to perform a certain operation.

In this case, the learning processor 130 may include a memory integrated or implemented in the mobile terminal 100 . Alternatively, the learning processor 130 may be implemented using the memory 170 , an external memory directly coupled to the mobile terminal 100 , or a memory maintained in an external device.

The sensing unit 140 may acquire at least one of internal information of the mobile terminal 100 , surrounding environment information of the mobile terminal 100 , and user information by using various sensors.

At this time, sensors included in the sensing unit 140 include a proximity sensor, an illuminance sensor, an acceleration sensor, a magnetic sensor, a gyro sensor, an inertial sensor, an RGB sensor, an IR sensor, a fingerprint recognition sensor, an ultrasonic sensor, an optical sensor, a microphone, and a lidar. , radar, etc.

The output unit 150 may generate an output related to visual, auditory or tactile sense.

In this case, the output unit 150 may include a display unit that outputs visual information, a speaker that outputs auditory information, and a haptic module that outputs tactile information.

The output unit 150 includes at least one of a display unit 151, a sound output unit 152, a haptic module 153, and an optical output unit 154. can do.

The display unit 151 displays (outputs) information processed by the mobile terminal 100 . For example, the display unit 151 may display execution screen information of an application program driven in the mobile terminal 100 or UI (User Interface) and GUI (Graphic User Interface) information according to the execution screen information.

The display unit 151 may implement a touch screen by forming a layer structure with the touch sensor or being formed integrally with the touch sensor. Such a touch screen may function as the user input unit 123 providing an input interface between the mobile terminal 100 and the user, and may provide an output interface between the terminal 100 and the user.

The sound output unit 152 may output audio data received from the communication unit 110 or stored in the memory 170 in a call signal reception, a call mode or a recording mode, a voice recognition mode, a broadcast reception mode, and the like.

The sound output unit 152 may include at least one of a receiver, a speaker, and a buzzer.

The haptic module 153 generates various tactile effects that the user can feel. A representative example of the tactile effect generated by the haptic module 153 may be vibration.

The light output unit 154 outputs a signal for notifying the occurrence of an event by using the light of the light source of the mobile terminal 100 . Examples of the event generated in the mobile terminal 100 may be message reception, call signal reception, missed call, alarm, schedule notification, email reception, information reception through an application, and the like.

The memory 170 may store data supporting various functions of the mobile terminal 100 . For example, the memory 170 may store input data obtained from the input unit 120 , learning data, a learning model, a learning history, and the like.

The processor 180 may determine at least one executable operation of the mobile terminal 100 based on information determined or generated using a data analysis algorithm or a machine learning algorithm. Then, the processor 180 may control the components of the mobile terminal 100 to perform the determined operation.

To this end, the processor 180 may request, search, receive, or utilize the data of the learning processor 130 or the memory 170, and may perform a predicted operation or an operation determined to be desirable among the at least one executable operation. It is possible to control the components of the mobile terminal 100 to execute.

In this case, when the connection of the external device is required to perform the determined operation, the processor 180 may generate a control signal for controlling the corresponding external device and transmit the generated control signal to the corresponding external device.

The processor 180 may obtain intention information with respect to a user input and determine a user's requirement based on the obtained intention information.

In this case, the processor 180 uses at least one of a speech to text (STT) engine for converting a voice input into a character string or a natural language processing (NLP) engine for obtaining intention information of a natural language. Intention information corresponding to the input may be obtained.

At this time, at least one of the STT engine and the NLP engine may be configured as an artificial neural network, at least a part of which is learned according to a machine learning algorithm. And, at least one or more of the STT engine or the NLP engine may be learned by the learning processor 130 , learned by an external server, or learned by distributed processing thereof.

The processor 180 collects history information including user feedback on the operation contents or operation of the mobile terminal 100 and stores it in the memory 170 or the learning processor 130, or to an external device such as an external server. can be transmitted The collected historical information may be used to update the learning model.

The processor 180 may control at least some of the components of the mobile terminal 100 in order to drive an application program stored in the memory 170 . Furthermore, in order to drive the application program, the processor 180 may operate two or more of the components included in the mobile terminal 100 in combination with each other.

FIG. 2 is a view for explaining a noise removal method according to the prior art, and FIG. 3 is a view for explaining an example of adjusting the amount of ambient noise inflow according to an embodiment of the present disclosure.

Referring to FIG. 2 , the noise removal module 200 according to the related art removes the noise signal n from the voice signal y including the original sound signal s0 and the noise signal n.

Accordingly, the noise removal module 200 may output the estimated original sound signal s1 similar to the input original sound signal s.

The noise removal module 200 may identify the noise signal n, generate a signal having a waveform opposite to that of the identified noise signal n, and cancel the noise signal n.

According to the prior art, although the noise signal n can be effectively removed, there is a problem in that the noise signal n corresponding to the ambient noise signal is always removed.

That is, as the noise signal n is removed, distortion of the original sound signal s0 may occur, and there is a problem in that the surrounding environment is not recognized.

In addition, as all the noise signals n are removed, there is a problem in that the realism of moving pictures is transmitted.

In order to solve this problem, in an embodiment of the present disclosure, an inflow amount of ambient noise is to be adjusted.

Referring to FIG. 3 , the mobile terminal 100 may include a noise removal module 310 and a mixer 330 .

The noise removal module 310 and the mixer 330 may be included in the processor 180 of FIG. 1 or may exist separately from the processor 180 .

The microphone 122 may receive a voice signal y from the outside. The voice signal y may include an original sound signal s0 corresponding to the voice output by the target object and a noise signal n corresponding to ambient noise.

The noise removal module 310 may output the estimated original sound signal s1 obtained by removing the noise signal n from the voice signal y.

The noise removal module 310 may separate the original sound signal s0 and the noise signal n from the voice signal y.

The noise removal module 310 may generate an opposite signal having a waveform opposite to that of the noise signal n, and cancel the noise signal n by using the generated opposite signal. Accordingly, an estimated original sound signal s1 similar to the original sound signal s0 may be obtained.

The mixer 330 may mix the estimated original sound signal s1 and the original sound signal y, and output the mixed result.

The mixer 330 may mix the estimated original sound signal s1 and the audio signal y using the scaling factor α.

The mixing voice signal, which is the mixing result of the mixer 330, may be expressed as Equation 1 below.

[Equation 1]

Here, the scaling factor (α) is a factor used to adjust the amount of ambient noise, and may be any one of 0 or more and 1 or less.

The reason that the voice signal y is used instead of the noise signal n in the (1-α) y item corresponding to the amount of ambient noise is that the estimated original sound signal s1 is used in the process of removing the noise signal n. because it was distorted.

That is, in order to compensate for the distortion of the estimated original sound signal s1, in the (1-α)·y item corresponding to the amount of ambient noise, the voice signal y is used instead of the noise signal n. This is because the audio signal y includes the original sound signal s0 to compensate for distortion of the estimated original sound signal s1.

When the value of the scaling factor α is 1, the amount of ambient noise introduced may be 0.

When the value of the scaling factor α is 0, the amount of ambient noise introduced may be 1.

The value of the scaling factor α may be set as a default or may be set according to a user input. The value of the scaling factor α may be associated with an ambient noise mixing level determined through manipulation of a mixing level adjustment menu, which will be described later.

The setting of the value of the scaling factor α will be described later.

As described above, according to an embodiment of the present disclosure, the amount of ambient noise can be adjusted, so that the user can remove the ambient noise to a desired degree.

Accordingly, a sense of presence appropriate to the recording environment of the video may be delivered to the viewer of the video.

The mobile terminal 100 may include a plurality of microphones.

In FIG. 4 , two

microphones

122a and 122b are used as an example.

Referring to FIG. 4 , the processor 180 may include a noise removal module 310 , a preprocessor 320 , a mixer 330 , and a postprocessor 350 .

The noise removal module 310 may remove a noise signal from a voice signal input through the first microphone 122a or the second microphone 122b.

The preprocessor 320 may preprocess the voice signal input through the first microphone 122a or the second microphone 122b.

The mixer 330 may mix the original sound signal from which the noise signal is removed and the audio signal.

The mixer 330 may mix an original sound signal and an audio signal based on the ambient noise mixing level.

The post-processing unit 350 may post-process the mixed voice signal representing the output result of the mixer 330 .

Hereinafter, the function of each configuration will be described in more detail.

The noise removal module 310 may include a first amplifier 311 , a first digital filter 313 , a signal separator 315 , and a first dynamic range compressor 317 .

The first amplifier 311 may amplify a voice signal input through the first microphone 122a or the second microphone 122b.

The first digital filter 313 may filter the amplified voice signal. The first digital filter 313 may correct the tone characteristics of the voice signal.

The signal separator 315 may separate the filtered voice signal into an original sound signal and a noise signal.

The signal separation unit 315 may separate a voice signal into an original sound signal and a noise signal by using a well-known deep learning algorithm or machine learning algorithm for noise cancellation. The noise signal may be a signal corresponding to the surrounding voice signal.

The signal separator 315 may obtain an estimated original sound signal obtained by estimating the original sound signal by removing the separated noise signal.

The first dynamic range compressor 317 may compress the dynamic range of the estimated original sound signal. The dynamic range of the estimated original sound signal may be a range between the largest magnitude and the smallest magnitude of the estimated original sound signal.

The preprocessor 320 may include a delay time compensator 321 , a second amplifier 323 , and a second digital filter 325 .

The delay time compensator 321 determines the time it takes for the voice signal to be output to the mixer 330 through the noise removal module 310 and the time it takes for the voice signal to be output to the mixer 330 through the preprocessor 320 . difference can be compensated for. The delay time compensator 321 may compensate for the delay time through phase shifting of the voice signal.

The second amplifier 323 may amplify the audio signal.

The second digital filter 325 may filter the amplified voice signal. The second digital filter 325 may correct distortion of the amplified voice signal.

The mixer 330 may mix the estimated original sound signal output from the noise removal module 310 and the filtered audio signal output from the preprocessor 320 .

The mixer 330 may mix the estimated original sound signal and the audio signal based on the ambient noise mixing level, and may output a mixed audio signal indicating the mixing result.

The post-processing unit 350 may include a second dynamic range compressor 351 , a third amplifier 353 , and an encoder 355 .

The second dynamic range compressor 351 may compress the dynamic range of the mixed voice signal output from the mixer 330 .

The third amplifier 353 may amplify a mixed voice signal having a compressed dynamic range.

The encoder 355 may encode the amplified speech signal.

The encoded mixed voice signal may be matched with a moving picture and stored in the memory 170 .

As another example, the encoded mixed voice signal, the moving picture, and the ambient noise mixing level may be stored together in the memory 170 .

Referring to FIG. 5 , the processor 180 of the mobile terminal 100 displays a preview screen on the display unit 151 ( S501 ).

The processor 180 may display a preview screen on the display unit 151 according to the execution of the camera application installed in the mobile terminal 100 .

The preview screen may include an image capturing button for capturing an image and a video capturing button for capturing a moving picture.

The processor 180 may start recording a video when a video recording button is selected.

The preview screen will be described with reference to FIG. 6 .

Referring to FIG. 6 , the preview screen 600 may include a preview image 610 acquired through the camera 121 , a mixing level adjustment menu 630 , and a video recording button 601 .

The mixing level adjustment menu 630 may be a menu for adjusting the amount of ambient noise introduced through one or more microphones while shooting a video.

The mixing level adjustment menu 630 will be described later in detail.

The video recording button 601 may be a button for starting or ending recording of a video.

Again, FIG. 5 will be described.

이동 단말기(100)의 프로세서(180)는 The processor 180 of the mobile terminal 100 프리뷰preview 화면의 표시 중, 주변 소음 조절 요청을 수신했는지를 판단하고(S503), 주변 소음 조절 요청을 수신한 경우, 수신된 요청에 따라 주변 소음의 While displaying the screen, it is determined whether a request for controlling the ambient noise has been received (S503), and if the request for controlling the ambient noise is received, the ambient noise level is adjusted according to the received request. 믹싱mixing 레벨을 결정한다(S505). The level is determined (S505).

In an embodiment, the processor 18 may receive a request for adjusting the ambient noise after shooting a video.

In another embodiment, the processor 180 may receive a request for adjusting ambient noise even before shooting a video. That is, the processor 180 may receive a request for adjusting the ambient noise even when the preview image 610 of FIG. 6 is displayed and the video capture button 601 is not selected.

The ambient noise control request may be received through manipulation of the mixing level control menu 630 on the preview screen 600 of FIG. 6 . This will be described later.

The processor 180 may determine the mixing level of the ambient noise through manipulation of the mixing level adjustment menu 630 .

This will be described again with reference to FIG. 6 .

Referring to FIG. 6 , the preview screen 600 may include a mixing level adjustment menu 630 .

The mixing level adjustment menu 630 may be displayed when a video recording command is received.

The mixing level adjustment menu 630 includes one or more of a minimum level icon 631 , a maximum level icon 633 , a mixing level adjustment guide bar 635 , a mixing level adjustment button 637 , and a mixing level indicator 639 . can do.

The minimum level icon 631 may be an icon for maximally reducing ambient noise. When the minimum level icon 631 is selected, the ambient voice mixing level may be set to the minimum.

The minimum value of the ambient voice mixing level may be 0, and the maximum value of the ambient voice mixing level may be 100. However, this is only an example, and may vary according to user settings.

The maximum level icon 633 may be an icon for maximally increasing ambient noise. When the maximum level icon 633 is selected, the ambient voice mixing level may be set to the maximum.

The mixing level adjustment guide bar 635 may guide selection of a mixing level of ambient noise. The mixing level adjusting guide bar 635 may be divided into a plurality of levels.

The mixing level adjustment button 637 may move on the mixing level adjustment guide bar 635 and may be a button for selecting a specific mixing level.

The mixing level adjustment button 637 may be located at any one of a plurality of levels partitioned on the mixing level adjustment guide bar 635 .

A user may select a mixing level of ambient noise through a touch input to the mixing level adjustment button 637 .

The mixing level indicator 639 may be an indicator indicating the value of the mixing level selected through the mixing level control button 637 . The user may check how much of the ambient noise is introduced through the mixing level indicator 639 .

As the value of the mixing level indicator 639 increases, the amount of ambient noise may be increased, and as the value of the mixing level indicator 639 decreases, the amount of ambient noise may decrease.

A relationship between the following [Equation 1] and the ambient voice mixing level will be described.

[Equation 1]

When the value of the ambient voice mixing level is 0, which is the minimum, the value of the scaling factor α may be 1.

When the value of the ambient voice mixing level is 1, which is the maximum, the value of the scaling factor α may be 0.

Again, FIG. 5 will be described.

이동 단말기(100)의 프로세서(180)는 마이크로폰(122)을 통해 입력된 음성 신호를 원음 신호 및 주변 소음 신호로 분리한다(S507).The processor 180 of the mobile terminal 100 separates the voice signal input through the microphone 122 into an original sound signal and an ambient noise signal (S507).

The processor 180 may separate the voice signal into an original sound signal and an ambient noise signal. The ambient noise signal may be a noise signal.

The noise removal module 310 of the processor 180 may separate the voice signal into an original sound signal and an ambient noise signal and remove the ambient noise signal. Accordingly, the processor 180 may obtain an estimated original sound signal similar to the original sound signal.

The processor 180 may use a well-known deep learning algorithm or machine learning algorithm for noise cancellation to separate an original sound signal and an ambient sound signal from a voice signal, and may remove the surrounding voice signal.

이동 단말기(100)의 프로세서(180)는 결정된 주변 소음 The processor 180 of the mobile terminal 100 determines the ambient noise 믹싱mixing 레벨에 기초하여, 분리된 원음 신호 및 마이크로폰(122)을 통해 입력된 음성 신호를 Based on the level, the separated original sound signal and the audio signal input through the microphone 122 are 믹싱한다mix (S509).(S509).

The mixer 330 of the processor 180 may generate a mixed voice signal by mixing the separated original sound signal and the voice signal. The accurately separated original sound signal may be the estimated original sound signal s1.

The mixed voice signal representing the mixing result may be expressed as in [Equation 1] below.

[Equation 1]

The scaling factor α may be a value corresponding to the ambient noise mixing level. As the value of the ambient noise mixing level increases, the value of the scaling factor α may decrease. As the value of the ambient noise mixing level decreases, the value of the scaling factor α may increase.

이동 단말기(100)의 프로세서(180)는 동영상 촬영 종료 명령을 수신했는지를 판단하고(S511), 동영상 촬영 종료 명령을 수신한 경우, 촬영된 동영상 및 The processor 180 of the mobile terminal 100 determines whether a video recording end command has been received (S511), and upon receiving the video recording end command, the captured video and 믹싱mixing 결과를 나타내는 indicating the result 믹싱mixing 음성 신호 및 주변 소음 Voice signal and ambient noise 믹싱mixing 레벨을 메모리(170)에 저장한다(S513). The level is stored in the memory 170 (S513).

When receiving a request to play a stored video, the processor 180 may output a mixed voice signal reflecting the mixing result through a speaker provided in the mobile terminal 100 when playing the video.

한편, 이동 단말기(100)의 프로세서(180)는 주변 소음 조절 요청을 수신하지 Meanwhile, the processor 180 of the mobile terminal 100 does not receive the ambient noise control request. 않은 경우if not , , 기 설정된preset 주변 소음 ambient noise 믹싱mixing 레벨을 획득한다(S515). A level is acquired (S515).

That is, if the processor 180 does not receive the ambient noise control request, the processor 180 may determine the amount of ambient noise introduced by using a preset ambient noise mixing level.

The preset ambient noise mixing level may be the most recently stored ambient noise mixing level before shooting a video.

As another example, the preset ambient noise mixing level may be a default level. The level set by default may be 50, but this is only an example.

The scaling factor α is a factor described in [Equation 1], and the ambient noise mixing level is a level selected from the mixing level adjustment menu 630 of FIG. 6 .

When the ambient noise mixing level is set to 100, the value of the scaling factor α may be set to 0.

When the ambient noise mixing level is set to 80, the value of the scaling factor α may be set to 0.2.

When the ambient noise mixing level is set to 50, the value of the scaling factor α may be set to 0.5.

When the ambient noise mixing level is set to 20, the value of the scaling factor α may be set to 0.8.

When the ambient noise mixing level is set to 0, the value of the scaling factor α may be set to 0.

The processor 180 may obtain the ambient noise mixing level selected from the mixing level adjustment menu 630 and determine a scaling factor α corresponding to the obtained ambient noise mixing level.

The processor 180 may obtain a mixed voice signal as in [Equation 1] by using the determined value of the scaling factor α.

The present disclosure described above can be implemented as computer-readable code on a medium in which a program is recorded. The computer-readable medium includes all kinds of recording devices in which data readable by a computer system is stored. Examples of computer-readable media include Hard Disk Drive (HDD), Solid State Disk (SSD), Silicon Disk Drive (SDD), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. there is this In addition, the computer may include the processor 180 of the artificial intelligence device.

Claims

In a mobile terminal,

one or more microphones for receiving a voice signal including an original sound signal and a noise signal;

a camera for acquiring an image;

a display for displaying a preview screen including a mixing level adjustment menu for adjusting an inflow amount of the image acquired by the camera and ambient noise; and

A processor configured to receive a request for adjusting the ambient noise through the mixing level adjustment menu, determine a noise mixing level according to the received request, and adjust the amount of ambient noise introduced according to the determined noise mixing level

mobile terminal.
According to claim 1,

the processor

Mixing the original sound signal and the audio signal according to the determined noise mixing level to adjust the amount of inflow of the ambient noise

mobile terminal.
3. The method of claim 2,

the processor

removing the noise signal from the voice signal to obtain an estimated original sound signal estimating the original sound signal;

mixing the estimated original sound signal and the audio signal according to the determined noise mixing level;

mobile terminal.
4. The method of claim 3,

the processor

Mixing the estimated original sound signal and the audio signal using the following [Equation 1]

[Equation 1]

α is a scaling factor having a value from 0 to 1, s1 is the estimated original sound signal, y is the audio signal,

The scaling factor is a value determined according to the noise mixing level.

mobile terminal.
5. The method of claim 4,

As the noise mixing level increases, the value of the scaling factor decreases, and as the noise mixing level decreases, the value of the scaling factor increases.

mobile terminal.
5. The method of claim 4,

more memory,

the processor

storing a mixed audio signal indicating a mixing result of the estimated original sound signal and the audio signal, a moving picture captured by the camera, and the noise mixing level in the memory

mobile terminal.
7. The method of claim 6,

the processor

When a request to play the video is received, when the video is played, the mixing voice signal is output.

mobile terminal.
4. The method of claim 3,

the processor

a noise canceling module that removes the noise signal from the voice signal and outputs the estimated original sound signal;

a pre-processing unit for pre-processing the audio signal; and

and a mixer for mixing the estimated original sound signal and the preprocessed audio signal according to the noise mixing level

mobile terminal.
According to claim 1,

The mixing level adjustment menu is

a mixing level control guide bar for guiding the selection of the noise mixing level; and

It is movable on the mixing level adjustment guide bar and includes a mixing level adjustment button for selecting a specific noise mixing level.

mobile terminal.
10. The method of claim 9,

The mixing level adjustment menu is

Further comprising a mixing level indicator that digitizes the noise mixing level selected through the mixing level adjustment button

mobile terminal.
11. The method of claim 10,

The mixing level adjustment menu is

Further comprising a minimum level icon for setting the noise mixing level to a minimum and a maximum level icon for setting the noise mixing level to a maximum

mobile terminal.
10. The method of claim 9,

The request for adjusting the ambient noise is received by operating the mixing level adjustment button through a user's touch input.

mobile terminal.
According to claim 1,

the processor

When receiving a video recording command, displaying the mixing level adjustment menu on the display

mobile terminal.