KR102647545B1

KR102647545B1 - Electronic device having open speaker

Info

Publication number: KR102647545B1
Application number: KR1020220149986A
Authority: KR
Inventors: 김재영; 김광영
Original assignee: 주식회사 이엠텍
Priority date: 2022-11-10
Filing date: 2022-11-10
Publication date: 2024-03-14

Abstract

실시예인 오픈형 스피커를 구비하는 전자 디바이스는 프로세서로부터 생 음원을 포함하는 출력 음향 신호를 인가 받아 음 방출시키는 오픈형 스피커와, 환경음과 재생 음원을 포함하는 외부 음향을 획득하여 프로세서에 인가하는 마이크와, 마이크로부터의 인가된 외부 음향에서 재생 음원을 차감하여 환경음을 산정하고, 산정된 환경음의 음향 환경 특성을 추론하여 분류하고, 재생 음원의 음원 특성을 추출하고, 환경음의 음향 환경 특성과 재생 음원의 음원 특성으로부터 주파수별 이득값을 추론하고, 추론된 주파수별 이득값을 이용하여 재생 음원을 보정하고, 보정된 재생 음원을 포함하는 출력 음향 신호를 오픈형 스피커에 인가하는 프로세서를 포함한다.An electronic device having an open speaker as an embodiment includes an open speaker that receives an output sound signal including a live sound source from a processor and emits the sound, a microphone that acquires external sound including an environmental sound and a reproduced sound source and applies it to the processor, Calculate the environmental sound by subtracting the playback sound source from the external sound applied from the microphone, infer and classify the acoustic environment characteristics of the calculated environmental sound, extract the sound source characteristics of the playback sound source, and compare the acoustic environment characteristics and reproduction of the environmental sound. It includes a processor that infers a gain value for each frequency from the sound source characteristics of the sound source, corrects the playback sound source using the inferred gain value for each frequency, and applies an output sound signal including the corrected playback sound source to the open speaker.

Description

Electronic device having an open speaker {ELECTRONIC DEVICE HAVING OPEN SPEAKER}

실시예는 오픈형 스피커를 구비한 전자 디바이스에 관한 것으로서, 주변 환경의 특성과 재생 음원의 특성을 고려하여 재생 음원에 대한 주파수별 이득값을 산정하여 환경음의 영향을 최소화하여 재생 음원의 청취가 가능하도록 하는 오픈형 스피커를 구비하는 전자 디바이스에 관한 것이다.The embodiment relates to an electronic device equipped with an open speaker, which allows listening to the playback sound source by minimizing the influence of environmental sounds by calculating the gain value for each frequency of the playback sound source in consideration of the characteristics of the surrounding environment and the characteristics of the playback sound source. It relates to an electronic device having an open speaker that allows

헤드폰, 이어폰과 같은 오디오 장치는 다양한 노이즈 제거 기술을 이용할 수 있다. 예를 들어, 오디오 장치는 노이즈 제거 회로와 연결된 마이크를 통하여 오디오 장치 주변의 음향을 획득하고, 오디오 장치 주변의 음향에 포함된 노이즈를 제거함으로써 사용자에게 품질이 향상된 오디오 신호를 출력할 수 있다.Audio devices such as headphones and earphones can use a variety of noise cancellation technologies. For example, an audio device can obtain sound around the audio device through a microphone connected to a noise removal circuit, and output an audio signal of improved quality to the user by removing noise included in the sound around the audio device.

오디오 장치는 능동적 노이즈 제거(Active Noise Cancellation, 이하 "ANC") 기술을 활용함으로써, 주변 잡음 환경을 판단하고, 능동적으로 노이즈를 제거할 수 있다. ANC 기술을 활용하는 오디오 장치는 주변 잡음 환경을 이용하여 능동적으로 노이즈를 제거함으로써 전자장치로부터 제공되는 오디오 신호가 사용자에게 제공될 때, 주변 노이즈를 상쇄하도록 설계될 수 있다.Audio devices can determine the surrounding noise environment and actively remove noise by utilizing Active Noise Cancellation (“ANC”) technology. An audio device utilizing ANC technology may be designed to cancel out surrounding noise when an audio signal provided from an electronic device is provided to the user by actively removing noise using the surrounding noise environment.

한편, 종래 기술에 따른 오디오 장치는 사용자의 청각 기관에 삽입되는 커널형에서의 노이즈 제거에 국한되어, 오픈형 스피커가 작용된 장치에서는 그 노이즈 제거 기능이 약한 점이 있었다.Meanwhile, audio devices according to the prior art are limited to noise removal in a kernel type inserted into the user's auditory organ, and devices with open speakers have a weak noise removal function.

실시예는 주변 환경의 특성과 재생 음원의 특성을 고려하여 재생 음원에 대한 주파수별 이득값을 산정하여 환경음의 영향을 최소화하여 재생 음원의 청취가 가능하도록 하는 오픈형 스피커를 구비하는 전자 디바이스를 제공하는 것을 목적으로 한다.The embodiment provides an electronic device having an open speaker that minimizes the influence of environmental sounds by calculating gain values for each frequency for the playback sound source in consideration of the characteristics of the surrounding environment and the characteristics of the playback sound source, thereby enabling listening to the playback sound source. The purpose is to

또한, 프로세서는 인공 지능에 의해 기계 학습하여 주파수 영역의 환경음의 주파수 특성을 입력값으로 하고 음향 환경 특성을 추론하여 분류하고, 음향 환경 특성을 출력값으로 출력하는 환경 분류 모듈을 포함하고, 프로세서는 시간 영역의 환경음을 주파수 영역의 환경음으로 변환하여, 주파수 영역의 환경음에서 주파수 특성을 분석하고, 분석된 환경음의 주파수 특성을 환경 분류 모듈에 인가하는 것이 바람직하다.In addition, the processor includes an environment classification module that uses machine learning using artificial intelligence to infer and classify acoustic environment characteristics using the frequency characteristics of environmental sounds in the frequency domain as input values, and outputs the acoustic environment characteristics as output values, and the processor It is desirable to convert environmental sounds in the time domain into environmental sounds in the frequency domain, analyze the frequency characteristics of the environmental sounds in the frequency domain, and apply the frequency characteristics of the analyzed environmental sounds to the environmental classification module.

또한, 프로세서는 시간 영역의 재생 음원을 주파수 영역의 재생 음원으로 변환하여, 주파수 영역의 재생 음원의 음원 특성을 추출하는 음원 특성 추출 모듈을 구비하는 것이 바람직하다.In addition, the processor preferably includes a sound source characteristic extraction module that converts a time-domain playback sound source into a frequency-domain playback sound source and extracts sound source characteristics of the frequency-domain playback sound source.

또한, 프로세서는 인공 지능에 의해 기계 학습하여 환경 분류 모듈로부터의 음향 환경 특성과, 음원 특성 추출 모듈로부터의 음원 특성을 입력값으로 하여 주파수별 이득값을 추론하여 출력하는 주파수 특성 조절 모듈을 구비하는 것이 바람직하다. In addition, the processor is equipped with a frequency characteristic adjustment module that infers and outputs gain values for each frequency using the acoustic environment characteristics from the environment classification module and the sound source characteristics from the sound source characteristic extraction module as input values through machine learning using artificial intelligence. It is desirable.

또한, 프로세서는 주파수 영역의 재생 음원과 주파수별 이득값을 곱셈하여 주파수 영역의 보정된 재생 음원을 산정하고, 주파수 영역의 보정된 재생 음원을 시간 영역의 보정된 재생 음원으로 변환하고, 시간 영역의 보정된 재생 음원을 포함하는 출력 음향 신호를 생성하여 오픈형 스피커에 인가하는 것이 바람직하다.In addition, the processor calculates the corrected playback sound source in the frequency domain by multiplying the playback sound source in the frequency domain by the gain value for each frequency, converts the corrected playback sound source in the frequency domain to the corrected playback sound source in the time domain, and calculates the corrected playback sound source in the frequency domain. It is desirable to generate an output sound signal including a corrected playback sound source and apply it to an open speaker.

실시예는 주변 환경의 특성과 재생 음원의 특성을 고려하여 재생 음원에 대한 주파수별 이득값을 산정하여, 산정된 주파수별 이득값으로 재생 음원을 보정함으로써, 환경음의 영향을 최소화하여 재생 음원의 청취가 가능하도록 하는 효과가 있다.In the embodiment, the gain value for each frequency of the playback sound source is calculated in consideration of the characteristics of the surrounding environment and the characteristics of the playback sound source, and the playback sound source is corrected with the calculated gain value for each frequency, thereby minimizing the influence of the environmental sound and the playback sound source. It has the effect of making listening possible.

도 1은 오픈형 스피커를 구비하는 전자 디바이스의 음향 처리 순서도이다.1 is a flowchart of sound processing of an electronic device equipped with an open speaker.

이하에서, 실시예들은 도면을 통하여 상세하게 설명된다. 그러나, 이는 특정한 실시 형태에 대해 한정하려는 것이 아니며, 설명되는 실시예들은 그 실시예들의 다양한 변경(modification), 균등물(equivalent), 및/또는 대체물(alternative)을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.Hereinafter, embodiments are described in detail through the drawings. However, this is not intended to be limiting to specific embodiments, and the described embodiments should be understood to include various modifications, equivalents, and/or alternatives of the embodiments. In connection with the description of the drawings, similar reference numbers may be used for similar components.

본 문서에서, "가진다", "가질 수 있다", "포함한다", 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In this document, expressions such as “have,” “may have,” “includes,” or “may include” refer to the presence of the corresponding feature (e.g., a numerical value, function, operation, or component such as a part). , and does not rule out the existence of additional features.

본 문서에서, "A 또는 B", "A 또는/및 B 중 적어도 하나", 또는 "A 또는/및 B 중 하나 또는 그 이상" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. 예를 들면, "A 또는 B", "A 및 B 중 적어도 하나", 또는 "A 또는 B 중 적어도 하나"는, (1) 적어도 하나의 A를 포함, (2) 적어도 하나의 B를 포함, 또는 (3) 적어도 하나의 A 및 적어도 하나의 B 모두를 포함하는 경우를 모두 지칭할 수 있다.In this document, expressions such as “A or B,” “at least one of A or/and B,” or “one or more of A or/and B” may include all possible combinations of the items listed together. . For example, “A or B”, “at least one of A and B”, or “at least one of A or B” (1) includes at least one A, (2) includes at least one B, or (3) it may refer to all cases including both at least one A and at least one B.

본 문서에서 사용된 "제1", "제2", "첫째", 또는 "둘째" 등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. 예를 들면, 제1 사용자 기기와 제2 사용자 기기는, 순서 또는 중요도와 무관하게, 서로 다른 사용자 기기를 나타낼 수 있다. 예를 들면, 본 문서에 기재된 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 바꾸어 명명될 수 있다.As used herein, expressions such as "first", "second", "first", or "second" may describe various elements in any order and/or importance, and may refer to one element as another. It is only used to distinguish from components and does not limit the components. For example, a first user device and a second user device may represent different user devices regardless of order or importance. For example, a first component may be renamed a second component without departing from the scope of rights described in this document, and similarly, the second component may also be renamed to the first component.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.A component (e.g., a first component) is “(operatively or communicatively) coupled with/to” another component (e.g., a second component). When referred to as being “connected to,” it should be understood that any component may be directly connected to the other component or may be connected through another component (e.g., a third component). On the other hand, when a component (e.g., a first component) is said to be “directly connected” or “directly connected” to another component (e.g., a second component), It may be understood that no other component (e.g., a third component) exists between other components.

본 문서에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)", "~하는 능력을 가지는(having the capacity to)", "~하도록 설계된(designed to)", "~하도록 변경된(adapted to)", "~하도록 만들어진(made to)", 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성(또는 설정)된"은 하드웨어적으로 "특별히 설계된(specifically designed to)"것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성(또는 설정)된 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.The expression “configured to” used in this document may mean, for example, “suitable for,” “having the capacity to,” or “having the capacity to.” It can be used interchangeably with ", "designed to," "adapted to," "made to," or "capable of." The term “configured (or set) to” may not necessarily mean “specifically designed to” in hardware. Instead, in some contexts, the expression “a device configured to” may mean that the device is “capable of” working with other devices or components. For example, the phrase "processor configured (or set) to perform A, B, and C" refers to a processor dedicated to performing the operations (e.g., an embedded processor), or executing one or more software programs stored on a memory device. By doing so, it may mean a general-purpose processor (eg, CPU or application processor) capable of performing the corresponding operations.

본 문서에서 사용된 용어들은 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 다른 실시 예의 범위를 한정하려는 의도가 아닐 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 용어들은 본 문서에 기재된 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가질 수 있다. 본 문서에 사용된 용어들 중 일반적인 사전에 정의된 용어들은 관련 기술의 문맥 상 가지는 의미와 동일 또는 유사한 의미로 해석될 수 있으며, 본 문서에서 명백하게 정의되지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 경우에 따라서, 본 문서에서 정의된 용어일지라도 본 문서의 실시 예들을 배제하도록 해석될 수 없다.Terms used in this document are merely used to describe specific embodiments and may not be intended to limit the scope of other embodiments. Singular expressions may include plural expressions, unless the context clearly indicates otherwise. Terms used herein, including technical or scientific terms, may have the same meaning as commonly understood by a person of ordinary skill in the technical field described in this document. Among the terms used in this document, terms defined in general dictionaries may be interpreted to have the same or similar meaning as the meaning they have in the context of related technology, and unless clearly defined in this document, they may be interpreted in an ideal or excessively formal sense. It is not interpreted. In some cases, even terms defined in this document cannot be interpreted to exclude embodiments of this document.

실시예에 따른 오픈형 스피커를 구비하는 전자 디바이스는 예를 들면, 넥밴드형 음향 변환 장치, 실내 또는 실외 공간에 놓여지는 인공 지능(AI) 스피커, 차량용 스피커, 군용 스피커, 오픈형 무선 이어셋 등을 포함한다.Electronic devices equipped with open speakers according to embodiments include, for example, neckband-type sound conversion devices, artificial intelligence (AI) speakers placed in indoor or outdoor spaces, vehicle speakers, military speakers, and open-type wireless earsets. .

전자 디바이스는 환경음과 재생음향을 포함하는 외부 음향을 획득하여, 외부 음향을 포함하는 입력 음향 신호를 생성하여 프로세서에 인가하는 마이크와, 프로세서로부터 재생 음원을 포함하는 출력 음향 신호를 인가 받아 음 방출하는 오픈형 스피커와, 기저장되거나 수신된 음원을 재생하여 재생 음원에 대한 출력 음향 신호를 생성하여 오픈형 스피커에 인가하여 재생음이 표출되도록 하면서, 환경음과 재생 음원을 포함하는 외부 음향에서 환경음의 특성과 재생 음원의 특성을 반영하여 주파수별 이득값을 산정하고, 산정된 주파수별 이득값을 이용하여 재생 음원의 주파수 특성을 변경하여 보정하고, 보정된 재생 음원을 포함하는 출력 음향 신호를 오픈형 스피커에 인가하는 프로세서 등을 포함하여 구성된다.The electronic device acquires external sound including environmental sound and playback sound, generates an input sound signal including external sound and applies it to the processor, and receives an output sound signal including a playback sound source from the processor to emit sound. An open speaker that plays a pre-stored or received sound source to generate an output sound signal for the playback sound source and applies it to the open speaker to express the playback sound, and the characteristics of the environmental sound from external sounds including the environmental sound and the playback sound source. The gain value for each frequency is calculated by reflecting the characteristics of the playback sound source, the frequency characteristics of the playback sound source are changed and corrected using the calculated gain value for each frequency, and the output sound signal including the corrected playback sound source is transmitted to an open speaker. It is composed of an authorizing processor, etc.

프로세서는 고속 푸리에 변환(FFT) 기능과 인버스 고속 푸리에 변환(IFFT) 기능, 주파수 특성(features)인 멜 스펙트로그램(Mel Spectrogram) 분석 기능 등을 구비하며, 연산 기능(예를 들면, 곱셈 기능, 아마다르 곱(element-wise multiplication 기능 등) 및 저장 기능(예를 들면, 메모리 등)을 구비하는 전자적 및/또는 전기적 회로 장치이다. The processor is equipped with a Fast Fourier Transform (FFT) function, an Inverse Fast Fourier Transform (IFFT) function, a Mel Spectrogram analysis function with frequency characteristics, and an arithmetic function (e.g., a multiplication function, perhaps It is an electronic and/or electrical circuit device having multiplication (e.g., element-wise multiplication function) and storage function (e.g., memory, etc.).

프로세서는 인공 지능에 기계 학습을 수행하며 환경음의 주파수 특성을 입력값으로 하고, 환경음의 음향 환경 특성을 추론하여 분류하고, 환경음의 음향 환경 특성을 출력값으로 출력하는 환경 분류 모듈을 구비한다. The processor performs machine learning on artificial intelligence and is equipped with an environment classification module that takes the frequency characteristics of environmental sounds as input, infers and classifies the acoustic environment characteristics of environmental sounds, and outputs the acoustic environment characteristics of environmental sounds as output values. .

환경 분류 모듈은 환경음을 복수의 음향 환경들(또는 음향 잡음 환경들)(예를 들면, 바람, 차량 소음, 청소 소음, 창문 풍절음, 엔진 소리 그 외 기타)로 분류하되, 환경음에 음향 환경들 각각이 존재하거나 존재할 환경 확률들을 포함하는 음향 환경 특성을 추론하여 분류한다. 음향 환경 특성은 제 1 음향 환경의 환경 확률(0.6), 제 2 음향 환경의 환경 확률(0.2), 제 3 음향 환경의 환경 확률(0.05), 제 N 음향 환경의 환경 확률(0.15), 그 외의 이외의 제 4 내지 제 N-1 음향 환경의 환경 확률(0) 등을 포함할 수 있다. The environment classification module classifies environmental sounds into a plurality of acoustic environments (or acoustic noise environments) (e.g., wind, vehicle noise, cleaning noise, window wind noise, engine noise, etc.). The acoustic environment characteristics, including the environmental probabilities for each of them to exist or to exist, are inferred and classified. The acoustic environment characteristics are the environment probability of the first acoustic environment (0.6), the environment probability of the second acoustic environment (0.2), the environment probability of the third acoustic environment (0.05), the environment probability of the Nth acoustic environment (0.15), and others. In addition, the environmental probability (0) of the fourth to N-1th acoustic environment may be included.

환경 분류 모듈은 예를 들면, RNN, LSTM, CNN, DNN 모델 등을 사용하여 제 1 내지 제 N 음향 환경 각각에 대한 딥러닝이나 기계 학습을 수행하는 알고리즘이거나, 그러한 알고리즘에 따른 연산을 수행하여 출력하는 실행기로 구현될 수 있다.The environment classification module is, for example, an algorithm that performs deep learning or machine learning for each of the first to N acoustic environments using RNN, LSTM, CNN, DNN models, etc., or performs an operation according to such an algorithm and outputs it. It can be implemented as an executor.

또한, 프로세서는 재생 음원 또는 주파수 영역의 재생 음원으로부터 음원 특성(예를 들면, 비트 속도, 템포, 음계 등)을 추출하는 음원 특성 추출 모듈을 구비한다.Additionally, the processor is provided with a sound source characteristic extraction module that extracts sound source characteristics (eg, beat rate, tempo, musical scale, etc.) from a reproduced sound source or a reproduced sound source in the frequency domain.

또한, 프로세서는 인공 지능에 기계 학습을 수행하며 환경음의 음향 환경 특성과 재생 음원의 음원 특성을 입력값으로 하여, 오픈형 스피커를 통하여 재생될 때 재생 음원의 밸런스를 맞출 수 있는 주파수별 이득값들을 출력값으로 출력하는 주파수 특성 조절 모듈을 구비한다.In addition, the processor performs machine learning on artificial intelligence and uses the acoustic environment characteristics of the environmental sound and the sound source characteristics of the playback sound source as input values to determine gain values for each frequency that can balance the playback sound source when played through an open speaker. It is equipped with a frequency characteristic control module that outputs an output value.

주파수 특성 조절 모듈은 환경음의 음향 환경 특성에 포함된 주파수 특성에서, 음원 특성 추출 모듈로부터 인가된 음원 특성을 지닌 재생 음원이 보다 외부 공간을 통하여 사용자의 청각 기관에 도달하여 잘 청취되도록 하거나 재생 음원의 음원 특성이 더 잘 전달되도록, 즉 양질의 음질을 제공하도록 재생 음원의 밸런스를 맞출 수 있는 주파수별 이득값들을 추론하여 출력한다.The frequency characteristic adjustment module ensures that the playback sound source with the sound source characteristics approved from the sound source characteristic extraction module reaches the user's auditory organ through the external space and is heard well from the frequency characteristics included in the acoustic environment characteristics of the environmental sound. Gain values for each frequency that can balance the playback sound source are inferred and output so that the sound source characteristics are better transmitted, that is, to provide good sound quality.

주파수 특성 조절 모듈은 예를 들면, RNN, LSTM, CNN, DNN 모델 등을 사용하여 환경음의 음향 환경 특성과, 재생 음원의 음원 특성 각각에 대한 딥러닝이나 기계 학습을 수행하는 알고리즘이거나, 그러한 알고리즘에 따른 연산을 수행하여 출력하는 실행기로 구현될 수 있다.The frequency characteristic adjustment module is an algorithm that performs deep learning or machine learning on the acoustic environment characteristics of the environmental sound and the sound source characteristics of the reproduced sound source, for example, using RNN, LSTM, CNN, or DNN models, or such an algorithm. It can be implemented as an executor that performs operations according to and outputs the output.

하기에서, 본 전자 디바이스에서 수행되는 재생 음원의 이득 조절에 대해서 상세하게 기재된다.In the following, the gain adjustment of the reproduced sound source performed in this electronic device is described in detail.

도 1은 오픈형 스피커를 구비하는 전자 디바이스의 음향 처리 순서도이다. 프로세서는 도 1의 순서도를 기설정된 시간 간격으로 또는 기설정된 크기의 외부 음원 및 재생 음원 단위로 수행한다.1 is a flowchart of sound processing of an electronic device equipped with an open speaker. The processor performs the flowchart of FIG. 1 at preset time intervals or in units of external sound sources and playback sound sources of preset sizes.

프로세서는 시간 영역의 재생 음원을 포함하는 출력 음향 신호를 을 오픈형 스피커에 인가하여 음 방출시키고, 마이크는 외부 공간으로부터 환경음(S_env)과 재생 음원(S_out)을 포함하는 외부 음향(S_T)을 획득하여 프로세서에 인가한다. The processor applies an output sound signal including a playback sound source in the time domain to an open speaker to emit sound, and the microphone emits sound from an external space ₍ S _env ) and external sound including a playback sound source (S _out ). ) is obtained and applied to the processor.

단계(T1)에서, 프로세서는 이전에 스피커에 인가하여 음 방출된 재생 음원(S_out)을 저장하고 있으며, 외부 음향(S_T)에서 기저장된 재생 음원(S_out)을 차감하여 환경음(S_env)을 산정하고, 단계(T3)로 진행한다.In step T1, the processor stores the playback sound source (S _out ) that was previously applied to the speaker and emitted, and subtracts the pre-stored playback sound source (S _out ) from the external sound (S _T ) to determine the environmental sound (S _env ) is calculated and proceeds to step (T3).

단계(T3)에서, 프로세서는 시간 영역의 신호인 환경음(S_env)을 주파수 영역의 환경음으로 변환하는, FFT 기능을 수행하고, 단계(T5)로 진행한다.In step T3, the processor performs an FFT function to convert the environmental sound (S _env ), which is a signal in the time domain, into an environmental sound in the frequency domain, and proceeds to step T5.

단계(T5)에서, 프로세서는 주파수 특성(features)인 멜 스펙트로그램(Mel Spectrogram) 분석 기능을 수행하여 주파수 영역의 환경음의 주파수 특성을 분석하여 분석된 주파수 영역의 환경음의 주파수 특성을 환경 분류 모듈의 입력값으로 인가하고 단계(T7)로 진행한다. In step T5, the processor performs a Mel Spectrogram analysis function, which is a frequency feature, to analyze the frequency characteristics of the environmental sound in the frequency domain and classifies the frequency characteristics of the environmental sound in the analyzed frequency domain as an environment. Apply it as the input value of the module and proceed to step (T7).

단계(T7)에서, 환경 분류 모듈은 주파수 영역의 환경음의 주파수 특성으로부터 음향 환경들 각각이 존재하거나 존재할 환경 확률들을 포함하는 음향 환경 특성을 추론하여 분류하여 프로세서로 출력한다. 프로세서는 환경 분류 모듈로부터 출력된 음향 환경 특성(F_env)을 인가 받는다. 다른 실시예로서, 환경 분류 모듈은 복수의 환경 확률들 중에서 크기가 큰 기설정된 개수(예를 들면, 상위 3개)의 환경 확률들만을 포함하는 음향 환경 특성(F_env)을 프로세서에 출력할 수 있다.In step T7, the environment classification module infers and classifies acoustic environment characteristics including environmental probabilities for each of the acoustic environments to exist or exist from the frequency characteristics of the environmental sound in the frequency domain and outputs the classification to the processor. The processor receives the acoustic environment characteristics (F _env ) output from the environment classification module. As another embodiment, the environment classification module may output to the processor an acoustic environment characteristic (F _env ) that includes only a preset number of environment probabilities (e.g., the top three) with a large size among a plurality of environment probabilities. there is.

프로세서는 단계(T1) 내지 (T7)을 수행하는 중에 단계(T9)와 (T11)를 독립적으로 수행한다. The processor independently performs steps (T9) and (T11) while performing steps (T1) to (T7).

단계(T9)에서, 프로세서는 시간 영역의 재생 음원에 대하여 FFT 기능을 수행하여 주파수 영역의 재생 음원으로 변환하여 음원 특성 추출 모듈에 인가한다. In step T9, the processor performs an FFT function on the playback sound source in the time domain to convert it into a playback sound source in the frequency domain and applies it to the sound source characteristic extraction module.

단계(T11)에서, 음원 특성 추출 모듈은 인가된 주파수 영역의 재생 음원으로부터 재생 음원의 음원 특성(F_music)을 추출하는 동작을 수행하고, 추출된 재생 음원의 음원 특성(F_music)을 프로세서에 인가한다. In step T11, the sound source characteristic extraction module performs an operation of extracting the sound source characteristics (F _music ) of the playback sound source from the playback sound source in the applied frequency domain, and transmits the sound source characteristics (F _music ) of the extracted playback sound source to the processor. Authorize.

단계(T13)에서, 프로세서는 단계(T7)에서 인가된 음향 환경 특성(F_env) 및 단계(T11)에서 인가된 재생 음원의 음원 특성(F_music)을 주파수 특성 조절 모듈로 입력값으로서 인가한다. 주파수 특성 조절 모듈은 음향 환경 특성(F_env) 및 재생 음원의 음원 특성(F_music)으로부터 재생 음원의 밸런스를 맞출 수 있는 주파수별 이득값들을 추론하여 프로세서에 출력한다. In step T13, the processor applies the acoustic environment characteristic (F _env ) applied in step T7 and the sound source characteristic (F _music ) of the playback sound source applied in step T11 as input values to the frequency characteristic adjustment module. . The frequency characteristic adjustment module infers frequency-specific gain values that can balance the playback sound source from the acoustic environment characteristics (F _env ) and the sound source characteristics (F _music ) of the playback sound source and outputs them to the processor.

단계(T15)에서, 프로세서는 단계(T9)에서 변환된 주파수 영역의 재생 음원에 주파수별 이득값을 곱셈하여 주파수 영역의 보정된 재생 음원을 생성한다. 프로세서는 단계(T15)를 수행하고, 단계(T17)로 진행한다.In step T15, the processor generates a corrected playback sound source in the frequency domain by multiplying the playback sound source in the frequency domain converted in step T9 by a gain value for each frequency. The processor performs step T15 and proceeds to step T17.

단계(T17)에서, 프로세서는 주파수 영역의 보정된 재생 음원에 대하여 IFFT 기능을 수행하여 시간 영역의 보정된 재생 음원(S_out-a)을 생성하고, 시간 영역의 보정된 재생 음원(S_out-a)을 포함하는 출력 음향 신호를 오픈형 스피커로 인가하여 음 방출되도록 한다.In step T17, the processor performs an IFFT function on the corrected playback sound source in the frequency domain to generate a time-domain corrected playback sound source (S _out-a ), and the time-domain corrected playback sound source (S _out- a). The output sound signal including _a ) is applied to an open speaker to emit sound.

프로세서는 단계(T17)을 수행한 이후에, 다시 마이크로부터 외부 음향(환경음과 보정된 재생 음원)을 인가 받아 단계(T1) 내지 (T17)을 다시 수행한다.After performing step (T17), the processor receives external sound (environmental sound and corrected playback sound source) from the microphone again and performs steps (T1) to (T17) again.

다양한 실시 예에 따른 장치(예: 프로세서 또는 그 기능들) 또는 방법(예: 동작들)의 적어도 일부는, 예컨대, 프로그램 모듈의 형태로 컴퓨터로 읽을 수 있는 저장매체(computer-readable storage media)에 저장된 명령어로 구현될 수 있다. 상기 명령어가 프로세서에 의해 실행될 경우, 상기 하나 이상의 프로세서가 상기 명령어에 해당하는 기능을 수행할 수 있다. 컴퓨터로 읽을 수 있는 저장매체는, 예를 들면, 메모리가 될 수 있다.At least a portion of the device (e.g., processor or functions thereof) or method (e.g., operations) according to various embodiments is stored in a computer-readable storage media, for example, in the form of a program module. Can be implemented as stored instructions. When the instruction is executed by a processor, the one or more processors may perform the function corresponding to the instruction. A computer-readable storage medium may be, for example, memory.

컴퓨터로 판독 가능한 기록 매체는, 하드디스크, 플로피디스크, 마그네틱 매체(magnetic media)(예: 자기테이프), 광기록 매체(optical media)(예: CD-ROM, DVD(Digital Versatile Disc), 자기-광 매체(magnetoopticalmedia)(예: 플롭티컬 디스크(floptical disk)), 하드웨어 장치(예: ROM, RAM, 또는 플래시 메모리 등)등을 포함할 수 있다. 또한, 프로그램 명령에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 상술한 하드웨어 장치는 다양한 실시 예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지다.Computer-readable recording media include hard disks, floppy disks, magnetic media (e.g. magnetic tape), optical media (e.g. CD-ROM, DVD (Digital Versatile Disc), magnetic media) It may include magnetic media (e.g., a floptical disk), hardware devices (e.g., ROM, RAM, or flash memory, etc.), etc. Additionally, program instructions may include strings such as those generated by the compiler. It may include not only machine language code but also high-level language code that can be executed by a computer using an interpreter, etc. The above-described hardware device may be configured to operate as one or more software modules to perform the operations of various embodiments, The same goes for the station.

다양한 실시 예에 따른 프로세서 또는 프로세서에 의한 기능들은 전술한 구성요소들 중 적어도 하나 이상을 포함하거나, 일부가 생략되거나, 또는 추가적인 다른 구성요소를 더 포함할 수 있다. 다양한 실시 예에 따른 모듈, 프로그램 모듈 또는 다른 구성요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱(heuristic)한 방법으로 실행될 수 있다. 또한, 일부 동작은 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다.The processor or functions provided by the processor according to various embodiments may include at least one of the above-described components, some of them may be omitted, or other additional components may be included. Operations performed by modules, program modules, or other components according to various embodiments may be executed sequentially, in parallel, iteratively, or in a heuristic manner. Additionally, some operations may be executed in a different order, omitted, or other operations may be added.

이상 설명한 바와 같이, 상술한 특정의 바람직한 실시예들에 한정되지 아니하며, 청구범위에서 청구하는 요지를 벗어남이 없이 당해 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변형의 실시가 가능한 것은 물론이고, 그와 같은 변경은 청구범위 기재의 범위 내에 있게 된다.As explained above, it is not limited to the specific preferred embodiments described above, and various modifications can be made by anyone skilled in the art without departing from the gist of the claims. Of course, such changes are within the scope of the claims.

Claims

an open speaker that receives an output sound signal including a playback sound source from the processor and emits sound;
A microphone that acquires external sounds including environmental sounds and playback sound sources and applies them to the processor;
Calculate the environmental sound by subtracting the playback sound source from the external sound applied from the microphone, infer and classify the acoustic environment characteristics of the calculated environmental sound, extract the sound source characteristics of the playback sound source, and compare the acoustic environment characteristics and reproduction of the environmental sound. A processor for inferring a gain value for each frequency from the sound source characteristics of the sound source, correcting the playback sound source using the inferred gain value for each frequency, and applying an output sound signal including the corrected playback sound source to an open speaker;
The processor includes an environment classification module that uses machine learning using artificial intelligence to infer and classify the acoustic environment characteristics using the frequency characteristics of environmental sounds in the frequency domain as input values, and outputs the acoustic environment characteristics as output values,
The processor converts the environmental sound in the time domain into the environmental sound in the frequency domain, analyzes the frequency characteristics of the environmental sound in the frequency domain, and applies the frequency characteristics of the analyzed environmental sound to the environmental classification module.
The processor is provided with a sound source characteristic extraction module that converts a playback sound source in the time domain into a playback sound source in the frequency domain and extracts sound source characteristics of the playback sound source in the frequency domain,
The processor is characterized by having a frequency characteristic adjustment module that infers and outputs gain values for each frequency using the acoustic environment characteristics from the environment classification module and the sound source characteristics from the sound source characteristic extraction module as input values through machine learning using artificial intelligence. An electronic device having an open speaker.

delete

According to claim 1,
The processor calculates the corrected playback sound source in the frequency domain by multiplying the playback sound source in the frequency domain by the gain value for each frequency, converts the corrected playback sound source in the frequency domain to the corrected playback sound source in the time domain, and calculates the corrected playback sound source in the time domain. An electronic device having an open speaker, characterized in that an output sound signal including a playback sound source is generated and applied to the open speaker.