KR102380540B1

KR102380540B1 - Electronic device for detecting audio source and operating method thereof

Info

Publication number: KR102380540B1
Application number: KR1020200117628A
Authority: KR
Inventors: 박종은; 김대황; 서동우; 김동환; 전지수; 이장희
Original assignee: 네이버 주식회사
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2022-04-01
Also published as: JP7314221B2; KR20220035635A; JP2022048130A

Abstract

다양한 실시예들은 멀티미디어 콘텐츠에 사용된 음원을 검출하기 위한 전자 장치 및 그의 동작 방법에 관한 것으로, 멀티미디어 콘텐츠의 핑거프린트를 미리 설정된 시간 간격에 따라 복수의 검색 구간들로 분할하고, 검색 구간들 중 적어도 하나가 매칭되는 검출 구간을 갖는 적어도 하나의 음원을 검출하고, 멀티미디어 콘텐츠 내에서의 검출 구간의 시간 위치 및 음원 내에서의 검출 구간의 시간 위치를 나타내는 위치 정보를 결정하고, 음원과 관련된 정보 및 위치 정보를 제공하도록 구성될 수 있다.Various embodiments relate to an electronic device for detecting a sound source used in multimedia content and an operating method thereof, wherein a fingerprint of the multimedia content is divided into a plurality of search sections according to a preset time interval, and at least one of the search sections is provided. one detects at least one sound source having a matching detection section, determines the location information indicating the time position of the detection section in the multimedia content and the time position of the detection section in the sound source, and information and location related to the sound source may be configured to provide information.

Description

Electronic device for detecting a sound source and an operating method thereof

다양한 실시예들은 멀티미디어 콘텐츠(multimedia content)에 사용된 적어도 하나의 음원(audio source)을 검출하기 위한 전자 장치 및 그의 동작 방법에 관한 것이다. Various embodiments relate to an electronic device for detecting at least one audio source used for multimedia content, and an operating method thereof.

음원 검출 기술은 멀티미디어 콘텐츠에 사용된 음원을 검출하는 기술이다. 일반적으로, 서버에는, 복수의 음원들이 등록되며, 음원들의 핑거프린트(finger print)들이 각각 저장되어 있다. 이러한 서버는 음원 검출 기술을 통해, 멀티미디어 콘텐츠의 핑거프린트를 기반으로, 등록된 음원들로부터 멀티미디어 콘텐츠에 사용된 음원을 검출한다. 이를 통해, 서버는 음원에 대한 정보와 음원 내에서 멀티미디어 콘텐츠에 사용된 부분의 시작 위치를 제공한다. The sound source detection technology is a technology for detecting a sound source used in multimedia content. In general, a plurality of sound sources are registered in a server, and fingerprints of the sound sources are stored, respectively. Such a server detects a sound source used in multimedia content from registered sound sources based on a fingerprint of the multimedia content through a sound source detection technology. Through this, the server provides information about the sound source and the starting position of the part used for the multimedia content within the sound source.

그러나, 상기와 같은 서버에서, 멀티미디어 콘텐츠에 사용된 음원을 검출하기 위한 동작 성능이 낮은 문제점이 있다. 구체적으로, 서버가 멀티미디어 콘텐츠의 전체의 핑거프린트를 등록된 음원들의 핑거프린트들과 비교해야 하기 때문에, 서버의 연산량이 증가되어, 서버의 동작 효율성이 낮다. 그리고, 서버가 음원 내에서 멀티미디어 콘텐츠에 사용된 부분을 정확하게 검출하는 데 어려움이 있다.However, in the server as described above, there is a problem in that the operation performance for detecting the sound source used for the multimedia content is low. Specifically, since the server has to compare the fingerprint of the entire multimedia content with the fingerprints of the registered sound sources, the amount of computation of the server is increased, and the operation efficiency of the server is low. In addition, it is difficult for the server to accurately detect a part used for multimedia content in the sound source.

다양한 실시예들은, 멀티미디어 콘텐츠에 사용된 적어도 하나의 음원을 효율적으로 검출할 수 있는 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments provide an electronic device capable of efficiently detecting at least one sound source used for multimedia content, and an operating method thereof.

다양한 실시예들은, 멀티미디어 콘텐츠의 핑거프린트를 부분적으로 이용하여, 음원 내에서 멀티미디어 콘텐츠에 사용된 부분을 효율적으로 검출할 수 있는 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments provide an electronic device capable of efficiently detecting a part used for multimedia content within a sound source by partially using a fingerprint of the multimedia content, and an operating method thereof.

다양한 실시예들은, 음원 내에서 멀티미디어 콘텐츠에 사용된 부분에 대해 음원 내에서의 시간 위치뿐 아니라 멀티미디어 콘텐츠 내에서의 시간 위치를 검출함으로써, 음원 내에서 멀티미디어 콘텐츠에 사용된 부분을 보다 정확하게 특정할 수 있는 전자 장치 및 그의 동작 방법을 제공한다. Various embodiments may more accurately specify a part used for multimedia content within a sound source by detecting a temporal position within the multimedia content as well as a temporal position within the sound source for a part used for multimedia content within the sound source. An electronic device and an operating method thereof are provided.

다양한 실시예들은, 음원 내에서 멀티미디어 콘텐츠에 사용된 부분의 음원 내에서의 시간 위치와 멀티미디어 콘텐츠 내에서의 시간 위치에 기반하여, 음원에 대한 신뢰도를 검출할 수 있는 전자 장치 및 그의 동작 방법을 제공한다.Various embodiments provide an electronic device capable of detecting reliability of a sound source based on a time position within a sound source and a time position within a multimedia content of a portion used for multimedia content within a sound source, and an operating method thereof do.

다양한 실시예들에 따른 전자 장치의 동작 방법은, 멀티미디어 콘텐츠의 핑거프린트를 미리 설정된 시간 간격에 따라 복수의 검색 구간들로 분할하는 단계, 상기 검색 구간들 중 적어도 하나가 매칭되는 검출 구간을 갖는 적어도 하나의 음원을 검출하는 단계, 상기 멀티미디어 콘텐츠 내에서의 상기 검출 구간의 시간 위치 및 상기 음원 내에서의 상기 검출 구간의 시간 위치를 나타내는 위치 정보를 결정하는 단계, 및 상기 음원과 관련된 정보 및 상기 위치 정보를 제공하는 단계를 포함할 수 있다. A method of operating an electronic device according to various embodiments of the present disclosure includes dividing a fingerprint of multimedia content into a plurality of search sections according to a preset time interval, and at least one of the search sections having a matching detection section. Detecting one sound source, determining the location information indicating the time position of the detection section in the multimedia content and the time position of the detection section in the sound source, and information related to the sound source and the location It may include providing information.

다양한 실시예들에 따른 컴퓨터 프로그램은, 상기 동작 방법을 상기 전자 장치에 실행시키기 위해 비-일시적인 컴퓨터 판독 가능한 기록 매체에 저장될 수 있다. The computer program according to various embodiments may be stored in a non-transitory computer-readable recording medium in order to execute the operating method in the electronic device.

다양한 실시예들에 따른 비-일시적인 컴퓨터 판독 가능한 기록 매체는, 상기 동작 방법을 상기 전자 장치에 실행시키기 위한 프로그램이 기록되어 있다. In a non-transitory computer-readable recording medium according to various embodiments, a program for executing the operating method in the electronic device is recorded.

다양한 실시예들에 따른 전자 장치는, 메모리, 및 상기 메모리와 연결되고, 상기 메모리에 저장된 적어도 하나의 명령을 실행하도록 구성된 프로세서를 포함하고, 상기 프로세서는, 멀티미디어 콘텐츠의 핑거프린트를 미리 설정된 시간 간격에 따라 복수의 검색 구간들로 분할하고, 상기 검색 구간들 중 적어도 하나가 매칭되는 검출 구간을 갖는 적어도 하나의 음원을 검출하고, 상기 멀티미디어 콘텐츠 내에서의 상기 검출 구간의 시간 위치 및 상기 음원 내에서의 상기 검출 구간의 시간 위치를 나타내는 위치 정보를 결정하고, 상기 음원과 관련된 정보 및 상기 위치 정보를 제공하도록 구성될 수 있다. An electronic device according to various embodiments of the present disclosure includes a memory and a processor connected to the memory and configured to execute at least one command stored in the memory, wherein the processor prints a fingerprint of the multimedia content at a preset time interval divided into a plurality of search sections according to may be configured to determine position information indicating a temporal position of the detection section of , and provide information related to the sound source and the position information.

다양한 실시예들에 따르면, 전자 장치는 멀티미디어 콘텐츠에 사용된 적어도 하나의 음원을 효율적으로 검출할 수 있다. 구체적으로, 전자 장치는 멀티미디어 콘텐츠의 핑거프린트에서 검색 구간들 중 하나로부터 시간 범위를 확장시키면서, 음원 내에서 멀티미디어 콘텐츠에 매칭되는 검출 구간을 효율적으로 검출할 수 있다. 그리고, 전자 장치는 음원 내에서의 검출 구간의 시간 위치뿐 아니라 멀티미디어 콘텐츠 내에서의 검출 구간의 시간 위치를 검출함으로써, 음원 및 멀티미디어 콘텐츠 내에서 검출 구간을 보다 정확하게 특정할 수 있다. 아울러, 전자 장치는 검출 구간에 대한 멀티미디어 콘텐츠의 시작점으로부터 시간 오프셋과 음원의 시작점으로부터의 시간 오프셋 사이의 오프셋 차이에 기반하여 멀티미디어 콘텐츠와 음원을 비교함으로써, 음원에 대한 신뢰도를 검출할 수 있다. 이를 통해, 전자 장치는 사용자를 위해, 음원과 관련된 정보와 위치 정보뿐 아니라, 신뢰도를 제공할 수 있다.According to various embodiments, the electronic device may efficiently detect at least one sound source used for multimedia content. Specifically, the electronic device can efficiently detect a detection section matching the multimedia content in the sound source while extending the time range from one of the search sections in the fingerprint of the multimedia content. In addition, the electronic device may more accurately specify the detection section in the sound source and multimedia content by detecting the time position of the detection section in the multimedia content as well as the time position of the detection section in the sound source. In addition, the electronic device may detect the reliability of the sound source by comparing the multimedia content with the sound source based on the offset difference between the time offset from the start point of the multimedia content and the time offset from the start point of the sound source for the detection section. Through this, the electronic device may provide reliability as well as sound source related information and location information for the user.

도 1은 다양한 실시예들에 따른 전자 장치를 도시하는 도면이다.
도 2, 도 3a, 도 3b, 도 3c, 및 도 3d는 도 1의 프로세서의 동작 특징을 예시적으로 설명하기 위한 도면들이다.
도 4는 도 1의 프로세서를 세부적으로 도시하는 도면이다.
도 5는 다양한 실시예들에 따른 전자 장치의 동작 방법을 도시하는 도면이다.
도 6은 도 5의 음원의 신뢰도 검출 단계를 세부적으로 도시하는 도면이다.
도 7, 도 8, 도 9, 도 10, 도 11, 도 12, 도 13, 도 14, 도 15, 도 16, 및 도 17은 다양한 실시예들에 따른 전자 장치의 동작 방법을 예시적으로 설명하기 위한 도면들이다. 1 is a diagram illustrating an electronic device according to various embodiments of the present disclosure;
2, 3A, 3B, 3C, and 3D are diagrams for exemplarily explaining the operation characteristics of the processor of FIG. 1 .
FIG. 4 is a diagram illustrating the processor of FIG. 1 in detail.
5 is a diagram illustrating a method of operating an electronic device according to various embodiments of the present disclosure;
6 is a diagram illustrating in detail the step of detecting the reliability of the sound source of FIG. 5 .
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17 exemplarily describe an operating method of an electronic device according to various embodiments drawings to do

이하, 본 문서의 다양한 실시예들이 첨부된 도면을 참조하여 설명된다. Hereinafter, various embodiments of the present document will be described with reference to the accompanying drawings.

도 1은 다양한 실시예들에 따른 전자 장치(100)를 도시하는 도면이다. 도 2, 도 3a, 도 3b, 도 3c, 및 도 3d는 도 1의 프로세서(160)의 동작 특징을 예시적으로 설명하기 위한 도면들이다. 도 4는 도 1의 프로세서(160)를 세부적으로 도시하는 도면이다.1 is a diagram illustrating an electronic device 100 according to various embodiments. 2, 3A, 3B, 3C, and 3D are diagrams for exemplarily explaining the operation characteristics of the processor 160 of FIG. 1 . FIG. 4 is a diagram illustrating in detail the processor 160 of FIG. 1 .

도 1을 참조하면, 다양한 실시예들에 따른 전자 장치(100)는 연결 단자(110), 통신 모듈(120), 입력 모듈(130), 출력 모듈(140), 메모리(150), 또는 프로세서(160) 중 적어도 어느 하나를 포함할 수 있다. 어떤 실시예에서, 전자 장치(100)의 구성 요소들 중 적어도 어느 하나가 생략될 수 있으며, 적어도 하나의 다른 구성 요소가 추가될 수 있다. 어떤 실시예에서, 전자 장치(100)의 구성 요소들 중 적어도 어느 두 개가 하나의 통합된 회로로 구현될 수 있다. 예를 들면, 전자 장치(100)는 서버(server), 스마트폰(smart phone), 휴대폰, 내비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA(personal digital assistants), PMP(portable multimedia player), 태블릿 PC, 게임 콘솔(game console), 웨어러블 디바이스(wearable device), IoT(internet of things) 디바이스, 가전 기기, 의료 기기, 또는 로봇(robot) 중 적어도 어느 하나를 포함할 수 있다.Referring to FIG. 1 , an electronic device 100 according to various embodiments includes a connection terminal 110 , a communication module 120 , an input module 130 , an output module 140 , a memory 150 , or a processor ( 160) may include at least any one of. In some embodiments, at least one of the components of the electronic device 100 may be omitted, and at least one other component may be added. In some embodiments, at least any two of the components of the electronic device 100 may be implemented as one integrated circuit. For example, the electronic device 100 may include a server, a smart phone, a mobile phone, a navigation system, a computer, a laptop computer, a digital broadcasting terminal, personal digital assistants (PDA), a portable multimedia player (PMP), and a tablet PC. , may include at least one of a game console, a wearable device, an Internet of things (IoT) device, a home appliance, a medical device, or a robot.

연결 단자(110)는 전자 장치(100)에서 외부 장치(102)와 물리적으로 연결될 수 있다. 예를 들면, 외부 장치(102)는 다른 전자 장치를 포함할 수 있다. 이를 위해, 연결 단자(110)는 적어도 하나의 커넥터를 포함할 수 있다. 예를 들면, 커넥터는 HDMI 커넥터, USB 커넥터, SD 카드 커넥터, 또는 오디오 커넥터 중 적어도 어느 하나를 포함할 수 있다. The connection terminal 110 may be physically connected to the external device 102 in the electronic device 100 . For example, the external device 102 may include another electronic device. To this end, the connection terminal 110 may include at least one connector. For example, the connector may include at least one of an HDMI connector, a USB connector, an SD card connector, and an audio connector.

통신 모듈(120)은 전자 장치(100)에서 외부 장치(102, 104)와 통신을 수행할 수 있다. 통신 모듈(120)은 전자 장치(100)와 외부 장치(102, 104) 간 통신 채널을 수립하고, 통신 채널을 통해, 외부 장치(102, 104)와 통신을 수행할 수 있다. 여기서, 외부 장치(102, 104)는 위성, 기지국, 또는 다른 전자 장치 중 적어도 어느 하나를 포함할 수 있다. 통신 모듈(120)은 유선 통신 모듈 또는 무선 통신 모듈 중 적어도 어느 하나를 포함할 수 있다. 유선 통신 모듈은 연결 단자(102)를 통해, 외부 장치(102)와 유선으로 연결되어, 유선으로 통신할 수 있다. 무선 통신 모듈은 근거리 통신 모듈 또는 원거리 통신 모듈 중 적어도 어느 하나를 포함할 수 있다. 근거리 통신 모듈은 외부 장치(102)와 근거리 통신 방식으로 통신할 수 있다. 예를 들면, 근거리 통신 방식은, 블루투스(Bluetooth), 와이파이 다이렉트(WiFi direct), 또는 적외선 통신(IrDA; infrared data association) 중 적어도 어느 하나를 포함할 수 있다. 원거리 통신 모듈은 외부 장치(104)와 원거리 통신 방식으로 통신할 수 있다. 여기서, 원거리 통신 모듈은 네트워크(190)를 통해 외부 장치(104)와 통신할 수 있다. 예를 들면, 네트워크(190)는 셀룰러 네트워크, 인터넷, 또는 LAN(local area network)이나 WAN(wide area network)과 같은 컴퓨터 네트워크 중 적어도 어느 하나를 포함할 수 있다.The communication module 120 may communicate with the external devices 102 and 104 in the electronic device 100 . The communication module 120 may establish a communication channel between the electronic device 100 and the external devices 102 and 104 and communicate with the external devices 102 and 104 through the communication channel. Here, the external devices 102 and 104 may include at least one of a satellite, a base station, or another electronic device. The communication module 120 may include at least one of a wired communication module and a wireless communication module. The wired communication module may be connected to the external device 102 by wire through the connection terminal 102 to communicate via wire. The wireless communication module may include at least one of a short-range communication module and a long-distance communication module. The short-range communication module may communicate with the external device 102 in a short-distance communication method. For example, the short-range communication method may include at least one of Bluetooth, WiFi direct, and infrared data association (IrDA). The telecommunication module may communicate with the external device 104 in a telecommunication method. Here, the telecommunication module may communicate with the external device 104 through the network 190 . For example, the network 190 may include at least one of a cellular network, the Internet, or a computer network such as a local area network (LAN) or a wide area network (WAN).

입력 모듈(130)은 전자 장치(100)의 적어도 하나의 구성 요소에 사용될 신호를 입력할 수 있다. 입력 모듈(130)은, 사용자가 전자 장치(100)에 직접적으로 신호를 입력하도록 구성되는 입력 장치, 주변 환경을 감지하여 신호를 발생하도록 구성되는 센서 장치, 또는 영상을 촬영하여, 영상 데이터를 생성하도록 구성되는 카메라 모듈 중 적어도 어느 하나를 포함할 수 있다. 예를 들면, 입력 장치는 마이크로폰(microphone), 마우스(mouse), 또는 키보드(keyboard) 중 적어도 어느 하나를 포함할 수 있다. 어떤 실시예에서, 센서 장치는 터치를 감지하도록 설정된 터치 회로(touch circuitry) 또는 터치에 의해 발생되는 힘의 세기를 측정하도록 설정된 센서 회로 중 적어도 어느 하나를 포함할 수 있다. The input module 130 may input a signal to be used in at least one component of the electronic device 100 . The input module 130 is an input device configured to allow a user to directly input a signal to the electronic device 100 , a sensor device configured to generate a signal by sensing a surrounding environment, or capture an image to generate image data It may include at least one of the camera modules configured to do so. For example, the input device may include at least one of a microphone, a mouse, and a keyboard. In some embodiments, the sensor device may include at least one of a touch circuitry configured to sense a touch or a sensor circuit configured to measure the intensity of a force generated by the touch.

출력 모듈(140)은 정보를 출력할 수 있다. 출력 모듈(140)은 정보를 시각적으로 표시하도록 구성되는 표시 모듈 또는 정보를 청각적으로 재생하도록 구성되는 오디오 모듈 중 적어도 하나를 포함할 수 있다. 예를 들면, 표시 모듈은 디스플레이, 홀로그램 장치, 또는 프로젝터 중 적어도 어느 하나를 포함할 수 있다. 일 예로, 표시 모듈은 입력 모듈(130)의 터치 회로 또는 센서 회로 중 적어도 어느 하나와 조립되어, 터치 스크린으로 구현될 수 있다. 예를 들면, 오디오 모듈은 스피커 또는 리시버 중 적어도 어느 하나를 포함할 수 있다.The output module 140 may output information. The output module 140 may include at least one of a display module configured to visually display information or an audio module configured to audibly reproduce information. For example, the display module may include at least one of a display, a hologram device, and a projector. For example, the display module may be implemented as a touch screen by being assembled with at least one of a touch circuit and a sensor circuit of the input module 130 . For example, the audio module may include at least one of a speaker and a receiver.

메모리(150)는 전자 장치(100)의 적어도 하나의 구성 요소에 의해 사용되는 다양한 데이터를 저장할 수 있다. 예를 들면, 메모리(150)는 휘발성 메모리 또는 비휘발성 메모리 중 적어도 어느 하나를 포함할 수 있다. 데이터는 적어도 하나의 프로그램 및 이와 관련된 입력 데이터 또는 출력 데이터를 포함할 수 있다. 프로그램은 메모리(150)에 적어도 하나의 명령을 포함하는 소프트웨어로서 저장될 수 있으며, 예컨대 운영 체제, 미들 웨어, 또는 어플리케이션 중 적어도 어느 하나를 포함할 수 있다. The memory 150 may store various data used by at least one component of the electronic device 100 . For example, the memory 150 may include at least one of a volatile memory and a non-volatile memory. The data may include at least one program and input data or output data related thereto. The program may be stored as software including at least one instruction in the memory 150, and may include, for example, at least one of an operating system, middleware, or an application.

프로세서(160)는 메모리(150)의 프로그램을 실행하여, 전자 장치(100)의 적어도 하나의 구성 요소를 제어할 수 있다. 이를 통해, 프로세서(160)는 데이터 처리 또는 연산을 수행할 수 있다. 이 때 프로세서(160)는 메모리(150)에 저장된 명령을 실행할 수 있다. 프로세서(160)는 멀티미디어 콘텐츠(multimedia content)에 사용된 적어도 하나의 음원(audio source)에 대한 검출을 시도할 수 있다. 여기서, 멀티미디어 콘텐츠는 영상 데이터 또는 오디오 데이터 중 적어도 하나로 이루어질 수 있다. 일 예로, 멀티미디어 콘텐츠는 영상 데이터와 오디오 데이터로 이루어지며, 뮤직 비디오, 네트워크를 통해 공유되는 동영상 등을 포함할 수 있다. 다른 예로, 멀티미디어 콘텐츠는 오디오 데이터로 이루어지며, 팟캐스트, 방송국 등에서 생성될 수 있다. 그리고, 도 2에 도시된 바와 같이, 멀티미디어 콘텐츠의 오디오 데이터에는, 적어도 하나의 음원이 사용될 수 있으며, 각 음원의 적어도 일부가 사용될 수 있다. The processor 160 may execute a program in the memory 150 to control at least one component of the electronic device 100 . Through this, the processor 160 may process data or perform an operation. At this time, the processor 160 may execute a command stored in the memory 150 . The processor 160 may attempt to detect at least one audio source used for multimedia content. Here, the multimedia content may be formed of at least one of image data and audio data. For example, the multimedia content may include image data and audio data, and may include a music video, a moving picture shared through a network, and the like. As another example, multimedia content is made of audio data, and may be generated by a podcast, a broadcasting station, or the like. And, as shown in FIG. 2 , at least one sound source may be used for audio data of multimedia content, and at least a portion of each sound source may be used.

다양한 실시예들에 따르면, 프로세서(160)는 멀티미디어 콘텐츠에 대응하여, 음원의 적어도 일부를 검출 구간(200)으로서 검출할 수 있다. 이 때 검출 구간(200)에 대한 오프셋 차이가 정의될 수 있다. 오프셋 차이(ΔT_m - ΔT_a)는 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)과 음원의 시작점(T_a0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a) 사이의 차이를 나타낼 수 있다. 여기서, 오프셋 차이(ΔT_m - ΔT_a)는 한 값으로 정의될 수 있으며, 일정 범위 내의 값들로 정의될 수도 있다. 일 예로, 오프셋 차이(ΔT_m - ΔT_a)는 멀티미디어 콘텐츠의 시작점(T_m0)으로부터의 시간 오프셋(ΔT_m)과 음원의 시작점(T_a0)으로부터의 시간 오프셋(ΔT_a) 사이의 차이를 중심으로 하는 범위 내의 값들로 정의될 수 있다. 오프셋 차이(ΔT_m - ΔT_a)가 일정 범위 내의 값들로 정의되는 경우, 동일한 음원에 대한 다양한 재생 속도들이 고려될 수 있다. According to various embodiments, the processor 160 may detect at least a portion of the sound source as the detection section 200 in response to the multimedia content. In this case, an offset difference with respect to the detection section 200 may be defined. The offset difference (ΔT _m - ΔT _a ) is the time offset (ΔT _m ) from the start point (T _m0 ) of the multimedia content to the start point (T _d0 ) of the detection section 200 and the detection section from the start point (T _a0 ) of the sound source ( 200) may represent a difference between the time offsets ΔT _a up to the starting point T _d0 . Here, the offset difference (ΔT _m - ΔT _a ) may be defined as one value or may be defined as values within a predetermined range. As an example, the offset difference (ΔT _m - ΔT _a ) centers on the difference between the time offset (ΔT _m ) from the start point (T _m0 ) of the multimedia content and the time offset (ΔT _a ) from the start point (T _a0 ) of the sound source It can be defined as values within the range of . When the offset difference (ΔT _m - ΔT _a ) is defined as values within a predetermined range, various reproduction speeds for the same sound source may be considered.

제 1 예로, 도 3a에 도시된 바와 같이, 검출 구간(200)은 음원의 전체 영역일 수 있으며, 멀티미디어 콘텐츠의 일부 영역에 사용될 수 있다. 여기서, 음원의 시작점(T_a0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a)은 0이므로, 오프셋 차이(ΔT_m - ΔT_a)는 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)으로 결정될 수 있다(오프셋 차이 = ΔT_m). 제 2 예로, 도 3b에 도시된 바와 같이, 검출 구간(200)은 음원의 일부 영역일 수 있으며, 멀티미디어 콘텐츠의 일부 영역에 사용될 수 있다. 여기서, 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)이므로, 오프셋 차이(ΔT_m - ΔT_a)는 음원의 시작점(T_a0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a)으로부터 결정될 수 있다(오프셋 차이 = -ΔT_a). 제 3 예로, 도 3c에 도시된 바와 같이, 검출 구간(200)은 음원의 일부 영역일 수 있으며, 멀티미디어 콘텐츠의 일부 영역에 사용될 수 있다. 여기서, 음원의 시작점(T_a0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a)은 0이므로, 오프셋 차이(ΔT_m - ΔT_a)는 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)으로 결정될 수 있다(오프셋 차이 = ΔT_m). 제 4 예로, 도 3d에 도시된 바와 같이, 검출 구간(200)은 음원의 일부 영역일 수 있으며, 멀티미디어 콘텐츠의 전체 영역에 사용될 수 있다. 여기서, 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)이므로, 오프셋 차이(ΔT_m - ΔT_a)는 음원의 시작점(T_a0)으로부터 검출 구간(200)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a)으로부터 결정될 수 있다(오프셋 차이 = -ΔT_a).As a first example, as shown in FIG. 3A , the detection section 200 may be the entire area of the sound source and may be used in a partial area of the multimedia content. Here, since the time offset (ΔT _a ) from the starting point (T _a0 ) of the sound source to the starting point (T _d0 ) of the detection section 200 is 0, the offset difference (ΔT _m - ΔT _a ) is the starting point (T _m0 ) of the multimedia content ) to the start point T _d0 of the detection section 200 may be determined as a time offset ΔT _m (offset difference = ΔT _m ). As a second example, as shown in FIG. 3B , the detection section 200 may be a partial region of a sound source and may be used in a partial region of multimedia content. Here, since the time offset (ΔT _m ) from the starting point (T _m0 ) of the multimedia content to the starting point (T _d0 ) of the detection section 200 , the offset difference (ΔT _m - ΔT _a ) is from the starting point (T _a0 ) of the sound source It may be determined from the time offset ΔT _a to the start point T _d0 of the detection period 200 (offset difference = -ΔT _a ). As a third example, as shown in FIG. 3C , the detection section 200 may be a partial region of a sound source and may be used in a partial region of multimedia content. Here, since the time offset (ΔT _a ) from the starting point (T _a0 ) of the sound source to the starting point (T _d0 ) of the detection section 200 is 0, the offset difference (ΔT _m - ΔT _a ) is the starting point (T _m0 ) of the multimedia content ) to the start point T _d0 of the detection section 200 may be determined as a time offset ΔT _m (offset difference = ΔT _m ). As a fourth example, as shown in FIG. 3D , the detection section 200 may be a partial area of the sound source and may be used for the entire area of the multimedia content. Here, since the time offset (ΔT _m ) from the starting point (T _m0 ) of the multimedia content to the starting point (T _d0 ) of the detection section 200 , the offset difference (ΔT _m - ΔT _a ) is from the starting point (T _a0 ) of the sound source It may be determined from the time offset ΔT _a to the start point T _d0 of the detection period 200 (offset difference = -ΔT _a ).

다양한 실시예들에 따르면, 프로세서(160)는 오프셋 차이(ΔT_m - ΔT_a)를 이용하여, 음원의 신뢰도(confidence)를 검출할 수 있다. 신뢰도는 검출된 음원이 멀티미디어 콘텐츠에 사용된 것인 지에 대한 정확도를 나타내는 것으로, 신뢰도가 높을수록, 정확도가 높을 수 있다. 구체적으로, 프로세서(160)는 오프셋 차이(ΔT_m - ΔT_a)에 기반하여, 멀티미디어 콘텐츠에 대해 검출 구간(200)을 정렬시킬 수 있다. 그리고, 프로세서(160)는 멀티미디어 콘텐츠와 검출 구간(200)을 비교하여, 음원의 신뢰도를 검출할 수 있다. 일 실시예에 따르면, 프로세서(160)는 비트 연산을 통해, 음원의 신뢰도를 검출할 수 있다. 프로세서(160)는 멀티미디어 콘텐츠의 핑거프린트와 검출 구간(200)의 핑거프린트의 비교 연산을 통해, 검출 구간(200)의 비트 에러율(bit error rate; BER)들을 계산할 수 있다. 프로세서(160)는 비트 에러율들을 점수(score)들로 각각 변환할 수 있다. 프로세서(160)는 미리 정해진 스코어 함수(score function)를 이용하여, 비트 에러율들을 점수들로 각각 변환할 수 있다. 프로세서(160)는 점수들의 합으로부터 신뢰도를 검출할 수 있다. 프로세서(160)는 미리 정해진 컨피던스 함수(confidence function)를 이용하여, 점수들의 합으로부터 신뢰도를 검출할 수 있다.According to various embodiments, the processor 160 may detect the reliability of the sound source by using the offset difference ΔT _m - ΔT _a . The reliability indicates the accuracy of whether the detected sound source is used for multimedia content, and the higher the reliability, the higher the accuracy may be. Specifically, the processor 160 may align the detection section 200 with respect to the multimedia content based on the offset difference (ΔT _m - ΔT _a ). In addition, the processor 160 may compare the multimedia content with the detection section 200 to detect the reliability of the sound source. According to an embodiment, the processor 160 may detect the reliability of the sound source through bit operation. The processor 160 may calculate bit error rates (BERs) of the detection period 200 through a comparison operation between the fingerprint of the multimedia content and the fingerprint of the detection period 200 . The processor 160 may convert the bit error rates into scores, respectively. The processor 160 may convert the bit error rates into scores, respectively, using a predetermined score function. The processor 160 may detect the reliability from the sum of the scores. The processor 160 may detect the reliability from the sum of the scores using a predetermined confidence function.

다양한 실시예들에 따르면, 프로세서(160)는, 도 4에 도시된 바와 같이 API(application programming interface)(461), 프로세스-API(process-API)(463), 제어부(465), 콘텐츠 획득부(467), 핑거프린트부(469), 매칭부(471), 비교부(473), 또는 클러스터링부(475) 중 적어도 하나를 포함할 수 있다. 어떤 실시예에서, 프로세서(160)의 구성 요소들 중 적어도 어느 하나가 생략될 수 있으며, 적어도 하나의 다른 구성 요소가 추가될 수 있다. 어떤 실시예에서, 프로세서(160)의 구성 요소들 중 적어도 어느 두 개가 하나의 통합된 회로로 구현될 수 있다.According to various embodiments, the processor 160, as shown in FIG. 4 , an application programming interface (API) 461 , a process-API (process-API) 463 , a controller 465 , and a content acquisition unit At least one of a 467 , a fingerprint unit 469 , a matching unit 471 , a comparison unit 473 , and a clustering unit 475 may be included. In some embodiments, at least one of the components of the processor 160 may be omitted, and at least one other component may be added. In some embodiments, at least any two of the components of the processor 160 may be implemented as one integrated circuit.

API(461)는 사용자의 요청을 검출할 수 있다. 프로세스-API(463)는 사용자의 요청에 기반하여, 명령어를 생성할 수 있다. 제어부(465)는 프로세서(160)의 구성 요소들 중 적어도 하나를 제어할 수 있다. 이 때 제어부(465)는 프로세서(160)의 구성 요소들 중 적어도 두 개를 위한 중개 역할을 수행할 수 있으며, 프로세서(160)의 구성 요소들 중 적어도 하나를 위한 작업을 수행할 수 있다. 콘텐츠 획득부(467)는 명령어에 기반하여, 멀티미디어 콘텐츠를 획득할 수 있다. 핑거프린트부(469)는 멀티미디어 콘텐츠의 핑거프린트를 획득할 수 있다. 이 때 핑거프린트부(469)는 멀티미디어 콘텐츠의 오디오 데이터로부터 핑거프린트를 직접적으로 추출할 수 있다. 매칭부(471)는 멀티미디어 콘텐츠의 핑거프린트에 기반하여, 적어도 하나의 음원을 검출할 수 있다. 이 때 메모리(150)에는, 복수의 음원들이 미리 등록되어 있으며, 등록된 음원들의 핑거프린트들이 각각 저장되어 있을 수 있다. 매칭부(471)는 멀티미디어 콘텐츠의 핑거프린트와 등록된 음원들의 핑거프린트들을 매칭시킴으로써, 등록된 음원들의 핑거프린트들 중 적어도 하나를 검출할 수 있다. 비교부(473)는 멀티미디어 콘텐츠의 핑거프린트와 검출된 음원의 핑거프린트를 비교하여, 검출된 음원의 신뢰도를 검출할 수 있다. 클러스터링부(475)는 검출된 음원을 기반으로, 멀티미디어 콘텐츠에 대한 비교 대상 또는 멀티미디어 콘텐츠와의 비교 결과 중 적어도 하나를 검출된 음원과 동일하거나 유사한 음원을 포괄하도록 확장시킬 수 있다. 구체적으로, 클러스터링부(475)는 검출된 음원과 동일하거나 유사한 음원의 정보를 획득하여, 멀티미디어 콘텐츠에 대한 비교 대상을 검출된 음원과 동일하거나 유사한 음원으로 확장시킬 수 있다. 한편, 클러스터링부(475)는 비교부(473)의 비교 결과에 기반하여, 검출된 음원과 동일하거나 유사한 음원을 취합할 수 있다. API 461 may detect the user's request. The process-API 463 may generate a command based on a user's request. The controller 465 may control at least one of the components of the processor 160 . In this case, the control unit 465 may play a mediating role for at least two of the components of the processor 160 , and may perform a task for at least one of the components of the processor 160 . The content acquisition unit 467 may acquire multimedia content based on the command. The fingerprint unit 469 may obtain a fingerprint of the multimedia content. In this case, the fingerprint unit 469 may directly extract the fingerprint from the audio data of the multimedia content. The matching unit 471 may detect at least one sound source based on the fingerprint of the multimedia content. At this time, in the memory 150 , a plurality of sound sources may be registered in advance, and fingerprints of the registered sound sources may be stored, respectively. The matching unit 471 may detect at least one of the fingerprints of the registered sound sources by matching the fingerprint of the multimedia content with the fingerprints of the registered sound sources. The comparison unit 473 may compare the fingerprint of the multimedia content with the fingerprint of the detected sound source to detect reliability of the detected sound source. Based on the detected sound source, the clustering unit 475 may expand at least one of a comparison target for multimedia content or a comparison result with multimedia content to include sound sources that are the same as or similar to the detected sound source. Specifically, the clustering unit 475 may acquire information on the sound source that is the same as or similar to the detected sound source, and expand the comparison target for multimedia content to the sound source that is the same or similar to the detected sound source. Meanwhile, the clustering unit 475 may collect sound sources that are the same as or similar to the detected sound sources based on the comparison result of the comparison unit 473 .

도 5는 다양한 실시예들에 따른 전자 장치(100)의 동작 방법을 도시하는 도면이다. 도 6은 도 5의 음원의 신뢰도 검출 단계(550 단계)를 세부적으로 도시하는 도면이다. 도 7, 도 8, 도 9, 도 10, 도 11, 도 12, 도 13, 도 14, 도 15, 도 16, 및 도 17은 다양한 실시예들에 따른 전자 장치(100)의 동작 방법을 예시적으로 설명하기 위한 도면들이다. 5 is a diagram illustrating a method of operating the electronic device 100 according to various embodiments of the present disclosure. 6 is a diagram illustrating in detail the step of detecting the reliability of the sound source (step 550) of FIG. 5 . 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17 illustrate an operating method of the electronic device 100 according to various embodiments The drawings are for illustrative purposes only.

도 5를 참조하면, 전자 장치(100)는 510 단계에서 멀티미디어 콘텐츠의 핑거프린트(710)를 복수의 검색 구간(720)들로 분할할 수 있다. 여기서, 멀티미디어 콘텐츠는 영상 데이터 또는 오디오 데이터 중 적어도 하나로 이루어질 수 있다. 일 예로, 멀티미디어 콘텐츠는 영상 데이터와 오디오 데이터로 이루어지며, 뮤직 비디오, 네트워크를 통해 공유되는 동영상 등을 포함할 수 있다. 다른 예로, 멀티미디어 콘텐츠는 오디오 데이터로 이루어지며, 팟캐스트, 방송국 등에서 생성될 수 있다. 그리고, 오디오 데이터에는, 적어도 하나의 음원이 사용될 수 있으며, 각 음원의 적어도 일부가 포함될 수 있다. 프로세서(160)는 멀티미디어 콘텐츠의 핑거프린트(710)를 획득할 수 있다. 일 실시예에 따르면, 프로세서(160)는 멀티미디어 콘텐츠의 오디오 데이터로부터 핑거프린트(710)를 직접적으로 추출할 수 있다. 예를 들면, 사용자에 의해 멀티미디어 콘텐츠가 선택되면, 프로세서(160)는 멀티미디어 콘텐츠의 오디오 데이터로부터 핑거프린트(710)를 추출할 수 있다. 다른 실시예에 따르면, 프로세서(160)는 외부 장치(102, 104)로부터 멀티미디어 콘텐츠의 핑거프린트(710)를 쿼리로서 수신할 수 있다. 여기서, 핑거프린트는 오디어 데이터에 대한 시간에 따른 주파수 분포를 나타낼 수 있다. 프로세서(160)는, 도 7에 도시된 바와 같이 멀티미디어의 핑거프린트(710)를 미리 설정된 시간 간격에 따라 복수의 검색 구간(720)들로 분할할 수 있다. 일 예로, 시간 간격은 수 초의 범위 내에서 정해질 수 있다. Referring to FIG. 5 , in step 510 , the electronic device 100 may divide the fingerprint 710 of the multimedia content into a plurality of search sections 720 . Here, the multimedia content may be formed of at least one of image data and audio data. For example, the multimedia content may include image data and audio data, and may include a music video, a moving picture shared through a network, and the like. As another example, multimedia content is made of audio data, and may be generated by a podcast, a broadcasting station, or the like. In addition, at least one sound source may be used in the audio data, and at least a portion of each sound source may be included. The processor 160 may acquire the fingerprint 710 of the multimedia content. According to an embodiment, the processor 160 may directly extract the fingerprint 710 from the audio data of the multimedia content. For example, when multimedia content is selected by the user, the processor 160 may extract the fingerprint 710 from audio data of the multimedia content. According to another embodiment, the processor 160 may receive the fingerprint 710 of the multimedia content from the external devices 102 and 104 as a query. Here, the fingerprint may indicate a frequency distribution according to time for the audio data. As shown in FIG. 7 , the processor 160 may divide the multimedia fingerprint 710 into a plurality of search sections 720 according to preset time intervals. For example, the time interval may be determined within a range of several seconds.

전자 장치(100)는 520 단계에서 검색 구간(720)들 중 적어도 하나가 매칭되는 검출 구간(1010)을 갖는 적어도 하나의 음원을 검출할 수 있다. 이 때 메모리(150)에는, 복수의 음원들이 미리 등록되어 있으며, 등록된 음원들의 핑거프린트(910)들이 각각 저장되어 있을 수 있다. 프로세서(160)는, 도 8에 도시된 바와 같이 검색 구간(720)들의 각각을 등록된 음원들의 핑거프린트(910)들과 비교할 수 있다. 이를 통해, 프로세서(160)는 검색 구간(720)들 중 하나에 기반하여, 등록된 음원들의 핑거프린트(910)들 중 적어도 하나를 검출할 수 있다. 이 때 프로세서(160)는, 도 9에 도시된 바와 같이 검색 구간(720)들 중 하나로부터 시간 범위를 확장시키면서, 멀티미디어 콘텐츠의 핑거프린트(710) 및 검출된 음원의 핑거프린트(910)를 비교할 수 있다. 이에 따라, 프로세서(160)는, 도 10에 도시된 바와 같이 검출된 음원의 핑거프린트(910)에서, 검색 구간(720)들 중 적어도 하나가 매칭되는 검출 구간(1010)을 검출할 수 있다. In step 520 , the electronic device 100 detects at least one sound source having a detection section 1010 that matches at least one of the search sections 720 . At this time, in the memory 150 , a plurality of sound sources are registered in advance, and fingerprints 910 of the registered sound sources may be stored, respectively. The processor 160 may compare each of the search sections 720 with fingerprints 910 of registered sound sources as shown in FIG. 8 . Through this, the processor 160 may detect at least one of the fingerprints 910 of the registered sound sources based on one of the search sections 720 . At this time, the processor 160 compares the fingerprint 710 of the multimedia content and the fingerprint 910 of the detected sound source while extending the time range from one of the search sections 720 as shown in FIG. can Accordingly, the processor 160 may detect a detection section 1010 in which at least one of the search sections 720 matches from the fingerprint 910 of the detected sound source as shown in FIG. 10 .

전자 장치(100)는 530 단계에서 멀티미디어 콘텐츠 및 검출된 음원 내에서의 검출 구간(1010)의 위치 정보를 결정할 수 있다. 위치 정보는 멀티미디어 콘텐츠의 핑거프린트(710) 내에서의 검출 구간(1010)의 시간 위치 및 검출된 음원의 핑거프린트(910) 내에서의 검출 구간(1010)의 시간 위치를 나타낼 수 있다. 프로세서(160)는 검출 구간(1010)의 위치 정보에 기반하여, 검출 구간(101)에 대한 오프셋 차이(ΔT_m - ΔT_a)를 검출할 수 있다. 오프셋 차이(ΔT_m - ΔT_a)는 멀티미디어 콘텐츠의 핑거프린트(710)의 시작점(T_m0)으로부터 검출 구간(1010)의 시작점(T_d0)까지의 시간 오프셋(ΔT_m)과 검출된 음원의 핑거프린트(910)의 시작점(T_a0)으로부터 검출 구간(1010)의 시작점(T_d0)까지의 시간 오프셋(ΔT_a) 사이의 차이를 나타낼 수 있다. The electronic device 100 may determine location information of the detection section 1010 within the multimedia content and the detected sound source in step 530 . The location information may indicate a time location of the detection section 1010 within the fingerprint 710 of the multimedia content and a time location of the detection section 1010 within the fingerprint 910 of the detected sound source. The processor 160 may detect an offset difference (ΔT _m - ΔT _a ) with respect to the detection period 101 based on the location information of the detection period 1010 . The offset difference (ΔT _m - ΔT _a ) is the time offset (ΔT _m ) from the starting point (T _m0 ) of the fingerprint 710 of the multimedia content to the starting point (T _d0 ) of the detection section 1010 and the finger of the detected sound source It may represent a difference between the time offset ΔT _a from the start point T _a0 of the print 910 to the start point T _d0 of the detection section 1010 .

전자 장치(100)는 540 단계에서 검출 구간(1010)에 대한 오프셋 차이(ΔT_m - ΔT_a)에 기반하여, 멀티미디어 콘텐츠의 핑거프린트(710)에 대해 검출 구간(1010)을 정렬시킬 수 있다. 프로세서(160)는, 도 10에 도시된 바와 같이 검출 구간(1010)과 검출 구간(1010)에 매칭된 적어도 하나의 검색 구간(720)이 서로에 대응되도록, 검출 구간(1010)을 정렬시킬 수 있다. 프로세서(160)는 적어도 하나의 검색 구간(720)의 시작점에 검출 구간(1010)의 시작점을 정렬시킬 수 있다. In operation 540 , the electronic device 100 may align the detection period 1010 with respect to the fingerprint 710 of the multimedia content based on the offset difference (ΔT _m - ΔT _a ) with respect to the detection period 1010 . The processor 160 may align the detection section 1010 so that the detection section 1010 and at least one search section 720 matching the detection section 1010 correspond to each other as shown in FIG. 10 . there is. The processor 160 may align the start point of the detection section 1010 to the start point of the at least one search section 720 .

전자 장치(100)는 550 단계에서 멀티미디어 콘텐츠의 핑거프린트(710)와 검출 구간(1010)을 비교하여, 검출된 음원의 신뢰도를 검출할 수 있다. 신뢰도는 검출된 음원이 멀티미디어 콘텐츠에 사용된 것인 지에 대한 정확도를 나타내는 것으로, 신뢰도가 높을수록, 정확도가 높을 수 있다. 프로세서(160)는 검출 구간(1010)과 검출 구간(1010)에 매칭된 적어도 하나의 검색 구간(720)을 비교하여, 검출된 음원의 신뢰도를 검출할 수 있다. 일 실시예에 따르면, 프로세서(160)는 적어도 하나의 검색 구간(720)에 대한 검출 구간(1010)의 비트 연산을 통해, 검출된 음원의 신뢰도를 검출할 수 있다. 이에 대해, 도 6을 참조하여, 보다 상세하게 후술될 것이다. The electronic device 100 compares the fingerprint 710 of the multimedia content with the detection section 1010 in step 550 to detect the reliability of the detected sound source. Reliability indicates the accuracy of whether the detected sound source is used for multimedia content, and the higher the reliability, the higher the accuracy may be. The processor 160 may detect the reliability of the detected sound source by comparing the detection section 1010 with at least one search section 720 matching the detection section 1010 . According to an embodiment, the processor 160 may detect the reliability of the detected sound source through bit operation of the detection section 1010 for at least one search section 720 . This will be described later in more detail with reference to FIG. 6 .

도 6을 참조하면, 전자 장치(100)는 651 단계에서 멀티미디어 콘텐츠의 핑거프린트(710)와 검출 구간(1010)의 비교 연산을 통해, 비교 구간(1110)을 생성할 수 있다. 일 예로, 비교 연산은 배타적 논리합(XOR)을 포함할 수 있다. 프로세서(160)는 검출 구간(1010)과 검출 구간(1010)에 매칭된 적어도 하나의 검색 구간(720)의 비교 연산을 통해, 도 11에 도시된 바와 같이 비교 구간(1110)을 생성할 수 있다. Referring to FIG. 6 , in step 651 , the electronic device 100 may generate a comparison section 1110 through a comparison operation between the fingerprint 710 of the multimedia content and the detection section 1010 . As an example, the comparison operation may include an exclusive OR (XOR). The processor 160 may generate a comparison section 1110 as shown in FIG. 11 through a comparison operation between the detection section 1010 and at least one search section 720 matched to the detection section 1010 . .

전자 장치(100)는 653 단계에서 비교 구간(1110)을 복수의 비트 구간(1210)들로 분할할 수 있다. 프로세서(160)는, 도 12에 도시된 바와 같이 비교 구간(1110)을 미리 설정된 시간 간격에 따라 복수의 비트 구간(1210)들로 분할할 수 있다. 일 예로, 시간 간격은 약 1 초일 수 있다. The electronic device 100 may divide the comparison section 1110 into a plurality of bit sections 1210 in step 653 . The processor 160 may divide the comparison section 1110 into a plurality of bit sections 1210 according to a preset time interval as shown in FIG. 12 . As an example, the time interval may be about 1 second.

전자 장치(100)는 655 단계에서 비트 구간(1210)들의 비트 에러율들을 각각 계산할 수 있다. 프로세서(160)는 비트 구간(1210)들의 각각에 대해 연속되는 비트들로 계산하고, 비트 구간(1210)들의 각각의 비트들로부터 비트 에러율을 계산할 수 있다. 여기서, 각 비트 에러율은 0과 1 사이의 값으로 표현되며, 비트 에러율이 낮을수록 유사성이 높을 수 있다. 유사성은 검출 구간(1010)과 검출 구간(1010)에 매칭된 적어도 하나의 검색 구간(720) 사이의 유사성을 나타낼 수 있다. 즉, 비트 에러율이 0이라는 것은, 검출 구간(1010)과 적어도 하나의 검색 구간(720)이 동일함을 의미할 수 있다. 일 예로, 도 13에 도시된 바와 같이 비트 구간(1210)들의 비트 에러율들이 계산된 경우, 이는 멀티미디어 콘텐츠에서 513 초에서부터 551 초까지에 검출된 음원의 검출 구간(1010)이 사용되었음을 나타낼 수 있다. 다른 예로, 도 14에 도시된 바와 같이 비트 구간(1210)들의 비트 에러율들이 계산된 경우, 이는 멀티미디어 콘텐츠에서 복수의 음원들의 검출 구간(1010)들이 사용되었으며, 동일한 시간 범위에도 복수의 음원들의 검출 구간(1010)들이 사용되었음을 나타낼 수 있다. 이러한 경우, 프로세서(160)는 검출 구간(1010)들의 비트 에러율들 사이에서 가장 높은 비트 에러율들을 추출할 수 있다. The electronic device 100 may calculate bit error rates of the bit sections 1210, respectively, in step 655 . The processor 160 may calculate consecutive bits for each of the bit sections 1210 and calculate a bit error rate from each bit of the bit sections 1210 . Here, each bit error rate is expressed as a value between 0 and 1, and the lower the bit error rate, the higher the similarity. Similarity may indicate similarity between the detection section 1010 and at least one search section 720 matched to the detection section 1010 . That is, the bit error rate of 0 may mean that the detection section 1010 and at least one search section 720 are the same. For example, when the bit error rates of the bit sections 1210 are calculated as shown in FIG. 13 , this may indicate that the detection section 1010 of the sound source detected from 513 seconds to 551 seconds in multimedia content is used. As another example, when the bit error rates of the bit sections 1210 are calculated as shown in FIG. 14 , the detection sections 1010 of a plurality of sound sources are used in multimedia content, and the detection sections of the plurality of sound sources are used in the same time range. (1010) may be used. In this case, the processor 160 may extract the highest bit error rates among the bit error rates of the detection period 1010 .

전자 장치(100)는 657 단계에서 비트 에러율들을 비트 구간(1210)들의 점수들로 각각 변환할 수 있다. 프로세서(160)는 미리 정해진 스코어 함수를 이용하여, 비트 에러율들을 점수들로 각각 변환할 수 있다. 여기서, 도 15에 도시된 바와 같이, 낮은 비트 에러율일수록, 높은 점수로 변환되고, 높은 비트 에러율일수록, 낮은 점수로 변환될 수 있다. 이 때 비트 에러율들 중 적어도 하나가 임계값을 초과하면, 프로세서(160)는 적어도 하나의 비트 에러율에 대응하여 0을 점수로 부여할 수 있다. 한편, 비트 에러율들 중 나머지가 임계값 이하이면, 프로세서(160)는 나머지의 비트 에러율에 기반하여 점수를 각각 계산하고, 나머지의 비트 에러율에 대응하여 계산된 점수를 각각 부여할 수 있다. 예를 들면, 스코어 함수는 하기 [수학식 1]과 같이 표현될 수 있다. The electronic device 100 may convert the bit error rates into scores of the bit sections 1210, respectively, in step 657 . The processor 160 may convert bit error rates into scores, respectively, using a predetermined score function. Here, as shown in FIG. 15 , a low bit error rate may be converted into a high score, and a high bit error rate may be converted into a low score. At this time, if at least one of the bit error rates exceeds the threshold value, the processor 160 may assign 0 as a score in response to the at least one bit error rate. Meanwhile, if the remainder of the bit error rates is equal to or less than the threshold, the processor 160 may calculate a score based on the remaining bit error rates, respectively, and assign the calculated scores corresponding to the remaining bit error rates. For example, the score function may be expressed as [Equation 1] below.

여기서, x는 비트 에러율을 나타내고, y는 점수를 나타내고, 임계값은 0을 초과하고 0.5 이하일 수 있으며, 일 예로 0.35 이상이고 0.45 이하일 수 있다. 임계값이 0에 가까운 값으로 설정될수록, 음원에 대한 오검출 가능성이 낮은 한편, 노이즈를 갖는 음원에 대한 검출 가능성이 높을 수 있다. 이에 반해, 임계값이 0.5에 가까운 값으로 설정될수록, 노이즈를 갖는 음원에 대한 검출 가능성이 낮은 한편, 음원에 대한 오검출 가능성이 높을 수 있다. Here, x represents a bit error rate, y represents a score, and the threshold may be greater than 0 and less than or equal to 0.5, for example, greater than or equal to 0.35 and less than or equal to 0.45. As the threshold value is set to a value close to 0, the probability of false detection of a sound source may be lower, and the possibility of detecting a sound source having noise may be high. On the other hand, as the threshold value is set to a value close to 0.5, the probability of detecting a sound source having noise is lower, while the possibility of erroneous detection of the sound source may be higher.

전자 장치(100)는 659 단계에서 점수들의 합으로부터 검출된 음원에 대한 신뢰도를 검출할 수 있다. 신뢰도는 검출된 음원이 멀티미디어 콘텐츠에 사용된 것인 지에 대한 정확도를 나타내는 것으로, 신뢰도가 높을수록, 정확도가 높을 수 있다. 프로세서(160)는 미리 정해진 컨피던스 함수를 이용하여, 점수들의 합으로부터 신뢰도를 검출할 수 있다. 여기서, 도 16에 도시된 바와 같이, 신뢰도는 0과 1 사이의 값으로 표현될 수 있다. 점수들의 합이 일정 범위 내에 있을 때, 점수들의 합이 신뢰도에 크게 영향을 미침으로써, 점수들의 합이 클수록, 신뢰도가 현저하게 높을 수 있다. 한편, 점수들의 합이 일정 범위 밖에 있을 때, 점수들의 합이 신뢰도에 미치는 영향이 감소될 수 있다. 예를 들면, 컨피던스 함수는 하기 [수학식 2]와 같이 표현될 수 있다. The electronic device 100 may detect the reliability of the sound source detected from the sum of the scores in step 659 . Reliability indicates the accuracy of whether the detected sound source is used for multimedia content, and the higher the reliability, the higher the accuracy may be. The processor 160 may detect the reliability from the sum of the scores using a predetermined confidence function. Here, as shown in FIG. 16 , the reliability may be expressed as a value between 0 and 1. When the sum of the scores is within a certain range, the sum of the scores greatly affects the reliability, so that the larger the sum of the scores, the higher the reliability may be. On the other hand, when the sum of the scores is outside a certain range, the effect of the sum of the scores on the reliability may be reduced. For example, the confidence function may be expressed as [Equation 2] below.

여기서, x는 점수들의 합을 나타내고, y는 신뢰도를 나타내고, 가중치는 스코어 함수의 임계값 또는 후술되는 신뢰도에 대한 기준값 중 적어도 하나에 따라 결정될 수 있으며, 예컨대 0.1 이상이고 0.2 이하일 수 있다. Here, x represents the sum of scores, y represents reliability, and the weight may be determined according to at least one of a threshold value of a score function or a reference value for reliability to be described later, and may be, for example, 0.1 or more and 0.2 or less.

이 후, 전자 장치(100)는 도 5로 리턴하여, 560 단계로 진행할 수 있다. Thereafter, the electronic device 100 may return to FIG. 5 and proceed to step 560 .

다시 도 5를 참조하면, 전자 장치(100)는 560 단계에서 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 음원과 관련된 정보는 음원의 식별자, 명칭, 또는 아티스트 중 적어도 하나를 포함할 수 있다. 위치 정보는 멀티미디어 콘텐츠의 핑거프린트(710) 내에서의 검출 구간(1010)의 시간 위치 및 검출된 음원의 핑거프린트(910) 내에서의 검출 구간(1010)의 시간 위치를 나타낼 수 있다. 프로세서(160)는, 도 17에 도시된 바와 같이 멀티미디어 콘텐츠에 대응하여 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 여기서, 멀티미디어 콘텐츠로부터 복수의 음원들이 검출된 경우, 프로세서(160)는 음원들의 리스트로서, 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 일 예로, 프로세서(160)는 검출된 음원의 신뢰도와 관계 없이, 검출된 음원의 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 다른 예로, 검출된 음원의 신뢰도가 기준값 이상이면, 프로세서(160)가 검출된 음원의 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 바꿔 말하면, 검출된 음원의 신뢰도가 기준값 미만이면, 프로세서(160)는 검출된 음원의 관련된 정보, 위치 정보, 및 신뢰도를 제공하지 않을 수 있다. 프로세서(160)는 외부 장치(102, 104)의 쿼리에 대한 응답으로서, 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 제공할 수 있다. 일 실시예에 따르면, 프로세서(160)는 외부 장치(102, 104)로 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 송신할 수 있다. 다른 실시예에 따르면, 프로세서(160)는 출력 모듈(140)을 통해, 검출된 음원과 관련된 정보, 위치 정보, 및 신뢰도를 직접적으로 출력할 수 있다. Referring back to FIG. 5 , the electronic device 100 may provide information related to the sound source detected in step 560 , location information, and reliability. The information related to the sound source may include at least one of an identifier, a name, and an artist of the sound source. The location information may indicate a time location of the detection section 1010 within the fingerprint 710 of the multimedia content and a time location of the detection section 1010 within the fingerprint 910 of the detected sound source. As shown in FIG. 17 , the processor 160 may provide information related to a sound source detected in response to multimedia content, location information, and reliability. Here, when a plurality of sound sources are detected from the multimedia content, the processor 160 may provide information related to the detected sound sources, location information, and reliability as a list of sound sources. For example, the processor 160 may provide related information, location information, and reliability of the detected sound source regardless of the reliability of the detected sound source. As another example, if the reliability of the detected sound source is equal to or greater than the reference value, the processor 160 may provide related information, location information, and reliability of the detected sound source. In other words, if the reliability of the detected sound source is less than the reference value, the processor 160 may not provide related information, location information, and reliability of the detected sound source. The processor 160 may provide information related to the detected sound source, location information, and reliability as a response to a query from the external devices 102 and 104 . According to an embodiment, the processor 160 may transmit information related to the sound source detected, location information, and reliability to the external devices 102 and 104 . According to another embodiment, the processor 160 may directly output information related to the detected sound source, location information, and reliability through the output module 140 .

다양한 실시예들에 따르면, 사용자는 멀티미디어 콘텐츠에 사용된 음원을 확인하고, 이를 다양하게 활용할 수 있다. 일 예로, 멀티미디어 콘텐츠가 방송이나 공연의 동영상인 경우, 사용자는 멀티미디어 콘텐츠에 사용된 음원에 기반하여, 멀티미디어 콘텐츠의 큐시트(cue sheet)를 획득할 수 있다. 다른 예로, 사용자는 멀티미디어 콘텐츠에 사용된 음원의 저작권 보호 또는 저작권 정산을 위해 활용할 수 있다. According to various embodiments, a user may check a sound source used for multimedia content and utilize it in various ways. For example, when the multimedia content is a moving picture of a broadcast or performance, the user may obtain a cue sheet of the multimedia content based on a sound source used for the multimedia content. As another example, the user may utilize it for copyright protection or copyright settlement of a sound source used for multimedia content.

다양한 실시예들에 따르면, 검출된 음원의 관련된 정보, 위치 정보, 및 신뢰도를 제공한 후에, 전자 장치(100)는 검출된 음원과 연관된 다양한 서비스들을 제공할 수 있다. 일 실시예에 따르면, 프로세서(160)는 외부 장치(102, 104)로 검출된 음원을 제공할 수 있다. 외부 장치(102, 104)에 의해 검출된 음원의 관련된 정보가 선택되면, 프로세서(160)가 외부 장치(102, 104)로 검출된 음원을 제공할 수 있다. 다른 실시예에 따르면, 프로세서(160)는 검출된 음원과 연관된 다른 멀티미디어 콘텐츠를 제공할 수 있다. 외부 장치(102, 104)에 의해 검출된 음원의 관련된 정보가 선택되면, 프로세서(160)는 검출된 음원과 관련된 정보에 기반하여, 다른 멀티미디어 콘텐츠를 검색하고, 외부 장치(102, 104)로 검색된 멀티미디어 콘텐츠를 제공할 수 있다. 또 다른 실시예에 따르면, 프로세서(160)는 검출된 음원과 연관된 부가 정보를 제공할 수 있다. 외부 장치(102, 104)에 의해 검출된 음원의 관련된 정보가 선택되면, 프로세서(160)는 검출된 음원과 관련된 정보에 기반하여, 예컨대 뉴스, 소셜 네트워크 서비스(social network service; SNS) 등을 통해 부가 정보를 검색하고, 외부 장치(102, 104)로 검색된 부가 정보를 제공할 수 있다. According to various embodiments, after providing information related to the detected sound source, location information, and reliability, the electronic device 100 may provide various services related to the detected sound source. According to an embodiment, the processor 160 may provide the detected sound source to the external devices 102 and 104 . When related information of the sound source detected by the external devices 102 and 104 is selected, the processor 160 may provide the sound source detected to the external devices 102 and 104 . According to another embodiment, the processor 160 may provide other multimedia content related to the detected sound source. When information related to the sound source detected by the external devices 102 and 104 is selected, the processor 160 searches for other multimedia content based on the information related to the detected sound source, and searches for other multimedia content with the external devices 102 and 104. Multimedia content can be provided. According to another embodiment, the processor 160 may provide additional information related to the detected sound source. When the information related to the sound source detected by the external devices 102 and 104 is selected, the processor 160 based on the information related to the detected sound source, for example, through news, social network service (SNS), etc. The additional information may be searched and the searched additional information may be provided to the external devices 102 and 104 .

다양한 실시예들에 따르면, 전자 장치(100)는 멀티미디어 콘텐츠에 사용된 적어도 하나의 음원을 효율적으로 검출할 수 있다. 구체적으로, 전자 장치(100)는 멀티미디어 콘텐츠의 핑거프린트(710)에서 검색 구간(720)들 중 하나로부터 시간 범위를 확장시키면서, 음원 내에서 멀티미디어 콘텐츠에 매칭되는 검출 구간(1010)을 효율적으로 검출할 수 있다. 그리고, 전자 장치(100)는 음원 내에서의 검출 구간(1010)의 시간 위치뿐 아니라 멀티미디어 콘텐츠 내에서의 검출 구간(1010)의 시간 위치를 검출함으로써, 음원 및 멀티미디어 콘텐츠 내에서 검출 구간(1010)을 보다 정확하게 특정할 수 있다. 아울러, 전자 장치(100)는 검출 구간(1010)에 대한 멀티미디어 콘텐츠의 시작점(T_m0)으로부터 시간 오프셋(ΔT_m)과 음원의 시작점(T_a0)으로부터의 시간 오프셋(ΔT_a) 사이의 오프셋 차이(ΔT_m - ΔT_a)에 기반하여 멀티미디어 콘텐츠와 음원을 비교함으로써, 음원에 대한 신뢰도를 검출할 수 있다. 이를 통해, 전자 장치(100)는 사용자를 위해, 음원과 관련된 정보와 위치 정보뿐 아니라, 신뢰도를 제공할 수 있다. According to various embodiments, the electronic device 100 may efficiently detect at least one sound source used for multimedia content. Specifically, the electronic device 100 efficiently detects the detection section 1010 matching the multimedia content in the sound source while extending the time range from one of the search sections 720 in the fingerprint 710 of the multimedia content. can do. And, the electronic device 100 detects the time position of the detection section 1010 in the multimedia content as well as the time position of the detection section 1010 in the sound source, thereby detecting the detection section 1010 in the sound source and multimedia content. can be more precisely specified. In addition, the electronic device 100 determines the offset difference between the time offset ΔT _m from the start point T _m0 of the multimedia content for the detection section 1010 and the time offset ΔT _a from the start point T _a0 of the sound source. By comparing the multimedia content and the sound source based on (ΔT _m - ΔT _a ), the reliability of the sound source can be detected. Through this, the electronic device 100 may provide not only information related to a sound source and location information, but also reliability for the user.

다양한 실시예들에 따른 전자 장치(100)의 동작 방법은, 멀티미디어 콘텐츠의 핑거프린트(710)를 미리 설정된 시간 간격에 따라 복수의 검색 구간(720)들로 분할하는 단계(510 단계), 검색 구간(720)들 중 적어도 하나가 매칭되는 검출 구간(1010)을 갖는 적어도 하나의 음원을 검출하는 단계(520 단계), 멀티미디어 콘텐츠 내에서의 검출 구간(1010)의 시간 위치 및 음원 내에서의 검출 구간(1010)의 시간 위치를 나타내는 위치 정보를 결정하는 단계(530 단계), 및 음원과 관련된 정보 및 위치 정보를 제공하는 단계(560 단계)를 포함할 수 있다. The method of operating the electronic device 100 according to various embodiments includes dividing a fingerprint 710 of multimedia content into a plurality of search sections 720 according to a preset time interval (step 510), a search section Detecting at least one sound source having a detection period 1010 matching at least one of 720 (step 520), a time position of the detection period 1010 in the multimedia content, and a detection period in the sound source It may include the step of determining the location information indicating the time location of 1010 (step 530), and providing information and location information related to the sound source (step 560).

다양한 실시예들에 따르면, 전자 장치(100)의 동작 방법은, 멀티미디어 콘텐츠의 시작점(T_m0)으로부터의 시간 오프셋(ΔT_m)과 음원의 시작점(T_a0)으로부터의 시간 오프셋(ΔT_a) 사이의 오프셋 차이(ΔT_m - ΔT_a)에 기반하여, 핑거프린트(710)에 대해 검출 구간(1010)을 정렬시키는 단계(540 단계), 및 핑거프린트(710)와 검출 구간(1010)을 비교하여, 음원의 신뢰도를 검출하는 단계(550 단계)를 더 포함할 수 있다. According to various embodiments, the method of operating the electronic device 100 includes a time offset ΔT _m from the start point T _m0 of the multimedia content and the time offset ΔT _a from the start point T _a0 of the sound source. Based on the offset difference (ΔT _m - ΔT _a ) of , aligning the detection section 1010 with respect to the fingerprint 710 (step 540), and comparing the fingerprint 710 with the detection section 1010 , it may further include the step of detecting the reliability of the sound source (step 550).

다양한 실시예들에 따르면, 음원과 관련된 정보 및 위치 정보를 제공하는 단계(560 단계)는, 음원과 관련된 정보 및 위치 정보와 함께, 신뢰도를 제공하는 단계를 포함할 수 있다. According to various embodiments, the providing of information and location information related to a sound source (step 560) may include providing reliability along with information and location information related to a sound source.

다양한 실시예들에 따르면, 신뢰도를 검출하는 단계(550 단계)는, 핑거프린트(710)와 검출 구간(1010)의 비교 연산을 통해, 비교 구간(1110)을 생성하는 단계(651 단계), 비교 구간(1110)을 복수의 비트 구간(1210)들로 분할하는 단계(653 단계), 비트 구간(1210)들의 비트 에러율들을 각각 계산하는 단계(655 단계), 비트 에러율들을 비트 구간(1210)들의 점수들로 각각 변환하는 단계(657 단계), 및 점수들의 합으로부터 신뢰도를 검출하는 단계(659 단계)를 포함할 수 있다.According to various embodiments, the step of detecting the reliability (step 550) includes generating a comparison section 1110 (step 651) through a comparison operation between the fingerprint 710 and the detection section 1010 (step 651), the comparison The step of dividing the section 1110 into a plurality of bit sections 1210 (step 653), the step of calculating bit error rates of the bit sections 1210, respectively (step 655), and the bit error rates of the bit sections 1210 scores converting each of the scores into values (step 657), and detecting a confidence level from the sum of the scores (step 659).

다양한 실시예들에 따르면, 점수들로 변환하는 단계(657 단계)는, 비트 에러율들 중 적어도 하나가 임계값을 초과하면, 적어도 하나의 비트 에러율에 대응하여 0을 점수로 부여하는 단계, 및 비트 에러율들 중 나머지가 임계값 이하이면, 나머지의 비트 에러율에 대응하여 계산되는 점수를 부여하는 단계를 포함할 수 있다. According to various embodiments, converting to scores (step 657) may include, if at least one of the bit error rates exceeds a threshold, assigning a score of 0 corresponding to the at least one bit error rate as a score, and If the remainder of the error rates is equal to or less than the threshold, assigning a score calculated in response to the remaining bit error rates may be included.

다양한 실시예들에 따르면, 음원과 관련된 정보 및 위치 정보를 제공하는 단계(560 단계)는, 멀티미디어 콘텐츠가 복수의 음원들과 연관되면, 음원과 관련된 정보 및 위치 정보를 리스트로 제공하는 단계를 포함할 수 있다. According to various embodiments, the step of providing information and location information related to the sound source (step 560) includes providing information and location information related to the sound source as a list when the multimedia content is associated with a plurality of sound sources can do.

다양한 실시예들에 따르면, 멀티미디어 콘텐츠는, 영상 데이터 또는 오디오 데이터 중 적어도 하나로 이루어질 수 있다. According to various embodiments, the multimedia content may be formed of at least one of image data and audio data.

다양한 실시예들에 따르면, 전자 장치(100)의 동작 방법은, 음원과 관련된 정보가 선택되면, 음원을 제공하는 단계, 또는 음원과 관련된 정보가 선택되면, 음원과 연관된 다른 멀티미디어 콘텐츠를 제공하는 단계 중 적어도 하나를 더 포함할 수 있다. According to various embodiments, the method of operating the electronic device 100 includes providing a sound source when information related to a sound source is selected, or providing other multimedia content related to a sound source when information related to a sound source is selected It may further include at least one of.

다양한 실시예들에 따르면, 검색 구간(720)들 중 적어도 하나는, 복수의 음원들에 매칭될 수 있다. According to various embodiments, at least one of the search sections 720 may be matched to a plurality of sound sources.

다양한 실시예들에 따른 전자 장치(100)는, 메모리(150), 및 메모리(150)와 연결되고, 메모리(150)에 저장된 적어도 하나의 명령을 실행하도록 구성된 프로세서(160)를 포함할 수 있다. The electronic device 100 according to various embodiments may include a memory 150 and a processor 160 connected to the memory 150 and configured to execute at least one command stored in the memory 150 . .

다양한 실시예들에 따르면, 프로세서(160)는, 멀티미디어 콘텐츠의 핑거프린트(710)를 미리 설정된 시간 간격에 따라 복수의 검색 구간(720)들로 분할하고, 검색 구간(720)들 중 적어도 하나가 매칭되는 검출 구간(1010)을 갖는 적어도 하나의 음원을 검출하고, 멀티미디어 콘텐츠 내에서의 검출 구간(1010)의 시간 위치 및 음원 내에서의 검출 구간(1010)의 시간 위치를 나타내는 위치 정보를 결정하고, 음원과 관련된 정보 및 위치 정보를 제공하도록 구성될 수 있다. According to various embodiments, the processor 160 divides the fingerprint 710 of the multimedia content into a plurality of search sections 720 according to a preset time interval, and at least one of the search sections 720 is Detect at least one sound source having a matching detection section 1010, and determine the location information indicating the time position of the detection section 1010 in the multimedia content and the time position of the detection section 1010 in the sound source, , may be configured to provide information and location information related to the sound source.

다양한 실시예들에 따르면, 프로세서(160)는, 멀티미디어 콘텐츠의 시작점(T_m0)으로부터의 시간 오프셋(ΔT_m)과 음원의 시작점(T_a0)으로부터의 시간 오프셋(ΔT_a) 사이의 오프셋 차이(ΔT_m - ΔT_a)에 기반하여, 핑거프린트(710)에 대해 검출 구간(1010)을 정렬시키고, 핑거프린트(710)와 검출 구간(1010)을 비교하여, 음원의 신뢰도를 검출하도록 구성될 수 있다.According to various embodiments, the processor 160 determines the offset difference between the time offset ΔT _m from the starting point T _m0 of the multimedia content and the time offset ΔT _a from the starting point T _a0 of the sound source ( Based on ΔT _m - ΔT _a ), by aligning the detection section 1010 with respect to the fingerprint 710 and comparing the fingerprint 710 with the detection section 1010, it can be configured to detect the reliability of the sound source there is.

다양한 실시예들에 따르면, 프로세서(160)는, 검출 구간(1010)에 대응하는 음원의 신뢰도를 검출하고, 음원과 관련된 정보 및 위치 정보와 함께, 신뢰도를 제공하도록 구성될 수 있다. According to various embodiments, the processor 160 may be configured to detect the reliability of the sound source corresponding to the detection section 1010 , and provide the reliability along with information and location information related to the sound source.

다양한 실시예들에 따르면, 프로세서(160)는, 핑거프린트(710)와 검출 구간(1010)의 비교 연산을 통해, 비교 구간(1110)을 생성하고, 비교 구간(1110)을 복수의 비트 구간(1210)들로 분할하고, 비트 구간(1210)들의 비트 에러율들을 각각 계산하고, 비트 에러율들을 비트 구간(1210)들의 점수들로 각각 변환하고, 점수들의 합으로부터 신뢰도를 검출하도록 구성될 수 있다. According to various embodiments, the processor 160 generates a comparison section 1110 through a comparison operation between the fingerprint 710 and the detection section 1010, and divides the comparison section 1110 into a plurality of bit sections ( 1210 , each calculating the bit error rates of the bit intervals 1210 , converting the bit error rates into scores of the bit intervals 1210 , respectively, and detecting reliability from the sum of the scores.

다양한 실시예들에 따르면, 프로세서(160)는, 비트 에러율들 중 적어도 하나가 임계값을 초과하면, 적어도 하나의 비트 에러율에 대응하여 0을 점수로 부여하고, 비트 에러율들 중 나머지가 임계값 이하이면, 나머지의 비트 에러율에 대응하여 계산되는 점수를 부여하도록 구성될 수 있다. According to various embodiments, if at least one of the bit error rates exceeds a threshold value, the processor 160 assigns a score of 0 in response to the at least one bit error rate, and the rest of the bit error rates are equal to or less than the threshold value. , it may be configured to give a score calculated in response to the remaining bit error rates.

다양한 실시예들에 따르면, 프로세서(160)는, 멀티미디어 콘텐츠가 복수의 음원들과 연관되면, 음원과 관련된 정보 및 위치 정보를 리스트로 제공하도록 구성될 수 있다. According to various embodiments, when the multimedia content is associated with a plurality of sound sources, the processor 160 may be configured to provide information related to the sound source and location information as a list.

다양한 실시예들에 따르면, 프로세서(160)는, 음원과 관련된 정보가 선택되면, 음원 또는 음원과 연관된 다른 멀티미디어 콘텐츠 중 적어도 하나를 제공하도록 구성될 수 있다. According to various embodiments, when information related to a sound source is selected, the processor 160 may be configured to provide at least one of a sound source or other multimedia content related to the sound source.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, the apparatus and components described in the embodiments may include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable logic unit (PLU). It may be implemented using one or more general purpose or special purpose computers, such as a logic unit, microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be embodied in any tangible machine, component, physical device, computer storage medium or device for interpretation by or providing instructions or data to the processing device. there is. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

다양한 실시예들에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이 때 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 그리고, 매체는 단일 또는 수 개의 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 어플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to various embodiments may be implemented in the form of program instructions that may be executed through various computer means and recorded in a computer-readable medium. In this case, the medium may be to continuously store a program executable by a computer, or to temporarily store it for execution or download. In addition, the medium may be a variety of recording means or storage means in the form of a single or several hardware combined, it is not limited to a medium directly connected to any computer system, and may exist distributed on a network. Examples of the medium include a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floppy disk, and those configured to store program instructions, including ROM, RAM, flash memory, and the like. In addition, examples of other media may include recording media or storage media managed by an app store that distributes applications, sites that supply or distribute other various software, and servers.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 해당 실시 예의 다양한 변경, 균등물, 및/또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성 요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및/또는 B 중 적어도 하나", "A, B 또는 C" 또는 "A, B 및/또는 C 중 적어도 하나" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", "첫째" 또는 "둘째" 등의 표현들은 해당 구성 요소들을, 순서 또는 중요도에 상관없이 수식할 수 있고, 한 구성 요소를 다른 구성 요소와 구분하기 위해 사용될 뿐 해당 구성 요소들을 한정하지 않는다. 어떤(예: 제 1) 구성 요소가 다른(예: 제 2) 구성 요소에 "(기능적으로 또는 통신적으로) 연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기 어떤 구성 요소가 상기 다른 구성 요소에 직접적으로 연결되거나, 다른 구성 요소(예: 제 3 구성 요소)를 통하여 연결될 수 있다.The various embodiments of this document and the terms used therein are not intended to limit the technology described in this document to a specific embodiment, but it should be understood to include various modifications, equivalents, and/or substitutions of the embodiments. In connection with the description of the drawings, like reference numerals may be used for like components. The singular expression may include the plural expression unless the context clearly dictates otherwise. In this document, expressions such as “A or B”, “at least one of A and/or B”, “A, B or C” or “at least one of A, B and/or C” refer to all of the items listed together. Possible combinations may be included. Expressions such as “first”, “second”, “first” or “second” can modify the corresponding components regardless of order or importance, and are only used to distinguish one component from another. It does not limit the corresponding components. When an (eg, first) component is referred to as being “connected (functionally or communicatively)” or “connected” to another (eg, second) component, that component is It may be directly connected to the component or may be connected through another component (eg, a third component).

본 문서에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구성된 유닛을 포함하며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로 등의 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 모듈은 ASIC(application-specific integrated circuit)으로 구성될 수 있다. As used herein, the term “module” includes a unit composed of hardware, software, or firmware, and may be used interchangeably with terms such as, for example, logic, logic block, component, or circuit. A module may be an integrally formed part or a minimum unit or a part of performing one or more functions. For example, the module may be configured as an application-specific integrated circuit (ASIC).

다양한 실시예들에 따르면, 기술한 구성 요소들의 각각의 구성 요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있다. 다양한 실시예들에 따르면, 전술한 해당 구성 요소들 중 하나 이상의 구성 요소들 또는 단계들이 생략되거나, 또는 하나 이상의 다른 구성 요소들 또는 단계들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성 요소들(예: 모듈 또는 프로그램)은 하나의 구성 요소로 통합될 수 있다. 이런 경우, 통합된 구성 요소는 복수의 구성 요소들 각각의 구성 요소의 하나 이상의 기능들을 통합 이전에 복수의 구성 요소들 중 해당 구성 요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 단계들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 단계들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 단계들이 추가될 수 있다. According to various embodiments, each component (eg, a module or a program) of the described components may include a singular or a plurality of entities. According to various embodiments, one or more components or steps among the above-described corresponding components may be omitted, or one or more other components or steps may be added. Alternatively or additionally, a plurality of components (eg, a module or a program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components identically or similarly to those performed by the corresponding component among the plurality of components prior to integration. According to various embodiments, steps performed by a module, program, or other component are executed sequentially, in parallel, repeatedly, or heuristically, or one or more of the steps are executed in a different order, omitted, or , or one or more other steps may be added.

Claims

A method of operating an electronic device, comprising:
dividing the fingerprint of the multimedia content into a plurality of search sections according to preset time intervals;
detecting at least one sound source having a detection section matching at least one of the search sections;
determining position information indicating a time position of the detection section in the multimedia content and a time position of the detection section in the sound source; and
providing information related to the sound source and the location information to a user
A method comprising

The method of claim 1,
aligning the detection section with respect to the fingerprint based on an offset difference between the time offset from the start point of the multimedia content and the time offset from the start point of the sound source; and
Comparing the fingerprint and the detection section, detecting the reliability of the sound source
A method further comprising:

3. The method of claim 2,
The step of providing information related to the sound source and the location information includes:
providing the reliability along with the information related to the sound source and the location information
A method comprising

3. The method of claim 2,
The step of detecting the reliability is
generating a comparison section through a comparison operation between the fingerprint and the detection section;
dividing the comparison section into a plurality of bit sections;
calculating bit error rates of each of the bit sections;
converting each of the bit error rates into scores of the bit intervals; and
detecting the reliability from the sum of the scores;
A method comprising

5. The method of claim 4,
The step of converting the scores into
if at least one of the bit error rates exceeds a threshold, assigning 0 as a score in response to the at least one bit error rate; and
if the remainder of the bit error rates is equal to or less than the threshold value, assigning a score calculated in response to the remaining bit error rates;
A method comprising

The method of claim 1,
The step of providing information related to the sound source and the location information includes:
When the multimedia content is associated with a plurality of sound sources, providing information related to the sound source and the location information as a list
A method comprising

The method of claim 1,
The multimedia content is
Consists of at least one of image data or audio data,
method.

The method of claim 1,
providing the sound source when information related to the sound source is selected; or
providing other multimedia content related to the sound source when information related to the sound source is selected
A method further comprising at least one of

The method of claim 1,
At least one of the search sections,
A method for matching a plurality of sound sources.

A computer program stored in a non-transitory computer-readable recording medium for executing the method of any one of claims 1 to 9 in the electronic device.

10. A non-transitory computer-readable recording medium in which a program for executing the method of any one of claims 1 to 9 in the electronic device is recorded.

In an electronic device,
Memory; and
a processor coupled to the memory and configured to execute at least one instruction stored in the memory;
The processor is
dividing the fingerprint of the multimedia content into a plurality of search sections according to a preset time interval,
Detects at least one sound source having a detection section that matches at least one of the search sections,
determining the location information indicating the time position of the detection section in the multimedia content and the time location of the detection section in the sound source;
configured to provide information related to the sound source and the location information to a user,
Device.

13. The method of claim 12,
The processor is
aligning the detection section with respect to the fingerprint based on an offset difference between the time offset from the start point of the multimedia content and the time offset from the start point of the sound source;
Comparing the fingerprint and the detection section, configured to detect the reliability of the sound source,
Device.

13. The method of claim 12,
The processor is
Detecting the reliability of the sound source corresponding to the detection section,
configured to provide the reliability, along with information related to the sound source and the location information,
Device.

15. The method of claim 14,
The processor is
A comparison section is generated through a comparison operation between the fingerprint and the detection section,
dividing the comparison section into a plurality of bit sections,
Calculating the bit error rates of each of the bit sections,
converting the bit error rates into scores of the bit intervals, respectively;
configured to detect the confidence from the sum of the scores;
Device.

16. The method of claim 15,
The processor is
If at least one of the bit error rates exceeds a threshold, assigning 0 as a score corresponding to the at least one bit error rate;
and if the remainder of the bit error rates is less than or equal to the threshold, assign a score calculated corresponding to the remaining bit error rates;
Device.

13. The method of claim 12,
The processor is
If the multimedia content is associated with a plurality of sound sources, configured to provide information related to the sound source and the location information as a list,
Device.

13. The method of claim 12,
The multimedia content is
Consists of at least one of image data or audio data,
Device.

13. The method of claim 12,
The processor is
configured to provide at least one of the sound source or other multimedia content associated with the sound source when information related to the sound source is selected,
Device.

13. The method of claim 12,
At least one of the search sections,
Matching multiple sound sources,
Device.