KR102608935B1

KR102608935B1 - Method and apparatus for providing real-time audio mixing service based on user information

Info

Publication number: KR102608935B1
Application number: KR1020230045535A
Authority: KR
Inventors: 김태형; 김근형; 이종필; 금상은
Original assignee: 뉴튠(주)
Priority date: 2023-04-06
Filing date: 2023-04-06
Publication date: 2023-12-04

Abstract

오디오 믹싱 서비스 제공 방법은 프로세서를 통해 사용자 단말기로 오디오 믹싱 서비스를 제공하는 방법으로서, 음원 정보 및 상기 음원 정보에 관한 믹싱 정보를 입력 받는 정보 수신 단계, 상기 믹싱 정보를 기초로 상기 음원 정보에 포함되어 있는 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block)을 선택하고, 선택한 상기 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션(session) 오디오로 생성하는 오디오 생성 단계 및 상기 세션 오디오를 기초로 믹싱 음원을 생성하고, 생성된 상기 믹싱 음원을 재생하는 믹싱 음원 재생 단계를 포함할 수 있다.The method of providing an audio mixing service is a method of providing an audio mixing service to a user terminal through a processor, and includes an information receiving step of receiving sound source information and mixing information about the sound source information, and is included in the sound source information based on the mixing information. An audio generation step of selecting an audio block corresponding to at least one stem item and combining audio information included in the selected audio block to generate one session audio, and the session It may include a mixing sound source reproduction step of generating a mixed sound source based on audio and playing the generated mixed sound source.

Description

Method and apparatus for providing real-time audio mixing service based on user information}

본 발명은 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치에 관한 발명으로서, 보다 상세하게는 사용자로부터 입력되는 정보 또는 사용자의 주변 환경 정보를 포함하는 사용자 정보를 믹싱 정보로 활용하고, 이에 따라 스템 데이터를 선택하고 믹싱하여 실시간 믹싱 오디오를 제공하는 기술에 관한 발명이다.The present invention relates to a method and device for providing a real-time audio mixing service based on user information. More specifically, the present invention relates to a method and device for providing a real-time audio mixing service based on user information. More specifically, the present invention utilizes user information including information input from the user or information about the user's surrounding environment as mixing information, and accordingly, the system This invention relates to technology that selects and mixes data to provide real-time mixed audio.

기술의 발전에 따라 기존 대용량의 미디어를 디지털화하여 저용량의 미디어로 변환할 수 있게 되어, 오늘날에는 사용자는 휴대가 가능한 사용자 단말 장치에 다양한 종류의 미디어를 저장하여 이동중에도 원하는 미디어를 선별하여 간편하게 감상할 수 있게 되었다. 또한, 디지털 압축 기술을 통해 디지털화된 미디어는 네트워크 상에서 사용자간 미디어 공유를 가능하게 하여 온라인 미디어 서비스를 폭발적으로 활성화시키고 있으며, 이와 관련한 많은 어플리케이션이나 프로그램이 개발되고 있다.With the advancement of technology, it has become possible to digitize existing large-capacity media and convert it into low-capacity media. Today, users can store various types of media on portable user terminal devices and easily select and enjoy the media they want on the go. It became possible. In addition, media digitized through digital compression technology is explosively revitalizing online media services by enabling media sharing between users on a network, and many applications or programs related to this are being developed.

이렇게 방대하게 제공되는 미디어 중에서 상당한 부분을 차지하는 것이 음악으로서, 다른 미디어 종류에 비해 저용량이며 통신 부하가 낮아 실시간 스트리밍 서비스를 지원하는데 무리가 없어 서비스 제공자나 사용자 모두에게 만족도가 높다. 이에 따라, 현재는 다양한 방법으로 사용자에게 온라인 음악을 제공하는 서비스가 등장하고 있다.Among the vast amounts of media provided, music accounts for a significant portion. Compared to other types of media, it has a lower capacity and communication load, so it is easy to support real-time streaming services, resulting in high satisfaction for both service providers and users. Accordingly, services that provide online music to users in various ways are currently emerging.

기존의 온라인 음악 서비스는 음원을 사용자 단말 장치로 제공하거나, 스트리밍 서비스를 제공하는 등의 방식으로 온라인에 연결된 사용자에게 실시간으로 음원을 단순하게 제공하는데 그쳤으나, 최근에는 빅데이터를 활용하거나 인공지능 기술을 사용하여 선호도가 높은 미디어를 사용자에게 추천하는 서비스를 제공하고 있다.Existing online music services simply provide music in real time to users connected online by providing music to the user's terminal device or providing streaming services, but recently, they have utilized big data or artificial intelligence technology. We provide a service that recommends highly preferred media to users.

그러나, 현재 온라인 음악 서비스에서의 추천 방식은 사용자가 구매하거나 청취 또는 검색한 음원의 수를 단순 집계하여 음악 차트를 생성하고 이를 기반으로 추천하는 방식으로서, 이러한 추천 방식은 단순 액세스 회수에 기반한 통계적인 기준에 의해서 음악을 추천하는 방식으로 사용자의 선호도가 가지는 다양성 및 변동성을 무시한 방식이다. 또한, 이러한 추천 방식은 누적된 액세스 회수를 기반으로 음악을 추천하므로 음악 차트의 변동성이 낮아, 기존에 추천된 음악들과 현재 추천되는 음악들이 대부분 중복되어 실효성이 크게 떨어진다.However, the current recommendation method in online music services is to create a music chart by simply counting the number of music sources purchased, listened to, or searched by the user and make recommendations based on this. This recommendation method is a statistical method based on the simple number of accesses. It is a method of recommending music based on standards and ignores the diversity and volatility of the user's preferences. In addition, since this recommendation method recommends music based on the accumulated number of accesses, the volatility of the music chart is low, and most of the previously recommended music overlaps with the currently recommended music, greatly reducing its effectiveness.

또한, 동일한 음악이라 하더라도, 사용자의 기호에 따라 다양한 버전으로 듣고 싶어하는 경우도 존재하는데, 현재는 음원을 배포하는 업체에서 다른 버전으로 음원을 배포하지 않는 이상, 사용자는 다른 느낌을 가지는 버전의 음악을 듣지 못하는 실정이다.In addition, even if it is the same music, there are cases where users want to listen to various versions depending on their preferences. Currently, unless the music distribution company distributes the sound source in a different version, the user cannot listen to a version of the music with a different feel. I can't hear it.

한국공개특허 제10-2015-0084133호 (2015.07.22. 공개 - '음의 간섭현상을 이용한 음정인식 및 이를 이용한 음계채보 방법'Korean Patent Publication No. 10-2015-0084133 (published on July 22, 2015 - 'Pitch recognition using sound interference phenomenon and scale notation method using the same') 한국등록특허 제10-1696555호 (2019.06.05.) - '영상 또는 지리 정보에서 음성 인식을 통한 텍스트 위치 탐색 시스템 및 그 방법'Korean Patent No. 10-1696555 (2019.06.05.) - 'Text location search system and method through voice recognition in image or geographic information'

따라서, 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치는 상기 설명한 문제점을 해결하기 위해 고안된 발명으로서, 사용자의 취향에 맞게 음원을 믹싱 하거나 사용자 주변 환경을 기반으로 음원을 믹싱 하여 새로운 믹싱 음원을 생성할 수 있는 방법 및 장치를 제공하는데 그 목적이 있다.Therefore, the method and device for providing a real-time audio mixing service based on user information according to an embodiment are inventions designed to solve the problems described above, and mix sound sources according to the user's taste or mix sound sources based on the user's surrounding environment. The purpose is to provide a method and device for creating new mixing sound sources.

보다 구체적으로는, 사용자로부터 입력되는 음악 특성 정보 및 음악 태그 정보와 센서로부터 수집되는 사용자의 주변 환경 정보를 기반으로 믹싱 정보를 생성하여, 생성된 믹싱 정보를 활용하여 오디오를 자유롭게 믹싱할 수 있는 오디오 믹싱 서비스를 제공하는 데 그 목적이 있다.More specifically, an audio system that generates mixing information based on music characteristic information and music tag information input from the user and the user's surrounding environment information collected from sensors, and allows free mixing of audio using the generated mixing information. The purpose is to provide mixing services.

일 실시예에 따른 오디오 믹싱 서비스 제공 방법은 프로세서를 통해 사용자 단말기로 오디오 믹싱 서비스를 제공하는 방법으로서, 음원 정보 및 상기 음원 정보에 관한 믹싱 정보를 입력 받는 정보 수신 단계, 상기 믹싱 정보를 기초로 상기 음원 정보에 포함되어 있는 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block)을 선택하고, 선택한 상기 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션(session) 오디오로 생성하는 오디오 생성 단계 및 상기 세션 오디오를 기초로 믹싱 음원을 생성하고, 생성된 상기 믹싱 음원을 재생하는 믹싱 음원 재생 단계를 포함할 수 있다.A method of providing an audio mixing service according to an embodiment is a method of providing an audio mixing service to a user terminal through a processor, comprising an information receiving step of receiving sound source information and mixing information related to the sound source information, and mixing information based on the mixing information. Audio that selects an audio block corresponding to at least one stem item included in the sound source information and combines the audio information included in the selected audio block to create one session audio. It may include a generation step and a mixing sound source reproduction step of generating a mixed sound source based on the session audio and playing the generated mixed sound source.

상기 믹싱 정보는 사용자로부터 입력된 음악 특성 정보 및 음악 태그 정보 중 적어도 하나를 포함할 수 있다.The mixing information may include at least one of music characteristic information and music tag information input by the user.

상기 음악 태그 정보는 장르 정보, 무드 정보 및 밝기 정보 중 적어도 하나를 포함할 수 있다.The music tag information may include at least one of genre information, mood information, and brightness information.

상기 음악 태그 정보는 상기 사용자로부터 직접 입력을 받거나, 상기 사용자의 음성 정보 및 챗봇과의 대화 정보를 기초로 생성되는 정보일 수 있다.The music tag information may be input directly from the user, or may be information generated based on the user's voice information and conversation information with a chatbot.

상기 믹싱 정보는 상기 사용자 단말기의 센서로부터 수집된 상기 사용자의 주변 환경 정보를 기초로 생성된 정보를 포함할 수 있다.The mixing information may include information generated based on information about the user's surrounding environment collected from sensors of the user terminal.

상기 사용자 주변 환경 정보는, 상기 사용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나를 포함할 수 있다.The user's surrounding environment information may include at least one of the user's location information, weather information, temperature information, and movement information.

상기 오디오 생성 단계는, 상기 세션 오디오를 복수 개 생성하는 단계, 상기 복수의 세션 오디오의 순서를 배열하는 단계, 상기 복수의 세션 오디오의 볼륨, 스피드, 조성, 이펙트, 재생 시작점 중 적어도 하나를 조정하는 단계 및 조정이 완료된 상기 복수의 세션 오디오를 하나의 믹싱 음원으로 생성하는 단계를 더 포함할 수 있다.The audio generating step includes generating a plurality of session audio, arranging the order of the plurality of session audio, and adjusting at least one of the volume, speed, composition, effect, and playback start point of the plurality of session audio. It may further include generating the plurality of session audios for which the steps and adjustments have been completed as one mixing sound source.

일 실시예에 따른 오디오 믹싱 서비스 제공 장치는 음원 정보 및 상기 음원 정보에 관한 믹싱 정보를 입력 받는 통신 모듈, 상기 음원 정보에 대한 적어도 하나 이상의 오디오 버전에 대해 미리 설정된 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block) 정보가 저장된 메모리 모듈, 사용자로부터 입력되는 정보 및 센서로부터 수집된 상기 사용자의 주변 환경 정보 중 적어도 하나를 포함하는 사용자 정보를 상기 믹싱 정보로 생성하는 믹싱 정보 생성 모듈 및 상기 믹싱 정보를 기초로 오디오 블록을 선택하고, 선택된 상기 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션(session) 오디오로 생성하며, 상기 세션 오디오를 기초로 믹싱 음원을 생성하는 믹싱 오디오 생성 모듈을 포함할 수 있다.An audio mixing service providing device according to an embodiment includes a communication module that receives sound source information and mixing information about the sound source information, and at least one stem item preset for at least one audio version of the sound source information. a memory module in which corresponding audio block information is stored, a mixing information generation module that generates user information including at least one of information input from the user and information about the user's surroundings collected from sensors as the mixing information, and A mixing audio generation module that selects an audio block based on mixing information, combines the audio information contained in the selected audio block to create one session audio, and generates a mixing sound source based on the session audio. may include.

상기 믹싱 오디오 생성 모듈은 상기 세션 오디오를 복수 개 생성한 후, 상기 복수의 세션 오디오의 순서를 배열하고, 상기 복수의 세션 오디오의 볼륨, 스피드, 조성, 이펙트, 재생 시작점 중 적어도 하나를 조정하며, 조정이 완료된 상기 복수의 세션 오디오를 하나의 믹싱 음원으로 생성할 수 있다.The mixing audio generation module generates a plurality of session audio, arranges the order of the plurality of session audio, and adjusts at least one of the volume, speed, composition, effect, and playback start point of the plurality of session audio, The plurality of session audios that have been adjusted can be created as one mixing sound source.

상기 사용자의 주변 환경 정보는, 상기 사용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나를 포함할 수 있다.The user's surrounding environment information may include at least one of the user's location information, weather information, temperature information, and movement information.

일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치는 사용자가 자신이 원하는 취향의 오디오를 능동적으로 믹싱하고 제작하고 있을 뿐만 아니라, 사용자가 입력한 음악 태그 정보를 믹싱 정보로 활용하여 보다 사용자의 편의성이 증대된 오디오 스트리밍 서비스를 제공할 수 있는 장점이 존재한다.A method and device for providing a real-time audio mixing service based on user information according to an embodiment not only actively mixes and produces audio of the user's desired taste, but also utilizes music tag information input by the user as mixing information. There is an advantage in providing an audio streaming service with increased user convenience.

일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치는 음원이 재생되는 중에도 사용자가 음원을 조정할 수 있고, 조정된 믹싱 음원이 실시간으로 송출될 수 있는 장점이 존재한다.The method and device for providing a real-time audio mixing service based on user information according to an embodiment have the advantage that the user can adjust the sound source even while the sound source is being played, and the adjusted mixed sound source can be transmitted in real time.

일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치는 사용자가 직접 세션 오디오의 다양한 음악 특성을 조정할 수 있어 사용자의 취향이 보다 다양하게 반영되는 오디오 스트리밍 서비스를 제공할 수 있는 장점이 존재한다.The method and device for providing a real-time audio mixing service based on user information according to an embodiment has the advantage of providing an audio streaming service that reflects the user's tastes in a more diverse manner by allowing the user to directly adjust various musical characteristics of the session audio. exist.

도 1은 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 시스템의 일부 구성을 도시한 도면이다.
도 2는 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 장치의 일부 구성을 도시한 도면이다.
도 3은 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법을 도시한 도면이다.
도 4는 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자가 입력한 음악 특성 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.
도 5는 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자가 입력한 음악 태그 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.
도 6은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자의 주변 환경 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.
도 7은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 세션 오디오를 개별적으로 조정하여 실시간 믹싱 음원을 생성하는 방법을 도시한 도면이다.
도 8은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 장치의 다중 클라이언트 관리 방법을 도시한 도면이다.
도 9는 개시된 발명이 적용되어 실제로 구현된 믹싱 오디오 서비스 제공 시 사용자 단말에 표시되는 인터페이스 화면을 도시한 도면이다.FIG. 1 is a diagram illustrating a partial configuration of a system for providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.
FIG. 2 is a diagram illustrating a partial configuration of an apparatus for providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.
Figure 3 is a diagram illustrating a method of providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.
FIG. 4 illustrates a method of generating mixing information from music characteristic information input by a user and generating a mixing sound source based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. It is a drawing.
FIG. 5 illustrates a method in which music tag information input by a user is generated as mixing information and a mixing sound source is generated based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. It is a drawing.
FIG. 6 is a diagram illustrating a method in which information on the user's surrounding environment is generated as mixing information and a mixing sound source is generated based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. .
FIG. 7 is a diagram illustrating a method of generating a real-time mixing sound source by individually adjusting session audio in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention.
FIG. 8 is a diagram illustrating a multi-client management method of a real-time audio mixing service providing device according to an embodiment of the disclosed invention.
Figure 9 is a diagram illustrating an interface screen displayed on a user terminal when providing a mixing audio service actually implemented by applying the disclosed invention.

이하, 본 발명에 따른 실시 예들은 첨부된 도면들을 참조하여 설명한다. 각 도면의 구성요소들에 참조 부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명의 실시 예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 실시예에 대한 이해를 방해한다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 이하에서 본 발명의 실시 예들을 설명할 것이나, 본 발명의 기술적 사상은 이에 한정되거나 제한되지 않고 당업자에 의해 변형되어 다양하게 실시될 수 있다.Hereinafter, embodiments according to the present invention will be described with reference to the attached drawings. When adding reference signs to components in each drawing, it should be noted that the same components are given the same reference numerals as much as possible even if they are shown in different drawings. Additionally, when describing embodiments of the present invention, if detailed descriptions of related known configurations or functions are judged to impede understanding of the embodiments of the present invention, the detailed descriptions will be omitted. In addition, embodiments of the present invention will be described below, but the technical idea of the present invention is not limited or limited thereto and may be modified and implemented in various ways by those skilled in the art.

또한, 본 명세서에서 사용한 용어는 실시 예를 설명하기 위해 사용된 것으로, 개시된 발명을 제한 및/또는 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Additionally, the terms used in this specification are used to describe embodiments and are not intended to limit and/or limit the disclosed invention. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 명세서에서, "포함하다", "구비하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는다.In this specification, terms such as “comprise,” “provide,” or “have” are intended to designate the presence of features, numbers, steps, operations, components, parts, or combinations thereof described in the specification. It does not exclude in advance the existence or addition of other features, numbers, steps, operations, components, parts, or combinations thereof.

또한, 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함하며, 본 명세서에서 사용한 "제 1", "제 2" 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되지는 않는다. In addition, throughout the specification, when a part is said to be “connected” to another part, this refers not only to the case where it is “directly connected” but also to the case where it is “indirectly connected” with another element in between. Terms including ordinal numbers, such as “first” and “second,” used in this specification may be used to describe various components, but the components are not limited by the terms.

아래에서는 첨부한 도면을 참고하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략한다. Below, with reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily implement the present invention. In order to clearly explain the present invention in the drawings, parts unrelated to the description are omitted.

한편, 본 발명의 명칭은 '사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치'로 기재하였으나, 이하 설명의 편의를 위해 '사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 장치'는 '오디오 믹싱 서비스 제공 장치'로 축약하여 설명하고, '사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법'은 '오디오 믹싱 서비스 제공 방법'으로 축약하여 설명하도록 한다.Meanwhile, the title of the present invention is 'method and device for providing real-time audio mixing service based on user information', but for convenience of explanation below, 'device for providing real-time audio mixing service based on user information' is referred to as 'device for providing audio mixing service'. ', and 'method of providing real-time audio mixing service based on user information' will be abbreviated and explained as 'method of providing audio mixing service'.

도 1은 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 시스템의 일부 구성을 도시한 도면이다.FIG. 1 is a diagram illustrating a partial configuration of a system for providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.

도 1을 참조하면, 일 실시예에 따른 오디오 믹싱 서비스 제공 시스템은, 오디오 믹싱 서비스를 사용자 단말기(300A, 300B, 300C)로 제공하는 오디오 믹싱 서비스 제공 장치(200)와, 오디오 믹싱 서비스 제공 장치(200)로부터 수신한 오디오 믹싱 인터페이스를 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시하는 사용자 단말기(300A, 300B, 300C)를 포함할 수 있다.Referring to FIG. 1, an audio mixing service providing system according to an embodiment includes an audio mixing service providing device 200 that provides an audio mixing service to user terminals 300A, 300B, and 300C, and an audio mixing service providing device ( It may include user terminals (300A, 300B, 300C) that display the audio mixing interface received from 200) on the displays of the user terminals (300A, 300B, 300C).

사용자 단말기(300A, 300B, 300C)는 도면에 도시된 바와 같이, 복수 개의 사용자 단말기(300A, 300B, 300C)를 포함할 수 있다.As shown in the figure, the user terminals 300A, 300B, and 300C may include a plurality of user terminals 300A, 300B, and 300C.

오디오 믹싱 서비스 제공 장치(200)는 사용자가 사용자 단말기(300A, 300B, 300C) 또는 사용자 단말기(300A, 300B, 300C)와 연계되어 있는 외부 서버(미도시)에 저장되어 있는 오디오를 사용자 취향에 맞추어 믹싱하고 편집할 수 있는 인터페이스를 생성하고, 생성된 인터페이스를 사용자 단말기(300A, 300B, 300C)를 통해 사용자에게 제공하고, 사용자 정보를 입력 받아 사용자 정보에 기초하여 믹싱 오디오를 생성할 수 있다. 이에 대한 구체적인 내용은 후술하도록 한다.The audio mixing service providing device 200 allows the user to customize audio stored in the user terminal (300A, 300B, 300C) or an external server (not shown) linked to the user terminal (300A, 300B, 300C) to the user's taste. An interface for mixing and editing can be created, the created interface can be provided to the user through the user terminals 300A, 300B, and 300C, user information can be input, and mixed audio can be generated based on the user information. Specific details regarding this will be described later.

오디오 믹싱 서비스 제공 장치(200)는 오디오 믹싱 인터페이스를 생성하고, 생성된 오디오 믹싱 인터페이스를 사용자 단말기(300A, 300B, 300C)로 송신하고, 오디오를 믹싱하여 생성된 믹싱 오디오를 사용자 단말기(300A, 300B, 300C)로 송신할 수 있도록 서버(server)로 구현될 수 있다.The audio mixing service providing device 200 creates an audio mixing interface, transmits the generated audio mixing interface to the user terminals 300A, 300B, and 300C, and mixes the audio and mixes the generated audio to the user terminals 300A and 300B. , 300C) can be implemented as a server.

본 발명에서의 서버(server)는 통상적인 서버를 의미하는 바, 서버는 프로그램이 실행되고 있는 컴퓨터 하드웨어로서, 프린터 제어나 파일 관리 등 네트워크 전체를 감시하거나, 제어하거나, 메인 프레임이나 공중망을 통한 다른 네트워크와의 연결, 데이터, 프로그램, 파일 같은 소프트웨어 자원이나 모뎀, 팩스, 프린터 공유, 기타 장비 등 하드웨어 자원을 공유할 수 있도록 지원할 수 있다. 사용자 단말기(300A, 300B, 300C)는 사용자 단말기(300A, 300B, 300C)에 설치되어 있는 특정 프로그램이나 어플리케이션을 이용하여 오디오 믹싱 서비스 제공 장치(200)가 제공하는 오디오 믹싱 서비스를 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시할 수 있다.In the present invention, a server refers to a typical server. A server is computer hardware on which a program is running, and monitors or controls the entire network, such as printer control or file management, or other functions through a main frame or public network. It can support the sharing of software resources such as network connections, data, programs, and files, or hardware resources such as modems, faxes, shared printers, and other equipment. The user terminals 300A, 300B, and 300C provide the audio mixing service provided by the audio mixing service providing device 200 using a specific program or application installed on the user terminals 300A, 300B, and 300C. 300B, 300C) can be displayed on the display.

한편, 도 1에서는 오디오 믹싱 서비스 제공 장치(200)가 서버로 구현되어 사용자가 서버로부터 오디오를 믹싱하고 편집할 수 있는 인터페이스를 수신하는 것을 기준으로 설명하였지만, 본 발명에 따른 오디오 믹싱 서비스 제공 장치(200)가 서버로 구현되는 것으로 본 발명의 실시예가 한정되는 것은 아니고, 오디오 믹싱 서비스 제공 장치(200)는 사용자 단말기(300A, 300B, 300C)로 구현될 수 있다.Meanwhile, in Figure 1, the audio mixing service providing device 200 is implemented as a server and is described based on receiving an interface that allows the user to mix and edit audio from the server. However, the audio mixing service providing device 200 according to the present invention ( The embodiment of the present invention is not limited to the fact that 200) is implemented as a server, and the audio mixing service providing device 200 may be implemented as user terminals 300A, 300B, and 300C.

오디오 믹싱 서비스 제공 장치(200)가 사용자 단말기(300A, 300B, 300C)로 구현되는 경우, 사용자 단말기(300A, 300B, 300C)에 포함되어 있는 프로세서가 직접 오디오 믹싱 인터페이스 화면을 생성하고, 생성한 인터페이스 화면을 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시할 수도 있다.When the audio mixing service providing device 200 is implemented as a user terminal (300A, 300B, 300C), the processor included in the user terminal (300A, 300B, 300C) directly generates the audio mixing interface screen and creates the interface. The screen may be displayed on the display of the user terminal (300A, 300B, 300C).

구체적으로, 사용자 단말기(300A, 300B, 300C)는 오디오 믹싱 서비스 제공 장치(200)가 제공하는 믹싱 인터페이스 화면을 생성할 수 있는 프로세서(processor)를 포함하고 있어, 프로세서는 오디오 믹싱 인터페이스 화면을 생성하고, 생성된 화면을 사용자 단말기(300A, 300B, 300C)의 디스플레이를 통해 사용자에게 제공해줄 수 있다. 따라서, 사용자는 오디오 믹싱 인터페이스 화면을 통해 믹싱하고자 하는 오디오를 본인의 취향에 맞추어 편집한 음악 특성 정보 및 음원의 장르 및 무드와 관련한 음악 태그 정보 중 적어도 하나를 입력하여 오디오 믹싱 서비스 제공 장치(200)로 송신할 수 있다.Specifically, the user terminals 300A, 300B, and 300C include a processor capable of generating a mixing interface screen provided by the audio mixing service providing device 200, and the processor generates an audio mixing interface screen and , the generated screen can be provided to the user through the display of the user terminal (300A, 300B, 300C). Therefore, the user inputs at least one of the music characteristic information edited to suit the user's taste for the audio to be mixed and music tag information related to the genre and mood of the sound source through the audio mixing interface screen, and provides the audio mixing service providing device 200. It can be sent to .

따라서, 사용자 단말기(300A, 300B, 300C)는 이러한 알고리즘이 실현될 수 있도록 프로세서를 포함하는 여러 단말 장치로 구현될 수 있는데, 일 예로 도 1에 도시된 바와 같이 PC(personal computer, 300A), 스마트 패드(300B) 또는 노트 북(note book, 300C) 등으로 구현될 수 있다. 또한 도면에 도시 되지는 않았지만, 사용자 디바이스(300)는 PDA(Personal Digital Assistant) 단말, Wibro(Wireless Broadband Internet) 단말, 스마트폰(Smartphone), 태블릿 PC, 스마트 와치(smart watch), 스마트 글라스(smart glass), 웨어러블 기기(wearable device) 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치 등으로 구현될 수 있다.Accordingly, the user terminals 300A, 300B, and 300C may be implemented with several terminal devices including processors so that these algorithms can be realized. For example, as shown in FIG. 1, a personal computer (PC) 300A, a smart It can be implemented as a pad (300B) or a note book (300C). In addition, although not shown in the drawing, the user device 300 includes a Personal Digital Assistant (PDA) terminal, a Wireless Broadband Internet (Wibro) terminal, a smartphone, a tablet PC, a smart watch, and smart glasses. It can be implemented with all types of handheld-based wireless communication devices, such as glass and wearable devices.

도 2는 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 장치(200)의 일부 구성을 도시한 도면이다.FIG. 2 is a diagram illustrating a partial configuration of an apparatus 200 for providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.

도 2를 참조하면, 일 실시예에 따른 오디오 믹싱 서비스 제공 장치(200)는 통신 모듈(210), 오디오 믹싱 화면 생성 모듈(220), 믹싱 정보 생성 모듈(230), 믹싱 오디오 생성 모듈(240), 플레이 리스트 생성 모듈(250) 및 메모리 모듈(260)을 포함할 수 있다.Referring to FIG. 2, the audio mixing service providing device 200 according to an embodiment includes a communication module 210, an audio mixing screen generation module 220, a mixing information generation module 230, and a mixing audio generation module 240. , may include a play list creation module 250 and a memory module 260.

한편, 도 2에서는 설명의 편의를 위해, 통신 모듈(210), 오디오 믹싱 화면 생성 모듈(220), 믹싱 정보 생성 모듈(230), 믹싱 오디오 생성 모듈(240), 플레이 리스트 생성 모듈(250)을 각각 구분하여 표시하였지만, 본 발명의 실시예가 이러한 독립된 구성으로 한정되는 것은 아니고 통신 모듈(210), 오디오 믹싱 화면 생성 모듈(220), 믹싱 정보 생성 모듈(230), 믹싱 오디오 생성 모듈(240), 플레이 리스트 생성 모듈(250)은 프로세서(processor) 역할을 하는 하나의 처리 모듈로 구성되어 구현될 수 있다.Meanwhile, in FIG. 2, for convenience of explanation, a communication module 210, an audio mixing screen creation module 220, a mixing information creation module 230, a mixing audio creation module 240, and a play list creation module 250 are shown. Although each is indicated separately, the embodiment of the present invention is not limited to these independent configurations, and includes a communication module 210, an audio mixing screen generation module 220, a mixing information generation module 230, a mixing audio generation module 240, The play list creation module 250 may be implemented by consisting of one processing module that functions as a processor.

통신 모듈(210)은 오디오 믹싱 서비스 제공 장치(200)가 서버와 같은 장치로 구현되는 경우, 사용자 단말기(300A, 300B, 300C) 및 오디오 데이터 등이 저장되어 있는 외부 서버(미도시)와 무선 통신을 수행할 수 있으며, 사용자 단말기(300A, 300B, 300C) 및 외부 서버 중 적어도 하나로부터 수신한 오디오 데이터를 기초로 오디오 믹싱 화면 생성 모듈(220) 및 믹싱 오디오 생성 모듈(240)이 생성한 오디오 믹싱 인터페이스를 사용자 단말기(300A, 300B, 300C)로 송신할 수 있다.When the audio mixing service providing device 200 is implemented as a server-like device, the communication module 210 wirelessly communicates with the user terminals 300A, 300B, 300C and an external server (not shown) storing audio data, etc. Can be performed, audio mixing generated by the audio mixing screen generation module 220 and the mixing audio generation module 240 based on audio data received from at least one of the user terminal (300A, 300B, 300C) and an external server. The interface can be transmitted to user terminals (300A, 300B, 300C).

또한, 본 발명의 다른 실시예로, 오디오 믹싱 서비스 제공 장치(200)가 사용자 단말기(300A, 300B, 300C)로 구현되는 경우, 오디오 믹싱 서비스 제공 장치(200)의 통신 모듈(210)은 사용자가 외부 서버에 미리 저장해 놓은 데이터 또는 외부 서버를 운영하는 업체에서 미리 저장해 놓은 오디오 데이터를 수신하고, 수신한 오디오 데이터는 메모리 모듈(260)에 저장될 수 있다.In addition, in another embodiment of the present invention, when the audio mixing service providing device 200 is implemented as a user terminal (300A, 300B, 300C), the communication module 210 of the audio mixing service providing device 200 allows the user to Data pre-stored on an external server or audio data pre-stored by a company operating an external server may be received, and the received audio data may be stored in the memory module 260.

오디오 믹싱 화면 생성 모듈(220)은 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시되는 각종 화면 또는 패널을 생성하고, 생성된 화면을 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시할 수 있다. 본 발명에서 말하는 패널(panel)은 디스플레이 화면에 표시되는 내용 중에서 그 내용의 성격에 따라 구분된 인터페이스의 일 부분을 의미한다. 따라서, 패널은 그 내용의 성격에 따라 복수 개 생성될 수 있으며, 생성된 복수 개의 패널은 디스플레이 화면에 동시에 표시될 수 있다.The audio mixing screen creation module 220 can generate various screens or panels displayed on the display of the user terminal (300A, 300B, 300C), and display the generated screen on the display of the user terminal (300A, 300B, 300C). there is. In the present invention, a panel refers to a portion of an interface divided according to the nature of the content displayed on the display screen. Accordingly, a plurality of panels can be created depending on the nature of the content, and the plurality of generated panels can be displayed simultaneously on the display screen.

또한, 패널의 크기는 생성된 패널의 개수에 따라 자동적으로 그 크기가 조절될 수 있으며, 사용자의 조작에 따라 작아지거나 커질 수도 있다.Additionally, the size of the panel can be automatically adjusted according to the number of panels created, and can be made smaller or larger depending on the user's manipulation.

본 발명에 따른 오디오 믹싱 화면 생성 모듈(220)은 서로 다른 성격을 가지는 화면을 생성하고, 생성된 화면을 사용자 단말기(300A, 300B, 300C)의 디스플레이에 표시할 수 있다.The audio mixing screen generation module 220 according to the present invention can generate screens with different characteristics and display the generated screens on the displays of the user terminals 300A, 300B, and 300C.

구체적으로, 오디오 믹싱 화면 생성 모듈(220)은 사용자에 의해 믹싱될 오디오가 실행된 경우, 실행된 오디오에 대해 미리 저장되어 있는 적어도 하나 이상의 오디오 버전을 불러오고, 불러온 오디오 버전에 대해 각각 미리 설정된 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block)을 생성하고, 생성된 오디오 블록을 포함하는 오디오 블록 화면을 사용자 디바이스의 디스플레이에 표시할 수 있다.Specifically, when audio to be mixed is executed by the user, the audio mixing screen creation module 220 loads at least one audio version pre-stored for the executed audio, and preset each of the loaded audio versions. An audio block corresponding to at least one stem item can be created, and an audio block screen including the generated audio block can be displayed on the display of the user device.

여기서 의미하는 오디오는, 우리가 일반적으로 청취하는 노래와 반주 등이 모두 포함되어 있는 음악 데이터를 의미하며, 스템(stem)은 하나의 음악을 구성하는 각각의 오디오 트랙들을 음역대와 기능을 고려하여 분류한 뒤, 하나의 오디오 트랙으로 구성한 데이터를 의미한다.Audio here refers to music data that includes all the songs and accompaniments that we commonly listen to, and the stem classifies each audio track that makes up one piece of music, taking into account the sound range and function. This refers to data composed of one audio track.

구체적으로, 오디오를 구성하는 음원은 사람의 보컬 및 여러 악기들의 소리들이 어울려서 하나의 결과물로 구성이 되는데, 스템은 여기서 음원을 구성하는 단일 항목에 대한 데이터를 의미한다. 일 예로, 스템의 종류로는 리듬(Rhythm) 스템, 베이스(Bass) 스템, 미드(Mid) 스템, 하이(High) 스템, FX 스템 및 멜로디(Melody) 스템 등이 포함될 수 있다.Specifically, the sound source that makes up the audio is composed of human vocals and the sounds of various instruments combined to form a single result, and the stem here refers to data about a single item that makes up the sound source. For example, types of stems may include rhythm stems, bass stems, mid stems, high stems, FX stems, and melody stems.

또한, 오디오 믹싱 화면 생성 모듈(220)은 오디오 믹싱 화면에 표시되어 있는 오디오 블록 중 사용자가 선택한 선택 블록이 존재하거나 믹싱 오디오 생성 모듈(240)이 선택한 선택 블록이 존재하는 경우, 선택된 블록을 다른 오디오 블록들과 다른 음영으로 표시하거나, 선택된 블록에만 체크 마크를 표시할 수 있으며, 생성된 파형 정보를 오디오 믹싱 화면에 표시할 수 있다.In addition, if a selection block selected by the user exists among the audio blocks displayed on the audio mixing screen or a selection block selected by the mixing audio creation module 240 exists, the audio mixing screen creation module 220 selects the selected block as another audio. Blocks can be displayed in a different shade, checkmarks can be displayed only on selected blocks, and generated waveform information can be displayed on the audio mixing screen.

믹싱 정보 생성 모듈(230)은 사용자로부터 입력되는 정보 및 센서로부터 수집된 사용자의 주변 환경 정보 중 적어도 하나를 포함하는 사용자 정보를 믹싱 정보로 생성할 수 있다. The mixing information generation module 230 may generate user information including at least one of information input from the user and information about the user's surrounding environment collected from a sensor as mixing information.

믹싱 오디오 생성 모듈(240)은 믹싱 정보 생성 모듈(230)이 생성한 믹싱 정보를 기초로, 음원 정보에 포함되어 있는 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block)을 선택하고, 선택한 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션(session) 오디오로 생성할 수 있다.The mixing audio generation module 240 selects an audio block corresponding to at least one stem item included in the sound source information based on the mixing information generated by the mixing information generation module 230, The audio information contained in the selected audio block can be combined to create one session audio.

믹싱 오디오 생성 모듈(240)은 사용자의 오디오 블록 선택이 완료되거나, 사용자 정보를 기초로 인공지능 서버의 오디오 블록 선택이 완료된 경우, 선택된 오디오 블록에 포함되어 있는 오디오 정보를 결합하여 하나의 세션(session) 오디오로 생성할 수 있다.When the user's audio block selection is completed or the artificial intelligence server's audio block selection is completed based on user information, the mixing audio generation module 240 combines the audio information included in the selected audio block to create one session. ) can be created as audio.

본 발명에서 의미하는 세션 오디오는 하나의 오디오에 대해 일정한 시간 단위로 구분한 하나의 파트(part)를 의미할 수 있다. 세션 오디오를 나누는 기준은 균등한 시간 단위를 기준으로 할 수 있으나, 전체 오디오의 평균적인 흐름을 고려하여 오디오의 특성이 변하는 구간을 기준으로 나눠질 수 있다.Session audio as used in the present invention may refer to one part of one audio divided by a certain time unit. The standard for dividing session audio can be based on equal time units, but it can also be divided based on sections where the characteristics of the audio change by considering the average flow of the entire audio.

따라서, 세션 오디오를 나누는 기준은 음악 제작자가 사전에 미리 설정하여 저장되어 있는 정보에 기초하거나, 사용자의 조작에 기초하여 자유롭게 변경될 수 있다.Accordingly, the standard for dividing session audio may be based on information preset and stored by the music producer, or may be freely changed based on the user's operation.

또한, 이렇게 설정된 세션 오디오 사이에는 세션 오디오 사이의 음악의 연결이 자연스럽게 이루어질 수 있도록 잔향이 배치될 수 있다.Additionally, reverberation may be placed between the session audios set in this way so that the music between the session audios can be connected naturally.

또한, 믹싱 오디오 생성 모듈(240)은 이러한 세션 오디오를 복수 개 생성하고, 생성한 복수 개의 세션 오디오를 하나의 믹싱 음원으로 생성할 수 있다. 본 발명에서 설명하는 믹싱 오디오와 믹싱 음원은 동일한 의미를 갖는다. Additionally, the mixing audio generation module 240 may generate a plurality of session audios and generate the plurality of session audios as one mixing sound source. Mixing audio and mixing sound source described in the present invention have the same meaning.

이렇게 생성된 믹싱 오디오 또는 각각의 세션 오디오에 관한 데이터는 메모리 모듈(260)에 저장될 수 있다.Data related to the mixed audio or each session audio created in this way may be stored in the memory module 260.

플레이 리스트 생성 모듈(250)은 사용자 정보를 기초로 믹싱된 믹싱 오디오들을 리스트로 생성한 후, 리스트에 존재하는 오디오들을 재생하는 역할을 수행할 수 있다.The play list creation module 250 may create a list of mixed audios based on user information and then play the audio that exists in the list.

플레이 리스트 생성 모듈(250)이 생성 하는 플레이 리스트는, 사용자 정보를 기초로 오디오 블록을 믹싱한 믹싱 오디오들이 포함되어 있을 수 있고, 믹싱 오디오와 유사한 특징을 갖는 음원을 플레이리스트에 포함시킬 수 있다. 플레이 리스트 생성 모듈(250)은 사용자 정보를 기초로 오디오 블록을 믹싱한 믹싱 오디오에서 특정 스템을 포함하는 오디오 블록과 유사한 오디오들을 검색하여 플레이리스트에 포함시킬 수도 있고, 사용자 정보를 기초로 믹싱한 믹싱 오디오의 오디오 블록을 다시 랜덤하게 믹싱하여 생성한 오디오를 플레이 리스트에 포함시킬 수 있다. 이와 관련한 자세한 내용은 후술한다.The playlist generated by the playlist creation module 250 may include mixed audio obtained by mixing audio blocks based on user information, and sound sources with similar characteristics to the mixed audio may be included in the playlist. The playlist creation module 250 may search for audio similar to an audio block containing a specific stem in the mixed audio mixed with the audio block based on user information and include it in the playlist, or may search for audio similar to the audio block containing a specific stem and include it in the playlist. Audio created by randomly mixing audio blocks can be included in the playlist. Detailed information regarding this will be described later.

메모리 모듈(260)은 사용자가 기존에 저장한 음원 및 믹싱 음원에 관한 데이터가 저장될 수 있는 모듈을 의미한다. The memory module 260 refers to a module in which data regarding sound sources and mixed sound sources previously stored by the user can be stored.

오디오 믹싱 서비스 제공 장치(200)가 사용자 단말기(300A, 300B, 300C)로 구현되는 경우, 메모리 모듈(260)은 사용자 단말기(300A, 300B, 300C)에 포함되어 있지 않고, 메모리 모듈(260)에 저장될 수 있는 각종 데이터들은 외부 서버에 저장될 수 있다. 따라서, 이러한 경우 사용자 단말기(300A, 300B, 300C)는 통신 모듈(210)을 이용하여 외부 서버로부터 각종 오디오에 대한 데이터를 수신할 수 있다.When the audio mixing service providing device 200 is implemented as a user terminal (300A, 300B, 300C), the memory module 260 is not included in the user terminal (300A, 300B, 300C) and is included in the memory module 260. Various data that can be stored can be stored on an external server. Therefore, in this case, the user terminals 300A, 300B, and 300C can receive data about various types of audio from an external server using the communication module 210.

도 3은 개시된 발명의 일 실시예에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법을 도시한 도면이다.Figure 3 is a diagram illustrating a method of providing a real-time audio mixing service based on user information according to an embodiment of the disclosed invention.

도 3을 참조하면, 본 발명은 음원 정보 및 음원 정보에 관한 믹싱 정보를 입력 받는 정보 수신 단계(S100), 믹싱 정보를 기초로 음원 정보에 포함되어 있는 적어도 하나 이상의 스템(stem) 항목에 대응되는 오디오 블록(block)을 선택하고, 선택한 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션(session) 오디오로 생성하는 오디오 생성 단계(S110), 세션 오디오를 기초로 믹싱 음원을 생성하고, 생성된 믹싱 음원을 재생하는 믹싱 음원 재생 단계(S120)를 포함할 수 있다.Referring to FIG. 3, the present invention includes an information reception step (S100) of receiving sound source information and mixing information about the sound source information, and based on the mixing information, an information corresponding to at least one stem item included in the sound source information is provided. An audio generation step (S110) in which an audio block is selected and the audio information contained in the selected audio block is combined to create one session audio, and a mixing sound source is created based on the session audio. It may include a mixing sound source reproduction step (S120) of playing the mixed sound source.

일 실시예에 따른 오디오 믹싱 서비스 제공 방법의 오디오 생성 단계(S110)는, 사용자 정보를 기초로 음원 정보에 포함되어 있는 오디오 블록들을 선택하고, 선택한 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션 오디오로 생성할 수 있다. The audio generation step (S110) of the method for providing an audio mixing service according to an embodiment selects audio blocks included in sound source information based on user information, and combines the audio information included in the selected audio blocks to form one audio block. It can be created as session audio.

또한, S110 단계에서는 하나의 세션 오디오를 생성하는 과정을 반복하여 세션 오디오를 복수 개 생성하고, 복수 개의 세션 오디오의 특성을 개별적으로 조정하여 믹싱 음원을 생성하는 단계가 포함될 수 있다.Additionally, step S110 may include generating a plurality of session audios by repeating the process of generating one session audio, and generating a mixed sound source by individually adjusting the characteristics of the plurality of session audios.

또한, 본 발명은 믹싱 음원이 재생되고 있는 경우, 믹싱 음원의 음악 특성을 고려하여 메모리 모듈(260)에 저장된 데이터들과 재생되는 믹싱 음원 사이의 유사도를 판단하고, 다음 재생될 음원을 선택하는 플레이리스트 생성 단계(S130)를 포함할 수 있다.In addition, the present invention provides a play method that, when a mixing sound source is being played, considers the music characteristics of the mixing sound source, determines the similarity between the data stored in the memory module 260 and the mixed sound source being played, and selects the sound source to be played next. It may include a list creation step (S130).

플레이리스트 생성 단계(S130)는 사용자 정보를 기초로 선택된 오디오 블록을 통해 생성된 믹싱 오디오와 유사한 오디오 블록 또는 세션 오디오를 선택하여 플레이 리스트를 생성할 수 있다. 이러한 유사도 판단 및 다음 재생될 음원 선택 과정은 플레이 리스트 생성 모듈(250)에 의해 수행될 수 있다.In the playlist creation step (S130), a playlist may be created by selecting an audio block or session audio similar to the mixed audio generated through the audio block selected based on user information. This process of determining similarity and selecting a sound source to be played next may be performed by the play list creation module 250.

이하에서는 실시간 오디오 믹싱 서비스 제공 방법의 정보 수신 단계(S100)와 오디오 생성 단계(S110)에 관하여 구체적으로 설명하도록 한다.Hereinafter, the information reception step (S100) and the audio generation step (S110) of the method for providing real-time audio mixing services will be described in detail.

일 실시예에 따른 오디오 믹싱 서비스 제공 장치(200)는 음원 정보 및 음원 정보에 대한 믹싱 정보를 입력받을 수 있다. 이 때, 음원 정보는 도 3에 도시된 음악 특성 정보에 포함되는 정보일 수 있다.The audio mixing service providing device 200 according to an embodiment may receive sound source information and mixing information about the sound source information. At this time, the sound source information may be information included in the music characteristic information shown in FIG. 3.

정보 수신 단계(S100)에서, 오디오 믹싱 서비스 제공 장치(200)가 음원 정보를 입력 받는 단계와 관련하여, 사용자는 사용자 단말기(300A, 300B, 300C)에 표시되는 오디오 믹싱 인터페이스를 이용하여 재생될 음악을 선택함으로써 음원 정보를 오디오 믹싱 서비스 제공 장치(200)로 송신할 수 있다.In the information reception step (S100), in relation to the step where the audio mixing service providing device 200 receives sound source information, the user selects music to be played using the audio mixing interface displayed on the user terminals 300A, 300B, and 300C. By selecting , sound source information can be transmitted to the audio mixing service providing device 200.

또는, 사용자가 사용자 단말기(300A, 300B, 300C)로 음성 정보를 입력하면, 사용자 단말기(300A, 300B, 300C)의 프로세서는 입력된 음성 정보에 포함된 자연어를 처리하여 재생에 사용될 음악을 선택할 수 있고, 선택한 음원 정보를 오디오 믹싱 서비스 제공 장치(200)로 송신할 수 있다.Alternatively, when the user inputs voice information into the user terminal (300A, 300B, 300C), the processor of the user terminal (300A, 300B, 300C) processes the natural language included in the input voice information to select music to be used for playback. and the selected sound source information can be transmitted to the audio mixing service providing device 200.

이러한 방법을 통해, 오디오 믹싱 서비스 제공 장치(200)는 사용자 단말기(300A, 300B, 300C)에서 재생될 음원 정보를 사용자로부터 입력 받을 수 있다.Through this method, the audio mixing service providing device 200 can receive sound source information to be played in the user terminals 300A, 300B, and 300C from the user.

이후, 오디오 믹싱 서비스 제공 장치(200)는 입력된 음원 정보에 대응되는 오디오가 사용자 단말기(300A, 300B, 300C)에서 재생되도록 재생 음원 정보를 사용자 단말기(300A, 300B, 300C)로 송신할 수 있다. 사용자 단말기(300A, 300B, 300C)에 입력된 재생 음원 정보는 사용자 단말기(300A, 300B, 300C)의 오디오 믹싱 인터페이스에 표시될 수 있다.Thereafter, the audio mixing service providing device 200 may transmit playback sound source information to the user terminals 300A, 300B, and 300C so that audio corresponding to the input sound source information is played in the user terminals 300A, 300B, and 300C. . Playback sound source information input to the user terminals 300A, 300B, and 300C may be displayed on the audio mixing interface of the user terminals 300A, 300B, and 300C.

이후, 정보 수신 단계(S100)에서 오디오 믹싱 서비스 제공 장치(200)가 믹싱 정보를 입력 받는 단계에 대해 설명하면, 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음악 특성 정보를 입력 받고, 입력된 음악 특성 정보를 믹싱 정보로 생성할 수 있다.Next, in the information receiving step (S100), the audio mixing service providing device 200 receives mixing information. When the audio mixing service providing device 200 receives music characteristic information from the user, the input music Characteristic information can be created as mixing information.

구체적으로, 사용자는 사용자 단말기(300A, 300B, 300C)에 표시되는 오디오 믹싱 인터페이스를 이용하여, 본인의 취향에 맞게 오디오 블록을 선택하여 오디오를 믹싱할 수 있다. 즉, 사용자는 여러 오디오 블록들을 클릭을 통해 On/Off 하면서 본인에게 맞는 오디오 블록만을 On 시키는 방식으로 음악 특성 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력할 수 있다. 이후, 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력 받은 음악 특성 정보를 기초로 믹싱 정보로 생성할 수 있다.Specifically, the user can mix audio by selecting audio blocks according to the user's taste using the audio mixing interface displayed on the user terminals 300A, 300B, and 300C. In other words, the user can input music characteristic information to the audio mixing service providing device 200 by turning on/off various audio blocks by clicking and turning on only the audio block that suits the user. Thereafter, the audio mixing service providing device 200 may generate mixing information based on music characteristic information input from the user.

또한, 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음악 태그 정보를 입력 받고, 입력된 음악 태그 정보를 믹싱 정보로 생성할 수 있다.Additionally, the audio mixing service providing device 200 may receive music tag information from the user and generate the input music tag information as mixing information.

구체적으로, 사용자로부터 입력된 음악 태그 정보는 사용자가 사용자 단말기(300A, 300B, 300C)에 표시되는 오디오 믹싱 인터페이스를 이용하여 직접 선택한 음악 태그 정보일 수 있다. Specifically, the music tag information input by the user may be music tag information directly selected by the user using the audio mixing interface displayed on the user terminals 300A, 300B, and 300C.

다시 말해, 사용자로부터 입력된 음악 태그 정보는 사용자로부터 직접 입력된 정보일 수 있다. 따라서, 사용자는 오디오 믹싱 인터페이스에 표시된 태그 정보들을 선택적으로 클릭하면서 본인의 취향에 맞는 음악 태그 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력할 수 있다. 이후, 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력 받은 음악 태그 정보를 믹싱 정보로 생성할 수 있다.In other words, music tag information input from the user may be information input directly from the user. Accordingly, the user can selectively click on tag information displayed on the audio mixing interface and input music tag information suited to the user's taste into the audio mixing service providing device 200. Afterwards, the audio mixing service providing device 200 may generate mixing information using music tag information input from the user.

또는, 사용자로부터 입력된 음악 태그 정보는 사용자의 음성 정보 및 챗봇과의 대화 정보를 기초로 생성되는 정보일 수 있다.Alternatively, the music tag information input by the user may be information generated based on the user's voice information and conversation information with the chatbot.

구체적으로, 사용자는 사용자 단말기(300A, 300B, 300C)에 표시되는 오디오 믹싱 인터페이스를 이용하여 사용자의 음성 정보 및 챗봇과의 대화 정보를 사용자 단말기(300A, 300B, 300C)로 입력할 수 있다. 사용자 단말기(300A, 300B, 300C)는 입력된 사용자의 음성 정보 및 챗봇과의 대화 정보를 오디오 믹싱 서비스 제공 장치(200)로 송신하고, 오디오 믹싱 서비스 제공 장치(200)는 사용자 단말기(300A, 300B, 300C)로부터 입력 받은 사용자의 음성 정보 및 챗봇과의 대화 정보를 기초로 믹싱 정보를 생성할 수 있다.Specifically, the user can input the user's voice information and conversation information with the chatbot into the user terminals 300A, 300B, and 300C using the audio mixing interface displayed on the user terminals 300A, 300B, and 300C. The user terminals (300A, 300B, 300C) transmit the input user's voice information and conversation information with the chatbot to the audio mixing service providing device 200, and the audio mixing service providing device 200 transmits the input user's voice information and conversation information with the chatbot to the user terminals (300A, 300B). , 300C), mixing information can be generated based on the user's voice information and conversation information with the chatbot.

일 예로, 오디오 믹싱 서비스 제공 장치(200)의 믹싱 정보 생성 모듈(230)은 사용자의 음성 정보 및 챗봇과의 대화 정보를 처리하여 음악 태그 정보와 매핑하고, 매핑된 음악 태그 정보를 기초로 믹싱 정보를 생성할 수 있다.As an example, the mixing information generation module 230 of the audio mixing service providing device 200 processes the user's voice information and conversation information with the chatbot, maps it to music tag information, and generates mixing information based on the mapped music tag information. can be created.

또한, 오디오 믹싱 서비스 제공 장치(200)는 사용자 단말기(300A, 300B, 300C)의 센서로부터 수집된 사용자의 주변 환경 정보를 믹싱 정보로 생성할 수 있다.Additionally, the audio mixing service providing device 200 may generate mixing information using the user's surrounding environment information collected from the sensors of the user terminals 300A, 300B, and 300C.

구체적으로, 사용자 주변 환경 정보는 사용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나를 포함할 수 있다. 오디오 믹싱 서비스 제공 장치(200)는 사용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나의 정보를 기초로 믹싱 정보를 생성할 수 있다.Specifically, the user's surrounding environment information may include at least one of the user's location information, weather information, temperature information, and movement information. The audio mixing service providing device 200 may generate mixing information based on at least one of the user's location information, weather information, temperature information, and movement information.

일 예로, 오디오 믹싱 서비스 제공 장치(200)의 믹싱 정보 생성 모듈(230)은 용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나의 정보를 처리하여 음악 태그 정보와 매핑하고, 매핑된 음악 태그 정보를 기초로 믹싱 정보를 생성할 수 있다.As an example, the mixing information generation module 230 of the audio mixing service providing device 200 processes at least one of the user's location information, weather information, temperature information, and movement information, maps it to music tag information, and maps the mapped information to music tag information. Mixing information can be generated based on music tag information.

다만, 사용자 주변 환경 정보는 상술한 예에 한정되지 않고, 사용자 단말기(300A, 300B, 300C)에 부착된 센서로부터 수집될 수 있는 정보라면 모두 사용자 주변 환경 정보에 속할 수 있다.However, the user surrounding environment information is not limited to the above-described examples, and any information that can be collected from sensors attached to the user terminals 300A, 300B, and 300C may belong to the user surrounding environment information.

개시된 발명의 일 실시예에 따른 오디오 믹싱 서비스 제공 장치(200)는 상술한 바와 같이 다양한 방식으로 정보를 수신하고, 이를 통해 생성된 믹싱 정보를 기초로 음원 정보에 포함되어 있는 적어도 하나 이상의 스템 항목에 대응되는 오디오 블록을 선택하고, 선택한 오디오 블록에 포함되어 있는 오디오 정보들을 결합하여 하나의 세션 오디오로 생성할 수 있다.The audio mixing service providing device 200 according to an embodiment of the disclosed invention receives information in various ways as described above, and adds information to at least one stem item included in the sound source information based on the mixing information generated through this. You can select the corresponding audio block and combine the audio information contained in the selected audio block to create one session audio.

따라서, 개시된 발명의 일 실시예에 따른 오디오 믹싱 서비스 제공 방법의 사용자 정보를 기초로 믹싱 정보를 생성하고, 믹싱 정보를 기초로 오디오 블록을 선택하는 단계는, 사용자로부터 음악 특성 정보를 입력 받아 사용자에 의해 오디오 블록이 직접 선택(S111) 되거나, 인공지능 기술이 적용되어 오디오 블록이 자동으로 선택(S112)되는 방식이 존재한다.Therefore, the step of generating mixing information based on user information and selecting an audio block based on the mixing information in the method of providing an audio mixing service according to an embodiment of the disclosed invention involves receiving music characteristic information from the user and providing the information to the user. There is a method in which the audio block is directly selected (S111), or the audio block is automatically selected (S112) by applying artificial intelligence technology.

또한, 오디오 믹싱 서비스 제공 장치(200)는 입력된 믹싱 정보를 기반으로 하나의 음원 내에서 서로 다른 구간에 속하는 세션 오디오를 복수 개 생성하고, 복수의 세션 오디오를 조정할 수 있다(S113).Additionally, the audio mixing service providing device 200 may generate a plurality of session audio belonging to different sections within one sound source based on the input mixing information and adjust the plurality of session audio (S113).

구체적으로, 오디오 믹싱 서비스 제공 장치(200)는 복수의 세션 오디오의 순서를 배열할 수 있다. 또한, 오디오 믹싱 서비스 제공 장치(200)는 복수의 세션 오디오의 볼륨, 스피드, 조성, 이펙트, 재생 시작점 중 적어도 하나를 조정하고, 조정이 완료된 복수의 세션 오디오를 하나의 믹싱 음원으로 생성할 수 있다. 복수의 세션 오디오의 순서를 배열하는 과정과 복수의 세션 오디오의 개별 특성을 조정하는 과정은 사용자에 의해 직접 수행될 수도 있고, 인공 지능 기술이 적용되어 자동으로 수행될 수도 있다. 이와 관련한 자세한 내용은 후술하도록 한다.Specifically, the audio mixing service providing device 200 may arrange the order of a plurality of session audios. In addition, the audio mixing service providing device 200 can adjust at least one of the volume, speed, composition, effect, and playback start point of a plurality of session audios, and generate a plurality of adjusted session audios as one mixing sound source. . The process of arranging the order of the plurality of session audios and the process of adjusting the individual characteristics of the plurality of session audios may be performed directly by the user, or may be performed automatically by applying artificial intelligence technology. Details regarding this will be described later.

조정이 완료되면, 오디오 믹싱 서비스 제공 장치(200)는 조정이 완료된 세션 오디오를 기초로 믹싱 음원을 생성함으로써 오디오 생성을 완료할 수 있다.When adjustment is completed, the audio mixing service providing device 200 can complete audio generation by generating a mixing sound source based on the adjusted session audio.

오디오 믹싱 서비스 제공 장치(200)는 믹싱 음원 생성이 완료되면, 생성된 믹싱 음원을 출력할 수 있다. 구체적으로, 오디오 믹싱 서비스 제공 장치(200)는 최종적으로 생성된 믹싱 음원에 대한 정보를 사용자 단말기(300A, 300B, 300C)로 송신하고, 사용자 단말기(300A, 300B, 300C)는 믹싱 음원을 재생하여 사용자에게 믹싱이 완료된 오디오를 송출할 수 있다.When the audio mixing service providing device 200 completes the generation of the mixing sound source, it can output the generated mixing sound source. Specifically, the audio mixing service providing device 200 transmits information about the finally generated mixing sound source to the user terminals 300A, 300B, and 300C, and the user terminals 300A, 300B, and 300C play the mixing sound source. Completely mixed audio can be transmitted to the user.

도 4는 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자가 입력한 음악 특성 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.FIG. 4 illustrates a method in which music characteristic information input by a user is generated as mixing information and a mixing sound source is generated based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. It is a drawing.

도 4를 참조하면, 본 발명에 따른 오디오 믹싱 서비스 제공 장치(200)는 완성되어 있는 하나의 음원을 여러 버전 별로 나누어 생성할 수 있다. 구체적으로, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 하나의 음원에 대해 버전 별로 복수의 스템 항목에 대응되는 복수의 오디오 블록(block)이 나누어져 결합되어 있는 세션 오디오를 복수 개 생성할 수 있다.Referring to FIG. 4, the audio mixing service providing device 200 according to the present invention can generate one completed sound source by dividing it into several versions. Specifically, the audio mixing service providing device 200 of the present invention can generate a plurality of session audio in which a plurality of audio blocks corresponding to a plurality of stem items are divided and combined for each version of one sound source. there is.

이를 통해, 사용자는 본인의 취향에 맞춰 오디오 블록을 재 조합하는 방향으로 새로운 믹싱 오디오를 생성할 수 있게 된다.Through this, users can create new mixed audio by recombining audio blocks to suit their taste.

본 발명에서 설명하는 세션 오디오는 하나의 음원에 대해 생성된 여러 버전의 음원에 대해, 각각의 음원을 구성하는 여러 스템 데이터를 모아 놓은 형태의 데이터를 의미한다.Session audio described in the present invention refers to data in the form of a collection of various stem data constituting each sound source for several versions of the sound source created for one sound source.

한편, 본 발명에서 의미하는 완성되어 있는 하나의 음원은, 처음부터 블록 형식으로 제작된 오디오일수도 있고, 이미 공개되어 있지만 작곡가의 동의를 얻어 블록 형식으로 제작된 오디오일수도 있다.Meanwhile, a completed sound source within the meaning of the present invention may be audio produced from the beginning in a block format, or may be audio that has already been released but produced in a block format with the consent of the composer.

본 발명의 오디오 믹싱 서비스 제공 장치(200)는 오디오 믹싱 화면 생성 모듈을 통해 사용자 단말기(300A, 300B, 300C)에 재생 가능한 음원들의 목록을 표시할 수 있다(S210). The audio mixing service providing device 200 of the present invention can display a list of playable sound sources on the user terminals 300A, 300B, and 300C through the audio mixing screen creation module (S210).

이를 통해, 사용자는 사용자 단말기(300A, 300B, 300C)의 화면에 표시되는 오디오 믹싱 인터페이스를 통해 재생될 음원을 직접 선택할 수 있다. 따라서 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자 단말기로부터 사용자가 입력한 음원 정보를 입력 받을 수 있다(S220).Through this, the user can directly select a sound source to be played through the audio mixing interface displayed on the screen of the user terminal (300A, 300B, 300C). Therefore, the audio mixing service providing device 200 of the present invention can receive sound source information input by the user from the user terminal (S220).

또한, 사용자는 청취를 원하는 분위기의 음악 태그를 선택하여 음악 태그 정보를 입력하거나, 자연어 또는 챗봇을 이용하여 음성 정보 및 대화 정보를 입력하여 인공지능 기반의 음악 추천 서비스를 이용할 수 있다. Additionally, users can use an artificial intelligence-based music recommendation service by selecting the music tag for the mood they want to listen to and entering music tag information, or by entering voice information and conversation information using natural language or a chatbot.

사용자의 음성 정보 및 대화 정보는 사용자 단말기(300)의 입력부를 통해 사용자로부터 입력되는 정보이며, 입력된 정보는 사용자 단말기(300)에서 텍스트 정보로 처리될 수 있다. 예를 들어, 사용자의 음성 정보는 '산책할 때 듣기 좋은 음악 추천해줘' 의 문장에 대해 챗봇이 대답한 'BPM은 어느 정도가 좋을까요?'의 질문에 응답하여 'BPM 100 이하의 음악으로 추천해줘'라고 대답한 문장을 포함하는 정보일 수 있다. 이러한 경우 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력된 음악 태그 정보, 음성 정보 및 대화 정보 중 적어도 하나를 기초로 메모리 모듈에 저장된 데이터를 활용하여 재생될 음원을 선택할 수 있다. 따라서, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 인공지능을 활용하여 선택된 음원 정보를 입력 받을 수 있다(S220).The user's voice information and conversation information are information input from the user through the input unit of the user terminal 300, and the input information may be processed as text information in the user terminal 300. For example, the user's voice information is in response to the chatbot's question 'What BPM is good?' in response to the sentence 'Recommend music that is good to listen to while walking', 'Recommend music with a BPM of 100 or less.' It may be information containing the sentence answered '. In this case, the audio mixing service providing device 200 of the present invention may select a sound source to be played using data stored in the memory module based on at least one of music tag information, voice information, and conversation information input from the user. Accordingly, the audio mixing service providing device 200 of the present invention can receive selected sound source information using artificial intelligence (S220).

사용자는 사용자 단말기(300A, 300B, 300C)를 통해 재생되는 음원에 대한 음악 특성 정보를 입력할 수 있다. 다시 말해, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음악 특성 정보를 입력 받을 수 있다(S230).The user can input music characteristic information about the sound source played through the user terminals 300A, 300B, and 300C. In other words, the audio mixing service providing device 200 of the present invention can receive music characteristic information from the user (S230).

구체적으로, 본 발명은 사용자 단말기(300A, 300B, 300C)에 믹싱 인터페이스 화면을 표시할 수 있으며, 믹싱 인터페이스 화면에는 현재 재생되고 있는 음원의 스템(stem) 데이터에 대응되는 오디오 블록이 표시될 수 있다. 따라서, 사용자 단말기(300A, 300B, 300C)는 사용자에 의해 선택된 오디오 블록을 기초로 음악 특성 정보를 생성하고, 생성된 음악 특성 정보를 본 발명의 오디오 믹싱 서비스 제공 장치(200)로 송신할 수 있다.Specifically, the present invention can display a mixing interface screen on the user terminals 300A, 300B, and 300C, and audio blocks corresponding to stem data of the sound source currently being played can be displayed on the mixing interface screen. . Accordingly, the user terminals 300A, 300B, and 300C can generate music characteristic information based on the audio block selected by the user and transmit the generated music characteristic information to the audio mixing service providing device 200 of the present invention. .

본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력된 음악 특성 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보에 기반하여 세션 오디오를 생성할 수 있다(S240). 보다 상세하게는, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 세션 오디오를 복수 개 생성할 수 있다.The audio mixing service providing device 200 of the present invention can generate mixing information from music characteristic information input by the user and generate session audio based on the generated mixing information (S240). More specifically, the audio mixing service providing device 200 of the present invention can generate a plurality of session audio.

본 발명에서 의미하는 세션 오디오는 하나의 오디오에 대해 일정한 시간 단위로 구분한 하나의 파트(part)를 의미한다. 세션 오디오를 나누는 기준은 균등한 시간 단위로 나눌 수 있으나, 오디오의 전체적인 흐름을 고려하여, 오디오의 특성이 변화는 구간을 기준으로 나눠질 수도 있다. 이렇게 세션을 나누는 기준은 음악 제작자가 사전에 미리 설정하여 저장되어 있거나, 사용자의 조작에 의해 자유롭게 변경될 수 있다.Session audio as used in the present invention refers to one part of one audio divided by a certain time unit. The standard for dividing session audio can be divided into equal time units, but considering the overall flow of audio, it can also be divided based on sections where audio characteristics change. The criteria for dividing sessions in this way are set and stored in advance by the music producer, or can be freely changed by user manipulation.

세션 오디오의 생성이 완료되면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 믹싱 음원을 출력할 수 있고, 이에 따라 사용자 단말기(300A, 300B, 300C)에서 믹싱 음원이 재생될 수 있다.When the generation of session audio is completed, the audio mixing service providing device 200 of the present invention can output the mixing sound source, and thus the mixing sound source can be played in the user terminals 300A, 300B, and 300C.

이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자의 제어에 따라 세션 오디오를 조정할 수 있다(S250). Afterwards, the audio mixing service providing device 200 of the present invention can adjust the session audio according to the user's control (S250).

구체적으로, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 복수의 세션 오디오의 순서를 배열하고, 복수의 세션 오디오의 볼륨, 스피드, 조성, 이펙트, 재생 시작점 중 적어도 하나를 조정함으로써 새로운 믹싱 음원을 생성할 수 있다.Specifically, the audio mixing service providing device 200 of the present invention arranges the order of a plurality of session audios and adjusts at least one of the volume, speed, composition, effect, and playback start point of the plurality of session audios to create a new mixing sound source. can be created.

세션 오디오의 조정이 완료되면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 조정이 완료된 복수의 세션 오디오를 하나의 믹싱 음원으로 생성하고, 생성된 믹싱 음원을 사용자 단말기로 송신하고, 이에 따라 사용자 단말기(300A, 300B, 300C)는 믹싱 음원을 실시간으로 재생할 수 있다(S260). 이와 관련한 자세한 내용은 도 7에서 후술한다.When the adjustment of the session audio is completed, the audio mixing service providing device 200 of the present invention generates a plurality of adjusted session audio as one mixing sound source, transmits the generated mixing sound source to the user terminal, and accordingly, the user Terminals (300A, 300B, 300C) can play mixed sound sources in real time (S260). Details regarding this are described later in FIG. 7 .

도 5는 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자가 입력한 음악 태그 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.FIG. 5 illustrates a method in which music tag information input by a user is generated as mixing information and a mixing sound source is generated based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. It is a drawing.

도 5를 참조하면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 오디오 믹싱 화면 생성 모듈을 통해 사용자 단말기(300A, 300B, 300C)에서 재생 가능한 음원들의 목록을 표시할 수 있다(S310). Referring to FIG. 5, the audio mixing service providing device 200 of the present invention can display a list of sound sources that can be played on the user terminals 300A, 300B, and 300C through the audio mixing screen creation module (S310).

이를 통해, 사용자는 사용자 단말기(300A, 300B, 300C)의 화면에 표시되는 오디오 믹싱 인터페이스를 통해 재생될 음원을 직접 선택할 수 있다. 따라서, 이러한 방법을 통해 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음원 정보를 입력 받을 수 있다(S320).Through this, the user can directly select a sound source to be played through the audio mixing interface displayed on the screen of the user terminal (300A, 300B, 300C). Therefore, through this method, the audio mixing service providing device 200 of the present invention can receive sound source information from the user (S320).

또한, 사용자는 청취를 원하는 분위기의 음악 태그를 선택하여 음악 태그 정보를 입력하거나, 자연어 또는 챗봇을 이용하여 음성 정보 및 대화 정보를 입력하여 인공지능 기반의 음악 추천 서비스를 이용할 수 있다. 이러한 경우 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자단말기로부터 수신한 음악 태그 정보, 음성 정보 및 대화 정보 중 적어도 하나를 기초로 메모리 모듈에 저장된 데이터를 활용하여 재생될 음원을 선택할 수 있다. 따라서, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 인공지능을 활용하여 선택된 음원 정보를 입력 받을 수 있다(S320).Additionally, users can use an artificial intelligence-based music recommendation service by selecting the music tag for the mood they want to listen to and entering music tag information, or by entering voice information and conversation information using natural language or a chatbot. In this case, the audio mixing service providing device 200 of the present invention can select a sound source to be played using data stored in the memory module based on at least one of music tag information, voice information, and conversation information received from the user terminal. Accordingly, the audio mixing service providing device 200 of the present invention can receive selected sound source information using artificial intelligence (S320).

사용자는 사용자 단말기(300A, 300B, 300C)를 통해 재생되는 음원에 대한 음악 태그 정보를 입력할 수 있다. 다시 말해, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음악 태그 정보를 입력 받을 수 있다(S330).The user can input music tag information for the sound source played through the user terminals 300A, 300B, and 300C. In other words, the audio mixing service providing device 200 of the present invention can receive music tag information from the user (S330).

구체적으로, 본 발명은 사용자 단말기(300A, 300B, 300C)에 믹싱 인터페이스 화면을 표시할 수 있으며, 믹싱 인터페이스 화면에는 음원의 장르 정보, 무드 정보, 밝기 정보 등을 포함하는 음악 태그가 표시될 수 있다. Specifically, the present invention can display a mixing interface screen on the user terminals (300A, 300B, and 300C), and music tags including genre information, mood information, brightness information, etc. of the sound source can be displayed on the mixing interface screen. .

따라서, 사용자는 사용자 단말기(300A, 300B, 300C) 화면에 표시된 태그를 선택함으로써 음악 태그 정보를 생성하고, 생성된 음악 태그 정보를 본 발명의 오디오 믹싱 서비스 제공 장치(200)로 입력할 수 있다.Accordingly, the user can generate music tag information by selecting a tag displayed on the screen of the user terminal (300A, 300B, 300C) and input the generated music tag information into the audio mixing service providing device 200 of the present invention.

또는, 사용자가 음성 정보 및 챗봇과의 대화 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력하면, 오디오 믹싱 서비스 제공 장치(200)는 입력된 사용자의 음성 정보 및 챗봇과의 대화 정보를 기초로 믹싱 정보를 생성할 수 있다. Alternatively, when the user inputs voice information and conversation information with the chatbot into the audio mixing service providing device 200, the audio mixing service providing device 200 mixes based on the input user's voice information and conversation information with the chatbot. Information can be generated.

사용자의 음성 정보 및 대화 정보는 사용자 단말기(300)의 입력부를 통해 사용자로부터 입력되는 정보이며, 입력된 정보는 사용자 단말기(300)에서 텍스트 정보로 처리될 수 있다. 예를 들어, 사용자의 음성 정보는 '산책할 때 듣기 좋은 음악 추천해줘' 의 문장을 포함하는 정보일 수 있고, 챗봇과의 대화 정보는 사용자의 음성 정보에 대해 챗봇이 대답한 'BPM은 어느 정도가 좋을까요?'의 질문에 사용자가 응답하여 'BPM 100 이하의 음악으로 추천해줘'라고 대답한 문장을 포함하는 정보일 수 있다. The user's voice information and conversation information are information input from the user through the input unit of the user terminal 300, and the input information may be processed as text information in the user terminal 300. For example, the user's voice information may include the sentence 'Recommend music that is good to listen to while walking,' and the conversation information with the chatbot may include the chatbot's response to the user's voice information, 'What is the BPM?' This may be information containing a sentence in which the user responds to the question 'Is this good?' by saying 'Recommend music with a BPM of 100 or less.'

이후, 오디오 믹싱 서비스 제공 장치(200)는 입력된 사용자의 음성 정보 및 챗봇과의 대화 정보를 분석 및 처리하여 대응되는 음악 태그 정보를 맵핑할 수 있다. 이에 따라, 오디오 믹싱 서비스 제공 장치(200)는 맵핑을 통해 음악 태그 정보를 생성할 수 있다. Thereafter, the audio mixing service providing device 200 may analyze and process the input user's voice information and conversation information with the chatbot to map corresponding music tag information. Accordingly, the audio mixing service providing device 200 can generate music tag information through mapping.

예를 들어, 오디오 믹싱 서비스 제공 장치(200)는 산책할 때 듣기 좋은 음악을 추천해달라는 사용자의 음성 정보를 토대로 Lo-fi, Jazz 등의 음악 태그 정보를 생성할 수 있다.For example, the audio mixing service providing device 200 may generate music tag information such as Lo-fi or Jazz based on the user's voice information requesting recommendation of good music to listen to while taking a walk.

이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력받은 음성 정보 및 챗봇과의 대화 정보를 토대로 생성된 음악 태그 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보에 기반하여 세션 오디오를 생성할 수 있다(S340). 보다 상세하게는, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 세션 오디오를 복수 개 생성할 수 있다.Afterwards, the audio mixing service providing device 200 of the present invention generates music tag information generated based on voice information input from the user and conversation information with the chatbot as mixing information, and generates session audio based on the generated mixing information. Can be created (S340). More specifically, the audio mixing service providing device 200 of the present invention can generate a plurality of session audio.

세션 오디오의 생성이 완료되면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 믹싱 음원을 출력할 수 있고, 이에 따라 사용자 단말기(300A, 300B, 300C)에서는 믹싱 음원이 재생될 수 있다.When the generation of session audio is completed, the audio mixing service providing device 200 of the present invention can output the mixing sound source, and accordingly, the mixing sound source can be played in the user terminals 300A, 300B, and 300C.

이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자의 제어에 따라 세션 오디오를 조정할 수 있다(S350). Afterwards, the audio mixing service providing device 200 of the present invention can adjust the session audio according to the user's control (S350).

세션 오디오의 조정이 완료되면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 조정이 완료된 복수의 세션 오디오를 하나의 믹싱 음원으로 생성하고, 생성된 믹싱 음원을 사용자 단말기(300A, 300B, 300C)로 송신하고, 이에 따라 사용자 단말기(300A, 300B, 300C)는 믹싱 음원을 실시간으로 재생할 수 있다(S360). When the adjustment of the session audio is completed, the audio mixing service providing device 200 of the present invention generates a plurality of adjusted session audio into one mixing sound source and sends the generated mixing sound source to the user terminals 300A, 300B, and 300C. It is transmitted to, and according to this, the user terminals (300A, 300B, 300C) can play the mixed sound source in real time (S360).

도 6은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 사용자의 주변 환경 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보를 기초로 믹싱 음원이 생성되는 방법을 도시한 도면이다.FIG. 6 is a diagram illustrating a method in which information on the user's surrounding environment is generated as mixing information and a mixing sound source is generated based on the generated mixing information in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention. .

도 6을 참조하면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 오디오 믹싱 화면 생성 모듈을 통해 사용자 단말기(300A, 300B, 300C)에서 재생 가능한 음원들의 목록을 사용자 단말기에 표시할 수 있다(S410). Referring to FIG. 6, the audio mixing service providing device 200 of the present invention can display a list of sound sources that can be played on the user terminal (300A, 300B, 300C) on the user terminal through the audio mixing screen creation module (S410) ).

이를 통해, 사용자는 사용자 단말기(300A, 300B, 300C)의 화면에 표시되는 오디오 믹싱 인터페이스를 통해 재생될 음원을 직접 선택할 수 있다. 따라서 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 음원 정보를 입력 받을 수 있다(S420).Through this, the user can directly select a sound source to be played through the audio mixing interface displayed on the screen of the user terminal (300A, 300B, 300C). Therefore, the audio mixing service providing device 200 of the present invention can receive sound source information from the user (S420).

또한, 사용자는 청취를 원하는 분위기의 음악 태그를 선택하는 방법으로 음악 태그 정보를 입력하거나, 자연어 또는 챗봇을 이용하여 음성 정보 및 대화 정보를 입력하여 인공지능 기반의 음악 추천 서비스를 이용할 수 있다. 이러한 경우 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자로부터 입력된 음악 태그 정보, 음성 정보 및 대화 정보 중 적어도 하나를 기초로 메모리 모듈에 저장된 데이터를 활용하여 재생될 음원을 선택할 수 있다. 따라서, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 인공지능을 활용하여 선택된 음원 정보를 입력 받을 수 있다(S420).Additionally, users can use an artificial intelligence-based music recommendation service by entering music tag information by selecting a music tag for the mood they want to listen to, or by entering voice information and conversation information using natural language or a chatbot. In this case, the audio mixing service providing device 200 of the present invention may select a sound source to be played using data stored in the memory module based on at least one of music tag information, voice information, and conversation information input from the user. Accordingly, the audio mixing service providing device 200 of the present invention can receive selected sound source information using artificial intelligence (S420).

이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 센서로부터 수집되는 주변 환경 정보를 입력 받을 수 있다(S430).Afterwards, the audio mixing service providing device 200 of the present invention can receive surrounding environment information collected from the sensor (S430).

구체적으로, 본 발명은 사용자 단말기(300A, 300B, 300C)의, 믹싱 인터페이스 화면에는 사용자 주변 환경을 기반으로 믹싱 서비스를 이용할 수 있는 버튼이 표시될 수 있다.Specifically, in the present invention, a button for using a mixing service based on the user's surrounding environment may be displayed on the mixing interface screen of the user terminals 300A, 300B, and 300C.

따라서, 사용자는 사용자 단말기(300A, 300B, 300C) 화면에 표시된 주변 환경 기반 믹싱 서비스 버튼을 선택함으로써, 사용자 주변 환경 정보를 본 발명의 오디오 믹싱 서비스 제공 장치(200)로 제공할 수 있고, 사용자 주변 환경 정보는 본 발명의 오디오 믹싱 서비스 제공 장치(200)로 입력될 수 있다.Therefore, the user can provide information on the user's surrounding environment to the audio mixing service providing device 200 of the present invention by selecting the surrounding environment-based mixing service button displayed on the screen of the user terminal (300A, 300B, 300C), and provide information on the user's surrounding environment to the audio mixing service providing device 200 of the present invention. Environmental information can be input to the audio mixing service providing device 200 of the present invention.

또한, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 상술한 바와 같이 사용자가 직접 주변 환경 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력할 수도 있으나, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 센서에 의해 수집되는 사용자의 주변 환경 정보를 자동으로 입력 받을 수도 있다.In addition, the audio mixing service providing device 200 of the present invention allows the user to directly input surrounding environment information into the audio mixing service providing device 200, as described above. It can also automatically receive information about the user's surrounding environment collected by sensors.

사용자 주변 환경 정보는 사용자의 위치 정보, 날씨 정보, 온도 정보 및 이동 정보 중 적어도 하나를 포함할 수 있다.The user's surrounding environment information may include at least one of the user's location information, weather information, temperature information, and movement information.

본 발명의 오디오 믹싱 서비스 제공 장치(200)가 센서를 통해 사용자의 주변 환경 정보를 입력 받는 경우, 오디오 믹싱 서비스 제공 장치(200)는 입력된 사용자의 주변 환경 정보를 기초로 믹싱 정보를 생성할 수 있다. When the audio mixing service providing device 200 of the present invention receives information about the user's surrounding environment through a sensor, the audio mixing service providing device 200 can generate mixing information based on the input information about the user's surrounding environment. there is.

보다 상세하게는, 오디오 믹싱 서비스 제공 장치(200)는 입력된 사용자의 주변 환경 정보를 분석 및 처리하여 대응되는 음악 태그 정보를 맵핑을 통해 생성할 수 있다. 예를 들어, 오디오 믹싱 서비스 제공 장치(200)는 입력받은 사용자의 주변 환경의 날씨가 맑고, 이동 속도가 느린 경우 이와 대응되는 음악 태그 정보인 Lo-fi, Jazz 등의 음악 태그 정보를 맵핑을 통해 생성할 수 있다.More specifically, the audio mixing service providing device 200 may analyze and process the input user's surrounding environment information and generate corresponding music tag information through mapping. For example, when the weather in the user's surrounding environment is clear and the movement speed is slow, the audio mixing service providing device 200 maps corresponding music tag information such as Lo-fi and Jazz. can be created.

이를 통해, 오디오 믹싱 서비스 제공 장치(200)는 맵핑을 통해 음악 태그 정보를 생성할 수 있다 이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 센서로부터 수집된 주변 환경 정보를 믹싱 정보로 생성하고, 생성된 믹싱 정보에 기반하여 세션 오디오를 생성할 수 있다(S440). 보다 상세하게는, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 세션 오디오를 복수 개 생성할 수 있다.Through this, the audio mixing service providing device 200 can generate music tag information through mapping. Afterwards, the audio mixing service providing device 200 of the present invention generates mixing information from the surrounding environment information collected from the sensor and , Session audio can be created based on the generated mixing information (S440). More specifically, the audio mixing service providing device 200 of the present invention can generate a plurality of session audio.

이후, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자의 제어에 따라 세션 오디오를 조정할 수 있다(S450). Afterwards, the audio mixing service providing device 200 of the present invention can adjust the session audio according to the user's control (S450).

세션 오디오의 조정이 완료되면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 조정이 완료된 복수의 세션 오디오를 하나의 믹싱 음원으로 생성하고, 생성된 믹싱 음원을 사용자 단말기(300A, 300B, 300C)로 송신하고, 이에 따라 사용자 단말기(300A, 300B, 300C)는 믹싱 음원을 실시간으로 재생할 수 있다(S460). When the adjustment of the session audio is completed, the audio mixing service providing device 200 of the present invention generates a plurality of adjusted session audio into one mixing sound source and sends the generated mixing sound source to the user terminals 300A, 300B, and 300C. It is transmitted to, and according to this, the user terminals (300A, 300B, 300C) can play the mixed sound source in real time (S460).

도 7은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 방법에 있어서, 세션 오디오를 개별적으로 조정하여 실시간 믹싱 음원을 생성하는 방법을 도시한 도면이다.FIG. 7 is a diagram illustrating a method of generating a real-time mixing sound source by individually adjusting session audio in a method of providing a real-time audio mixing service according to an embodiment of the disclosed invention.

도 7을 참조하면, 사용자는 사용자 단말기(300A, 300B, 300C)에 표시되는 믹싱 인터페이스를 이용하여 세션 오디오의 재생 시작점, 오디오 블록 및 배열 순서 등을 변경할 수 있다. 일 예로 도 7에 표시된 바와 같이, 사용자는 적어도 하나의 오디오 블록이 포함된 세션 오디오를 클릭한 후 원하는 위치로 이동시키는 방법으로 세션의 재생 순서를 변경할 수 있다.Referring to FIG. 7, the user can change the playback start point, audio block, and arrangement order of session audio using the mixing interface displayed on the user terminals 300A, 300B, and 300C. As an example, as shown in FIG. 7, the user can change the playback order of the session by clicking on session audio containing at least one audio block and then moving it to a desired location.

또한, 사용자는 세션 오디오를 클릭하여 세션 오디오의 볼륨, 스피드, 조성 및 이펙트 중 적어도 하나를 조정할 수 있다.Additionally, the user can click on the session audio to adjust at least one of the volume, speed, composition, and effects of the session audio.

또한, 사용자는 세션 오디오에 포함되어 있는 적어도 하나의 스템 항목에 대응되는 오디오 블록을 선택하고, 해당 오디오 블록의 볼륨, 스피드, 조성 및 이펙트를 개별적으로 변경할 수도 있다.Additionally, the user may select an audio block corresponding to at least one stem item included in the session audio and individually change the volume, speed, composition, and effect of the corresponding audio block.

또한, 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 장치(200)는 자동으로 세션 오디오의 재생 시작점, 길이, 위치, 배열 등을 조정할 수 있다.Additionally, the real-time audio mixing service providing device 200 according to an embodiment can automatically adjust the playback start point, length, position, arrangement, etc. of session audio.

예를 들어, 오디오 믹싱 서비스 제공 장치(200)는 입력된 믹싱 정보를 기초로 복수의 세션 오디오 각각의 길이, 복수의 세션 오디오 사이의 배열 또는 위치를 자동으로 조정할 수 있다. 따라서, 본 발명에 따른 오디오 믹싱 서비스 제공 장치(200)는 사용자의 입력이 없더라도 사용자 니즈에 맞는 다양한 믹싱 음원을 사용자에게 제공할 수 있다.For example, the audio mixing service providing device 200 may automatically adjust the length of each of the plurality of session audios and the arrangement or position between the plurality of session audios based on the input mixing information. Therefore, the audio mixing service providing device 200 according to the present invention can provide the user with various mixing sound sources that meet the user's needs even without user input.

이러한 사용자의 세션 오디오 제어 및 오디오 믹싱 서비스 제공 장치의 세션 오디오 제어에 따른 세션 오디오 조정 과정에서, 본 발명은 WebRTC(Web Real-Time Communication) 기술에 기반하여 사용자 단말기(300A, 300B, 300C)로 실시간 믹싱 음원을 송출할 수 있다.In the process of adjusting the session audio according to the user's session audio control and the session audio control of the audio mixing service providing device, the present invention provides real-time audio streaming to the user terminals (300A, 300B, 300C) based on WebRTC (Web Real-Time Communication) technology. Mixed sound sources can be transmitted.

Web RTC란, 웹 어플리케이션과 사이트가 중간자 없이 브라우저 간 오디오나 영상 미디어를 포착하여 마음대로 스트리밍 하면서 임의의 데이터를 교환할 수 있도록 하는 기술로서, 별도의 드라이버나 플러그인 설치 없이 웹 브라우저 간 P2P 연결을 통해 데이터 교환이 이루어지도록 하는 기술이다.Web RTC is a technology that allows web applications and sites to exchange arbitrary data by capturing audio or video media between browsers without an intermediary and streaming it at will. Data is exchanged through a P2P connection between web browsers without installing a separate driver or plug-in. It is a technology that allows exchange to take place.

WebRTC는 짧은 latency를 가지기에 지연시간이 거의 없는 near real-time 서비스를 제공할 수 있다. 이러한 WebRTC는 실시간 정보 처리가 중요한 미디어 송수신 서비스에 주로 사용될 수 있다.WebRTC has a short latency, so it can provide near real-time services with almost no delay. This WebRTC can be mainly used in media transmission and reception services where real-time information processing is important.

따라서, 여러 트랙을 동시에 스트리밍하기 위해 믹스다운 방식을 이용한 종래 음원 스트리밍 서비스의 경우 믹싱 정보를 활용하여 믹싱 음원을 생성하더라도 이를 사용자 단말기(300A, 300B, 300C)로 실시간 송출하기 어려운 단점이 존재하였으나, 본 발명의 경우 웹 브라우저 간에 플러그인의 도움 없이 서로 통신이 가능하도록 설계된 WebRTC 기술을 활용하여 매시간 인터랙티브하게 변경되는 복수의 오디오들을 실시간으로 믹싱하고 재생할 수 있는 서비스를 제공할 수 있다.Therefore, in the case of the conventional music streaming service that uses the mixdown method to stream multiple tracks simultaneously, there is a disadvantage that it is difficult to transmit the mixed sound source in real time to the user terminal (300A, 300B, 300C) even if the mixing information is used to create the mixed sound source. In the case of the present invention, it is possible to provide a service that can mix and play multiple audios that interactively change every hour in real time by utilizing WebRTC technology, which is designed to enable communication between web browsers without the help of plug-ins.

이에 따라, 본 발명에 따른 오디오 믹싱 서비스 제공 장치(200)는 세션 오디오 각각에 대해 볼륨, 스피드, 조성, 이펙트 및 재생 시작점 중 적어도 하나를 조정하거나 세션 오디오를 구성하는 각각의 스템에 대응되는 오디오 블록의 볼륨, 스피드, 조성, 이펙트 및 재생 시작점 중 적어도 하나를 조정함으로써 사용자의 니즈에 맞게 세션 오디오를 실시간으로 커스터마이징 할 수 있고, 세션 오디오를 자동으로 조정하여 생성된 믹싱 음원을 사용자에게 추천해줄 수 있다.Accordingly, the audio mixing service providing device 200 according to the present invention adjusts at least one of the volume, speed, composition, effect, and playback start point for each session audio, or adjusts the audio block corresponding to each stem constituting the session audio. By adjusting at least one of the volume, speed, composition, effect, and playback start point, session audio can be customized in real time to suit the user's needs, and the mixed sound source created by automatically adjusting the session audio can be recommended to the user. .

사용자 또는 믹싱 오디오 서비스 제공 장치(200)에 의해 조정된 복수의 세션 오디오를 포함하는 음원은 하나의 음원으로 믹스 다운되어, 글로벌 이펙트가 적용됨으로써 최종적으로 믹싱 음원이 생성될 수 있다.Sound sources including a plurality of session audios adjusted by the user or the mixing audio service providing device 200 may be mixed down into one sound source and a global effect may be applied to ultimately create a mixed sound source.

도 8은 개시된 발명의 일 실시예에 따른 실시간 오디오 믹싱 서비스 제공 장치(200)의 다중 클라이언트 관리 방법을 도시한 도면이다.FIG. 8 is a diagram illustrating a multi-client management method of the real-time audio mixing service providing device 200 according to an embodiment of the disclosed invention.

도 8을 참조하면, 본 발명의 오디오 믹싱 서비스 제공 장치(200)는 사용자 별로 믹싱 서버가 매칭될 수 있도록 복수의 믹싱 서버를 포함할 수 있다.Referring to FIG. 8, the audio mixing service providing device 200 of the present invention may include a plurality of mixing servers so that the mixing servers can be matched for each user.

이를 통해, 사용자가 오디오 믹싱 서비스를 이용하는 경우 최종 생성되는 믹싱 음원이 사용자와 1:1로 대응되는 믹싱 서버에 저장될 수 있다. 따라서 본 발명은 사용자가 과거에 생성한 믹싱 음원 데이터를 오디오 믹싱 서비스 제공 시 기초 데이터로 활용할 수 있다.Through this, when a user uses an audio mixing service, the final generated mixed sound source can be stored in a mixing server that corresponds 1:1 with the user. Therefore, the present invention can utilize mixing sound source data created by a user in the past as basic data when providing an audio mixing service.

도 8에 도시된 믹싱 서버는 본 발명의 오디오 믹싱 서비스 제공 장치(200)의 메모리 모듈에 대응될 수 있으나, 이에 한정되지 않고 오디오 믹싱 서비스 제공 장치(200)와 별도로 마련되는 외부 서버일 수도 있다.The mixing server shown in FIG. 8 may correspond to the memory module of the audio mixing service providing device 200 of the present invention, but is not limited to this and may be an external server provided separately from the audio mixing service providing device 200.

도 9는 개시된 발명이 적용되어 실제로 구현된 믹싱 오디오 서비스 제공 시 사용자 단말에 표시되는 인터페이스 화면을 도시한 도면이다.Figure 9 is a diagram illustrating an interface screen displayed on a user terminal when providing a mixing audio service actually implemented by applying the disclosed invention.

도 9를 참조하면, 음원 정보에 기초하여 재생 음원이 스트리밍 서비스에 의해 재생되는 경우, 인터페이스 화면에는 도 9의 좌측 화면처럼 해당 오디오의 커버 이미지가 표시될 수 있다. 이러한 상태에서 사용자가 오디오 커버를 클릭하는 경우, 재생되는 오디오의 현재 구간에서의 오디오 블록 정보를 포함하고 있는 오디오 블록 화면이 도 9의 우측 화면과 같이 사용자 단말기(300A, 300B, 300C)의 인터페이스 화면에 표시될 수 있다.Referring to FIG. 9, when a sound source is played by a streaming service based on sound source information, a cover image of the corresponding audio may be displayed on the interface screen, as shown in the left screen of FIG. 9. In this state, when the user clicks on the audio cover, the audio block screen containing audio block information in the current section of the audio being played appears on the interface screen of the user terminals 300A, 300B, and 300C, as shown in the right screen of FIG. 9. It can be displayed in .

이를 통해, 사용자는 현재 재생되는 오디오 블록 정보를 직관적으로 파악할 수 있고, 사용자가 오디오 블록을 변경하여 선택하고 싶은 경우 화면에 표시된 다른 오디오 블록을 선택하여 변경된 음악 특성 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력할 수 있다.Through this, the user can intuitively understand the currently playing audio block information, and if the user wants to change and select an audio block, he or she can select another audio block displayed on the screen and send the changed music characteristic information to the audio mixing service providing device (200). ) can be entered.

또한, 도 9의 좌측에 도시된 바와 같이, 사용자는 인터페이스 화면에 표시되는 다양한 음악 태그를 선택하여 믹싱에 사용되는 음악 태그 정보를 오디오 믹싱 서비스 제공 장치(200)로 입력할 수 있다. Additionally, as shown on the left side of FIG. 9, the user can select various music tags displayed on the interface screen and input music tag information used for mixing into the audio mixing service providing device 200.

또한, 도 9의 좌측에 도시된 바와 같이, 사용자가 인터페이스 화면의 AI Mix를 선택하는 경우, 오디오 믹싱 서비스 제공 장치(200)는 사용자 주변 환경 정보를 기초로 믹싱 정보를 생성하고, 생성된 믹싱 정보에 대응되는 오디오 블록을 선택함으로써 인공 지능 기반의 믹싱 서비스를 제공할 수 있다.In addition, as shown on the left side of FIG. 9, when the user selects AI Mix on the interface screen, the audio mixing service providing device 200 generates mixing information based on the user's surrounding environment information, and the generated mixing information By selecting the corresponding audio block, an artificial intelligence-based mixing service can be provided.

지금까지 본 발명에 따른 사용자 정보에 기초한 실시간 오디오 믹싱 서비스 제공 방법 및 장치에 대해 자세히 알아보았다.So far, we have looked in detail at the method and device for providing a real-time audio mixing service based on user information according to the present invention.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 컨트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general-purpose or special-purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may perform an operating system (OS) and one or more software applications that run on the operating system.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination. Program instructions recorded on the medium may be specially designed and configured for the embodiment or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes optical media (magneto-optical media) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다. 그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다. As described above, although the embodiments have been described with limited examples and drawings, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent. Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

200; 오디오 믹싱 서비스 제공 장치
300A, 300B, 300C; 사용자 단말기
210; 통신 모듈
220; 오디오 믹싱 화면 생성 모듈
230; 믹싱 정보 생성 모듈
240; 믹싱 오디오 생성 모듈
250; 플레이 리스트 생성 모듈
260; 메모리 모듈200; Audio mixing service provider
300A, 300B, 300C; user terminal
210; communication module
220; Audio mixing screen creation module
230; Mixing information generation module
240; Mixing Audio Generation Module
250; Playlist creation module
260; memory module

Claims

In a method of providing an audio mixing service to a user terminal through a processor,
An information receiving step of receiving sound source information and mixing information about the sound source information;
Based on the mixing information, an audio block corresponding to at least one stem item included in the sound source information is selected, and the audio information included in the selected audio block is combined to create one session ( session) An audio generation step that generates audio; and
A mixing sound source reproduction step of generating a mixed sound source based on the session audio and playing the generated mixed sound source,
The mixing information is
A method of providing an audio mixing service, including information generated based on information about the user's surrounding environment collected from sensors of the user terminal.

According to paragraph 1,
The mixing information is
A method of providing an audio mixing service, including at least one of music characteristic information and music tag information input by a user.

According to paragraph 2,
The music tag information is
A method of providing an audio mixing service including at least one of genre information, mood information, and brightness information.

According to paragraph 2,
The music tag information is
A method of providing an audio mixing service, which is information received directly from the user or generated based on the user's voice information and conversation information with a chatbot.

delete

In paragraph 1
The user's surrounding environment information is,
A method of providing an audio mixing service including at least one of the user's location information, weather information, temperature information, and movement information.

According to paragraph 1,
The audio generation step is,
generating a plurality of the session audio;
arranging the order of the plurality of session audios;
adjusting at least one of volume, speed, composition, effect, and playback start point of the plurality of session audios; and
A method of providing an audio mixing service further comprising: generating the plurality of adjusted session audio into one mixing sound source.

A communication module that receives sound source information and mixing information about the sound source information;
a memory module storing audio block information corresponding to at least one stem item preset for at least one audio version of the sound source information;
a mixing information generation module that generates user information including at least one of information input from the user and information about the user's surrounding environment collected from a sensor as the mixing information; and
Mixing audio generation that selects an audio block based on the mixing information, combines the audio information contained in the selected audio block to create one session audio, and generates a mixing sound source based on the session audio. Contains a module;
The mixing information is
An audio mixing service providing device including information generated based on information about the user's surrounding environment collected from sensors of the user's terminal.

According to clause 8,
The mixing audio generation module is
After generating a plurality of session audios, arranging the order of the plurality of session audios, adjusting at least one of the volume, speed, composition, effect, and playback start point of the plurality of session audios, and adjusting the plurality of session audios for which the adjustment has been completed. An audio mixing service provider that generates session audio into a single mixing sound source.

delete