KR101793184B1

KR101793184B1 - Apparatus of Fast lyric area extraction from images of printed music scores and method therefor

Info

Publication number: KR101793184B1
Application number: KR1020160082018A
Authority: KR
Inventors: 양형정; 민딩콩
Original assignee: 전남대학교산학협력단
Priority date: 2016-06-29
Filing date: 2016-06-29
Publication date: 2017-11-03

Abstract

The present invention relates to an apparatus and method for extracting a lyric area for automatic performance of a captured music score image, and more particularly, to an apparatus and method for extracting a lyric area for automatic performance of a captured music score image, capable of extracting the lyric area in an optical music recognition (OMR) process of a score using information of a frequency, a projection, a height and position of the staff, etc. for the automatic performance of the captured music score image. As described above, the present invention can obtain a high recognition rate with a small calculation amount in the extraction of the lyric area of the music score and extract the lyric area by correcting distortion using staff and bar line information.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and apparatus for extracting a lyric region for automatically playing a photographed music score image,

본 발명은 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치 및 방법에 관한 것으로, 더욱 상세하게는 촬영된 악보 영상의 자동 연주를 위해 빈도수, 투영, 오선의 높이, 위치 등의 정보를 이용하여 악보의 광학악보인식(Optical Music Recognition:OMR) 과정 중에 가사 영역을 추출하는 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치 및 방법에 관한 것이다.
The present invention relates to an apparatus and method for extracting a lyric region for automatic performance of a photographed music score image, and more particularly, to an apparatus and method for extracting a lyric region for automatically playing a photographed music score image using information on frequency, projection, The present invention relates to a lyric region extracting apparatus and method for automatically playing a photographed music score image, which extracts a lyric region during an optical music recognition (OMR) process of a score.

종래, 국내공개특허 제10-2014-0144876호에 의하면, 필기 인식을 이용한 악보 그리기 방법에 있어서, 광학식문자판독(OCR; Optical Character Recognition)을 이용하여 악보 영상을 디지털 형식의 악보로 변환하는 과정과, 필기 입력에 따라 디지털 형식의 악보를 편집하는 과정을 포함함을 특징으로 한다.Japanese Laid-Open Patent Publication No. 10-2014-0144876 discloses a method of drawing a score using handwriting recognition, which includes a process of converting a score image into a score of a digital format using Optical Character Recognition (OCR) And editing the score of the digital form according to the handwriting input.

최근들어 기술의 발전에 따라 신속하고 정확한 정보를 위해 많은 부분에서 디지털화가 요구되고 있고, 음악에서도 예외는 아니다. 광학악보인식은 인쇄된 악보의 디지털화를 통해 음악의 연주, 분석, 비교, 편곡, 작곡에 이르기까지 다양한 분야에서 컴퓨터를 이용할 수 있게 한다. 하지만 다수의 광학악보인식 애플리케이션에서 악보인식의 정확률에 영향을 미치는 가사 영역의 추출이나 인식을 고려하지 않았고, 광학 악보 인식시 가사가 음악 기호와 접촉되거나 겹칠 경우 악보의 인식률이 떨어지며, 연산과정이 복잡한 문제가 있었다.In recent years, with the development of technology, digitalization has been required in many parts for quick and accurate information, and music is no exception. Optical score recognition enables computers to be used in a variety of fields ranging from playing, analyzing, comparing, arranging, and composing music through digitalization of printed musical scores. However, in many optical musical score recognition applications, the extraction or recognition of the lyric area that affects the accuracy of the score recognition is not considered. When the lyrics are touched or overlapped with the music symbol in the optical score recognition, the recognition rate of the score is lowered. There was a problem.

이에 따라 광학악보인식의 정확률에 영향을 미치는 가사 영역의 추출이나 인식을 고려해볼 필요가 있다.
Therefore, it is necessary to consider extracting or recognizing the lexical region that affects the accuracy of optical score recognition.

본 발명의 목적은 전술한 점들을 감안하여 안출된 것으로, 한국어 악보의 가사 영역 추출에서 빈도수 방법과 경험적 규칙을 이용하여 적은 계산량으로도 높은 인식률을 낼 수 있도록 하는 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치 및 방법을 제공함에 있다.An object of the present invention is to provide a method and apparatus for automatically playing a recorded musical score image that enables a high recognition rate to be obtained even with a small amount of calculation using a frequency method and an empirical rule in a lexical region extraction of a Korean music score And a method for extracting a lyric region.

또한 오선과 마디선 정보를 이용하여 왜곡을 보정하고, 빈도수 방법과 경험적 규칙을 사용하여 가사 영역을 추출하는 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치 및 방법을 제공함에도 있다.
There is also provided a device and method for extracting a lyric region for automatically playing a photographed music score image, which corrects distortions by using line and line information, and extracts a lyric region using frequency and empirical rules.

이러한 기술적 과제를 달성하기 위한 본 발명은 악보 영상을 이진화 전처리하는 전처리부(100); 상기 전처리된 악보 영상에서 오선 정보를 추출하는 오선정보 추출부(200); 및 상기 전처리된 악보 영상에서 잠재적인 가사영역을 설정하되, 구역 분할영역과 실행길이를 찾아 필터링하고, 후보 경계정보를 찾아 가사 영역을 추출하는 가사 라인 추출부(300);를 포함한다.According to an aspect of the present invention, there is provided a binarization apparatus comprising: a preprocessing unit for binarizing a score image; A preference information extracting unit (200) for extracting preference information from the preprocessed music score image; And a lyric line extracting unit 300 for setting a potential lyric region in the preprocessed music score image, searching and searching for a zonal region and an execution length, and searching for candidate boundary information to extract a lyric region.

바람직하게 전처리부(100)는 입력되는 악보영상을 이진화 처리하는 이진화 모듈(110); 및 이진화 처리결과, 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정하도록 하는 보정모듈(120);을 포함할 수 있다.Preferably, the preprocessing unit 100 includes a binarization module 110 for binarizing input music images; And a correction module 120 for correcting the distortion of the score image using the line and line information as a result of the binarization process.

또한 오선정보 추출부(200)는 오선을 감지하는 오선감지모듈(210); 및 오선의 공간과 높이정보를 평가하는 오선정보 추출모듈(220);을 포함할 수 있다.The cruise information extracting unit 200 includes a cruise detection module 210 for detecting a cruise line; And a five-line information extraction module 220 for evaluating the space and height information of the pentagon.

또한 가사 라인 추출부(300)는 오선과 오선 사이 영역을 잠재적인 가사영역으로 설정하는 가사영역 설정모듈(310); 가사영역 설정모듈을 통해 설정되는 잠재적인 가사영역을 N구역들로 분할하는 구역 분할모듈(320); 구역 분할모듈을 통해 분할된 구역에서 0이 아닌 연속적인 실행길이를 찾기 위해 수직 투영을 하되, 실행길이 사이의 공간이 오선 높이보다 작거나 같을 때, 실행길이를 하나의 구역으로 합하도록 하는 실행길이 탐색모듈(330); 전처리부의 이진화 처리 후 노이즈가 있는 경우, 후보 경계를 수정하여 추출하는 후보 경계정보 추출모듈(340); 상기 후보 경계정보 추출모듈의 후보 경계를 필터링하는 후보 경계필터링 모듈(350); 및 후보 경계필터링 모듈을 통해 필터링된 결과, 가사영역과 악보영역을 O 또는 I로 표시하도록 하는 가사영역 추출모듈(360);을 포함할 수 있다.The lyric line extracting unit 300 further includes a lyric area setting module 310 for setting the area between the five-line and the five-line area as a potential lyric area; A zone division module (320) for dividing the potential lyrics area set through the lyrics area setting module into N zones; Execute vertical projection to find consecutive run lengths other than 0 in a partitioned area through the zone partitioning module, and to set the execution length to add the run lengths to one zone when the space between run lengths is less than or equal to the straight line height A search module 330; A candidate boundary information extraction module 340 for correcting and extracting a candidate boundary when there is noise after the binarization processing of the preprocessing unit; A candidate boundary filtering module 350 for filtering a candidate boundary of the candidate boundary information extraction module; And a lyric region extraction module 360 for displaying the lyric region and the score region as O or I as a result filtered through the candidate boundary filtering module.

한편, 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법에 있어서, (a) 상기 가사 영역 추출장치의 제어부가 악보 영상을 전처리하도록 하는 단계; (b) 상기 제어부가 전처리된 악보 영상에서 오선 정보를 추출하도록 하는 단계; 및 (c) 상기 제어부가 전처리된 악보 영상에서 잠재적인 가사영역을 설정하되, 구역 분할영역과 실행길이를 찾아 필터링하고, 후보 경계정보를 찾아 가사 영역을 추출하도록 하는 단계;를 포함할 수 있다.A method of using a lyric region extracting apparatus for automatically playing a photographed music score image, the method comprising the steps of: (a) pre-processing a score image by a control unit of the lyric region extracting apparatus; (b) extracting the selection information from the preprocessed music score image; And (c) setting a potential lyrics area in the preprocessed music score image, wherein the control part searches for and sorts the zone division area and the execution length, and searches for candidate area information to extract a lyrics area.

바람직하게 제 (a) 단계는 (a-1) 상기 제어부가 입력되는 악보영상을 이진화 처리하도록 하는 단계; 및 (a-2) 상기 제어부가 이진화 처리결과, 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정하도록 하는 단계;를 포함할 수 있다.Preferably, the step (a) includes the steps of: (a-1) binarizing a score image input by the controller; And (a-2) the controller corrects the distortion of the score image using the five-line and line information as a result of the binarization process.

또한 바람직하게 제 (b) 단계는 (b-1) 상기 제어부가 오선을 감지하도록 하는 단계; 및 (b-2) 상기 제어부가 오선의 공간과 높이정보를 평가하여 오선 정보를 추출하도록 하는 단계;를 포함할 수 있다.Preferably, the step (b) includes the steps of: (b-1) detecting the pentode of the control unit; And (b-2) the control unit evaluating the space and height information of the pentagon and extracting the pentagon information.

그리고 바람직하게 제 (c) 단계는 (c-1) 제어부가 오선과 오선 사이 영역을 잠재적인 가사영역으로 설정하도록 하는 단계; (c-2) 제어부가 설정되는 잠재적인 가사영역을 N구역들로 분할하도록 하는 단계; (c-3) 제어부가 분할된 구역에서 0이 아닌 연속적인 실행길이를 찾기 위해 수직 투영을 하되, 실행길이 사이의 공간이 오선 높이보다 작거나 같을 때, 실행길이를 하나의 구역으로 합하도록 하는 단계; (c-4) 제어부가 제 (a) 단계의 전처리 과정 이후 노이즈가 있는 경우, 후보 경계를 수정하여 추출하도록 하는 단계; (c-5) 제어부가 제 (c-4) 단계의 후보 경계를 필터링하도록 하는 단계; 및 (c-6) 제어부가 제 (c-5) 단계의 필터링 결과, 가사영역과 악보영역을 O 또는 I로 표시하도록 하는 단계;를 포함할 수 있다.
Preferably, the step (c) includes the steps of: (c-1) setting the area between the pentagon and the pentagon as a potential lyric area; (c-2) causing the control unit to divide the potential lyric area to be set into N zones; (c-3) The control unit performs vertical projection to find non-zero consecutive run lengths in the subdivided zone, so that when the space between run lengths is less than or equal to the syllable height, step; (c-4) if the control unit has noise after the pre-processing of step (a), correcting the candidate boundary and extracting it; (c-5) causing the control unit to filter the candidate boundary of step (c-4); And (c-6) causing the control unit to display the lyric region and the score region as O or I as a result of the filtering in the (c-5) step.

상술한 바에 의하면, 악보의 가사영역 추출에서 적은 계산량으로도 높은 인식률을 낼 수 있고, 오선과 마디선 정보를 이용해 왜곡을 보정하여 가사 영역을 추출할 수 있는 효과가 있다.
According to the above description, it is possible to obtain a high recognition rate even with a small amount of calculation in extracting the lyrics region of the music score, and it is possible to extract the lyrics region by correcting the distortion using the five-line and the ninth line information.

도 1은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치의 구성도이고,
도 2는 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법의 전체 흐름도이다.
도 3은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 잠재적인 가사 영역을 나타낸 예시도이다.
도 4는 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 구역 분할을 나타낸 예시도이다.
도 5는 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 오선 정보 추출정보를 나타낸 예시도이다.
도 6은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 run-lengths 탐색을 나타낸 예시도이다.
도 7은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 run-lengths 필터링을 나타낸 예시도이다.
도 8은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치에 대한 후보 라인의 수정을 나타낸 예시도이다.
도 9는 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법을 나타낸 전체 흐름도이다.
도 10은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법의 가사영역 추출단계의 상세 흐름도이다.1 is a block diagram of a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention,
FIG. 2 is an overall flowchart of a method using a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
3 is an exemplary view illustrating a potential lyric area for a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
FIG. 4 is a diagram illustrating an area division for a lyric region extraction apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating line information extraction information for a lyric region extraction device for automatically playing a photographed music score image according to an exemplary embodiment of the present invention. Referring to FIG.
FIG. 6 is an exemplary view illustrating a run-length search for a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
7 is a view illustrating run-lengths filtering for a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating a modification of a candidate line for a lyric region extracting apparatus for automatically playing a photographed music score image according to an exemplary embodiment of the present invention.
FIG. 9 is an overall flowchart illustrating a method using a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.
10 is a detailed flowchart of a lyric region extraction step of a method using a lyric region extraction device for automatically playing a photographed music score image according to an embodiment of the present invention.

본 발명의 구체적 특징 및 이점들은 첨부도면에 의거한 다음의 상세한 설명으로 더욱 명백해질 것이다. 이에 앞서 본 발명에 관련된 공지 기능 및 그 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는, 그 구체적인 설명을 생략하였음에 유의해야 할 것이다.Specific features and advantages of the present invention will become more apparent from the following detailed description based on the accompanying drawings. It is to be noted that the detailed description of known functions and constructions related to the present invention is omitted when it is determined that the gist of the present invention may be unnecessarily blurred.

본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치 및 방법은 오선과 마디선을 이용하여 회전 문제를 해결(왜곡의 보정)하고, 이러한 결과로부터 빈도수 방법과 경험적 규칙을 사용하여 run-lengths 탐색 및 필터링을 통해 가사 영역을 추출하는 기술구성에 특징이 있다.
The apparatus and method for extracting a lyric region for automatic performance of a photographed music score image according to an embodiment of the present invention solve the rotation problem (distortion correction) using a pentagon and a line, This technique is characterized by a technique for extracting a lyric area through run-lengths search and filtering using rules.

이하, 첨부 도면을 참조하여 설명하면 다음과 같다.Hereinafter, the present invention will be described with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치의 구성도이다.1 is a block diagram of a lyric region extracting apparatus for automatically playing a photographed music score image according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치는 전처리부(100), 오선정보 추출부(200), 가사라인 추출부(300)를 포함할 수 있다.1, a lyric region extracting apparatus for automatically playing a photographed music score image according to an exemplary embodiment of the present invention includes a preprocessing unit 100, a preference information extracting unit 200, a lyrics line extracting unit 300).

전처리부(100)는 악보 영상을 전처리하는 구성으로, 입력되는 악보영상을 이진화처리하고, 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정하도록 한다. The preprocessing unit 100 preprocesses the score image, binarizes the score image to be inputted, and corrects distortion of the score image by using the five-line and line information.

이러한 기능을 수행하기 위한 전처리부(100)는 입력되는 악보영상을 이진화 처리하는 이진화 모듈(110)과 이진화 처리결과, 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정하도록 하는 보정모듈(120)을 포함할 수 있다.The preprocessing unit 100 for performing such a function includes a binarization module 110 for binarizing the inputted score image and a correction module 120 for correcting the distortion of the score image using the binarization result, ).

악보영상의 왜곡을 보정하기 위하여, 우선 오선을 검출한 후, 악보 마디가 오선 중 첫 번재 선과 다섯 번째 선에 이어져 있다는 특성을 이용하여 마디를 검출한다. 다음으로 오선영역을 마디선 정보를 이용하여 마디와 마디 사이의 각 지역영역으로 나눈 후 영역단위로 수평이 되도록 영역을 재배치하여 왜곡을 보정한다.
In order to correct the distortion of the music score image, after detecting the music piece, the music piece is detected by using the characteristic that the music score section is connected to the first line and the fifth line of the line. Next, the dive line is divided into the regional regions between the nodes and the nodes using the line information, and the regions are rearranged horizontally to correct the distortion.

오선정보 추출부(200)는 전처리된 악보 영상에서 오선 정보를 추출하는 구성이다. 이러한 오선정보 추출부(200)는 오선을 감지하고, 오선의 공간과 높이정보를 평가하여 오선정보를 추출하도록 하는 기능을 수행할 수 있다.The preference information extracting unit (200) extracts preference information from a preprocessed score image. The sloped information extracting unit 200 can detect a sloped line and perform a function of extracting the sloped information by evaluating the space and height information of the sloped line.

이러한 오선정보 추출부(200)는 오선을 감지하는 오선감지모듈(210)과, 오선의 공간과 높이정보를 평가하는 오선정보 추출모듈(220)을 포함할 수 있다.
The slope information extraction unit 200 may include a slope detection module 210 for detecting a slope and a slope information extraction module 220 for evaluating slope space and height information.

가사 라인 추출부(300)는 전처리된 악보 영상에서 잠재적인 가사영역을 설정하되, 구역 분할영역과 실행길이를 찾아 필터링하고, 후보 경계정보를 찾아 가사 영역을 추출하는 구성이다.The lyrics line extracting unit 300 is configured to set a potential lyrics region in the preprocessed music score image, to search for a zone division region and an execution length, and to search for candidate boundary information to extract a lyrics region.

본 실시예에 따른 가사라인 추출부(300)는 가사영역 설정모듈(310), 구역 분할모듈(320), 실행길이 탐색모듈(330), 후보 경계정보 추출모듈(340), 후보 경계필터링 모듈(350), 가사영역 추출모듈(360)을 포함할 수 있다.The lyric line extracting unit 300 according to the present embodiment includes a lyric area setting module 310, a zone dividing module 320, an execution length searching module 330, a candidate boundary information extracting module 340, a candidate boundary filtering module 350, and a lyric region extraction module 360.

가사영역 설정모듈(310)은 오선과 오선 사이 영역을 잠재적인 가사영역으로 설정하는 구성이다. The ladder area setting module 310 is a configuration for setting the area between the pentagon and the pentagon to be a potential lexical area.

잠재적인 가사영역으로 설정하는 이유는 다음과 같다. 악보에서 가사의 배치가 오선과 오선 사이에 이루어진다. 그러나 오선과 오선 사이 전체가 가사 영역이 아닌 경우가 있기 때문에 오선과 오선사이는 가사영역을 포함하고 있는 잠재적인 가사영역의 후보가 된다. 따라서, 잠재적인 가사영역을 설정하고 더 자세한 분석을 통해 정확한 가사영역을 추출한다.
The reasons for setting the potential lyric area are as follows. The arrangement of the lyrics in the score is made between the line and the line. However, since the whole area between the pentagon and the pentagon may not be the lexical region, the candidate between the pentagon and the pentagon is the potential lexical region including the lexical region. Therefore, we set the potential lyric area and extract the correct lyric area through more detailed analysis.

구역 분할모듈(320)은 가사영역 설정모듈을 통해 설정되는 잠재적인 가사영역을 N구역들로 분할하는 구성이다. The zone division module 320 is a configuration for dividing a potential lyric area set through the lyric area setting module into N zones.

이때, N구역으로 분할하는 이유에 대한 설명하면 다음과 같다.Here, the reason for dividing into N zones will be described as follows.

잠재적 가사영역은 악보의 음표의 일부분을 포함할 수 있기 때문에 N 구역의 작은 영역으로 나누어 히스토그램을 구해 글자 영역과 음표 영역을 구분한다.
Since the potential lyric area may include a part of the musical notes, the histogram is divided into a small area of the N area to distinguish the character area from the note area.

실행길이 탐색모듈(330)은 구역 분할모듈을 통해 분할된 구역에서 0이 아닌 연속적인 실행길이를 찾기 위해 수직 투영을 하되, 실행길이 사이의 공간이 오선 높이보다 작거나 같을 때, 실행길이를 하나의 구역으로 합하도록 하는 구성이다.The run length search module 330 performs a vertical projection to find non-zero consecutive run lengths in the divided zones through the zone division module, and when the space between run lengths is less than or equal to the run length, As shown in FIG.

우선, 구역 분할 및 run-lengths 탐색에 관하여, 잠재적인 가사 영역을 N 구역들로 나누고 0이 아닌 연속적인 run-lengths를 찾기 위해 수직 투영을 한다. First, for zoning and run-lengths searches, divide the potential lyric area into N zones and perform a vertical projection to find non-zero consecutive run-lengths.

일반적으로 각 단어마다 하나의 run-lengths를 만들기를 기대하지만 한국어에는 도 6의 (a)와 같이 run-lengths를 합쳐야 하는 경우도 있다. run-lengths 사이의 공간이 sl_h(heights of the stave line)보다 작거나 같을 때, run-lengths를 하나의 구역으로 합한다. Generally, it is expected to make one run-lengths for each word, but in some cases it is necessary to combine run-lengths as shown in Figure 6 (a). When the space between run-lengths is less than or equal to sl_h (heights of the stave line), sum the run-lengths into one section.

와

이

를 만족할 때,

를

와

로 교체한다. 도 6의 (a)가 (b)와 같이 합해진다.

Wow

this

Lt; / RTI >

To

Wow

. 6 (a) is combined as shown in (b).

run-lengths 필터링에 관하여, 높거나 낮은 음표가 잠재 영역안에 있을 수 있기 때문에, 모든 run-length가 가사영역으로부터 나오는 것은 아니다. 이런 상황에는 다음의 필터링 규칙이 필요하다. 도 7은 FR1~4를 통한 run-lengths 필터링을 보여준다.With respect to run-length filtering, not all run-lengths come from the lyric area, because high or low notes may be in the latent area. The following filtering rules are required for this situation. Figure 7 shows run-lengths filtering through FR1-4.

규칙 FR 1~4에 관하여 설명하면 다음과 같다.The rules FR 1-4 are as follows.

FR1은 경계에 걸쳐있는 음표일부 제거하고, FR2는 오선 사이의 높이 값보다 작은 run length를 제거하며, FR3, 4는 경계에 걸쳐있지만 이전 블록과 같은 행에 있다면 제거하지 않고, 시작행과 끝행 값을 변경한다.
FR1 removes some of the notes that extend across the boundary, FR2 removes run lengths that are less than the height between the lines, FR3, 4 spans the boundary but does not remove if it is on the same line as the previous block, .

후보 경계정보 추출모듈(340)은 전처리부의 이진화 처리 후 노이즈가 있는 경우, 후보 경계를 수정하여 추출하는 기능을 수행할 수 있다.The candidate boundary information extraction module 340 can perform a function of extracting a candidate boundary by correcting the noise when there is noise after the binarization processing of the preprocessing unit.

후보 경계정보 추출과 관련하여, 이진화 후에 노이즈가 있는 경우, 경계가 실제 가사 용역을 모두 포함하기 힘들다. 이때 다음의 규칙을 이용하여 경계를 수정한다. 도 8은 LE1~2 과정을 통해 후보 경계가 녹색에서 적색으로 바뀌는 과정을 보여준다. 이러한 규칙 LE1~2에 대하여 설명하면 다음과 같다.With respect to candidate boundary information extraction, if there is noise after binarization, it is difficult for the boundary to include all of the actual household services. At this time, modify the boundary using the following rule. FIG. 8 shows a process in which the candidate boundary is changed from green to red through LE1 to LE2. These rules LE1 to LE2 will be described as follows.

LE1 : top edge에서 시작행의 run length를 뺀 값이 2*sl_h 보다 작거나 같고 시작행의 run length가 새로운 top edge보다 작으면 새로운 top edge를 시작행의 run length로 변경LE1: If the value obtained by subtracting the run length of the start line from the top edge is less than or equal to 2 * sl_h and the run length of the start line is smaller than the new top edge, the new top edge is changed to the run length of the start line.

LE2 : bottom edge에서 끝행의 run length를 뺀 값이 2*sl_h 보다 작거나 같고 끝행의 run length가 새로운 bottom edge보다 작으면 새로운 bottom edge를 끝행의 run length로 변경
LE2: If the value obtained by subtracting the run length of the end line from the bottom edge is less than or equal to 2 * sl_h and the run length of the end line is smaller than the new bottom edge, the new bottom edge is changed to the run length of the end line.

후보 경계필터링 모듈(350)은 후보 경계정보 추출모듈의 후보 경계를 필터링하는 구성이다. The candidate boundary filtering module 350 is a configuration for filtering the candidate boundary of the candidate boundary information extraction module.

후보 경계 필터링과 관련하여, 픽셀의 밀도는 가사 라인에서 높은데, 종종 음악 기호에 의해서도 높은 밀도를 보이며 이것은 이상치가 된다. 이러한 경우 다음의 FL1~3 규칙을 통해 가사 라인을 수정한다.With respect to candidate boundary filtering, the density of pixels is high in the line of sight, often with high density by music symbols, which is anomalous. In this case, the lines are corrected through the following rules FL1 to FL3.

규칙 FL1~3에 대하여 설명하면, 후보 가사 영역에는 코드나 음표 길이를 나타내는 숫자 등과 같이 텍스트이지만 가사는 아닌 부분이 있는데 이 부분을 필터링하는 규칙이다.Regarding the rules FL1 to FL3, the candidate lyric area is a text, such as a code or a numeral indicating the length of a note, but there is a portion that is not a lyrics.

FL1: 한 라인의 가로 히스토그램 평균이 전체 라인의 히스토그램 평균의 일정 값보다 작으면 이 라인은 제외된다. FL1: This line is excluded if the horizontal histogram average of one line is less than a certain value of the histogram average of all lines.

FL2: 대상 라인이 아래의 오선과 너무 가깝게 붙어 있을 경우 제외한다.FL2: Exclude the target line if it is too close to the below line.

FL3: 대상 라인이 위의 텍스트라인에서 너무 떨어져 있을 경우 제외한다.
FL3: Exclude if the target line is too far from the text line above.

가사영역 추출모듈(360)은 후보 경계필터링 모듈을 통해 필터링된 결과, 가사영역과 악보영역을 O 또는 I로 표시하도록 하는 구성으로, 이때의 규칙 LA1~4에 대한 설명을 하면 다음과 같다.The lexical region extraction module 360 is configured to display a lyric region and a score region as O or I as a result of filtering through the candidate boundary filtering module. The rules LA1 to LA4 at this time will be described below.

LA1 : outpart의 높이와 넓이가 inpart의 높이와 넓이보다 작으면 outpart, inpart가 가사영역LA1: If the height and width of the outpart are less than the height and width of the inpart, outpart, inpart,

LA2 : outpart 높이가 τ 값보다 크고 inpart 높이가 l_h/2 보다 작고, outpart의 넓이가 inpart의 넓이보다 작으면 outpart와 inpart가 음표LA2: outpart the height is greater than the height inpart τ values less than l _h / 2, if the width of the outpart less than the width of the inpart outpart and inpart notes

LA3 : outpart 높이가 τ 값보다 크고 inpart 높이가 l_h/2 보다 작고, joint part의 왼쪽 경계와 inpart의 왼쪽경계의 차와 joint part의 오른쪽 경계와 inpart의 오른쪽경계의 차가 일정값보다 작으면 outpart와 inpart가 음표LA3: If outpart height is large inpart higher than the τ values less than l _h / 2, the difference between the right edge of the right edge of the car and the joint part of the left edge of the left edge of the joint part and inpart and inpart less than the predetermined value outpart And inpart are notes

LA4: 위의 LA1,LA2,LA3을 만족하지 않으면 outpart는 음표 inpart는 가사영역LA4: If the above LA1, LA2, LA3 are not satisfied, outpart is the note inpart,

제어부(400)는 전처리부(100), 오선정보 추출부(200), 가사 라인 추출부(300)를 제어하는 구성이다.
The control unit 400 controls the pre-processing unit 100, the line-of-sight information extracting unit 200, and the lyrics line extracting unit 300.

한편, 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법을 도 9를 참조하여 살펴보면 다음과 같다.Meanwhile, a method using a lyric region extracting apparatus for automatically playing a photographed music score image will be described with reference to FIG.

촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치의 제어부는 악보 영상을 전처리하도록 한다(a).The controller of the lyric region extracting apparatus for automatically playing the photographed music score image preprocesses the score image (a).

여기서, 제 (a) 단계에서 제어부는 입력되는 악보영상을 이진화 처리하도록 하고(a-1), 이진화 처리결과, 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정하도록 한다(a-2).
In step (a), the control unit causes the input music image to be binarized (a-1), and the distortion of the music score image is corrected using the binarization result as a result of the binarization process (a-2) .

다음으로 제어부는 전처리된 악보 영상에서 오선 정보를 추출하도록 한다(b). 제 (b) 단계에서 제어부는 오선을 감지하도록 하고(b-1), 오선의 공간과 높이정보를 평가하여 오선 정보를 추출하도록 한다(b-2).
Next, the control unit extracts the selection information from the preprocessed music score image (b). In step (b), the control unit causes the pentagon to be detected (b-1), and the space and height information of the pentagon is evaluated to extract the pentagon information (b-2).

다음으로 제어부는 전처리된 악보 영상에서 잠재적인 가사영역을 설정하되, 구역 분할영역과 실행길이를 찾아 필터링하고, 후보 경계정보를 찾아 가사 영역을 추출하도록 한다(c).Next, the control unit sets a potential lyric region in the preprocessed music score image, and searches for and searches for the zonal region and the execution length, and extracts the lyric region by searching for candidate boundary information (c).

도 10은 본 발명의 일실시예에 따른 촬영된 음악 악보 영상의 자동연주를 위한 가사 영역 추출장치를 이용한 방법의 가사영역 추출단계의 상세 흐름도이다.10 is a detailed flowchart of a lyric region extraction step of a method using a lyric region extraction device for automatically playing a photographed music score image according to an embodiment of the present invention.

이러한 도 10을 참조하여 (c)단계를 설명하면 다음과 같다.Referring to FIG. 10, step (c) will be described as follows.

제 (c) 단계에서 제어부는 오선과 오선 사이 영역을 잠재적인 가사영역으로 설정하도록 한다(c-1). 다음으로 설정되는 잠재적인 가사영역을 N구역들로 분할하도록 하고(c-2), 분할된 구역에서 0이 아닌 연속적인 실행길이를 찾기 위해 수직 투영을 하되, 실행길이 사이의 공간이 오선 높이보다 작거나 같을 때, 실행길이를 하나의 구역으로 합하도록 한다(c-3). In step (c), the control unit sets the area between the five-line and the five-line area as a potential lyric area (c-1). (C-2) divide the potential lyric area, which is set next, into N zones (c-2) and perform vertical projection to find consecutive run lengths other than 0 in the divided zones, When they are less than or equal to each other, the execution length is summed into one zone (c-3).

다음으로 제어부는 제 (a) 단계의 전처리 과정 이후 노이즈가 있는 경우, 후보 경계를 수정하여 추출하도록 할 수 있다(c-4). 다음으로 제어부는 제 (c-4) 단계의 후보 경계를 필터링하도록 하고(c-5), 필터링 결과, 가사영역과 악보영역을 O 또는 I로 표시하도록 한다(c-6).
Next, if there is noise after the preprocessing step of step (a), the control unit may correct the candidate boundary and extract it (c-4). Next, the control unit causes the candidate boundary of step (c-4) to be filtered (c-5), and the filtering result, the lyric area and the score area are displayed as O or I (c-6).

도 2는 입력되는 악보영상으로부터 가사영역을 추출하는 과정을 나타낸 것이다. 먼저, 이진화(binarization)를 하고 오선과 마디선 정보를 이용하여 악보영상의 왜곡을 보정한다. 그리고 오선을 찾고 오선의 공간과 높이 등의 정보를 얻는다. 그리고 잠재적인 가사영역 설정, 분할영역과 run-lengths를 찾고, run-lengths 필터링과 후보 경계의 정보를 찾아 가사 영역을 추출하게 된다.FIG. 2 shows a process of extracting a lyric region from an input music image. First, the binarization is performed and the distortion of the score image is corrected using the line and line information. Then find the line and get information such as the space and height of the line. Then we search for potential lexical domain settings, partitions and run-lengths, and extract run-lengths filtering and candidate boundary information to extract the lexical domain.

각 단계의 세부적인 내용은 다음과 같다.The details of each stage are as follows.

잠재적인 가사 영역 설정과 관련하여, 가사는 마지막 부분을 제외하고는 항상 오선 사이에 위치하기 때문에 이 부분을 도 3과 같이 잠재적인 가사영역으로 설정한다. 가사 영역의 윗부분은 오선의 5번째 줄에서 ss_h(height of stave-space)+sl_h(height of stave-line)만큼 아래에, 가사 영역의 아랫부분은 오선의 첫 번째 줄에서 ss_h_sl_h만큼 윗부분이다.Regarding the setting of the potential lyric area, since the lyrics are always located between the pentatons except for the last part, this part is set as a potential lyric area as shown in Fig. The upper part of the lyric area is on the fifth line of the pentagon as much as ss_h (height of stave-space) + sl_h (height of stave-line) and the lower part of the lyric area is ss_h_sl_h on the first line of the pentagon.

구역 분할 및 run-lengths 탐색에 관하여, 잠재적인 가사 영역을 N 구역들로 나누고 0이 아닌 연속적인 run-lengths를 찾기 위해 수직 투영을 한다. For zoning and run-lengths searches, divide the potential lyric area into N zones and perform a vertical projection to find non-zero consecutive run-lengths.

와

이

를 만족할 때,

를

와

로 교체한다. 도 6의 (a)가 (b)와 같이 합해진다.

Wow

this

Lt; / RTI >

To

Wow

. 6 (a) is combined as shown in (b).

의 시작과 끝 행,

: 잠재영역의 위와 아래 가장자리.

The beginning and ending rows,

: The top and bottom edges of the potential area.

후보 경계정보 추출과 관련하여, 이진화 후에 노이즈가 있는 경우, 경계가 실제 가사 용역을 모두 포함하기 힘들다. 이때 다음의 규칙을 이용하여 경계를 수정한다. 도 8은 LE1~2 과정을 통해 후보 경계가 녹색에서 적색으로 바뀌는 과정을 보여준다.With respect to candidate boundary information extraction, if there is noise after binarization, it is difficult for the boundary to include all of the actual household services. At this time, modify the boundary using the following rule. FIG. 8 shows a process in which the candidate boundary is changed from green to red through LE1 to LE2.

위에서,

는 top edge이고, 는 bottom edge이며,

는 new top edge,

는 new bottom edge이다.
From above,

Is the top edge, Is the bottom edge,

New top edge,

Is the new bottom edge.

후보 경계 필터링과 관련하여, 픽셀의 밀도는 가사 라인에서 높은데, 종종 음악 기호에 의해서도 높은 밀도를 보이며 이것은 이상치가 된다. 이러한 경우 다음의 규칙을 통해 가사 라인을 수정한다.With respect to candidate boundary filtering, the density of pixels is high in the line of sight, often with high density by music symbols, which is anomalous. In this case, the lyrics line is modified by the following rule.

Rule FL1: Rule FL1:

Rule FL1에서 avg(h)은 후보 라인의 수직 히스토그램의 평균이고,

은 실험적으로 결정된 상수이다.In rule FL1, avg (h) is the average of the vertical histogram of the candidate line,

Is an experimentally determined constant.

Rule FL2:

은 다음 오선

의 윗부분이고,

와

는 라인

의 윗부분과 아랫부분이며,

는 오선

의 아랫부분, 이 규칙은 잠재적 가사 영역에서 첫 번째 라인(i=1)에서 사용된다.

The next line

Lt; / RTI >

Wow

The line

The upper and lower portions thereof,

A penthouse

, This rule is used in the first line (i = 1) in the potential lyric area.

Rule FL3:Rule FL3:

는 j번째 오선 아래에 잠재적 가사 영역의 I번째 라인이다.

Is the Ith line of the potential lyric area beneath the jth pentagon.

가사영역 추출과 관련하여, 후보 경계필터링 모듈을 통해 필터링된 결과, 가사영역과 악보영역을 O 또는 I로 표시하도록 한다.Regarding the extraction of the lyric region, as a result of filtering through the candidate boundary filtering module, the lyric region and the score region are indicated by O or I.

Rule LA1:

, marking O and I as lyricsRule LA1:

, marking O and I as lyrics

Rule LA2:

Rule LA3:

Rule LA4: 위의 규칙 (LA1, LA2 및 LA3)를 만족하지 않는 경우, I를 가사로 O를 음악 기호로 표시Rule LA4: If the above rules (LA1, LA2, and LA3) are not satisfied, I is indicated by lyrics and O by music symbol

in-part I일 때,

는 폭,

는 높이,

는 왼쪽 가장자리,

는 오른쪽 가장자리When in-part I,

The width,

The height,

The left edge,

The right edge

out-part O일 때,

는 폭,

는 높이When out-part is O,

The width,

The height

joint-part일 때,

는 왼쪽 가장자리,

는 오른쪽 가장자리
When it is a joint-part,

The left edge,

The right edge

광학악보 인식에서 적은 계산량과 빠른 속도로 가사 영역을 제거하는 본 발명의 일실시예에 따르면, 연산과정을 단순화하여 연산능력이 제한된 모바일 환경에서 강점을 발휘할 수 있다. According to an embodiment of the present invention for eliminating a lyric area at a low calculation amount and a high speed in optical score recognition, a calculation process can be simplified and a strength can be exerted in a mobile environment with limited computing capability.

또한 본 발명에 따르면, 광학악보 인식시 가사가 음악기호와 접촉되거나 겹칠 경우를 해결하여 악보의 인식률을 향상할 수 있고, 광학악보 인식은 시간과 장소의 구애됨이 없이 인쇄된 악보의 디지털화를 통해 음악의 연주, 분석, 비교, 편곡, 작곡에 이르기까지 다양한 분야에서 빠른 작업을 할 수 있게 된다.
Further, according to the present invention, it is possible to improve the recognition rate of the musical score by solving the case where the lyrics are touched or overlapped with the musical symbol in the optical musical score recognition, and the optical musical score recognition is performed by digitizing the printed musical score From music to playing, analyzing, comparing, arranging, and composing, you will be able to work fast in various fields.

이상으로 본 발명의 기술적 사상을 예시하기 위한 바람직한 실시예와 관련하여 설명하고 도시하였지만, 본 발명은 이와 같이 도시되고 설명된 그대로의 구성 및 작용에만 국한되는 것이 아니며, 기술적 사상의 범주를 일탈함이 없이 본 발명에 대해 다수의 변경 및 수정이 가능함을 당업자들은 잘 이해할 수 있을 것이다. 따라서, 그러한 모든 적절한 변경 및 수정과 균등물들도 본 발명의 범위에 속하는 것으로 간주되어야 할 것이다.While the present invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. It will be appreciated by those skilled in the art that numerous changes and modifications may be made without departing from the invention. Accordingly, all such appropriate modifications and changes, and equivalents thereof, should be regarded as within the scope of the present invention.

100 : 전처리부 110 : 이진화 모듈
120 : 보정모듈 200 : 오선정보 추출부
210 : 오선감지모듈 220 : 오선정보 추출모듈
300 : 가사 라인 추출부 310 : 가사영역 설정모듈
320 : 구역 분할모듈 330 : 실행길이 탐색모듈
340 : 후보 경계정보 추출모듈 350 : 후보 경계필터링 모듈
360 : 가사영역 추출모듈100: preprocessing unit 110: binarization module
120: correction module 200:
210: a pentagon detection module 220: a pentagon information extraction module
300: Lyrics line extraction unit 310: Lyrics area setting module
320: zoning module 330: run length search module
340: candidate boundary information extraction module 350: candidate boundary filtering module
360: Lyrics Region Extraction Module

Claims

A preprocessing unit for binarizing the music score image;
A preference information extracting unit for extracting preference information from the preprocessed music score image; And
And a lyric line extracting unit for setting a potential lyric area in the preprocessed music score image, searching and searching for a zoned area and an execution length, and searching for candidate boundary information to extract a lyric area,
Wherein the pre-processing unit comprises: a binarization module for binarizing the input music image; And
And a correction module for correcting the distortion of the score image using the five-line and line information as a result of the binarization process,
The correction module detects a line segment using the line segment characteristic of the score segment, divides the line segments into respective regional regions between the segments and nodes using the line segment information, and then horizontally And rearranging the area so as to correct the distortion. A lyric area extracting apparatus for automatically playing a photographed music score image.

delete

The method according to claim 1,
The preference information extracting unit extracts,
A pentode detection module for detecting pentode; And
And a five-line information extraction module for evaluating the space and height information of the dashed line.

The method according to claim 1,
The lyrics line extracting unit extracts,
A lexical area setting module for setting the area between the pentagon and the pentagon as a potential lexical area;
A zone division module for dividing a potential lyrics area set through the lyrics zone setting module into N zones;
Performing a vertical projection to find successive non-zero run lengths in the zones partitioned by the zone partitioning module, and when the space between run lengths is less than or equal to the straight line height, Length search module;
A candidate boundary information extracting module for correcting and extracting a candidate boundary when there is noise after the binarization processing of the preprocessing unit;
A candidate boundary filtering module for filtering a candidate boundary of the candidate boundary information extraction module; And
A lyric region extraction module for displaying a lyric region and a score region as O or I as a result of filtering through the candidate boundary filtering module; Wherein the lexical region extracting means includes a lexical region extracting means for automatically playing the photographed music score image.

A method using a lyric region extracting apparatus for automatically playing a photographed music score image,
(a) causing the control unit of the lyric region extracting apparatus to preprocess the score image;
(b) extracting the selection information from the preprocessed music score image; And
(c) setting, by the control unit, a potential lyric area in the preprocessed score image, searching for a zone division area and an execution length, filtering the candidates, and extracting a lyric area by searching for candidate boundary information,
Wherein the step (a) comprises the steps of: (a-1) binarizing the score image input by the controller; And (a-2) the controller corrects the distortion of the score image using the five-line and line information as a result of the binarization process,
In the step (a-2), after the control unit detects the pentagon, the control unit detects the bar using the pentatonic characteristic of the score segment, divides the pentagon region into the area regions between the bar and the node And rearranging the regions so as to be horizontally arranged on a region-by-region basis, thereby correcting the distortion.

delete

6. The method of claim 5,
The step (b)
(b-1) causing the control unit to detect a pentode; And
(b-2) extracting the five-line information by evaluating the space and height information of the five-line and the control unit extracting the five-line information.

6. The method of claim 5,
The step (c)
(c-1) causing the control unit to set the area between the five-line and the five-line area as a potential lyric area;
(c-2) causing the control unit to divide the potential lyric area to be set into N zones;
(c-3) The control unit performs vertical projection to find non-zero consecutive execution lengths in the divided areas, and when the space between execution lengths is less than or equal to the five-line height, ;
(c-4) if the noise is present after the preprocessing step in the step (a), correcting and extracting the candidate boundary;
(c-5) causing the controller to filter the candidate boundary of the (c-4) -th step; And
(c-6) causing the controller to display the lyric region and the score region as O or I as a result of the filtering in the (c-5) -th step.