KR100659542B1

KR100659542B1 - Method and system for searching the korean alphabet, and computer readable storage

Info

Publication number: KR100659542B1
Application number: KR1020060022954A
Authority: KR
Inventors: 박영철; 박종철
Original assignee: 주식회사 퓨전소프트
Priority date: 2006-03-13
Filing date: 2006-03-13
Publication date: 2006-12-19

Abstract

A method and a system for searching Hangul, and a computer-readable recording medium thereof are provided to perform Hangul search with a keyword using a consonant, a vowel, or consonant and vowel in a Hangul search program such as a like operation of SQL(Structured Query Language). The keyword is inputted by using a Hangul search form comprised by including a searcher representing at least one of the consonant, a syllable formed by the consonant and the vowel, and the vowel, and a search indicator for discriminating the searcher by being placed to a front side, a rear side, and both the front and rear side of the searcher. A code value range of Hangul including characters used as one component of the searcher included in the inputted keyword is calculated(27). The syllable included in the calculated code value range is extracted from search targets(28). Character strings including the extracted syllable is output as a search result(30).

Description

Method and system for searching the Korean alphabet, and computer readable storage}

도 1은 입력양식의 정규화 과정을 설명하기 위한 설명도, 1 is an explanatory diagram for explaining a normalization process of an input form;

도 2는 입력양식의 플래그 설정 과정을 나타낸 흐름도, 2 is a flowchart illustrating a flag setting process of an input form;

도 3은 정규화된 양식과 양식 플래그를 이용해서 데이터베이스의 데이터를 검색하는 과정을 나타낸 흐름도이다. 3 is a flowchart illustrating a process of retrieving data from a database using a normalized form and a form flag.

* 도면의 주요 부분에 대한 부호의 설명 * Explanation of symbols on the main parts of the drawings

50 : 입력양식 50a : 정규화된 양식 50: input form 50a: normalized form

52 : 탐색자표시자 54 : 한글탐색양식 52: searcher indicator 54: Hangul search form

본 발명은 에스큐엘(SQL, Structured Query Lange)의 라이크(like) 연산 등의 한글 검색프로그램에서 한글 검색을 보다 다양화하기 위한 한글탐색양식과 이를 이용한 한글검색방법과 상기 한글검색방법을 이용하는 프로그램이 저장된 데이터 저장매체 및 그 시스템에 관한 것이다. The present invention provides a Hangul search form for more diversifying the Hangul search in a Hangul search program such as Like operation of SQL (Structured Query Lange), a Hangul search method using the same, and a program using the Hangul search method. A stored data storage medium and a system thereof.

에스큐엘(SQL)의 라이크(like) 연산은 검색 키워드를 통해 해당 키워드와 일 치하는 코드를 가진 데이터를 검색해서 결과를 취한다. 한글은 초성, 중성, 초성+중성, 초성+중성+종성 에 해당하는 한글코드가 다르므로 초성으로만 이루어진 키워드를 이용해서 초성+중성 또는 초성+중성+종성으로 이루어진 음절의 검색이 불가능하다. SQL's like operation searches for data whose code matches the keyword via a search keyword and takes the result. Since Hangul has different Hangul codes for initial, neutral, initial + neutral, initial + neutral + jong, it is impossible to search for syllables composed of first + neutral or first + neutral + jong using keywords composed only of first.

연산자 라이크(like)는 기본적으로 영어권 언어인 영어를 기본으로 개발된 연산자로서 영어의 알파벳 하나 하나에 대해 연산을 수행할 수 있게 한다. 이러한 라이크(like) 연산자를 한글에 적용하면서, 영어의 알파벳 하나 하나에 적용하던 양식(pattern)이 한글의 음절(音節, syllable) 하나 하나에 적용하도록 하는 양식(pattern)으로 제공 되었다. 예를 들어, 영어의 알파벳 'P'로 시작하는 문자열을 찾고자 할 경우에는 양식을 'P%'로 설정하며, 한글 음절 '박'으로 시작하는 문자열을 찾을 경우에는 양식을 '박%'로 설정하면 된다. 이는 한글의 음절 '박'이 영어로는 'Park'임을 고려한다면 라이크(like) 연산에서 영어로는 'P'로 시작하는 문자열을 'P%'로, 'Pa'로 시작하는 문자열을 'Pa%'로 그리고 두 번째가 'a'인 문자열을 '_a%'로 찾을 수 있게 하면서 영어의 'P'에 해당하는 한글'ㅂ'으로 시작하는 음절, 영어의 'Pa'에 해당하는 한글 '바'로 시작하는 음절, 그리고 영어의 'a'에 해당하는 한글 'ㅏ'를 가진 음절을 찾을 수 있는 기능을 제공하지 않는 것은 한글의 장점을 충분히 살리지 못하는 것이다. An operator like is basically an operator developed based on English, an English-language language, and can operate on each alphabet of English. While applying the like operator to Hangul, the pattern applied to each alphabet of English was provided as a pattern to apply to each syllable of Hangul. For example, if you want to find a string that starts with the English letter 'P', set the form to 'P%'. If you find a string that begins with the Korean syllable 'Pak', set the form to 'Pak%'. Just do it. Considering that the Korean syllable 'Pak' is 'Park' in English, the string operation starting with 'P' in English is 'P%' and the string starting with 'Pa' is 'Pa' A syllable beginning with 'ㅂ' for English 'P' and a 'H' for English 'Pa', allowing you to find a string with '%' and a second 'a' as '_a%' Not providing a function to find syllables that begin with 'and syllables with Korean' ㅏ 'that corresponds to' a 'in English does not make good use of Hangul.

본 발명의 목적은 한글의 특징을 살려서 한글의 초성, 중성 또는 한글의 초성+중성의 키워드에 대한 검색을 수행할 수 있도록 해주는 한글탐색양식과 이를 이 용하는 한글검색방법을 제공하는 데 있다. An object of the present invention is to provide a Hangul search form and a Hangul search method using the same to enable the search for the keywords of Hangul's initial, neutral, or Hangul's initial + neutral using the features of Hangul.

본 발명의 다른 목적은 한글의 특징을 살려서 한글의 초성의 키워드에 대한 검색을 수행할 수 있도록 하는 한글 검색방법을 제공하는 데 있다. Another object of the present invention is to provide a Hangul search method that allows the search for the keywords of the Hangul's initial characters by utilizing the characteristics of Hangul.

본 발명의 또 다른 목적은 한글의 특징을 살려서 한글의 중성의 키워드에 대한 검색을 수행할 수 있도록 하는 한글 검색방법을 제공하는 데 있다. It is still another object of the present invention to provide a Korean search method that enables a search for a neutral keyword of Korean by utilizing the features of Korean.

본 발명의 또 다른 목적은 한글의 특징을 살려서 한글의 초성+중성의 키워드에 대한 검색을 수행할 수 있도록 하는 한글 검색방법을 제공하는 데 있다. It is still another object of the present invention to provide a Korean search method that enables the user to perform a search for the initial + neutral keywords of Korean by utilizing the characteristics of Korean.

본 발명의 또 다른 목적은 한글의 특징을 살려서 한글의 초성, 중성, 초성+중성의 키워드에 대한 검색을 수행할 수 있도록 하는 한글 검색방법을 제공하는 데 있다. Another object of the present invention to provide a Hangul search method that allows you to perform a search for the keywords of the initial, neutral, initial + neutral of the Hangul by utilizing the characteristics of the Hangul.

본 발명의 또 다른 목적은 본 발명의 한글 검색방법을 이용하는 프로그램이 저장된 데이터 저장매체를 제공하는 데 있다. Still another object of the present invention is to provide a data storage medium storing a program using the Korean search method of the present invention.

본 발명의 또 다른 목적은 본 발명의 한글 검색방법을 구현하기 위한 한글 검색 시스템을 제공하는 데 있다. Another object of the present invention to provide a Hangul search system for implementing the Hangul search method of the present invention.

본 발명에 따른 한글 탐색양식을 이용한 한글검색방법은 한글 초성자음, 한글 초성자음과 중성모음으로 구성된 음절, 한글 모음 중 적어도 어느 하나를 나타내는 탐색자와 상기 탐색자의 앞쪽 또는 뒤쪽 또는 앞쪽과 뒤쪽에 배치되어 상기 탐색자가 탐색자임을 식별할 수 있도록 해주는 탐색자표시자를 포함하여 구성된 한글탐색양식을 이용한다. Hangul search method using the Hangul search form according to the present invention is arranged in front or back or front and back of the searcher and at least one of the Hangul vowel consonants, syllable composed of Hangul choson consonants and neutral vowels, Hangul vowels It uses a Korean search form including a searcher indicator for identifying the searcher as a searcher.

이 경우 한글 초성자음, 한글 초성자음과 중성모음으로 구성된 음절, 한글 모음 중 하나 이상을 나타내는 탐색자와 상기 탐색자가 정확히 하나의 문자인 경우에는 앞쪽 또는 뒤쪽에 그리고 상기 탐색자가 하나 이상의 문자인 경우에는 앞쪽과 뒤쪽에 배치되어 상기 탐색자가 탐색자임을 식별할 수 있도록 해주는 탐색자표시자를 포함하여 구성된 것이 바람직하다. In this case, the Korean initial consonant, the syllable consisting of the Korean initial consonant and the neutral vowel, the searcher representing one or more of the Korean vowels, the front or rear if the searcher is exactly one character, and the front if the searcher is one or more characters. It is preferably configured to include a searcher indicator disposed behind and to allow the searcher to identify that the searcher is a searcher.

상기 탐색자표시자는 예약된 1이상의 일반문자 또는 특수문자가 사용될 수 있다. The searcher indicator may use one or more reserved regular or special characters.

경우에 따라 상기 탐색자표시자는 에스큐엘에서의 탈출문자일 수 있다. In some cases, the searcher indicator may be an escape character from escuel.

그리고 한글탐색양식에서 탐색자와 탐색자표시자는 '/ㄱ', 'ㄱ/', '/ㄱ/'의 경우 이외에 '/ㄱ가ㅏ/'와 같이 탐색자가 한글자 이상으로 구성될 수 있다. 탐색자표시자도 다양하게 변경될 수 있다.In the Korean search form, the searcher and the searcher indicator may include more than one Hangul character, such as '/ ㄱ 가 ㅏ /', in addition to the case of '/ ㄱ', 'ㄱ /', '/ ㄱ /'. The navigator indicator can also be changed in many ways.

본 발명에 따른 한글검색방법은 탐색자의 한 요소로 사용되는 글자가 포함될 수 있는 한글들에 검색대상 한글 음절이 속하는지의 여부를 판단하기 위한 구성을 가진다. The Hangul search method according to the present invention has a configuration for determining whether a search target Hangul syllable belongs to Hangul that may include a character used as an element of a searcher.

본 발명에 따른 한글의 코드값의 범위를 이용한 한글검색방법은 탐색자의 한 요소로 사용되는 글자가 포함될 수 있는 한글들의 코드값의 범위 안에 검색대상 문자의 코드값이 속하는지의 여부에 따라 검색결과를 출력하는 것을 특징으로 한다. The Hangul search method using the range of Hangul code values according to the present invention is a search result depending on whether the code value of the character to be searched falls within the range of the Hangul code value that may include a character used as an element of the searcher It characterized in that the output.

상기 한글의 코드값범위를 이용한 한글검색방법은, The Hangul search method using the code value range of the Hangul,

한글의 초성자음들, 한글의 모음들, 한글의 초성자음과 모음으로만 된 음절들 중 탐색자의 한 요소로 사용되는 글자가 포함될 수 있는 한글들의 코드값범위를 산출하는 코드값범위 산출단계, 검색대상에서 상기 산출된 코드값 범위에 속하는 한글 음절을 추출하는 한글 음절 추출단계 및 상기 추출된 한글 음절을 포함하는 문자열을 검색결과로 출력하는 검색결과 출력단계를 포함하는 구성을 가진다. Code value range calculation step of calculating code value ranges of Korean characters that may include letters used as an element of the searcher among Korean initial consonants, Korean vowels, Korean initial consonants and vowel syllables The Korean syllable extraction step of extracting Hangul syllables belonging to the calculated code value range from a target and a search result outputting step of outputting a string including the extracted Hangul syllables as a search result.

상기 탐색자의 한 요소로 사용되는 글자는 한글완성형 초성자음이고, The letter used as an element of the searcher is a Hangul complete superson

상기 코드값범위 산출단계는, The code value range calculating step,

검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 한글의 초성자음들 중에서 탐색자 중의 한 요소로 입력되는 초성자음이 몇 번째 초성자음인지를 식별할 수 있도록 하기 위해 각 초성자음의 순서값을 설정해두는 초성자음의 순서값 설정단계, 탐색자 중의 한 요소가 초성자음이 될 수 있음을 나타내는 탐색자표시자가 포함된 한글탐색양식의 입력양식을 설정하여 상기 탐색자표시자가 포함된 한글탐색양식의 입력양식을 탐색자로 이용할 수 있도록 하는 한글탐색양식의 입력양식 설정단계, 상기 입력양식으로 입력된 탐색자를 상기 탐색자표시자가 제외된 문자열 양식으로 정규화하는 정규화단계 및 상기 설정된 초성자음의 순서값을 이용해 상기 탐색자 중의 한 요소로 입력된 초성자음을 초성으로 가질 수 있는 한글 음절의 최소 코드값과 최대 코드값을 읽어오는 코드값 독출단계를 포함하는 것이 바람직하다. The first consonant to set the order value of each first consonant so that the search program or the program linked to the search program can identify the first consonant inputted as one element of the searcher among the Korean consonants In the step of setting the order value of, the input form of the Hangul search form including the searcher indicator can be used as the searcher by setting the input form of the Hangul search form including the searcher indicator indicating that one element of the searcher can be an initial consonant. An input form setting step of the Hangul search form, a normalization step of normalizing the searcher inputted into the input form into a string form without the searcher indicator, and an input value of one of the searchers using the set order value of the initial consonant Minimum code for Hangul syllables that can have a consonant It is preferable to include a code value reading step of reading the value and the maximum code value.

상기 음절 추출단계에는 상기 독출된 상기 최소 코드값과 최대 코드값을 검색대상의 음절의 코드값과 비교하여 상기 최소 코드값 이상이고 상기 최대 코드값 미만인 한글 음절을 추출하는 것이 바람직하다. In the syllable extraction step, it is preferable to extract the Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with the code value of the syllable to be searched.

상기 정규화단계에는 상기 탐색자표시자 다음에 나오는 한글탐색양식에 해당 하는 한글이 초성자음에 해당하는지를 식별할 수 있도록 하기 위한 양식 플래그를 생성하는 것이 좋다. In the normalization step, it is preferable to generate a style flag for identifying whether the Hangul corresponding to the Hangul search form following the searcher indicator corresponds to the initial consonant.

상기 각 초성자음의 순서값 설정단계에는 한글 자음들 중 'ㄱ'을 최소값 CON_START로 하고 사전적 순서에 따라 그 다음 자음들은 1씩 증가하는 값을 부여하고 'ㅎ'은 최대값 CON_END로 하여, 한글의 자음들 중(CON_START ~ CON_END)에서 상기 탐색자로 입력되는 초성자음이 몇 번째 초성자음인지를 식별할 수 있도록 하기 위해 하기에 나타낸 배열, In the step of setting the order value of each consonant consonant, 'AB' is the minimum value CON_START among the Korean consonants, and the next consonants are incremented by 1 according to the dictionary order, and 'ㅎ' is the maximum value CON_END. In order to be able to identify the first consonant which is the first consonant input to the searcher among the consonants of (CON_START to CON_END),

static const int Initial_Consonant[] = static const int Initial_Consonant [] =

{ {

0, 1, -1, 2, -1, -1, 3, 4, 5, -1, /* 'ㄱ', 'ㄲ', 'ㄳ', 'ㄴ', 'ㄵ', 'ㄶ', 'ㄷ', 'ㄸ', 'ㄹ', 'ㄺ' */ 0, 1, -1, 2, -1, -1, 3, 4, 5, -1, / * 'ㄱ', 'ㄲ', 'ㄳ', 'ㄴ', 'ㄵ', 'ㄶ', 'ㄷ', 'ㄸ', 'ㄹ', 'ㄺ' * /

-1, -1, -1, -1, -1, -1, 6, 7, 8, -1, /* 'ㄻ', 'ㄼ', 'ㄽ', 'ㄾ', 'ㄿ', 'ㅀ', 'ㅁ', 'ㅂ', 'ㅃ', 'ㅄ' */ -1, -1, -1, -1, -1, -1, 6, 7, 8, -1, / * 'ㄻ', 'ㄼ', 'ㄽ', 'ㄾ', 'ㄿ', ' ㅀ, 'ㅁ', 'ㅂ', 'ㅃ', 'ㅄ' * /

9, 10, 11, 12, 13, 14, 15, 16, 17, 18 /* 'ㅅ', 'ㅆ', 'ㅇ', 'ㅈ', 'ㅉ', 'ㅊ', 'ㅋ', 'ㅌ', 'ㅍ', 'ㅎ' */ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 / * 'ㅅ', 'ㅆ', 'ㅇ', 'ㅈ', 'ㅉ', 'ㅊ', 'ㅋ', ' ㅌ, ', ㅎ * /

}에 따라 상기 각 초성자음의 순서값을 설정해두는 것이 좋다. 이하에서는 이를 배열 IC_Index라고도 칭한다. }, It is better to set the order value of each consonant. This is also referred to as an array IC_Index below.

상기 코드값 범위 산출단계에서 상기 CON_START와CON-END 사이의 값을 가지는 상기 탐색자에 포함된 초성자음을 X라 할 때, Initial_Consonant[X - CON_START] = i라 하고, i번째 초성자음을 초성으로 가지는 임의의 음절 S가 가질 수 있는 상기 최소 코드값과 상기 최대 코드값의 범위를 LB_Inclusive ≤ S ＜ UB_Exclusive라 할 때, LB_Inclusive는 ICV[i*VOWEL_SIZE]이고, UB_Exclusive는 ICV[i*VOWEL_SIZE+VOWER_SIZE] 의 값을 가지고, In the code value range calculating step, when an initial consonant included in the searcher having a value between the CON_START and the CON-END is X, Initial_Consonant [X-CON_START] = i, and the i-th initial consonant has an initial LB_Inclusive is ICV [i * VOWEL_SIZE], and UB_Exclusive is ICV [i * VOWEL_SIZE + VOWER_SIZE] when the range of the minimum code value and the maximum code value that any syllable S can have is LB_Inclusive ≤ S <UB_Exclusive. Take the value,

상기 VOWEL_SIZE는 한글 모음의 개수이고, The VOWEL_SIZE is the number of Korean vowels,

상기 ICV는 배열로서, The ICV is an array,

static const unsigned short ICV[]= static const unsigned short ICV [] =

{ {

'가', '개', '갸', '걔', '거', '게', '겨', '계', '고', '과', '괘', '괴', '교', '구', '궈', '궤', '귀', '규', '그', '긔', '기', 'Ga', 'Dog', 'Ga', 'Her', 'Geo', 'Crab', 'Off', 'Gye', 'Go', 'Family', 'Tu', 'Ko', 'Gyo' ',' Gu ',' gou ',' ark ',' ear ',' gyu ',' he ',' 긔 ',' ki ',

'까', '깨', '꺄', '꺼', '꺼', '께', '껴', '꼐', '꼬', '꽈', '꽤', '꾀', '꾜', '꾸', '꿔', '꿰', '뀌', '뀨', '끄', '끼', '끼', 'Ka', 'seam', '꺄', 'off', 'off', 'ke', 'cuddly', '꼐', 'ko', '꽈', 'pretty', 'chi', '꾜 ',' Ku ',' cuddle ',' sew ',' squeezed ',', ',' off ',' kitten ',' kitten ',

'나', '내', '냐', '너', '너', '네', '녀', '녜', '노', '놔', '뇌', '뇌', '뇨', '누', '눠', '눼', '뉘', '뉴', '느', '늬', '니', 'I', 'my', 'nya', 'you', 'you', 'yes', 'woman', '녜', 'no', 'let', 'brain', 'brain', 'urine' ',' Nu ',' hand ',' 눼 ',' nu ',' new ',' ne ',' ni ',' ni ',

'다', '대', '댜', '더', '더', '데', '뎌', '뎨', '도', '돠', '돼', '되', '됴', '두', '둬', '뒈', '뒤', '듀', '드', '듸', '디', 'Da', 'large', '댜', 'more', 'more', 'de', 'dee', '뎨', 'do', '돠', 'can', 'be', '됴 ',' Two ',' leave ',' 뒈 ',' back ',' du ',' de ',' 듸 ',' D ',

'따', '때', '떠', '떠', '떠', '떼', '뗘', '또', '또', '똬', '뙈', '뙤', '뚜', '뚜', '뛔', '뛔', '뛰', '뜨', '뜨', '띄', '띠', 'Ta', 'time', 'float', 'float', 'float', 'flock', '뗘', 't', 't', '똬', '뙈', '뙤', 't ',' Do ',' 뛔 ',' 뛔 ',' Run ',' Tu ',' Tu ',' Off ',' Zi ',

'라', '래', '랴', '러', '러', '레', '려', '례', '로', '롸', '뢨', '뢰', '료', '루', '뤄', '뤠', '뤼', '류', '르', '리', '리', 'La', 'Ra', 'Lah', 'R', 'R', 'Le', 'Ryeo', 'Yes', 'Ro', '롸', '뢨', 'Rho', 'Ryo' ',' Lu ',' luo ',' 뤠 ',' lu ',' ryu ',' le ',' li ',' li ',

'마', '매', '먀', '머', '머', '메', '며', '몌', '모', '뫄', '뫼', '뫼', '묘', '무', '뭐', '뭬', '뮈', '뮤', '므', '미', '미', 'Ma', 'Mae', '먀', 'Mur', 'Mur', 'Me', 'Come', '몌', 'Mo', '뫄', 'Moo', 'Moo', 'Tomb ',' Mu ',' what ',' 뭬 ',' 뮈 ',' mu ',' me ',' mi ',' mi ',

'바', '배', '뱌', '버', '버', '베', '벼', '볘', '보', '봐', '봬', '뵈', '뵤', '부', '붜', '붸', '뷔', '뷰', '브', '비', '비', 'Bar', 'boat', '뱌', 'burr', 'burr', 'bee', 'rice', '볘', 'bo', 'look', '봬', '뵈', '뵤 ',' Boo ',' 붜 ',' 붸 ',' v ',' view ',' b ',' rain ',' rain ',

'빠', '빼', '뺘', '뻐', '뻐', '뻬', '뼈', '뽀', '뽀', '뾔', '뾔', '뾔', '뾰', '뿌', '쀼', '쀼', '쀼', '쀼', '쁘', '삐', '삐', 'Pa', 'po', '뺘', 'good', 'good', 'pe', 'bone', 'po', 'po', '뾔', '뾔', '뾔', 'tip ',' Pu ',' 쀼 ',' 쀼 ',' 쀼 ',' 쀼 ',' pretty ',' beep ',' beep ',

'사', '새', '샤', '섀', '서', '세', '셔', '셰', '소', '솨', '쇄', '쇠', '쇼', '수', '숴', '쉐', '쉬', '슈', '스', '시', '시', 'Sa', 'bird', 'sha', 'sha', 'seo', 'three', 'sher', 'she', 'cow', '솨', 'chain', 'iron', 'show ',' Number ',' 숴 ',' she ',' she ',' shu ',' su ',' shi ',' shi ',

'싸', '쌔', '썅', '써', '써', '쎄', '쏀', '쏀', '쏘', '쏴', '쐐', '쐬', '쑈', '쑤', '쒀', '쒜', '쒸', '쓩', '쓰', '씌', '씨', 'Che', 'sah', '썅', 'sur', 'sur', 'se', '쏀', '쏀', 'saw', 'shoot', 'wedge', 'dip', '쑈 ',' Xu ',' 쒀 ',' 쒜 ',' 쒸 ',' 쓩 ',' Tsu ',' Tu ',' Sea ',

'아', '애', '야', '얘', '어', '에', '여', '예', '오', '와', '왜', '외', '요', '우', '워', '웨', '위', '유', '으', '의', '이', 'Ah', 'Ah', 'Hey', 'Hey', 'U', 'E', 'F', 'Yes', 'O', 'Wah', 'Why', 'Other', 'Yo' ',' U ',' wo ',' we ',' up ',' u ',' u ',' of ',' yi ',

'자', '재', '쟈', '쟤', '저', '제', '져', '졔', '조', '좌', '좨', '죄', '죠', '주', '줘', '줴', '쥐', '쥬', '즈', '지', '지', 'Ja', 'Ja', 'Ja', '쟤', 'Me', 'Je', 'Jer', '졔', 'Jo', 'Left', '좨', 'Sin', 'Joe' ',' Ju ',' give ',' 줴 ',' rat ',' ju ',' z ',' ji ',' ji ',

'짜', '째', '쨔', '쩌', '쩌', '쩨', '쪄', '쪼', '쪼', '쫘', '쫴', '쬐', '쭁', '쭈', '쭤', '쮜', '쮜', '쮸', '쯔', '찌', '찌', 'Cha', 'th', '쨔', 'ze', 'ze', '쩨', '쪄', 'speck', 'squash', '쫘', '쫴', '' ',' 쭁 ' ',' Chuu ',' 쭤 ',' 쮜 ',' 쮜 ',' 쮸 ',' tsu ',' chi ',' chi ',

'차', '채', '챠', '처', '처', '체', '쳐', '쳬', '초', '촤', '최', '최', '쵸', '추', '춰', '췌', '취', '츄', '츠', '치', '치', 'Cha', 'chae', 'cha', 'cher', 'cher', 'che', 'cher', '쳬', 'second', '촤', 'choi', 'choi', 'cho ',' Chu ',' chu ',' pancreas', 'odor', 'chu', 'tsu', 'chi', 'chi',

'카', '캐', '캬', '커', '커', '케', '켜', '켸', '코', '콰', '쾌', '쾨', '쿄', '쿠', '쿼', '퀘', '퀴', '큐', '크', '키', '키', 'Ka', 'ca', 'ky', 'ker', 'ker', 'ke', 'turn', ',', 'nose', 'quan', 'joy', '쾨', 'kyo' ',' Cu ',' qua ',' qua ',' qui ',' cue ',' large ',' key ',' key ',

'타', '태', '탸', '터', '터', '테', '텨', '톄', '토', '톼', '퇘', '퇴', '툐', '투', '퉈', '퉤', '튀', '튜', '트', '틔', '티', 'Ta', 'tae', '탸', 'ter', 'terr', 'te', '텨', '톄', 'sat', '톼', '퇘', 'tung', '툐 ',' Two ',' 퉈 ',' 퉤 ',' throw ',' tu ',' t ',' 틔 ',' tee ',

'파', '패', '퍄', '퍼', '퍼', '페', '펴', '폐', '포', '퐈', '푀', '푀', '표', '푸', '풔', '퓌', '퓌', '퓨', '프', '피', '피', 'Par', 'l', '퍄', 'fur', 'fur', 'pe', 'unfold', 'lung', 'po', '퐈', '푀', '푀', 'table' ',' Fu ',' 풔 ',' Fu ',' Fu ',' Fu ',' F ',' Blood ',' Blood ',

'하', '해', '햐', '허', '허', '헤', '혀', '혜', '호', '화', '홰', '회', '효', '후', '훠', '훼', '휘', '휴', '흐', '희', '히' 'Ha', 'sun', '햐', 'huh', 'huh', 'he', 'tongue', 'hye', 'ho', 'hwa', '화', 'hoe', 'hyo ',' Hu ',' 훠 ',' fe ',' hui ',' hugh ',' he ',' hee ',' he '

}로서 한글 완성형에서 사용하고 있는 문자를 기준으로 만들어지고 한글 완성형에서 사용하지 않는 음절에 대해서는 해당 음절보다 크거나 같은 값들 중 최소값을 가지는 음절을 반복해서 사용하는 것이 좋다. }, It is recommended to repeat the syllables that have the minimum value among the syllables that are greater than or equal to the syllable for syllables that are made based on the characters used in the Hangul Completion.

경우에 따라 상기 탐색자의 한 요소로 사용되는 글자는 한글의 유니코드 초성자음 x이고, In some cases, a character used as an element of the searcher is a Unicode initial consonant x of Korean,

상기 코드값범위 산출단계는, The code value range calculating step,

상기 x가 유니코드 초성자음의 UNI_IC_START와 UNI_IC_END 사이의 값인 경우 x-UNI_IC_START를 x의 초성자음 색인으로 두고, If x is a value between UNI_IC_START and UNI_IC_END of the Unicode initial consonant, let x-UNI_IC_START be the index of the initial consonant of x,

상기 x가 유니코드 자음의 UNI_CONSONANT_START와 UNI_CONSONANT_END 사이의 값인 경우에는 KSX 1001의 배열 IC_Index[x - UNI_CONSONANT-START]의 값이 -1이 아닌 경우에 한하여 x를 초성자음으로 판단하고 그 초성자음의 초성자음 색인으로 두어,If x is a value between UNI_CONSONANT_START and UNI_CONSONANT_END of the Unicode consonant, x is determined to be an initial consonant only when the value of the array IC_Index [x-UNI_CONSONANT-START] of KSX 1001 is not -1 and the initial consonant of the initial consonant Put into index,

상기 초성자음 색인에 대응되는 상기 초성자음을 초성으로 가질 수 있는 한글 음절의 최소 코드값과 최대 코드값을 읽어오는 코드값 독출단계를 포함하는 것이 바람직하다. And a code value reading step of reading a minimum code value and a maximum code value of a Hangul syllable that may have the first consonant corresponding to the first consonant index.

이 경우, 상기 한글 음절 추출단계는 초성자음색인이 ic_idx인 초성자음을 초성으로 가지는 임의의 음절 S가 가질 수 있는 값의 범위를 R이라 하고 ('가' + ic_idx * 588)을 m이라 할 때, m ≤ R ＜ (m + 588)의 범위를 설정하고 이 범위에 속하는 한글 음절을 추출하는 것이 바람직하다. In this case, the Hangul syllable extraction step is a range of values that any syllable S having an initial consonant with an initial consonant index ic_idx may be R and ('ga' + ic_idx * 588) is m It is preferable to set a range of m ≤ R <(m + 588) and extract Hangul syllables belonging to this range.

경우에 따라, 상기 한글 코드값 범위 R의 최소 코드값과 최대 코드값을 구한 후 그 값들을 유니코드 표준의 임의의 문자 부호화 형식의 임의의 부호화 방법으로 표현한 후 그 범위에 속하는 한글 음절을 추출할 수 있다. In some cases, the minimum code value and the maximum code value of the Hangul code value range R are obtained, and the values are expressed by any encoding method of any character encoding format of the Unicode standard, and the Hangul syllables belonging to the range are extracted. Can be.

또, 경우에 따라 상기 탐색자의 한 요소로 사용되는 글자는 한글완성형 모음이고, In some cases, the letter used as an element of the searcher is a Hangul complete vowel type,

상기 코드값범위 산출단계는, The code value range calculating step,

검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 탐색자 중의 한 요소로 입력되는 음절이 한글 모음인지를 식별할 수 있도록 하기 위해 한글 모음의 최소 코드값과 최대 코드값을 설정해두는 한글 모음 코드값 설정단계, 탐색자 중의 한 요소가 모음이 될 수 있음을 나타내는 탐색자표시자가 포함된 한글탐색양식의 입력양식을 설정하여 상기 탐색자표시자가 포함된 한글탐색양식의 입력양식을 탐색자로 이용할 수 있도록 하는 한글탐색양식의 입력양식 설정단계, 상기 입력양식으로 입력된 탐색자를 상기 탐색자표시자가 제외된 문자열 양식으로 정규화하는 정규화단계, 검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 한글의 각 모음들 중에서 탐색자로 입력되는 모음이 몇 번째 모음인지를 산출하는 모음순서 산출단계, 상기 모음순서 산출단계에서 산출된 순서의 모음을 중성으로 가질 수 있는 한글 음절들의 최소코드값과 최대코드값을 읽어오는 단계를 포함하여 구성되는 것이 좋다. A step of setting a Korean vowel code value in which a minimum code value and a maximum code value of the Korean vowel are set so that a search program or a program linked to the search program can identify whether a syllable inputted as an element of the searcher is a Korean vowel; Set the input form of the Korean search form including the searcher indicator indicating that one element of the searcher can be a vowel so that the input form of the Korean search form including the searcher indicator can be used as the searcher. A form setting step, a normalization step of normalizing the searcher inputted into the input form to a string form without the searcher indicator, a search program or a vowel in which a program linked to the search program is input as a searcher among Korean vowels. Vowel order calculation step for calculating whether the second vowel The method may include reading the minimum code value and the maximum code value of Hangul syllables that may have a vowel of the order calculated in the vowel order calculation step as neutral.

이 경우, 상기 음절 추출단계에는 상기 읽어온 상기 최소 코드값과 최대 코드값을 검색대상의 음절의 코드값과 비교하여 상기 최소 코드값 이상이고 상기 최대 코드값 미만인 한글 음절을 추출하는 것이 좋고, In this case, in the syllable extraction step, it is preferable to extract the Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with the code value of the syllable to be searched.

상기 정규화단계에는 상기 탐색자표시자 다음에 나오는 한글탐색양식에 해당하는 한글이 모음에 해당하는지를 식별할 수 있도록 하기 위한 양식 플래그를 생성하는 것이 바람직하고, In the normalization step, it is preferable to generate a style flag for identifying whether a Korean corresponding to the Korean search form following the searcher indicator corresponds to a vowel.

상기 모음순서 산출단계에는 탐색자로 입력되는 한글 한 문자 X가 한글 모음들 중 최소값 VOWEL_START와 최대값 VOWEL_END 사이의 값을 가지는 경우에 X - VOWEL_START로 모음순서를 산출하는 것이 좋다. In the vowel order calculation step, it is preferable to calculate the vowel order with X-VOWEL_START when the Hangul one letter X inputted as a searcher has a value between the minimum value VOWEL_START and the maximum value VOWEL_END among the Hangul vowels.

그리고 상기 코드값 범위 산출단계는 상기 X - VOWEL_START = i라 할 때, 한글의 i번째 모음을 중성으로 가지는 임의의 음절 S가 가질 수 있는 코드값 범위로 각 초성자음 + i번째 모음을 갖는 음절들이 가질 수 있는 코드값의 범위들을 산출하는 것이 바람직하다. The code value range calculating step includes a code value range that any syllable S having an i th vowel of a Korean alphabet as a neutral may have syllables having each first consonant + i th vowel when X-VOWEL_START = i. It is desirable to calculate the ranges of code values that may have.

경우에 따라, 본 발명에 따른 한글검색방법은 탐색자의 한 요소로 사용되는 중성모음 x가 유니코드의 한글 자모에서 모음의 범위인 MV_START('ㅏ')와 MV_END('ㅣ') 사이의 값인 경우 상기 x를 한글 모음으로 판단하고, x - MV_START를 상기 x의 중성모음 색인 vowel_idx로 두고, In some cases, the Hangul search method according to the present invention is a case where the neutral vowel x used as an element of the searcher is a value between MV_START ('ㅏ') and MV_END ('ㅣ') which is the range of the vowel in the Korean alphabet of Unicode. The x is determined as a Korean vowel, x-MV_START is the neutral vowel index vowel_idx of x,

상기 x가 유니코드의 한글 호환자모에서 모음의 범위인 UNI_VOWEL_START(' ㅏ')와 UNI_VOWEL_END('ㅣ') 사이의 값인 경우 상기 x를 한글 모음으로 판단하고 x - UNI_VOWEL_START를 중성모음 색인 vowel_idx로 두어, If x is a value between UNI_VOWEL_START ('ㅏ') and UNI_VOWEL_END ('ㅣ'), which is the range of vowels in the Unicode Hangul compatible alphabet of Unicode, the x is determined as a Korean vowel and x-UNI_VOWEL_START is the neutral vowel index vowel_idx.

중성모음 색인이 vowel_idx인 모음을 중성으로 가지는 임의의 음절 S가 가질 수 있는 상기 최소 코드값과 상기 최대 코드값의 범위를 LB_Inclusive ≤ R ＜ UB_Exclusive라 할 때, LB_Inclusive는 ('가' + vowel_idx * 28)이고, UB_Exclusive는 ('가' + (INITIAL_CONSONANT_SIZE -1) * 588 + (vowel_idx + 1)*28) 의 부합범위를 설정하고 이 부합범위에 속하는 한글 음절을 추출하고, 상기 INITIAL_CONSONANT_SIZE는 초성으로 사용가능한 19개 자음의 수를 의미하는 것을 특징으로 하는 구성을 가질 수 있다. When the range of the minimum code value and the maximum code value that any syllable S having a vowel_idx having a vowel index of neutral vowels as a neutral can have is LB_Inclusive ≤ R <UB_Exclusive, LB_Inclusive is equal to ('+' + vowel_idx * 28). ), UB_Exclusive sets the matching range of ('ga' + (INITIAL_CONSONANT_SIZE -1) * 588 + (vowel_idx + 1) * 28), and extracts Hangul syllables belonging to this matching range, and the INITIAL_CONSONANT_SIZE is available as an initial consonant. It may have a configuration that means that the number of 19 consonants.

또, 경우에 따라 상기 탐색자의 한 요소로 사용되는 글자는 한글완성형 초성+중성으로 구성된 한글 한 음절이고, In some cases, a letter used as an element of the searcher is a Hangul syllable composed of a Hangul complete type Choseong + Neutral.

상기 코드값범위 산출단계는, The code value range calculating step,

검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 탐색자 중의 한 요소로 입력되는 한글의 음절이 초성+중성으로 구성된 음절인지를 식별할 수 있도록 하기 위해 초성+중성으로 구성된 각 음절들의 순서값을 설정해두는 초성+중성으로 구성된 한글 음절 순서값 설정단계, 검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 한글의 각 음절들 중에서 탐색자 중의 한 요소로 입력되는 음절이 초성+중성으로만 이루어진 음절인지를 확인하는 음절종류 확인단계, 탐색자 중의 하나가 초성과 중성으로만 된 음절이 될 수 있음을 나타내는 탐색자표시자가 포함된 한글탐색양식의 입력양식을 설정하여 상기 탐색자표시자가 포함된 한글탐색 양식의 입력양식을 탐색자로 이용할 수 있도록 하는 한글탐색양식의 입력양식 설정단계, 상기 입력양식으로 입력된 탐색자를 상기 탐색자표시자가 제외된 문자열 양식으로 정규화하는 정규화단계, 상기 탐색자에 포함된 초성+중성의 음절을 가질 수 있는 한글 음절들의 최소 코드값과 최대 코드값을 읽어오는 단계를 포함하는 것이 바람직하다. In order to identify whether the syllable of Hangul inputted as one element of the search program or the program linked to the search program is a syllable composed of first + neutral, the initial value of each syllable composed of first + neutral A syllable type that checks whether a syllable sequence value consisting of a neutral syllable, a search program, or a program linked to the search program is a syllable composed of only a consonant + neutral syllable In the confirming step, the input form of the Hangul search form including the searcher indicator is set as the searcher by setting the input form of the Hangul search form including the searcher indicator indicating that one of the searchers may be a syllable composed of only the initial and the neutral. Input form setting step of Hangul search form to enable, said Normalization step of normalizing the searcher input in the input form in the form of a string excluding the searcher indicator, reading the minimum code value and the maximum code value of Hangul syllables that may have a consonant + neutral syllable included in the searcher It is preferable to include.

이 경우, 상기 음절 추출단계에는 상기 독출된 상기 최소 코드값과 최대 코드값을 검색대상의 음절의 코드값과 비교하여 상기 최소 코드값 이상이고 상기 최대 코드값 미만인 한글 음절을 추출하는 것이 바람직하고, In this case, in the syllable extraction step, it is preferable to extract the Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with the code value of the syllable to be searched.

상기 정규화단계에는 상기 탐색자표시자 다음에 나오는 한글탐색양식에 해당하는 한글이 초성+중성의 음절에 해당하는지를 식별할 수 있도록 하기 위한 양식 플래그를 생성하는 것이 좋고, In the normalization step, it is preferable to generate a style flag for identifying whether a Hangul corresponding to the Hangul search form following the searcher indicator corresponds to a syllable of a consonant + neutral.

상기 초성+중성으로 구성된 한글 음절 코드값 설정단계에는 두 바이트들로 구성된 한글 한 문자가 초성+중성으로 구성된 음절임을 식별하기 위해 배열 ICV를 두고, In the setting of Hangul syllable code value composed of consonant + neutral, an array ICV is provided to identify that a Hangul character composed of two bytes is a syllable composed of consonant + neutral.

상기 배열 ICV는, The array ICV,

static const unsigned short ICV[]= static const unsigned short ICV [] =

{ {

그리고 이 경우의 상기 코드값 범위 산출단계는 '가'의 코드값을 HANGUL_START, '힝'의 코드값을 HANGUL_END라 할 때, 상기 탐색자에 포함한 초성+ 중성으로 이루어진 한글 한 음절 X의 ICV 배열 값을 이진 탐색으로 탐색하여 X의 코드값보다 작거나 같은 값들 중에서 최대값의 배열의 순서를 i라 하면, 상기 X를 포함하는 음절들이 가질 수 있는 코드값의 범위는 ICV[i] ≤ X ＜ ICV[i+1]로 산출하는 것이 바람직하다. In this case, the code value range calculating step includes the ICV array value of the Hangul one syllable X composed of initial + neutral included in the searcher when the code value of 'A' is HANGUL_START and the code value of 'Hing' is HANGUL_END. If the sequence of the maximum value among the values less than or equal to the code value of X is searched by binary search and i, the range of code values that the syllables including X may have is ICV [i] ≤ X <ICV [ i + 1] is preferable.

또, 경우에 따라 상기 탐색자의 한 요소로 사용되는 글자는 유니코드 초성+중성으로 구성된 한글 한 음절이고, In some cases, a letter used as an element of the searcher is a Hangul syllable composed of Unicode initials + neutrals,

상기 코드값범위 산출단계는, The code value range calculating step,

임의의 한글 한 음절 S가 유니코드의 최소 코드값 UNI_HANGUL_START와 최대 코드값 UNI_HANGUL_END 사이의 값이고, (S - '가')의 값이 ((((S - '가')/588)*588) + ((((S - '가')/28)%21)*28))과 같다면 상기 음절 S를 초성과 중성으로만 된 음절로 판단하고,Any one syllable S is between Unicode minimum code value UNI_HANGUL_START and maximum code value UNI_HANGUL_END, and the value of (S-'ga') is (((((S-'ga') / 588) * 588) If it is equal to + (((((S-'ga') / 28)% 21) * 28)), the syllable S is judged to be a syllable composed only of a consonant and a neutral,

상기 음절 S에 대한 초성자음의 코드값범위와 상기 음절 S에 대한 중성의 코드값범위를 읽어오는 구성을 가질 수 있고, It may have a configuration for reading the code value range of the consonant consonant for the syllable S and the neutral code value range for the syllable S,

상기 한글 음절 추출단계는, The Hangul syllable extraction step,

초성자음색인이 ic_idx인 초성자음과 중성모음 색인이 vowel_idx인 모음을 초성과 중성으로 가지는 상기 S의 코드값범위를 R이라 하고 ('가' + ic_idx*588 + vowel_idx*28)을 m이라 할 때, m ≤ R ＜(m + 28)의 부합범위를 설정하고 이 부합범위에 속하는 한글 음절을 추출하고, 상기 초성자음색인 ic_idx는 (S - '가')/(VOWEL_SIZE*(FINAL_CONSONANT_SIZE + 1) = (S - '가')/588의 값이고, 상기 중성모음색인은 vowel_idx는 ((S - '가')/(FINAL_CONSONANT_SIZE + 1))%VOWEL_SIZE = ((S - '가')/28)%21의 값일 수 있다. The code value range of S having an initial consonant with a vowel_idx vowel_idx and a vowel_idx vowel_idx with an initial consonant index is R and ('A' + ic_idx * 588 + vowel_idx * 28) is m. , set the matching range of m ≤ R <(m + 28) and extract the Hangul syllables belonging to this matching range, and the initial consonant ic_idx is (S-'ga') / (VOWEL_SIZE * (FINAL_CONSONANT_SIZE + 1) = (S-'ga') / 588, where the vowel_idx is ((S-'ga') / (FINAL_CONSONANT_SIZE + 1))% VOWEL_SIZE = ((S-'ga') / 28)% It may be a value of 21.

본 발명에 따른 한글의 코드값범위를 이용한 한글검색방법은 검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 한글의 자음들 중에서 탐색자 중의 한 요소로 입력되는 초성자음이 몇 번째 초성자음인지를 식별할 수 있도록 하기 위해 각 초성자음의 순서값을 설정해두는 초성자음의 순서값 설정단계, 검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 탐색자 중의 한 요소로 입력되는 음절이 한글 모음인지를 식별할 수 있도록 하기 위해 한글 모음의 최소 코드값과 최대 코드값을 설정해두는 한글 모음 코드값 설정단계, 검색 프로그램 또는 상기 검색프로그램에 연동하는 프로그램이 탐색자 중의 한 요소로 입력되는 한글의 음절이 초성+중성으로 구성된 음절인지를 식별할 수 있도록 하기 위해 초성+중성으로 구성된 각 음절들의 코드값을 순서대로 설정해두는 초성+중성으로 구성된 한글 음절 코드값 설정단계, 탐색자 중의 한 요소가 한글 초성자음, 한글 중성, 한글 초성과 중성으로 이루어진 음절 중 적어도 하나임 나타내는 탐색자표시자가 포함된 한글탐색양식의 입력양식을 설정하여 상기 탐색자표시자가 포함된 한글탐색양식의 입력양식을 탐색자로 이용할 수 있도록 하는 한글탐색양식의 입력양식 설정단계, 상기 입력양식으로 입력된 탐색자를 상기 탐색자표시자가 제외된 문자열 양식으로 변환하고 상기 문자열 양식의 종류를 나타내는 양식 플래그를 생성하는 정규화 단 계, 상기 생성된 양식 플래그에 따라 상기 문자열 양식의 한글범위를 산출하는 코드값 범위 산출단계, 상기 산출된 코드값 범위에 검색대상의 음절의 코드값이 속하는지 비교하는 코드값 비교단계, 상기 비교결과 상기 검색대상 음절의 코드값이 상기 산출된 코드값 범위에 속하는 한글 음절을 추출하는 추출단계 및 상기 추출한 음절을 포함하는 문자열을 검색결과로 출력하는 검색결과 출력단계를 포함하는 구성을 가질 수 있다. The Hangul search method using the range of Korean code values according to the present invention can identify the number of the first consonant which is the first consonant inputted as one element of the searcher among the consonants of the search program or the program linked to the search program. To set the order value of each consonant to set the order value of each consonant, so that the search program or the program linked to the search program can identify whether the syllable inputted as one element of the searcher is a Korean vowel. Hangul vowel code value setting step that sets minimum code value and maximum code value of Hangul vowel, whether search program or program linked to the search program is syllable composed of initial + neutral syllable Each note, consisting of consonant + neutral, for identification Hangul syllable code value setting step consisting of consonant + neutral to set the code values of them in order. Input form setting step of the Hangul search form to set the input form of the form so that the search form using the Hangul search form including the searcher indicator, the searcher inputted into the input form, the string without the searcher indicator A normalization step of converting a form into a form and generating a form flag indicating a type of the string form, a code value range calculating step of calculating a Hangul range of the string form according to the generated form flag, and searching for the calculated code value range Compare chord values to compare if the chord values of the syllables belong And extracting a Hangul syllable in which the code value of the searched syllable is within the calculated range of code values, and outputting a search result outputting a string including the extracted syllable as a search result. Can have

이 경우, 상기 정규화 단계는 다음의 규칙, In this case, the normalization step is based on the following rules,

(1) 정규화된 양식을 표현하는 배열 zPattern의 바이트 수효만큼의 크기를 가지는 배열 zPatternFlag를 두고, zPatternFlag[i]는 zPattern[i]와 zPattern[i+1]로 구성되는 한글탐색양식의 식별 값을 나타내고, 배열 zPatternFlag의 각 엔트리의 기본 값은 NULL(0)이고, 해당 양식이 한글이 아닐 경우에도 zPatternFlag의 값은 NULL(0)이 되고, (1) The array zPatternFlag is as large as the number of bytes of the array zPattern representing the normalized form, and zPatternFlag [i] specifies the identification value of the Korean search form consisting of zPattern [i] and zPattern [i + 1]. The default value of each entry of the array zPatternFlag is NULL (0), and the value of zPatternFlag is NULL (0) even if the form is not Korean.

(2) 탐색자표시자와 한글 한 문자의 쌍에 대하여는, (2) For a pair of search indicators and one Korean character,

(2-1) 탐색자표시자를 행렬 zPattern에 저장하지 않으며, (2-1) do not store the searcher indicator in the matrix zPattern,

(2-2) 그 한글 문자의 두 바이트들을 행렬 zPattern의 다음 저장 위치에 저장하며, 그 위치의 색인을 i와 i+1이라 할 경우, (2-2) If two bytes of the Hangul character are stored in the next storage location of the matrix zPattern, and the index of the location is i and i + 1,

(2-3) 그 한글 문자가 '종성자음-초성자음(종성으로만 사용되는 자음)', '초성+중성+종성의 음절', '초성자음', '초성+중성의 음절', '모음'인 경우에 대하여 각각 NULL, NULL, INITIAL_CONSONANT(1), INITIAL_MEDIAL_SYLLABLE(2), MEDIAL_VOWEL(3)의 값을 zPatternFlag[i]에 설정하며 NULL 값을 zPatternFlag[i+1] 에 따르는 것이 바람직하다. (2-3) The Hangul characters are 'consonant consonant-consonant consonant' (consonant used only as a consonant) ',' consonant + neutral + final syllable ',' consonant consonant ',' consonant + neutral syllable ',' vowel ', NULL, NULL, INITIAL_CONSONANT (1), INITIAL_MEDIAL_SYLLABLE (2), and MEDIAL_VOWEL (3) are set to zPatternFlag [i], respectively, and it is preferable to set NULL values to zPatternFlag [i + 1].

본 발명에 따른 한글검색방법은 탐색자의 한 요소 초성자음 x가 유니코드 초성자음의 UNI_IC_START와 UNI_IC_END 사이의 값인 경우 x - UNI_IC_START를 x의 초성자음 색인 ic_idx로 두고, 상기 x가 유니코드 자음의 UNI_CONSONANT_START와 UNI_CONSONANT_END 사이의 값인 경우에는 KSX 1001의 배열 IC_Index[x - UNI_CONSONANT-START]의 값이 -1이 아닌 경우에 한하여 x를 초성자음으로 판단하고 그 초성자음의 초성자음 색인 ic_idx로 두어, 비교대상 음절을 S1라 할 때, (S1 - '가')/588의 값이 상기 x의 초성자음 색인 ic_idx와 일치하는 음절 S1을 추출하는 것을 포함하는 구성을 가질 수 있다.According to the present invention, the Hangul search method according to the present invention sets x-UNI_IC_START as the initial consonant index ic_idx of x when a single elemental consonant x of the searcher is a value between UNI_IC_START and UNI_IC_END of the Unicode consonant, and x is UNI_CONSONANT_START of the Unicode consonant. If the value is between UNI_CONSONANT_END, x is regarded as an initial consonant only when the value of the array IC_Index [x-UNI_CONSONANT-START] of KSX 1001 is not -1, and the initial syllable index of the initial consonant is ic_idx, and the syllable to be compared S1, a value of (S1-'ga') / 588 may have a configuration including extracting a syllable S1 that matches the initial consonant index ic_idx of x.

또, 경우에 따라 본 발명에 따른 한글검색방법은 탐색자의 한 요소로 사용되는 중성모음 x가 유니코드의 한글 자모에서 모음의 범위인 MV_START와 MV_END 사이의 값인 경우 상기 x를 한글 모음으로 판단하고, x - MV_START를 상기 x의 중성모음 색인 vowel_idx로 두고, 상기 x가 유니코드의 한글 호환자모에서 모음의 범위인 UNI_VOWEL_START와 UNI_VOWEL_END 사이의 값인 경우 상기 x를 한글 모음으로 판단하고 x - UNI_VOWEL_START를 중성모음 색인 vowel_idx로 두어, 비교대상 음절을 S1이라고 할 때, (((S1 - '가')/28)%21)의 값이 상기 x의 중성모음색인 vowel_idx와 일치하는 음절 S1을 추출하는 것을 포함하는 구성을 가질 수 있다. In some cases, the Hangul search method according to the present invention determines x as a Hangul vowel when the neutral vowel x used as an element of the searcher is a value between MV_START and MV_END which is a range of vowels in the Hangul alphabet of Unicode. Let x-MV_START be the vowel_idx of the neutral vowel index of x, and if x is a value between UNI_VOWEL_START and UNI_VOWEL_END, the range of vowels in the Korean Hangul compatibilizer of Unicode, determine the x as a Korean vowel and set x-UNI_VOWEL_START as the neutral vowel index. When vowel_idx is set and the syllable to be compared is S1, the composition includes extracting a syllable S1 whose value of (((S1-'ga') / 28)% 21) matches the neutral vowel_idx of x. Can have

여기에서, '%'는 그 앞의 수는 그 뒤의 수로 나누어서 그 나머지값을 취하는 연산자이다. Here, '%' is an operator that takes the remainder by dividing the number before it by the number after it.

또, 경우에 따라 본 발명에 따른 한글검색방법은 탐색자의 한 요소로 사용되는 글자는 유니코드 초성+중성으로 구성된 한글 한 음절 S가 유니코드의 최소 코드값 UNI_HANGUL_START와 최대 코드값 UNI_HANGUL_END 사이의 값이고, (S - '가')의 값이 ((((S - '가')/588)*588) + ((((S - '가')/28)%21)*28))과 같다면 상기 음절 S를 초성과 중성으로만 된 음절로 판단하고,In some cases, in the Hangul search method according to the present invention, a character used as an element of the searcher is a Hangul one syllable S composed of Unicode first and neutral, which is a value between the minimum code value UNI_HANGUL_START and the maximum code value UNI_HANGUL_END of Unicode. , (S-'ga') equals ((((S-'ga') / 588) * 588) + (((((S-'ga') / 28)% 21) * 28)) If the syllable S is judged to be only syllables composed of only primary and neutral,

상기 S의 초성자음색인 ic_idx는 (S - '가')/(VOWEL_SIZE*(FINAL_CONSONANT_SIZE + 1) = (S - '가')/588의 값으로 산출하고, Ic_idx, the initial consonant of S, is calculated as a value of (S-'ga') / (VOWEL_SIZE * (FINAL_CONSONANT_SIZE + 1) = (S-'ga') / 588,

상기 중성모음색인은 vowel_idx는 ((S - '가')/(FINAL_CONSONANT_SIZE + 1))%VOWEL_SIZE = ((S - '가')/28)%21의 값으로 산출하고, The neutral vowel index vowel_idx is calculated as a value of ((S-'ga') / (FINAL_CONSONANT_SIZE + 1))% VOWEL_SIZE = ((S-'ga') / 28)% 21,

비교대상 음절을 S1이라 할 때, (S1 - '가')/588 값이 초성자음색인 ic_idx와 같고 (((S1 - '가')/28)%21)의 값이 중성모음색인 vowel_idx인 S1을 추출하는 구성을 가질 수 있다. When the syllable to be compared is S1, the value of (S1-'ga') / 588 is equal to ic_idx, which is the initial voice, and ((((S1-'ga)) / 28)% 21) is S1, which is the vowel_idx of the neutral vowel color. It may have a configuration to extract.

본 발명에 따른 한글검색방법은, 배열 ICV에서 비교할 음절보다 작거나 같은 값들 중에서 최대값을 가진 엔트리들 중에서 최대 색인의 값 m을 구하는 단계, m % VOWEL_SIZE를 그 음절의 모음의 중성모음 색인으로 취하는 단계 및 상기 비교대상의 중성모음 색인을 탐색자의 중성모음 색인과 비교하여 상기 탐색자의 중성모음 색인과 같은 것을 추출하는 단계를 포함하는 구성을 가질 수 있다. The Hangul search method according to the present invention comprises the steps of obtaining a maximum index value m among entries having a maximum value among the values less than or equal to the syllable to be compared in the array ICV, taking m% VOWEL_SIZE as the neutral vowel index of the collection of syllables. And comparing the neutral vowel index of the comparison object with the searcher's vowel vowel index and extracting the same as the searcher's vowel vowel index.

또, 본 발명에 따른 한글 검색방법은 탐색자가 한글 완성형 j번째 중성모음일 때, In addition, the Hangul search method according to the present invention, when the searcher is the Hangul complete j j neutral vowel,

비교할 음절 S의 중성모음 색인을 한글탐색양식의 중성모음 색인 j로 추측(guess)하고 그 음절의 초성자음 색인을 구하고 그 음절의 ICV 색인을 추측한 후 그 추측이 맞는 지를 확인(verify)하는 과정으로 이루어지며, The process of guessing the vowels of the syllable S to be compared by the vowel index j of the Korean search form, obtaining the vowel index of the syllables, guessing the ICV index of the syllables, and verifying that the guess is correct. Made of

ICV_2D[i][j]를 ICV[m]과 동일하게 취급하며 여기서 m은 i * VOWEL_SIZE + j이고 i는 m/VOWEL_SIZE이며 j는 m% VOWEL_SIZE에 해당한다고 할 때Treat ICV_2D [i] [j] the same as ICV [m], where m is i * VOWEL_SIZE + j, i is m / VOWEL_SIZE, and j is m% VOWEL_SIZE

상기 확인하는 과정은, The checking process,

a.배열 ICV_2D의 j 번째 칼럼의 IC_SIZE 개의 엔트리들 즉, ICV_2D[0][j], ICV_2D[1][j], … , ICV_2D[IC_SIZE-1][j]에 대하여, 이들 중 상기 S보다 작거나 같은 값들 중에서 최대값을 가진 엔트리를 이진탐색으로 구하고, a. IC_SIZE entries of the j th column of the array ICV_2D, i.e. ICV_2D [0] [j], ICV_2D [1] [j],... For ICV_2D [IC_SIZE-1] [j], the binary search for the entry having the maximum value among the values less than or equal to S is obtained.

b. 상기 구한 엔트리의 색인을 m이라 할 때, b. When the index of the obtained entry is m,

상기 엔트리 m이 배열 ICV의 마지막 엔트리이거나, S < ICV[m+1]이면 상기 비교대상 음절 S를 추출하는 과정을 포함하는 구성을 가질 수 있다. If the entry m is the last entry of the array ICV or S <ICV [m + 1], the entry m may have a structure including extracting the comparison syllable S.

본 발명에 따른 데이터 저장매체는 위에서 설명한 것 중 어느 한 가지의 한글 검색방법을 이용하는 프로그램이 설치된 구성을 가진다. The data storage medium according to the present invention has a configuration in which a program using any one of the above-described Hangul retrieval methods is installed.

본 발명에 따른 한글 검색 시스템은 위에서 설명한 것 중 어느 한 가지의 한글 검색방법의 각 단계들을 수행하는 수단들이 구비된 구성을 가진다. The Hangul retrieval system according to the present invention has a structure provided with means for performing each step of any one of the Hangul retrieval methods described above.

본 발명에서 한글탐색양식은 문자열을 탐색하기 위한 문자열 양식의 표현에 사용되는 하나의 형태로서 선행자와 탐색자, 탐색자와 후행자, 또는 선행자와 탐색자 그리고 후행자로 구성된다. 여기서 선행자, 후행자 또는 선행자와 후행자는 검색프로그램이 탐색자임을 알 수 있도록 하기 위한 탐색자표시자이다. In the present invention, the Hangul search form is a form used to express a string form for searching a string and is composed of a preceding character and a searcher, a searcher and a successor, or a preceding character and a searcher and a trailing character. Here, the predecessor, the successor, or the predecessor and the latter are searcher indicators so that the search program can know that the searcher is a searcher.

한글탐색양식의 선행자, 후행자 또는 선행자와 후행자, 즉 탐색자표시자는 예약된 문자 또는 문자들일 수도 있고, 특수문자 일수도 있다. 예를 들어, SQL의 라이크(like) 질의의 경우에는 탐색자표시자로 특수문자 '$'를 예약된 문자로서 사용할 수도 있고 탐색자표시자를 선언하여 사용할 수도 있다. 본 발명의 나머지 내용에서 별도의 선언을 하지 않는 한 한글탐색양식은 탐색자표시자와 탐색자의 형태를 가진다. The preceding, trailing, or preceding and trailing characters of the Korean search form, that is, the searcher indicator, may be reserved characters or characters, or may be special characters. For example, in the case of SQL-like queries, the special character '$' can be used as a reserved character as a searcher indicator, or a searcher indicator can be declared and used. Unless otherwise specified in the rest of the present invention, the Korean search form has the form of a searcher indicator and a searcher.

그리고 한글탐색양식의 탐색자는 한글 초성자음, 초성자음과 중성모음으로 구성된 음절, 또는 모음을 의미한다. The searcher of the Hangul search form means a Hangul choson consonant, a syllable or vowel composed of a choson consonant and a vowel.

본 발명에서 제시되는 한글탐색양식, 즉, 한글탐색양식을 이용한 검색 방법은 한글 검색을 위한 검색 방법으로 한글 키워드일 경우 해당 키워드로 조합이 가능한 모든 음절(초성+중성, 초성+중성+종성)의 범위를 결정하고 데이터가 그 범위 안에 속하는지 검사 하는 방법이다. The search method using the Hangul search form, that is, the Hangul search form presented in the present invention, is a search method for the Hangul search, and in the case of the Hangul keyword, all syllables (combination of first + neutral, first + neutral + species) that can be combined with the corresponding keyword It is a way to determine the range and check whether the data falls within the range.

본 발명에서 한글탐색양식은 한글범위양식을 지칭하는 다른 말이다. In the present invention, the Hangul search form is another word referring to the Hangul range form.

그리고 초성자음, 모음의 경우에는 그 자체만으로도 코드값이 일치하는 음절을 찾는 기존 검색방법에서의 음절과 구분이 가능하지만 초성자음과 중성모음으로 이루어진 한글 한 음절의 경우에는 탐색자표시자가 반드시 있어야 된다. In the case of initial consonants and vowels, the syllables can be distinguished from the syllables in the existing search method for finding syllables with matching code values.

탐색자표시자는 기존의 일반 검색에서와 같이 초성자음과 중성모음으로 이루어진 한글 한음절과 일치하는 음절을 찾는 것이 아니라 해당 초성자음과 중성모음 둘 모두를 가진 음절을 찾는 검색임을 표시하여주는 역할을 한다. The searcher notifier searches for syllables with both the consonants and the vowels, rather than finding syllables that match the Hangul syllables consisting of the consonants and the vowels.

본 발명에서 탐색자표시자라 함은 탐색자가 한글 초성자음, 초성자음과 중성모음으로 구성된 음절, 모음에 대한 것을 식별할 수 있도록 하기 위한 예약어로서, SQL의 라이크 연산의 문자열 양식의 경우에는 탈출문자로 불리지만 다른 언어에서 는 다른 명칭으로 칭해질 수 있다. 즉, 에스큐엘에서의 탈출문자는 본 발명에서의 탐색자표시자의 한 예에 속한다. 그리고 SQL의 라이크 연산의 문자열 양식의 탈출문자에 해당되지 않는 '$' 등도 본 발명의 탐색자표시자가 될 수 있다. In the present invention, the searcher indicator is a reserved word for the searcher to identify a syllable and a vowel composed of a Korean consonant, a consonant and a neutral vowel, and is called an escape character in the case of a string form of SQL like operation. In other languages it may be called a different name. In other words, the escape character in escuel belongs to one example of the searcher indicator in the present invention. In addition, '$' which does not correspond to the escape character of the string form of the SQL like operation may also be a searcher indicator of the present invention.

그리고 본 발명은 SQL의 라이크연산에서만 적용되는 것은 아니고 한글의 초성자음, 초성자음과 중성모음으로 된 한글 한 음절, 한글 모음 중 어느 하나 또는 둘 이상을 조합한 검색을 하는 경우에는 다른 언어를 이용한 연산에서도 그대로 적용될 수 있다. In addition, the present invention is not only applied to SQL like operations, but when using a combination of one or more of Hangul vowels, Hangul one syllables and Hangul vowels, The same can be applied to.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세하게 설명한다. Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the present invention.

도면을 참조하여 설명하기에 앞서 본 발명과 관련된 사항을 설명하면, 본원 발명에서 검색을 위한 한글코드는 한글 사전의 순서로 정열이 되어 있음을 가정한다. 검색 키워드의 양식에는 탐색자표시자를 선언할 수 있고 탐색자표시자 바로 다음에 오는 한글 키워드는 한글탐색양식을 이용한 검색이며, 탈출문자가 선언되지 않은 양식의 한글키워드는 기존의 라이크(like) 등의 연산과 동일한 검색을 한다. 한글탐색양식을 이용한 검색방법은 검색 키워드를 정규화하고 정규화된 양식을 통해 데이터 검색을 한다. 정규화 과정을 통해 한글키워드의 범위를 결정한다. Before describing the matters related to the present invention with reference to the drawings, it is assumed that the Hangul code for searching in the present invention is arranged in the order of the Hangul dictionary. You can declare a searcher marker in the form of a search keyword, and the Korean keyword immediately following the searcher marker is a search using the Korean search form. Do the same search as In the search method using the Korean search form, the search keyword is normalized and the data is searched through the normalized form. The range of Korean keywords is determined through the normalization process.

상기와 같은 본원 발명에서 한글탐색양식은 연산자 라이크(like) 등을 이용하는 검색식의 탐색자표시자 다음에 나오는 한글의 초성자음, 초성+중성의 한글 음절 또는 한글 모음이다. In the present invention as described above, the Hangul search form is the Hangul syllables, Hangul syllables, or Hangul vowels of Hangul following the searcher indicator of a search expression using an operator like.

이들은 각각 해당 초성자음을 초성으로 가지는 모든 한글 음절들, 해당 초성 +중성의 초성자음과 모음을 초성과 중성으로 가지는 모든 한글 음절들, 또는 해당 모음을 중성으로 가지는 모든 한글 음절들과 부합한다. These correspond to all Hangul syllables, each with its corresponding consonant, all Hangul syllables with its first and neutral vowels, and all Hangul syllables with its vowels neutral.

연산자 라이크(like) 등을 이용하는 검색식의 탐색자표시자 다음에 나오는 한글키워드 중 종성으로만 사용되는 자음과 초성+중성+종성의 한글 음절은 한글탐색양식으로 분류하지 않는다. 종성으로만 사용되는 자음의 검색은 이 자음을 초성으로 가지는 음절의 검색인지 종성으로 가지는 음절의 검색인지 판단하기가 모호하기 때문이다. 초성+중성+종성의 한글 음절은 일반적인 라이크(like) 검색과 동일한 방법으로 검색이 가능하다. Korean syllables that are used only as a final consonant and consonant + initial + neutral + final Korean syllables are not classified as Korean search styles. This is because it is ambiguous to determine whether the search for consonants used only as a finality is a search for a syllable with a consonant or a syllable with a final consonant. The Hangul syllables of Choseong + Neutral + Jongjong can be searched in the same way as the general Like search.

표 1. 탐색자표시자 다음의 한글의 분류와 이들 중 한글탐색양식의 예. Table 1. Searcher Markers The following Korean classifications and examples of Korean search styles.

표 1은 탐색자표시자 다음에 올 수 있는 한글과 이들 중 한글탐색양식을 그 유형별로 구분한 예를 나타낸다. 표 1에서 세 번째 칼럼인 부합 범위는 각 한글탐색양식에 부합될 비교 문자열에서 값 x의 범위를 나타내며 |x|=1 이라 함은 그 값은 하나의 음절로 구성됨을 의미한다. 여기서는 '/'를 탐색자표시자라 한다. Table 1 shows an example of Korean characters that can be followed by the searcher indicators, and among them, Korean search styles. In the first column of Table 1, the matching range indicates the range of the value x in the comparison string to be matched with each Hangul search form. | X | = 1 means that the value consists of one syllable. Here '/' is called a navigator indicator.

표 1에서, 한글탐색양식 '/ㄱ'의 경우, 자음 'ㄱ'은 초성자음에 속하므로 한글탐색양식을 만족하며 그 양식에 부합하는 음절들은 하나의 음절로 구성되며 그 범위는 '가'보다 크거나 같고 '까'보다 작다. 그러나 '/ㄻ'의 경우, 'ㄻ'은 종성으로만 사용되는 자음 이므로 양식 '/ㄻ'은 한글탐색양식이 아니며 'ㄻ'과 동일하게 처리한다. 즉, 비교 문자열에서 자음 'ㄻ'을 탐색하는 것이다. 한글탐색양식 '/가'의 경우, 초성자음 'ㄱ'과 모음 'ㅏ'를 초성과 중성으로 가지는 모든 한글 음절들이 그 양식에 부합되므로 그 부합 범위는 '가'보다 크거나 같고 '개'보다 작다. 그러나 '/감'의 경우, 음절 '감'은 초성+중성+종성으로 구성되므로 한글에서 그 범위는 최소값과 최대값이 '감'으로 동일하여 특별히 한글탐색양식으로 처리할 이유가 없다. 한글탐색양식 '/ㅏ'의 경우, 중성이 모음 'ㅏ'인 모든 음절들이 부합되므로 그 부합 범위는 모든 초성자음들에 대하여 모음 'ㅏ'를 조합한 한글탐색양식들의 집합 즉, "/가 U /나 U …U /하"와 동일하며 '닮', '날', '할' 등이 이에 부합된다. In Table 1, in the case of the Korean search form '/ ㄱ', the consonant 'ㄱ' belongs to the consonant consonant, which satisfies the Korean search form, and the syllables corresponding to the form consist of one syllable and the range is larger than 'A'. Is less than or equal to However, in the case of '/', '\' is not consonant used only as a final consonant, so the form '/ ㄻ' is not the Hangul search form. That is, the search for the consonant 'ㄻ' in the comparison string. In the case of the Korean search form '/ 가', all Hangul syllables that have the first consonant 'ㄱ' and the vowel 'ㅏ' as the initial and neutral correspond to the form, so the matching range is greater than or equal to 'A' and is greater than 'Dae'. small. However, in the case of '/ gamma', the syllable 'gamma' is composed of initial + neutral + finality, so the range in Hangul is the same as 'gam', so there is no reason to treat it in Korean search style. In the case of the Korean search form '/ ㅏ', all syllables with the neutral vowel 'ㅏ' are matched, so the matching range is the set of Korean search forms combining the vowel 'ㅏ' for all the consonants. / Or U… U / 하 ”, and 'like', 'day', 'hal' and so on.

라이크(like) 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자가 한글탐색양식을 형성하는 지를 판단하기 위하여 다음 사항들을 이용한다. To determine whether a Korean character consisting of two bytes in the string form of a like operation forms the Korean search form, the following is used.

(1) 한글 자음들은 모두 30(CON_SIZE)개이다. 이들의 순서는 사전적으로 정해져 있으며 컴퓨터 표현은 최소값인 'ㄱ'을 CON_START라 하며, 차례로 1씩 증가하는 연속된 값으로 나타내어 최대값인 'ㅎ'을 CON_END라 한다. 이들 자음들 중에서 초성으로 사용되는 자음인 초성자음들은 19(IC_SIZE)개이며 종성으로 사용될 수 있는 종성 자음들은 27개(전체 자음들에서 'ㄸ', 'ㅃ', 'ㅉ'을 제외)이다. (1) All Korean consonants are 30 (CON_SIZE). The order of these is determined in advance, and the computer expression indicates CON_START as the minimum value 'ㄱ' and CON_END as a continuous value that is incremented by 1 in order. Of these consonants, consonants used as initial consonants are 19 (IC_SIZE) and 27 final consonants (except for 'ㄸ', 'ㅃ' and 'ㅉ').

(2) 한글 모음들은 모두 21(VOWEL_SIZE)개이다. 한글 모음들은 모두 중성으로 사용되며 이들의 순서는 사전적으로 정해져 있으며 컴퓨터 표현은 최소값인 'ㅏ'를 VOWEL_START라 하며 차례로 1씩 증가하는 연속된 값으로 나타내어 최대값인 'ㅣ'를 VOWEL_END라 한다. (2) All Korean vowels are 21 (VOWEL_SIZE). All Korean vowels are used as neutrals, and their order is pre-determined. The computer expression is VOWEL_START, which is the minimum value, and is represented as a series of values that are incremented by 1, and the maximum value, ㅣ, is called VOWEL_END.

(3) 한글 음절들은 정렬된 연속된 값으로 설정되어 있으며 이들 중 최소값인 '가'는HANGUL_START, 최대값인 '힝'은 HANGUL_END라 한다. 이들 중 초성+중성으로 형성되는 음절들은 정해져 있다. (3) Hangul syllables are set to an ordered sequence of values, the minimum value of which is HANGUL_START and the maximum value of HINGH is HANGUL_END. Of these, the syllables formed by primary + neutral are determined.

한글탐색양식의 식별에 대해서 설명하면 다음과 같다. The identification of the Korean search style is as follows.

(1)초성자음의 식별 (1) Identification of initial consonants

먼저, 초성자음의 식별에 대해 설명하면, 한글의 자음들 중(CON_START ~ CON_END)에서 각 자음이 몇 번째 초성자음인지를 식별하기 위하여 배열 Initial_Consonant를 둔다. 배열 Initial_Consonant는 CON_SIZE 개의 엔트리들을 가지며 그 배열의 i 번째 엔트리는 i 번째 자음이 초성자음인 경우에는 그 자음이 초성자음들에서 몇 번째 초성자음인지를 나타내며 초성자음이 아닌 경우에는 -1을 나타낸다. 배열 Initial_Consonant는 다음과 같이 표현된다. First, the identification of the consonant consonants, an array Initial_Consonant is placed to identify the number of the consonant consonants in each of the consonants (CON_START ~ CON_END) of Hangul. The array Initial_Consonant has CON_SIZE entries and the i th entry of the array indicates the number of first consonants in the consonants if the i th consonant is a consonant, and -1 if it is not a consonant. The array Initial_Consonant is expressed as:

static const int Initial_Consonant[] = static const int Initial_Consonant [] =

{ {

} }

배열 Initial_Consonant에서 0 번째 엔트리는 0 번째 자음인 'ㄱ'이 0 번째 초성자음임을 나타내며, 10 번째 엔트리는 10 번째 자음인 'ㄻ'이 초성자음이 아님을 나타내며, 21 번째 엔트리는 21 번째 자음인 'ㅆ'이 10 번째 초성자음임을 나타낸다. 따라서 라이크(like) 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자 X가 한글의 자음들의 범위인 CON_START와 CON_END 사이의 값인 경우, Initial_Consonant[X - CON_START]는 그 문자의 초성자음들에서의 순서를 나타낸다. Initial_Consonant[X - CON_START]의 값이 -1인 경우는 X가 종성으로만 사용되는 자음이므로 한글탐색양식을 형성하지 않는다. In the array Initial_Consonant, the 0th entry indicates that the 0th consonant 'ㄱ' is the 0th consonant, the 10th entry indicates that the 10th consonant 'ㄻ' is not an initial consonant, and the 21st entry is the 21st consonant ' ㅆ 'is the tenth consonant. Thus, if a Korean letter X consisting of two bytes in the string form of a like operation is a value between CON_START and CON_END, the range of Korean consonants, Initial_Consonant [X-CON_START] is the order in the initial consonants of that character. Indicates. If the value of Initial_Consonant [X-CON_START] is -1, X does not form a Korean search form because it is a consonant used only as a finality.

(2) 모음의 식별 (2) identification of vowels

한글 모음들은 모두 중성으로 사용되며 이들의 순서는 사전적으로 정해져 있으므로 LIKE 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자 x가 한글의 모음들의 범위인 VOWEL_START와 VOWEL_END 사이의 값인 경우, 그 문자는 한글 모음이며 x - VOWEL_START는 그 문자의 모음들에서의 순서를 나타낸다. The Hangul vowels are all used as neutrals, and their order is pre-determined, so if a Korean character x consisting of two bytes in the string form of the LIKE operation is a value between VOWEL_START and VOWEL_END, the range of Hangul vowels, the character is Hangul. Is a vowel and x-VOWEL_START indicates the order of the vowels of the character.

(3) 초성+중성으로 이루어진 음절의 식별 (3) Identification of syllables consisting of initial + neutral

라이크(like) 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자가 '가'(HANGUL_START) 와 '힝'(HANGUL_END) 사이의 값을 가질 경우, 그 문자는 한글 한 음절을 나타내며 그 음절은 초성+중성으로 구성되거나 초성+중성+종성으로 구성된 것이다. 그 음절이 초성+중성으로 구성된 음절임을 식별하기 위하여 배열 ICV를 둔다. 아래 배열은 한글 완성형에서 사용하고 있는 문자를 기준으로 하여 만든 것으로 완성형에서 사용하지 않는 음절에 대해서는 해당음절보다 크거나 같은 값들 중에 최소값을 가지는 음절을 반복해서 사용한다. 배열 ICV는 다음과 같이 표현된다. If a Hangul character consisting of two bytes in a string form of a like operation has a value between 'HANGUL_START' and 'HANGUL_END', the character represents a Hangul syllable and the syllable is a consonant It may be composed of + neutral or composed of primary + neutral + species. An array ICV is placed to identify that syllable is syllable composed of primary + neutral. The following array is made based on the characters used in the Korean Completion type. For syllables not used in the Completion type, the syllable having the minimum value among the values larger or equal to the syllable is used repeatedly. The array ICV is expressed as follows.

static const unsigned short ICV[] = static const unsigned short ICV [] =

{ {

}. }.

배열 ICV는 초성자음들의 수만큼의 행들과 모음들의 수만큼의 열로 구성(IC_SIZE * VOWEL_SIZE) 되어 399(ICV_SIZE)개의 엔트리들을 가진다. 배열 ICV의 생성 규칙은 한글 음절들의 사전적 순서가 <초성자음, 모음, 종성자음>의 순서 즉, 초성자음의 순서를 따르며, 동일 초성자음의 경우, 중성을 이루는 모음의 순서를 따르며, 동일 초성자음과 동일 모음의 경우, 종성자음의 순서를 따른다는 사실을 이용한다. The array ICV consists of as many rows as the number of initial consonants and as many columns as the number of vowels (IC_SIZE * VOWEL_SIZE) and has 399 (ICV_SIZE) entries. The rule of generating ICV of array ICV follows the order of <consonant consonant, vowel, and consonant consonant>, that is, the order of consonant consonant.In the case of the same consonant, it follows the order of vowel forming neutral. For consonants and the same vowel, we use the fact that they follow the order of the final consonants.

배열 ICV의 구성 규칙에 의해 배열 ICV는 다음의 특징을 가진다. According to the configuration rules of the array ICV, the array ICV has the following characteristics.

a. 한글의 모든 초성+중성의 음절들은 배열 ICV에 수록된다. a. All the consonant + neutral syllables of Hangul are recorded in the array ICV.

b. 임의의 음절을 배열 ICV에서 탐색하여 그 음절이 배열 ICV에 없으면 그 음절은 초성+중성+종성의 음절이다. b. If a syllable is searched in the array ICV and the syllable is not in the array ICV, the syllable is a syllable of primary + neutral + final.

한글탐색양식과 한글 음절의 비교에 대해 설명하면 다음과 같다. The following describes the comparison between Hangul search style and Hangul syllables.

(1) 초성자음의 음절 비교 (1) Comparison of syllables of consonants

주어진 키워드 X가 CON_START와 CON_END 사이에 있을 때 Initial_Consonant[X - CON_START] = i 라 하고, 한글의 i 번째 초성자음을 초성으로 가지는 임의의 음절 S가 가질 수 있는 범위를 LB_Inclusive ≤ S ＜UB_Exclusive라 할 때 LB_Inclusive는 ICV[i*VOWEL_SIZE], UB_Exclusive는 ICV[i*VOWEL_SIZE+VOWEL_SIZE]에 해당하는 값이 된다. 따라서 특정 음절이 초성자음의 한글탐색양식에 부합하는 지의 비교는 그 범위 양식의 LB_Inclusive와 UB_Exclusive를 구하여 그 음절이 그 범위에 속하는 지를 비교하면 된다. When the given keyword X is between CON_START and CON_END, Initial_Consonant [X-CON_START] = i, and LB_Inclusive ≤ S <UB_Exclusive is the range that any syllable S that has the i-th initial consonant of Hangul as an initial can have. LB_Inclusive is a value corresponding to ICV [i * VOWEL_SIZE], and UB_Exclusive is equivalent to ICV [i * VOWEL_SIZE + VOWEL_SIZE]. Therefore, the comparison of whether a syllable corresponds to the Hangul search style of a consonant is done by obtaining LB_Inclusive and UB_Exclusive of the range form and comparing whether the syllable is within the range.

(2) 모음의 음절 비교 (2) syllable comparison of vowels

주어진 키워드 X가 VOWEL_START와 VOWEL_END 사이에 있을 때 X-VOWEL_START = i 라 하면, 한글의 i 번째 모음을 중성으로 가지는 임의의 음절 S가 가질 수 있는 정확한 범위는 초성자음 'ㄱ'+ i 번째 모음으로 이루어진 음절이 가질 수 있는 범위, 초성자음 'ㄴ'+ i 번째 모음으로 이루어진 음절이 가질 수 있는 범위, … , 초성자음 'ㅎ' + i 번째 모음으로 이루어진 음절이 가질 수 있는 범위의 합집합이므로 그 범위들을 구하여 저장해 두고 특정 음절이 이들 범위 내에 존재하는 지를 판단하는 것이다. 초성자음 + i 번째 모음으로 이루어진 음절이 가질 수 있는 범위 는 초성+중성의 음절 비교로 알 수 있다. If the given keyword X is between VOWEL_START and VOWEL_END, then X-VOWEL_START = i, the exact range that any syllable S that has the i th vowel of the Hangul as a neutral can have an initial consonant 'ㄱ' + the i th vowel. The range that a syllable can have, the range that a syllable consisting of the i'th vowel can be, Since the syllable consisting of the first consonant 'ㅎ' + i vowels is the union of the ranges, the ranges are obtained and stored, and the syllables are judged to exist within these ranges. The range of syllables consisting of the consonant + i-th vowel can be determined by comparing the syllables of the consonant + neutral.

또 중성모음 색인이 j인 모음을 중성으로 가지는 임의의 음절 S가 가질 수 있는 정확한 범위를 초성과 중성으로 구성된 한글탐색양식으로 나타내면, /(초성자음 'ㄱ'과 중성모음 색인이 j인 모음으로 구성된 음절) U /(초성자음 'ㄴ'과 중성모음 색인이 j인 모음으로 구성된 음절) U … U /(초성자음 'ㅎ'과 중성모음 색인이 j인 모음으로 구성된 음절)이다. 따라서 비교할 음절의 모음이 중성모음 색인이 j인 모음과 동일한지를 식별하는 알고리즘에 따라 모음의 한글탐색양식과 한글 음절을 비교하는 방법을 다음과 같이 두 가지 방법으로 구분한다. In addition, if the exact range that any syllable S having a vowel with the vowel index of j is neutral can be represented in the Korean search form composed of the first and the neutron, / (the first consonant 'a' and the vowel with the vowel index is j Syllables) U / (syllables consisting of vowels with the initial consonant 'ㄴ' and the neutron vowel index j) U / (a syllable consisting of a vowel with the initial consonant 'ㅎ' and the neutron vowel index j). Therefore, according to the algorithm for identifying whether the vowels of the syllables to be compared are the same as the vowels with the neutral vowel index j, the method of comparing the Korean search style and the Korean syllables of the vowels is divided into two methods as follows.

첫 번째 방법은 비교할 음절의 ICV 색인을 구하고 그로부터 그 음절의 모음의 중성모음 색인 m을 구하고 m % VOWEL_SIZE를 그 음절의 모음의 중성모음 색인으로 취하는 것이다. 예를 들어, 비교할 음절이 '쁑'인 경우, 배열 ICV에서 '쁑'보다 작거나 같은 값들 중에서 최대값을 가진 182, 183, 184, 185 번째 엔트리들 중에서 최대 색인의 값은 185이며 185 %VOWEL_SIZE의 값인 17이 음절 '쁑'의 모음의 중성모음 색인이 된다. The first method is to find the ICV index of the syllable to compare, and from it the neutral vowel index m of the vowel of the syllable, and take m% VOWEL_SIZE as the vowel index of the vowel of the syllable. For example, if the syllable to be compared is '쁑', the maximum index among the 182th, 183, 184, and 185th entries with the maximum value less than or equal to '쁑' in the array ICV is 185 and 185% VOWEL_SIZE. The value of 17 is the neutral vowel index of the vowel syllable '쁑'.

두 번째 방법은 비교할 음절의 중성모음 색인을 한글탐색양식의 중성모음 색인으로 추측(guess)하고 그 음절의 초성자음 색인을 구함으로써 그 음절의 ICV 색인을 추측한 후 그 추측이 맞는 지를 확인(verify)하는 방법이다. 이는 다음과 같이 수행된다. The second method is to guess the vowel index of the syllable to be compared to the vowel index of the Hangul search form, and to obtain the initial consonant index of the syllable. ) This is done as follows.

a.배열 ICV_2D의 j 번째 칼럼의 IC_SIZE 개의 엔트리들 즉, ICV_2D[0][j], ICV_2D[1][j], … , ICV_2D[IC_SIZE-1][j]에 대하여, 이들 중 S보다 작거나 같은 값들 중에서 최대값을 가진 엔트리를 이진탐색으로 구한다. 그 엔트리의 색인을 m이라 하자. a. IC_SIZE entries of the j th column of the array ICV_2D, i.e. ICV_2D [0] [j], ICV_2D [1] [j],... For ICV_2D [IC_SIZE-1] [j], the entry having the maximum value among the values less than or equal to S is obtained by binary search. Let m be the index of the entry.

b.엔트리 m이 배열 ICV의 마지막 엔트리이거나, S < ICV[m+1]이면 음절 S의 중성모음 색인은 j이며 그렇지 않다면 중성모음 색인은 j가 아니다. b. If entry m is the last entry of array ICV, or if S <ICV [m + 1], the syllable S's neutral vowel index is j, otherwise the neutral vowel index is not j.

예를 들어, 음절 '쁑'과 모음 'ㅠ'의 모음 색인 17의 확인은, ICV_2D[0][17], ICV_2D[1][17], … , ICV_2D[IC_SIZE-1][17]에 대한 이진 탐색에서, '쁑'보다 작거나 같은 값들 중에서 최대값을 가진 엔트리는 ICV_2D[8][17] 즉, 값'쀼'이며, 그 엔트리의 배열 ICV에서의 엔트리는 185이고, '쁑' < ICV[186]이므로 음절 '쁑'은 모음 'ㅠ'와 동일한 모음을 가짐을 알 수 있다. For example, the confirmation of the vowel index 17 of the syllable '쁑' and the vowel 'ㅠ' is ICV_2D [0] [17], ICV_2D [1] [17],. , In the binary search for ICV_2D [IC_SIZE-1] [17], the entry with the largest value less than or equal to '쁑' is ICV_2D [8] [17], that is, the value '쀼', the array of entries The entry in the ICV is 185, and since '쁑' <ICV [186], the syllable '쁑' has the same vowel as the vowel 'ㅠ'.

(3) 초성+중성의 음절 비교 (3) Comparison of syllables of primary + neutral

주어진 키워드 X가 HANGUL_START와 HANGUL_END 사이에 있을 때 ICV 배열의 값을 이진 탐색으로 탐색하여 키워드 X보다 작거나 같은 값들 중에서 MAX값의 배열의 순서를 i 라 하면, 키워드 X를 포함하는 음절 S가 가질 수 있는 범위는 ICV[i]와 ICV[i+1]이다. 따라서 특정 음절의 값이 ICV[i]와 ICV[i+1] 사이에 존재한다면 그 음절은 키워드 X의 범위 양식에 포함된다고 할 수 있다. If a given keyword X is between HANGUL_START and HANGUL_END, then the value of the ICV array is searched in binary search and the order of the array of MAX values among the values less than or equal to the keyword X is i, and the syllable S containing the keyword X may have. The ranges are ICV [i] and ICV [i + 1]. Thus, if a syllable value exists between ICV [i] and ICV [i + 1], the syllable is included in the range form of keyword X.

유니코드에서 한글탐색양식의 식별과 한글 음절과의 비교 Identification of Korean Search Style in Unicode and Comparison with Korean Syllables

한글을 유니코드의 한글 완성형을 사용할 경우에 대하여 문자열 양식에서 한 글탐색양식의 각 유형을 식별하는 방법과 한글탐색양식의 각 유형에 대한 한글 음절의 부합 알고리즘을 차례로 제시한다. In case of using Hangul complete form of Unicode for Hangul, we propose a method of identifying each type of one search form in the form of string, and a matching algorithm of Hangul syllables for each type of Hangul search form.

(1) 유니코드에서 한글의 특징 (1) Characteristics of Hangul in Unicode

유니코드에서 한글의 자음과 모음 그리고 음절들은 다음의 특징을 가진다. Hangul consonants, vowels and syllables in Unicode have the following characteristics:

a.유니코드는 초성자음 19개, 중성 21개, 종성자음 27개의 완전한 조합에 대한 한글 완성형 11,172 개의 음절들을 현대 한국어의 사전식 순서에 따라 UNI_HANGUL_START(0xAC00)에서 UNI_HANGUL_END(0xD7AF)까지의 영역에 차례로 배치한다. 예를 들어, 한글 음절에서 가장 작은 값인 '가'는 0xAC00에 그리고 가장 큰 값인 '

'은 0xD7A3에 배치되어 있다. KSX 1001은 초성자음과 중성 그리고 종성자음의 조합들 중에서 일부만을 지원하지만 유니코드는 완전한 조합을 지원하므로, KSX 1001의 경우에는 배열 ICV가 필요했지만 유니코드의 경우에는 그와 유사한 배열은 필요하지 않다. a.Unicode consists of 11,172 syllables for the complete combination of 19 consonants, 21 neutrals, and 27 final consonants, in order from UNI_HANGUL_START (0xAC00) to UNI_HANGUL_END (0xD7AF), according to the lexical order of modern Korean. To place. For example, the smallest value of 'Ga' in Hangul syllable is 0xAC00 and the largest value is'

'Is placed at 0xD7A3. The KSX 1001 supports only some of the combinations of consonant, neutral, and final consonants, but Unicode supports full combinations, so the array ICV was required for the KSX 1001, but no similar arrangement was required for Unicode. .

b.유니코드의 한글 자모에서 현대 한글의 초성자음 19자는 UNI_IC_START(0x1110) 에서 UNI_IC_END(0x1112) 까지의 영역에 배치되며 한글 호환 자모에서 현대 한글의 자음은 UNI_CONSONANT_START(0x3131) 에서 UNI_CONSONANT_END(0x314E) 까지의 영역에 배치된다. b.Nineteen letters of contemporary Hangul consonants in Unicode are placed in the range from UNI_IC_START (0x1110) to UNI_IC_END (0x1112). Is placed in the area.

c.유니코드의 한글 자모에서 현대 한글의 모음 21 자는 MV_START(0x1161) 에서 MV_END(0x1175) 까지의 영역에 배치되며 한글 호환 자모에서 현대 한글의 모음 은 UNI_VOWEL_START(0x314F) 에서 UNI_VOWEL_END(0x3163) 까지의 영역에 배치된다. c.In Korean alphabet vowels of Unicode, the 21 vowels of modern Hangul are placed in the range from MV_START (0x1161) to MV_END (0x1175), and the Hangul vowels in the Hangul compatible alphabet range from UNI_VOWEL_START (0x314F) to UNI_VOWEL_END (0x3163) Is placed on.

유니코드에서 초성자음 'ㄱ' 부터 초성자음 'ㅎ' 까지의 19 개의 각 초성자음은 모음 'ㅏ' 부터 모음 'ㅣ' 까지의 21 개의 각 모음에 대하여 종성자음이 없는 하나의 음절과 27 개의 각 종성자음에 대하여 하나씩의 음절을 가지므로 모두 28 개의 음절들을 가진다. 따라서 각 초성자음은 그를 초성으로 가지는 한글 음절들을 21 * 28 즉, 588 개씩 가진다. 이러한 사실들로 인하여 아래의 두개의 고찰들이 가능하다. In Unicode, each of the 19 consonants, from the consonant 'a' to the consonant 'ㅎ', has one syllable and no 27 consonants for each of the 21 vowels from the vowel 'ㅏ' to the vowel 'ㅣ'. Each syllable has one syllable, so there are 28 syllables. Thus, each consonant has 21 * 28, or 588, Korean syllables. These facts make the following two considerations possible.

고찰1. 유니코드에서 한글 음절 S의 초성의 초성자음 색인을 ic_idx, 중성모음의 중성모음 색인을 vowel_idx라 할 때, ic_idx는 (S - '가')/(VOWEL_SIZE*(FINAL_CONSONANT_SIZE+1))의 값이며 vowel_idx는 ((S - '가')/(FINAL_CONSONANT_SIZE+1))%VOWEL_SIZE의 값이다. 즉, ic_idx = (S - '가')/588이며 vowel_idx = ((S - '가')/28)%21이다.Consideration 1. In Unicode, when the initial consonant index of the Hangul syllable S is ic_idx and the vowel_idx is the vowel_idx of the vowel, the ic_idx is the value of (S-'ga') / (VOWEL_SIZE * (FINAL_CONSONANT_SIZE + 1)) and vowel_idx Is the value of ((S-'ga') / (FINAL_CONSONANT_SIZE + 1))% VOWEL_SIZE. That is, ic_idx = (S-'ga') / 588 and vowel_idx = ((S-'ga') / 28)% 21.

고찰 2. 유니코드에서 한글 음절 S의 초성의 초성자음 색인을 ic_idx, 중성모음의 중성모음 색인을 vowel_idx라 할 때, 만약 (S - '가')의 값이 ((ic_idx * 588) + (vowel_idx * 28))과 같으면 음절 S는 초성과 중성으로만 구성된 음절이며 그렇지 않으면 초성과 중성 그리고 종성으로 구성된 음절이다. Consideration 2. In Unicode, when the initial consonant index of Korean syllable S is ic_idx and the vowel_idx is the vowel_idx of the vowel, the value of (S-'ga') is ((ic_idx * 588) + (vowel_idx * 28)), syllable S is a syllable consisting only of the first and neutral, otherwise it is a syllable consisting of the first, neutral, and final.

(2) 초성자음의 한글탐색양식의 식별과 한글 음절의 비교 (2) Identification of Hangeul Search Styles of First Consonants and Comparison of Hangul Syllables

두벌식 자판에서 초성자음과 종성자음을 구별하여 지원하기는 어렵다. 그러나 초성자음과 종성자음이 구별되어 입력된다면 입력된 글자의 부호값의 검사만으로 초성자음인지 종성자음인지의 구별은 가능하다. 따라서 본 절에서 제안하는 한글탐색양식의 초성자음의 식별 방법은 초성자음과 종성자음을 구별하여 규정한 유니코드 부호계의 한글 자모에도 적용할 수 있으며 유니코드 부호계의 한글 호환 자모에도 적용할 수 있도록 다음과 같이 수행한다. It is difficult to distinguish between a consonant and a final consonant in a two-bolt keyboard. However, if the initial consonant and the final consonant are input separately, it is possible to distinguish between the initial consonant or the final consonant only by checking the sign value of the input character. Therefore, the method of identifying initial consonants of the Korean search form proposed in this section can be applied to the Hangul alphabet of the Unicode code system that distinguishes the first consonant and the final consonant. Do as follows.

라이크 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자 x가, In the string form of the Like operation, a Korean letter x consisting of two bytes,

a. UNI_IC_START와 UNI_IC_END 사이의 값인 경우에는 x - UNI_IC_START가 그 초성자음의 초성자음 색인이 되며, a. For values between UNI_IC_START and UNI_IC_END, x-UNI_IC_START is the index of the initial consonant of that initial consonant,

b. UNI_CONSONANT_START와 UNI_CONSONANT_END 사이의 값인 경우에는 KSX 1001을 위해 사용했던 배열 IC_Index를 그대로 사용하여 IC_Index[x - UNI_CONSONANT_START]의 값이 -1이 아닌 경우에 한하여 x가 초성자음이며 그 초성자음의 초성자음 색인을 나타낸다. b. If the value is between UNI_CONSONANT_START and UNI_CONSONANT_END, the array IC_Index used for KSX 1001 is used as it is, and x is an initial consonant and the initial consonant index of that initial consonant only if the value of IC_Index [x-UNI_CONSONANT_START] is not -1. .

한글탐색양식이 한글에서 초성자음 색인이 ic_idx인 초성자음을 나타낼 경우, 그 유형을 만족하는 한글 음절들을 식별할 수 있는 방법은 다음의 두 가지로 제시될 수 있다. When the Hangul search style represents the first consonant with the initial consonant index ic_idx in Korean, there are two ways to identify Korean syllables that satisfy the type.

첫 번째 방법은 비교할 음절을 S1라고 할 경우, 고찰 1을 따라, (S1 - '가')/588의 값과 ic_idx를 비교하여 같을 경우에만 부합한다고 하는 것이다. 두 번째 방법은 한글에서 초성자음 색인이 ic_idx인 초성자음을 초성으로 가지는 임의의 음 절 S가 가질 수 있는 값의 범위를 R이라하고 ('가'+ ic_idx * 588)을 m이라 할 때, m ≤ R < (m + 588)의 부합 범위를 설정하고 비교할 음절이 그 범위를 만족하는지 검사하는 방법이다. The first method is to say that the syllable to be compared is S1, and according to consideration 1, the value of (S1-'ga') / 588 and ic_idx are compared and matched only if they are equal. In the second method, the range of values that any syllable S that has an initial consonant with the initial consonant index ic_idx in Korean can have is R and ('is' + ic_idx * 588) is m. ≤ R <(m + 588) Set the matching range and check whether the syllable to be compared satisfies the range.

예를 들어, 초성자음 색인이 6인 초성자음 'ㅁ'을 초성으로 가지는 음절들은 '가' + 6 * 588의 값인 '마'보다 크거나 같고 '가' + 7 * 588의 값인 '바'보다 작다. For example, syllables with an initial consonant 'ㅁ' with an initial consonant index of 6 are greater than or equal to 'Ma', a value of 'A' + 6 * 588, and 'Bar', a value of 'A' + 7 * 588. small.

(3) 초성과 중성으로만 구성된 음절의 한글탐색양식의 식별과 한글 음절의 비교 (3) Identification of Hangul search style of syllables composed only of primary and neutral and comparison of Hangul syllables

라이크(like) 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자가 UNI_HANGUL_START와 UNI_HANGUL_END 사이의 값을 가질 경우, 그 문자는 한글 한 음절을 나타내며 그 음절은 초성과 중성으로 구성되거나 초성, 중성, 그리고 종성으로 구성된 것이다. 임의의 음절 S가 초성과 중성으로만 구성된 음절임을 식별하는 방법은 다음과 같다. 고찰 1과 고찰 2에 의해, 만약 (S - '가')의 값이 ((((S - '가')/588)*588) + ((((S - '가')/28)%21)*28))과 같다면 음절 S는 초성과 중성으로만 구성된 음절이며 그렇지 않으면 초성과 중성 그리고 종성으로 구성된 음절이다.If a Hangul character consisting of two bytes in the string form of a like operation has a value between UNI_HANGUL_START and UNI_HANGUL_END, the character represents one Hangul syllable and the syllable consists of the first and neutral, or the first, neutral, and It is composed of the finality. The method of identifying that a syllable S is a syllable composed only of a consonant and a neutral is as follows. By review 1 and 2, if (S-'ga') is the value (((((S-'ga') / 588) * 588) + (((((S-'ga)) / 28)% 21) * 28)), the syllable S is a syllable consisting only of the first and the neutral, otherwise it is a syllable consisting of the first, neutral, and final.

한글탐색양식이 한글에서 초성자음 색인이 ic_idx인 초성자음과 중성모음 색인이 vowel_idx인 모음을 초성과 중성으로 가지는 음절을 나타낼 경우, 그 유형을 만족하는 한글 음절들을 식별할 수 있는 방법은 다음의 두 가지로 제시될 수 있다. If the Hangul search form represents syllables that have a consonant with the initial consonant index ic_idx and a vowel with the vowel_idx vowel_idx index in the Korean alphabet, the Korean syllables satisfying the type can be identified as follows. It can be presented in branches.

첫 번째 방법은 비교할 음절을 S1 이라고 할 경우, (S1 - '가')/588의 값이 ic_idx와 같고 (((S1 - '가')/28)%21)의 값이 vowel_idx와 같은 경우에 한하여 부합한다고 하는 것이다. 두 번째 방법은 한글에서 초성자음 색인이 ic_idx인 초성자음과 중성모음 색인이vowel_idx인 모음을 초성과 중성으로 가지는 임의의 음절 S1이 가질 수 있는 값의 범위를 R이라 하고 ('가'+ ic_idx*588 + vowel_idx*28)을 m이라 할 경우, m ≤ R < (m + 28)의 부합 범위를 설정하고 비교할 음절이 그 범위를 만족하는지 검사하는 방법이다. 예를 들어, 한글탐색양식이 음절 '조'인 경우, 초성인 'ㅈ'과 중성인 'ㅗ'를 초성과 중성으로 가지는 임의의 음절들은, 초성 'ㅈ'의 초성자음 색인이 12이고 중성 'ㅗ'의 중성모음색인이 8이므로 '가'+ 12 * 588 + 8 * 28 즉, 7,280 번째 음절인 '조'보다 크거나 같고 7,280 + 28 즉, 7,308 번째 음절인 '좌'보다 작다. In the first method, if the syllable to be compared is S1, the value of (S1-'ga') / 588 is equal to ic_idx, and the value of (((S1-'ga') / 28)% 21 is equal to vowel_idx It is said to be consistent. In the second method, the range of values that can be given by any syllable S1 that has a consonant with the initial consonant ic_idx and a vowel_idx with the vowel_idx vowel_idx in the Korean alphabet is R ('ga' + ic_idx * If 588 + vowel_idx * 28) is m, it sets the matching range of m ≤ R <(m + 28) and checks whether the syllable to be compared satisfies the range. For example, if the Hangul search style is syllable 'Joe', any syllable that has the initial consonant 'ㅈ' and the neutral 'ㅗ' as the initial and neutral, the initial consonant index of the initial 'ㅈ' is 12 and the neutral ' 중 'neutral vowel index is 8, so' A '+ 12 * 588 + 8 * 28 is greater than or equal to 7,280th syllable' Joe 'and is less than 7,280 + 28, 7,308th syllable' Left '.

(4) 모음의 한글탐색양식의 식별과 한글 음절의 비교 (4) Identification of Korean search patterns in vowels and comparison of Hangul syllables

한글 모음들은 모두 중성으로 사용되며 이들의 순서는 사전적으로 정해져 있으므로 라이크 연산의 문자열 양식에서 두 바이트들로 구성된 한글 한 문자 x가, The Hangul vowels are all used as neutrals, and their order is pre-determined.

a. 한글 자모에서 모음의 범위인 MV_START와 MV_END 사이의 값인 경우 그 문자는 한글 모음이며 x - MV_START가 그 문자의 중성모음 색인이며, a. If the value is between the range of MV_START and MV_END in the Hangul alphabet, the character is a Hangul vowel, and x-MV_START is the neutral vowel index of the character.

b. 한글 호환자모에서 모음의 범위인 UNI_VOWEL_START와 UNI_VOWEL_END 사이의 값인 경우, 그 문자는 한글 모음이며 x - UNI_VOWEL_START는 그 문자의 중성모음 색인을 나타낸다. b. If the value is between the ranges of UNI_VOWEL_START and UNI_VOWEL_END in the Hangul Compatibility Jamo, the character is a Hangul vowel and x-UNI_VOWEL_START represents the neutral vowel index of the character.

한글탐색양식이 한글에서 중성모음 색인이 vowel_idx인 모음을 나타낼 경우, 그 유형을 만족하는 한글 음절들의 식별은 비교할 음절을 S1라고 할 경우, 음절 S1 의 중성모음 색인인 (((S1 - '가')/28)%21)의 값이 vowel_idx와 같은 경우에 한하여 부합한다고 하는 것이다. If the Hangul search form represents a vowel with the vowel_idx vowel_idx in Hangul, the identification of the Hangul syllables satisfying the type is S (1). Only when the value of) / 28)% 21) is equal to vowel_idx.

이상 설명한 한글의 초성자음, 초성 + 중성으로 이루어진 한글 한 음절 및 중성 모음에 대한 검사방법을 정리하면 표 2.에 나타낸 바와 같다. 표 2.에서 P = 양식(pattern), F = 플래그(flag), V = 코드값(value)이다. The test methods for the Hangul syllables consisting of the consonants, the consonants + neutrals, and the vowels of the Hanguls described above are shown in Table 2. In Table 2. P = pattern, F = flag, and V = code value.

표 2. 한글탐색양식에 대한 한글 음절의 부합 방법 Table 2. Matching Hangul Syllables for Hangul Search Style

이하, 첨부된 도면을 참조하여 본 발명의 수행에 필요한 사항들과 수행과정을 더 자세히 설명한다. Hereinafter, with reference to the accompanying drawings will be described in more detail matters and procedures required for the implementation of the present invention.

도 1을 참조하여 입력양식의 정규화와 양식 플래그의 생성에 대해 설명하면 다음과 같다. Referring to FIG. 1, normalization of input forms and generation of form flags are described below.

도 1은 입력양식의 정규화 과정을 설명하기 위한 설명도이다. 1 is an explanatory diagram for explaining a normalization process of an input form.

여기에서, 라이크(like) 질의의 검색 키워드는 특수 문자들인 '%'와 '_'를 포함할 수 있으며 탐색자표시자를 가질 수 있다. Here, the search keyword of the like query may include special characters '%' and '_' and may have a searcher indicator.

도 1의 입력양식(50)은 '홍'이라는 문자 뒤에 임의의 문자가 들어갈 수 있고, 그 뒤에 'ㄱ'을 초성자음으로 가지는 한글이 들어가고, 그 뒤에 'ㅗ'를 중성으로 가지는 한글이 들어가고, 그 뒤에 'a'가 오고, 그 뒤에 임의의 1개의 문자가 들어가고, 그 뒤에 'c'가 들어간 문자열을 검색하라는 의미를 가진다. 여기에서, '%'는 1이상의 임의의 문자가 올 수 있음을 나타내고 '/'는 탐색자표시자이고, '_'는 임의의 한문자가 올 수 있음을 나타낸다. The input form 50 of FIG. 1 may include any character after the letter 'hong', followed by a Hangul having 'a' as an initial consonant, followed by a Hangul having 'ㅗ' as a neutral, This is followed by an 'a' followed by any single character followed by a 'c'. Here, '%' indicates that one or more arbitrary characters may come, '/' indicates a searcher indicator, and '_' indicates that any single character may follow.

한글탐색양식에서 탐색자와 탐색자표시자는 '/ㄱ', 'ㄱ/', '/ㄱ/'의 경우 이외에 '/ㄱ가ㅏ/'와 같이 탐색자가 한글자 이상으로 구성될 수 있다.In the Korean search form, the searcher and the searcher indicator may include more than one Hangul character, such as '/ ㄱ 가 ㅏ /', in addition to the case of '/ ㄱ', 'ㄱ /', '/ ㄱ /'.

상기와 같은 도 1의 입력양식(50)에서, 탐색자표시자(52)와 바로 그 다음에 나오는 한글을 한글탐색양식 이라 칭한다. 정규화는 입력양식(50)에서 탐색자표시자(52)를 제거 하고 정규화된 양식(50a)으로 정규화한다. 그리고 이 정규화 과정에서 한글범위 양식(54)에 해당하는 문자의 종류(초성, 중성, 초성+중성)를 알 수 있는 양식 플래그(pattern flag, 60)를 생성한다. 정규화를 통해 생성된 양식 플래그 (60)를 이용해서 현재 검색하고자 하는 키워드의 한글 범위를 알 수 있다. In the input form 50 of FIG. 1 as described above, the searcher indicator 52 and the immediately following Hangul are called Hangul search forms. Normalization removes the searcher indicator 52 from the input form 50 and normalizes to the normalized form 50a. In this normalization process, a pattern flag (60) for identifying the types of characters (first, neutral, first + neutral) corresponding to the Hangul range style 54 is generated. Using the form flag 60 generated through normalization, the Hangul range of the keyword to be searched can be known.

본 발명의 한글 범위양식은 라이크(like) 연산자 등의 탐색자표시자 다음에 나오는 한글 음절의 초성으로만 사용되는 자음인 초성자음, 초성과 중성으로 구성된 한글 음절, 또는 한글 모음(vowel)을 의미하며 각각 해당 초성자음을 초성으로 가지는 모든 한글 음절들, 해당 초성과 중성의 초성자음과 모음을 초성과 중성으로 가지는 모든 한글 음절들, 또는 해당 모음을 중성으로 가지는 모든 한글 음절들에 부합된다. The Hangul range form of the present invention means a consonant which is used only as a consonant of a Hangul syllable following a searcher indicator such as a like operator, a Hangul syllable composed of a consonant and a neutral, or a vowel. Each Hangul syllable that has the first consonant as the first consonant, all the Hangul syllables that have the first and neutral vowels and the vowels as the first and neutral vowels, or all the Hangul syllables as the vowels as a neutral.

도 2는 입력양식의 플래그 설정 과정을 나타낸 흐름도이다. 2 is a flowchart illustrating a flag setting process of an input form.

먼저, 첫 필드에서부터 순서대로 탐색자표시자가 있는지를 판단하고(단계 1), 탐색자표시자(52)가 나올 때까지 해당 필드의 플래그를 0으로 설정한다(단계 2). 단계 2 다음에는 해당 필드가 마지막 필드인가 판단하여(단계 3) 마지막 필드이면 종료하고 그렇지 않으면 단계 1로 돌아가 그다음 필드에 탐색자표시자(52)가 있는지를 판단한다. First, it is determined whether there is a searcher indicator in order from the first field (step 1), and the flag of the corresponding field is set to 0 until the searcher indicator 52 comes out (step 2). After step 2, it is determined whether the field is the last field (step 3). If it is the last field, the process ends. Otherwise, the process returns to step 1 to determine whether there is a searcher indicator 52 in the next field.

단계 1의 수행결과 탐색자표시자(52)가 있으면, 그 다음필드가 한글인지를 판단한다(단계 4). If there is a search result indicator 52 of the execution result of step 1, it is determined whether the next field is Korean (step 4).

단계 4의 수행결과 한글이 아니면 단계 2로 가고, 한글이면 그 한글이 초성자음인지를 판단한다(단계 5). If the result of step 4 is not Hangul, go to step 2; if it is Hangul, it is determined whether the Hangul is a consonant (step 5).

단계 5의 수행결과 초성자음이면 해당 필드에 초성자음에 대응되는 플래그를 설정하고 단계 3으로 간다(단계 6). If the result of the step 5 is the initial consonant, a flag corresponding to the initial consonant is set in the corresponding field, and the process proceeds to the step 3 (step 6).

단계 5의 수행결과 초성자음이 아니면 모음인가를 판단한다(단계 7). If the result of step 5 is not a consonant, it is determined whether it is a vowel (step 7).

단계 7의 수행결과 모음이면 모음에 대응되는 플래그를 설정하고 단계 3으로 간다(단계 8). If the result set of step 7 is set, the flag corresponding to the set is set, and the process goes to step 3 (step 8).

단계 7의 수행결과 모음이 아니면 '초성+중성'의 음절인지를 판단한다(단계 9). If it is not a collection of performance results in step 7, it is determined whether the syllable is 'first + neutral' (step 9).

단계 9의 수행결과 '초성+중성'의 음절도 아니면 단계 2로 가고, '초성+중성'의 음절이면 '초성+중성'의 음절에 대한 플래그를 해당필드에 설정하고 단계 3으로 간다(단계 10). If the result of step 9 is not the syllable of 'primary + neutral', go to step 2; if the syllable of 'first + neutral', set the flag for the syllable of 'primary + neutral' in the corresponding field and go to step 3 (step 10 ).

그 후 마지막 필드까지 상기의 과정을 되풀이하고, 마지막 필드인 경우 종료한다. After that, the above process is repeated until the last field, and if the last field is terminated.

상기와 같은 양식 플래그의 설정과 관련한 한글 범위질의 정규화 규칙을 설명하면 다음과 같다. The normalization rule of Hangul range query related to the setting of the above format flag is as follows.

(1) 정규화된 양식을 표현하는 배열 zPattern의 바이트 수효만큼의 크기를 가지는 배열 zPatternFlag를 둔다. zPatternFlag[i]는 zPattern[i]와 zPattern[i+1]로 구성되는 한글탐색양식의 식별 값을 나타낸다. 배열 zPatternFlag의 각 엔트리의 기본값은 NULL(0)이고, 해당 양식이 한글이 아닐 경우도 zPatternFlag의 값은 NULL(0)이 된다. (1) The array zPatternFlag is as large as the number of bytes of the array zPattern representing the normalized form. zPatternFlag [i] represents an identification value of the Korean search form composed of zPattern [i] and zPattern [i + 1]. The default value of each entry of the array zPatternFlag is NULL (0). Even if the format is not Korean, the value of zPatternFlag is NULL (0).

(2) 탐색자표시자와 한글 한 문자의 쌍에 대하여 (2) About a pair of search indicators and a Korean character

(2-2) 그 한글 문자의 두 바이트들을 행렬 zPattern의 다음 저장 위치에 저 장하며, 그 위치의 색인을 i와 i+1이라 할 경우, (2-2) If two bytes of the Hangul character are stored in the next storage location of the matrix zPattern, and the index of the location is i and i + 1,

그 한글 문자가 '종성자음의 집합 - 초성자음의 집합'과 같은 '종성으로만 사용되는 자음', '초성+중성+종성의 음절', '초성자음', '초성+중성의 음절', '모음'인 경우에 대하여 각각 NULL, NULL, INITIAL_CONSONANT(1), INITIAL_MEDIAL_SYLLABLE(2), MEDIAL_VOWEL(3)의 값을 zPatternFlag[i]에 설정하며 NULL 값을 zPatternFlag[i+1]에 설정한다.The Hangul characters are 'consonant used only for the final consonant' such as' set of the final consonant-the set of the initial consonant ',' first + neutral + final syllable ',' chosonary consonant ',' first + neutral syllable ',' Collection, NULL, NULL, INITIAL_CONSONANT (1), INITIAL_MEDIAL_SYLLABLE (2), and MEDIAL_VOWEL (3) are set to zPatternFlag [i] and NULL is set to zPatternFlag [i + 1].

위에서 설명한 정규화 과정을 거친 후, 검색 프로그램은 정규화된 문자열 양식과 양식 플래그의 첫 필드를 읽어 와서(단계 21) 양식이 한글이고 양식플래그는 "0"이 아닌 값을 가지는지 판단한다(단계 22). 단계 22를 만족하지 않는 경우 일반 라이크 검색방식에서와 같은 방식으로 해당 필드의 데이터를 데이터베이스의 데이터와 비교하여 검색하여(단계 23) 일반검색 데이터를 추출한다(단계 24). After the normalization process described above, the search program reads the normalized string form and the first field of the form flag (step 21) to determine if the form is Korean and the form flag has a value other than "0" (step 22). . If the step 22 is not satisfied, the general search data is extracted by comparing the data of the corresponding field with the data of the database in the same manner as in the normal like search method (step 23).

그런 다음 다음 필드의 양식이 있는지를 판단하고(단계 24), 다음 필드의 양식을 읽어 와서(단계 25) 단계 22를 수행한다. It then determines if there is a form for the next field (step 24), reads the form for the next field (step 25) and performs step 22.

단계 22의 수행결과 이를 만족하는 경우 해당 한글탐색양식의 한글범위를 위에서 설명한 것과 같은 방식으로 산출하고(단계 27), 데이터베이스의 데이트와 비교하여 한글범위에 부합되는 데이터를 추출하고(단계 28), 단계 25를 수행한다. If the result of step 22 is satisfied, the Hangul range of the corresponding Hangul search form is calculated in the same manner as described above (step 27), the data corresponding to the Hangul range is extracted by comparing with the data of the database (step 28), Perform step 25.

단계 25의 수행결과 이를 만족하지 않으면 추출된 데이터를 통합하여(단계 29), 검색결과로 출력하고(단계 30) 종료한다. If the result of the step 25 is not satisfied, the extracted data is integrated (step 29), output as a search result (step 30), and the process ends.

상기와 같은 과정으로 본 발명에 따른 한글 검색방법이 수행된다. The Hangul search method according to the present invention is performed by the above process.

정규화한 문자열 양식과 비교 문자열의 부합 알고리즘에 대해 보충설명하면 다음과 같다. The following describes the matching algorithm for the normalized string form and comparison string.

라이크(like) 질의의 검색은 정규화된 문자열 양식(pattern)과, 양식(pattern)의 종류를 알 수 있는 양식 플래그(pattern flag), 비교할 데이터, 비교할 데이터의 길이를 필요로 한다. 현재 찾고자 하는 양식(pattern)이 ASCII 값이 아닐 경우 해당 양식(pattern)의 양식 플래그(pattern flag)를 보고 현재 양식(pattern)의 한글 범위를 구하고 비교할 데이터가 구해진 양식(pattern)의 한글 범위에 속하는지를 판단한다. Searching for like queries requires a normalized string pattern, a pattern flag that identifies the type of pattern, the data to compare, and the length of the data to compare. If the pattern you are looking for is not an ASCII value, look at the pattern flag of the pattern to find the Hangul range of the current pattern, and the data belonging to the Hangul range of the pattern where the data to be compared is obtained. Judge.

이상 설명한 본 발명은 데이터 저장매체에 저장되어 있다가 데이터베이스 시스템 등의 주기억장치와 씨피유 등에 로딩 되어 검색기능을 수행하는 응용프로그램에 의해 구현되어진다. The present invention described above is implemented by an application program that is stored in a data storage medium and loaded into a main memory device such as a database system and the like, and performs a search function.

이상의 설명에서 알 수 있는 바와 같이 본 발명은 한글의 초성자음, 한글 모음 및 한글 자음+모음의 음절을 포함하는 한글을 검색할 수 있도록 해주는 획기적인 효과를 제공한다. As can be seen from the above description, the present invention provides a breakthrough effect that enables a search for a Korean language including a consonant of a Korean consonant, a Korean vowel, and a Korean consonant + a syllable.

필요에 따라, 본 발명에서는 초성자음이 포함된 한글에 대한 검색기능만 구현될 수도 있고, 한글 모음이 포함된 한글의 검색기능만 구현될 수도 있고, 한글 모음+자음의 음절이 포함된 한글의 검색기능만 구현될 수도 있고, 상기 2이상의 검색기능이 포함된 형태로 구현될 수도 있다. If necessary, in the present invention, only a search function for Hangul containing an initial consonant may be implemented, or a search function of Hangul including a Korean vowel may be implemented, and a Korean search including a syllable of Hangul vowel + consonant is implemented. Only a function may be implemented or may be implemented in a form including two or more search functions.

한글탐색양식을 이용한 본원 발명은 한글 검색 영역에 새로운 가능성을 제공 할 것이다. The present invention using the Hangul search form will provide new possibilities in the Hangul search area.

Claims

A searcher representing at least one of Hangul choson consonants, syllables consisting of Hangul choson consonants and neutral vowels, and Hangul vowels, and a searcher indicator placed at the front or back or front and back of the searcher to identify the searcher as a searcher. Inputting a search word using a Hangul search form configured to include;

Calculating a code value range of Korean characters which may include a character used as an element of the searcher included in the input search word;

Extracting Hangul syllables belonging to the calculated code value range from a search target; And

Hangul search method using the Hangul search form comprising the step of outputting a string containing the extracted Hangul syllables as a search result.

The method of claim 1, wherein the searcher indicator is one or more reserved general characters or special characters.

The method of claim 1, wherein the searcher indicator is an escape character from esque.

Inputting a search word including a searcher;

A code value range calculating step of calculating a code value range of Korean characters which may include a character used as an element of the searcher;

A Hangul syllable extraction step of extracting Hangul syllables belonging to the calculated code value range from a search target; And

Hangul search method using a range of Hangul code values including a search result output step of outputting a string containing the extracted Hangul syllables as a search result.

The method of claim 4, wherein the calculating of the code value range includes a Hangul which may be used as an element of the searcher among Hangul's initial consonants, Hangul vowels, Hangul's initial consonants and vowel syllables. Hangul search method using a range of Hangul code value, characterized in that to calculate the range of code values.

6. The method of claim 5, wherein the letter used as an element of the searcher is a Hangul complete initial consonant,

The code value range calculating step,

The first consonant to set the order value of each first consonant so that the search program or the program linked to the search program can identify the first consonant inputted as one element of the searcher among the Korean consonants Step value setting step of

Set the input form of the Hangul search form containing the searcher indicator indicating that one element of the searcher can be a consonant so that the input form of the Hangul search form including the searcher indicator can be used as the searcher. Input Form Setup Steps

A normalization step of normalizing the searcher input in the input form to a string form in which the searcher indicator is excluded;

And a code value reading step of reading a minimum code value and a maximum code value of a Hangul syllable that may have the first consonant input as one element of the searcher by using the set order value of the first consonant. Korean search method using the range of code values.

The method of claim 6, wherein the syllable extraction step comprises extracting the Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with a code value of a syllable to be searched. Hangul search method using a range of Hangul code value characterized by.

7. The method of claim 6, wherein the normalization step includes generating a style flag for identifying whether a Hangul corresponding to the Hangul search form following the searcher indicator corresponds to a consonant. Hangul search method using range.

7. The method of claim 6, wherein in the step of setting the order value of each consonant, 'AB' is the minimum value CON_START among Korean consonants, and the next consonants are incremented by 1 according to the dictionary order, and 'ㅎ' is maximum. An array indicated so that the first consonant input into the searcher from among Korean consonants (CON_START to CON_END) can be identified as the value CON_END,

static const int Initial_Consonant [] =

{

0, 1, -1, 2, -1, -1, 3, 4, 5, -1, / * 'ㄱ', 'ㄲ', 'ㄳ', 'ㄴ', 'ㄵ', 'ㄶ', 'ㄷ', 'ㄸ', 'ㄹ', 'ㄺ' * /

-1, -1, -1, -1, -1, -1, 6, 7, 8, -1, / * 'ㄻ', 'ㄼ', 'ㄽ', 'ㄾ', 'ㄿ', ' ㅀ, 'ㅁ', 'ㅂ', 'ㅃ', 'ㅄ' * /

9, 10, 11, 12, 13, 14, 15, 16, 17, 18 / * 'ㅅ', 'ㅆ', 'ㅇ', 'ㅈ', 'ㅉ', 'ㅊ', 'ㅋ', ' ㅌ, ', ㅎ * /

} The Hangul search method using a range of Hangul code values, characterized in that to set the order value of each consonant.

10. The method of claim 9, wherein in the code value range calculating step, when an initial consonant included in the searcher having a value between the CON_START and the CON_END is X, Initial_Consonant [X-CON_START] = i, and the i-th initial consonant LB_Inclusive is ICV [i * VOWEL_SIZE], and UB_Exclusive is ICV [i * VOWEL_SIZE, when the range of the minimum code value and the maximum code value that any syllable S having as a primary can have is LB_Inclusive ≤ S <UB_Exclusive. + VOWEL_SIZE],

The VOWEL_SIZE is the number of Korean vowels,

The ICV is an array;

static const unsigned short ICV [] =

{

'Ga', 'Dog', 'Ga', 'Her', 'Geo', 'Crab', 'Off', 'Gye', 'Go', 'Family', 'Tu', 'Ko', 'Gyo' ',' Gu ',' gou ',' ark ',' ear ',' gyu ',' he ',' 긔 ',' ki ',

'Ka', 'seam', '꺄', 'off', 'off', 'ke', 'cuddly', '꼐', 'ko', '꽈', 'pretty', 'chi', '꾜 ',' Ku ',' cuddle ',' sew ',' squeezed ',', ',' off ',' kitten ',' kitten ',

'I', 'my', 'nya', 'you', 'you', 'yes', 'woman', '녜', 'no', 'let', 'brain', 'brain', 'urine' ',' Nu ',' hand ',' 눼 ',' nu ',' new ',' ne ',' ni ',' ni ',

'Da', 'large', '댜', 'more', 'more', 'de', 'dee', '뎨', 'do', '돠', 'can', 'be', '됴 ',' Two ',' leave ',' 뒈 ',' back ',' du ',' de ',' 듸 ',' D ',

'Ta', 'time', 'float', 'float', 'float', 'flock', '뗘', 't', 't', '똬', '뙈', '뙤', 't ',' Do ',' 뛔 ',' 뛔 ',' Run ',' Tu ',' Tu ',' Off ',' Zi ',

'La', 'Ra', 'Lah', 'R', 'R', 'Le', 'Ryeo', 'Yes', 'Ro', '롸', '뢨', 'Rho', 'Ryo' ',' Lu ',' luo ',' 뤠 ',' lu ',' ryu ',' le ',' li ',' li ',

'Ma', 'Mae', '먀', 'Mur', 'Mur', 'Me', 'Come', '몌', 'Mo', '뫄', 'Moo', 'Moo', 'Tomb ',' Mu ',' what ',' 뭬 ',' 뮈 ',' mu ',' me ',' mi ',' mi ',

'Bar', 'boat', '뱌', 'burr', 'burr', 'bee', 'rice', '볘', 'bo', 'look', '봬', '뵈', '뵤 ',' Boo ',' 붜 ',' 붸 ',' v ',' view ',' b ',' rain ',' rain ',

'Pa', 'po', '뺘', 'good', 'good', 'pe', 'bone', 'po', 'po', '뾔', '뾔', '뾔', 'tip ',' Pu ',' 쀼 ',' 쀼 ',' 쀼 ',' 쀼 ',' pretty ',' beep ',' beep ',

'Sa', 'bird', 'sha', 'sha', 'seo', 'three', 'sher', 'she', 'cow', '솨', 'chain', 'iron', 'show ',' Number ',' 숴 ',' she ',' she ',' shu ',' su ',' shi ',' shi ',

'Che', 'sah', '썅', 'sur', 'sur', 'se', '쏀', '쏀', 'saw', 'shoot', 'wedge', 'dip', '쑈 ',' Xu ',' 쒀 ',' 쒜 ',' 쒸 ',' 쓩 ',' Tsu ',' Tu ',' Sea ',

'Ah', 'Ah', 'Hey', 'Hey', 'U', 'E', 'F', 'Yes', 'O', 'Wah', 'Why', 'Other', 'Yo' ',' U ',' wo ',' we ',' up ',' u ',' u ',' of ',' yi ',

'Ja', 'Ja', 'Ja', '쟤', 'Me', 'Je', 'Jer', '졔', 'Jo', 'Left', '좨', 'Sin', 'Joe' ',' Ju ',' give ',' 줴 ',' rat ',' ju ',' z ',' ji ',' ji ',

'Cha', 'th', '쨔', 'ze', 'ze', '쩨', '쪄', 'speck', 'squash', '쫘', '쫴', '' ',' 쭁 ' ',' Chuu ',' 쭤 ',' 쮜 ',' 쮜 ',' 쮸 ',' tsu ',' chi ',' chi ',

'Cha', 'chae', 'cha', 'cher', 'cher', 'che', 'cher', '쳬', 'second', '촤', 'choi', 'choi', 'cho ',' Chu ',' chu ',' pancreas', 'odor', 'chu', 'tsu', 'chi', 'chi',

'Ka', 'ca', 'ky', 'ker', 'ker', 'ke', 'turn', ',', 'nose', 'quan', 'joy', '쾨', 'kyo' ',' Cu ',' qua ',' qua ',' qui ',' cue ',' large ',' key ',' key ',

'Ta', 'tae', '탸', 'ter', 'terr', 'te', '텨', '톄', 'sat', '톼', '퇘', 'tung', '툐 ',' Two ',' 퉈 ',' 퉤 ',' throw ',' tu ',' t ',' 틔 ',' tee ',

'Par', 'l', '퍄', 'fur', 'fur', 'pe', 'unfold', 'lung', 'po', '퐈', '푀', '푀', 'table' ',' Fu ',' 풔 ',' Fu ',' Fu ',' Fu ',' F ',' Blood ',' Blood ',

'Ha', 'sun', '햐', 'huh', 'huh', 'he', 'tongue', 'hye', 'ho', 'hwa', '화', 'hoe', 'hyo ',' Hu ',' 훠 ',' fe ',' hui ',' hugh ',' he ',' hee ',' he '

} As for the syllables used in the Hangul Completion type, and for syllables not used in the Hangul Completion type, repeat the syllables with the minimum value among the values greater than or equal to the syllable. Hangul search method using range.

The character of claim 5, wherein a character used as an element of the searcher is a Unicode initial consonant x of Korean.

The code value range calculating step,

If x is a value between UNI_IC_START ('ㄱ') and UNI_IC_END ('ㅎ') of a Unicode initial consonant, then set x-UNI_IC_START as the initial index of x,

If x is a value between UNI_CONSONANT_START ('ㄱ') and UNI_CONSONANT_END ('ㅎ') of the Unicode consonant, then x is the initial consonant only if the value of the array IC_Index [x-UNI_CONSONANT_START] of KSX 1001 is not -1. Judging and putting it in the index of that consonant,

Hangul search using a range of code values of the Hangul characterized in that it comprises a code value reading step of reading the minimum code value and the maximum code value of the Hangul syllable that has the first consonant corresponding to the first consonant index Way.

The method of claim 11, wherein the extracting of the Hangul syllables has a range of values that any syllable S having an initial consonant having an initial consonant index of ic_idx as an initial number R is ('ga' + ic_idx * 588) m In this case, the Hangul retrieval method using a range of Korean code values, characterized by setting a range of m ≤ R <(m + 588) and extracting Hangul syllables belonging to this range.

The method of claim 5, wherein the letter used as an element of the searcher is a Hangul complete vowel,

The code value range calculating step,

A step of setting a Korean vowel code value in which a minimum code value and a maximum code value of a Korean vowel are set so that a search program or a program linked to the search program can identify whether a syllable inputted as an element of a searcher is a Korean vowel.

Set the input form of the Korean search form including the searcher indicator indicating that one element of the searcher can be a vowel so that the input form of the Korean search form including the searcher indicator can be used as the searcher. Form setup step

Normalization step of normalizing the searcher input in the input form in the form of a string excluding the searcher indicator

A vowel order calculation step of calculating the number of vowels inputted as a searcher among the vowels of the Korean alphabet by a search program or a program linked to the search program.

A method of retrieving Korean characters using a range of Korean code values, comprising the steps of reading the minimum and maximum code values of Hangul syllables that may have the vowels calculated in the vowel order calculation step as neutral. .

15. The method of claim 13, wherein the syllable extraction step comprises extracting the Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with a code value of a syllable to be searched. Hangul search method using a range of Hangul code value characterized by.

The range of Korean code values according to claim 13, wherein the normalization step generates a style flag for identifying whether a Korean language corresponding to a Korean search form following the searcher indicator corresponds to a vowel. Hangul search method using the.

15. The method of claim 13, wherein in the vowel order calculating step, when a Korean letter X input as a searcher has a value between a minimum value VOWEL_START ('ㅏ') and a maximum value VOWEL_END ('ㅣ') among Korean vowels, A Hangul search method using a range of Hangul code values characterized by calculating a neutral vowel index with VOWEL_START.

17. The method of claim 16, wherein the calculating of the chord value range comprises a chord value range in which any syllable S having a vowel having a neutral vowel index i as a neutral in Korean when i is X-VOWEL_START. A Hangul retrieval method using a range of Hangul code values, characterized by calculating ranges of code values that syllables having consonant + i-th vowel can have.

Inputting a search word including a searcher;

If the neutral vowel x used as an element of the searcher is a value between MV_START ('ㅏ') and MV_END ('ㅣ'), which is the range of vowels in the Hangul alphabet of Unicode, the x is determined to be a Hangul vowel and x-MV_START Is the neutral vowel index vowel_idx of x, and if x is a value between UNI_VOWEL_START ('ㅏ') and UNI_VOWEL_END ('ㅣ'), which is the range of vowels in the Korean Hangul compatibilities of Unicode, then x is determined to be a Korean vowel. x-determining the neutral vowel index vowel_idx with UNI_VOWEL_START as the neutral vowel index vowel_idx;

When LB_Inclusive ≤ R <UB_Exclusive, LB_Inclusive is ('A' + vowel_idx * 28) when the range of minimum and maximum chord values that any syllable S that has a vowel_idx neutron collection index as neutral can be LB_Inclusive ≤ R <UB_Exclusive. A matching range setting step of setting a matching range where UB_Exclusive is ('is' + (INITIAL_CONSONANT_SIZE -1) * 588 + (vowel_idx + 1) * 28); And

Extracting Hangul syllables belonging to the set matching range;

The INITIAL_CONSONANT_SIZE is a Hangul search method, characterized in that the number of the 19 consonants available as a consonant.

[6] The method of claim 5, wherein the letter used as an element of the searcher is a Hangul syllable composed of a Hangul complete type Choseong + Neutral.

The code value range calculating step,

In order for the search program or the program linked to the search program to identify whether the syllable of Hangul inputted as one element of the searcher is a syllable composed of consonant + neutral, the order value of each syllable composed of consonant + neutral is set. Korean syllable order value setting step consisting of consonant + neutral

A syllable type checking step of confirming whether a search program or a program linked to the search program is a syllable composed of only consonants + neutrals

By setting the input form of the Hangul search form including the searcher indicator indicating that one of the searchers can be a syllable consisting only of the initial and the neutral words, the input form of the Hangul search form including the searcher indicator can be used as the searcher. Input form setting step of Korean search form

And reading the minimum code value and the maximum code value of Hangul syllables that may have a consonant + neutral syllable included in the searcher.

20. The method of claim 19, wherein the syllable extraction step comprises extracting a Hangul syllable that is greater than the minimum code value and less than the maximum code value by comparing the read minimum code value and the maximum code value with a code value of a syllable to be searched. Hangul search method using a range of Hangul code value characterized by.

20. The method of claim 19, wherein the normalizing step generates a style flag for identifying whether a Hangul corresponding to a Hangul search form following the searcher indicator corresponds to a syllable of a consonant + neutral. Korean search method using a range of code values.

20. The method of claim 19, wherein the Hangul syllable code value setting step consisting of consonant + neutral has an array ICV to identify that a Hangul character consisting of two bytes is a syllable composed of consonant + neutral.

The array ICV,

static const unsigned short ICV [] =

{

} Ranges the code values of Korean characters, which are made based on the characters used in the Korean complete type and repeatedly use syllables having the minimum value among the values greater than or equal to the corresponding syllable for syllables not used in the Korean complete type. Hangul search method using the.

23. The ICV array of claim 22, wherein the chord value range calculating step comprises the initial + neutral chords included in the searcher when the chord value of 'A' is HANGUL_START and the chord value of 'Hing' is HANGUL_END. If the value is searched by binary search and the order of the array of maximum values among the values less than or equal to the code value of X is i, the range of code values that the syllables including X may have is ICV [i] ≤ X < A Hangul search method using a range of Hangul code values, characterized in that it is calculated by ICV [i + 1].

The method of claim 5, wherein the letter used as an element of the searcher is a Hangul syllable composed of Unicode first + neutral.

The code value range calculating step,

Any single syllable S is Unicode's minimum code value UNI_HANGUL_START ('ga') and maximum code value UNI_HANGUL_END ('

Is between '), and the value of (S-' ga ') is ((((S-' ga ') / 588) * 588) + (((((S-' ga ') / 28)% 21) * 28)), the syllable S is judged as a syllable composed only of a consonant and a neutral,

A Hangul retrieval method using a range of Hangul code values, characterized in that for reading the initial consonant index value of the consonant consonant S and the neutral vowel index value for the syllable S.

The method of claim 24, wherein the extracting the Hangul syllables comprises:

The code value range of S having an initial consonant with a vowel_idx vowel_idx and a vowel_idx vowel_idx with an initial consonant index is R and ('A' + ic_idx * 588 + vowel_idx * 28) is m. , sets the matching range of m ≤ R <(m + 28) and extracts Hangul syllables belonging to this matching range, and the initial consonant ic_idx is (S-'ga') / (VOWEL_SIZE * (FINAL_CONSONANT_SIZE + 1)) That is, (S-'ga') / 588, the neutral vowel index vowel_idx is ((S-'ga') / (FINAL_CONSONANT_SIZE + 1))% VOWEL_SIZE, that is ((S-'ga') / 28) A Hangul search method using a range of Hangul code values, characterized in that the value of% 21.

A search program or a program linked to the search program sets the order value of each consonant so that the first consonant inputted as an element of the searcher among Korean consonants can be identified. Order value setting step

In order to identify whether the search program or the program linked to the search program is a syllable composed of consonant + neutral among syllables inputted as one of the searchers, the code values of the syllables composed of consonant + neutral are sequentially set. Hangul syllable chord value setting step

Input form of the Hangul search form including the searcher indicator by setting the input form of the Hangul search form including the searcher indicator indicating that one element of the searcher is at least one of the syllables consisting of Hangul choson consonant, Hangul neutral, Hangul choson and neutral Input form setting step of Korean search form to use as searcher

A normalization step of converting the searcher input in the input form into a string form without the searcher indicator and generating a form flag indicating the type of the string form.

A code value range calculating step of calculating a Hangul range of the string form according to the generated form flag

A code value comparison step of comparing whether a code value of a syllable of a search object belongs to the calculated code value range

An extraction step of extracting Hangul syllables whose code values of the searched syllables are within the calculated code value range as a result of the comparison;

Hangul search method using a range of Hangul code values including a search result output step of outputting a string including the extracted syllables as a search result.

27. The method of claim 26, wherein said normalizing step comprises the following rules:

(1) The array zPatternFlag is as large as the number of bytes of the array zPattern representing the normalized form, and zPatternFlag [i] specifies the identification value of the Korean search form consisting of zPattern [i] and zPattern [i + 1]. The default value of each entry of the array zPatternFlag is NULL (0), and the value of zPatternFlag is NULL (0) even if the form is not Korean.

(2) For a pair of search indicators and one Korean character,

(2-1) do not store the searcher indicator in the matrix zPattern,

(2-2) store the two bytes of the Hangul character in the next storage location of the matrix zPattern and place the indices of i and i + 1 as

(2-3) NULL for the case that the Hangul characters are 'consonant used only as the finality', 'consonant + neutral + final syllable', 'consonant consonant', 'first + neutral syllable', and 'vowel' Korean code values, which are set to zPatternFlag [i] and NULL, INITIAL_CONSONANT (1), INITIAL_MEDIAL_SYLLABLE (2), and MEDIAL_VOWEL (3). Hangul search method using range.

Inputting a search word including a searcher;

If the first consonant x, which is an element of the searcher, is a value between UNI_IC_START and UNI_IC_END of the Unicode first consonant, then x-UNI_IC_START is set to the initial consonant index ic_idx of x, and if x is a value between UNI_CONSONANT_START and UNI_CONSONANT_END of the Unicode consonant Determining x as an initial consonant only when the value of the array IC_Index [x-UNI_CONSONANT_START] of KSX 1001 is not -1, and determining the ic_idx which is the initial consonant color by setting IC_Index [x-UNI_CONSONANT_START] as the initial consonant index ic_idx ; And

And a syllable S1 whose value of (S1-'ga ') / 588 matches the determined initial consonant index ic_idx of x when the comparison syllable is S1.

Inputting a search word including a searcher;

If the neutral vowel x used as an element of the searcher is a value between MV_START and MV_END which is the range of vowels in the Hangul alphabet of Unicode, the x is determined as the Hangul vowel and x-MV_START is the vowel_idx of the vowel vowel index of x. Determining the neutral vowel index vowel_idx, if x is a value between the UNI_VOWEL_START and the UNI_VOWEL_END ranges of the vowel in the Unicode Hangul compatible Unicode, and determining x as the Korean vowel and leaving x-UNI_VOWEL_START as the neutral vowel index vowel_idx; And

When the syllable to be compared is S1, the method includes extracting a syllable S1 whose value of (((S1-'ga') / 28)% 21 matches the vowel_idx which is the neutral vowel color of x. Search method.

Inputting a search word including a searcher;

Hangul one syllable S composed of Unicode initial + neutral used as an element of the searcher is a value between the minimum code value UNI_HANGUL_START and the maximum code value UNI_HANGUL_END of Unicode, and the value of (S-'ga') is ((( If (S-'ga') / 588) * 588) + ((((((S-'ga') / 28)% 21) * 28)) is equal to the syllable, S Doing;

Calculating ic_idx, the initial consonant of S, as a value of ((S-'ga') / (VOWEL_SIZE * (FINAL_CONSONANT_SIZE + 1))), ie, ((S-'ga') / 588;

Calculating the vowel_idx as a value of (((S-'ga') / (FINAL_CONSONANT_SIZE + 1))% VOWEL_SIZE), that is, ((((S-'ga') / 28)% 21); And

When the syllable to be compared is S1, the value of (S1-'ga') / 588 is equal to ic_idx, which is the initial voice, and (((S1-'ga') / 28)% 21) is S1, which is the neutral vowel_idx. Hangul search method comprising the step of extracting.

When the searcher is a Hangul complete neutral vowel,

Finding the value m of the maximum index among the entries with the maximum value less than or equal to the syllable to be compared in the array ICV

taking m% VOWEL_SIZE as the neutral vowel index of the vowel of the syllable, and

And comparing the neutral vowel index of the comparison object with the searcher's vowel vowel index and extracting the same as the searcher's vowel vowel index.

When the searcher is the jth neutral vowel of the Hangul complete form,

The process of guessing the vowels of the syllable S to be compared by the vowel index j of the Korean search form, obtaining the vowel index of the syllables, guessing the ICV index of the syllables, and verifying that the guess is correct. Made of

Treat ICV_2D [i] [j] the same as ICV [m], where m is i * VOWEL_SIZE + j, i is m / VOWEL_SIZE, and j is m% VOWEL_SIZE

The checking process,

a. IC_SIZE entries of the j th column of the array ICV_2D, i.e. ICV_2D [0] [j], ICV_2D [1] [j],... For ICV_2D [IC_SIZE-1] [j], the binary search for the entry having the maximum value among the values less than or equal to S is obtained.

b. When the index of the obtained entry is m,

And extracting the comparison syllable S when the entry m is the last entry of the array ICV or S <ICV [m + 1].

26. The method according to claim 12 or 25, wherein the minimum code value and the maximum code value of the Hangul code value range R are obtained, and the values are expressed by any encoding method of any character encoding format of the Unicode standard, and then the range is expressed in the range. Hangul search method using a range of Hangul code value, characterized in that to extract the belonging Hangul syllables.

A computer-readable recording medium having recorded thereon a program using the Korean search method according to any one of claims 1 to 32.

33. A Korean language search system comprising means for performing each process of any one of claims 1 to 32.

delete