KR101459285B1

KR101459285B1 - Device and method for determining sentence similality and device and method for marking exam

Info

Publication number: KR101459285B1
Application number: KR20140040422A
Authority: KR
Inventors: 김종명
Original assignee: 김종명
Priority date: 2014-04-04
Filing date: 2014-04-04
Publication date: 2014-11-12

Abstract

According to an embodiment of the present invention, a sentence which a test applicant inputs is automatically compared with an answer sentence and marked, and the correct marking can be realized without efforts by a person. Accordingly, the marking system can be performed through digitization and automation, and the outcome of the test can be provided promptly to the test applicant. Also, according to an embodiment of the present invention, a reference to an incorrect answer which the user has input can be provided by using the outcome of the marking.

Description

TECHNICAL FIELD [0001] The present invention relates to a sentence similarity determination method and apparatus, a test score method,

본 발명은 문장유사도 판단방법 및 장치, 시험채점방법 및 장치에 관한 것으로, 시험응시자가 입력한 답을 자동으로 채점하기 위한 문장유사도 판단방법 및 장치, 시험채점방법 및 장치에 관한 것이다.The present invention relates to a sentence similarity determination method and apparatus, a test scoring method and apparatus, and a sentence similarity determination method and apparatus, a test scoring method, and an apparatus for automatically scoring an answer inputted by a test candidate.

최근 들어, 취업난이 심화되고, 자기 개발에 대한 필요와 욕구가 상승됨에 따라서, 각종 기관과 단체에서 주최하는 많은 종목의 시험에 응시자들이 몰리고 있다. In recent years, as the number of job hunting increases and the need and desire for self - development have increased, candidates have been attracting many examinations hosted by various organizations and organizations.

이러한 시험은 오프라인에서 시험장을 방문하여 치루어지는 경우가 많다. 최근에는 시험장에서 컴퓨터를 이용하여 시험을 치르고 컴퓨터로 채점결과를 수신하는 방식으로 시험이 시행되어 오고 있다. 나아가, 종래기술 중에는 공개특허공보 제10-2012-0090564호와 같이 클라우딩 시스템을 이용하여 시험장을 방문하지 않고도 원격으로 시험을 치르고 시험결과를 확인할 수도 있는 시스템이 제안되어 오고도 있다. These tests are often carried out offline by visiting the test site. In recent years, tests have been carried out in such a manner that tests are conducted using a computer at a test site and the result of scoring is received by a computer. Furthermore, in the prior art, a system has been proposed in which a test can be performed remotely and a test result can be confirmed without visiting a test site using a clouding system as disclosed in Korean Patent Publication No. 10-2012-0090564.

이러한 시험의 채점방식은 일반적으로 객관식의 경우에는 OMR카드에 답안을 입력하여 컴퓨터로 자동화되도록 구현되어 있으나, 주관식의 경우에는 채점자가 직접 응시자의 답안을 검토함으로써 이루어지고 있다. The scoring method of this test is generally implemented by automating the computer by inputting the answer on the OMR card in the case of multiple choice, but in the case of the supporting type, the scorer directly examines the candidate 's answer.

특히, IBT와 같이 쓰기와 말하기 시험이 시험항목으로 포함되는 경우에는 채점자가 일일이 답안을 확인하고 점수를 매기는 것과 방식으로 이루어지고 있는데, 채점자의 채점 기준은 주관적일 수 밖에 없으며, 수동으로 채점이 이루어지는 것이기 때문에 채점결과가 나오기까지 많은 시간이 걸리게 된다. 따라서, 이러한 문제점을 해결하기 위한 기술이 필요하게 되었다.In particular, in the case of the IBT, if the writing and speaking exams are included as test items, the scorer will check his / her answer and score the score. The scoring criteria of the scorer are subjective, It takes a lot of time to get the result of the scoring. Therefore, a technique for solving such a problem is required.

한편, 전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.
On the other hand, the background art described above is technical information acquired by the inventor for the derivation of the present invention or obtained in the derivation process of the present invention, and can not necessarily be a known technology disclosed to the general public before the application of the present invention .

따라서, 본 발명의 일실시예는 상술한 종래기술의 문제점을 해결하기 위하여 시험응시자가 입력한 답안과 정답을 비교하여 문장유사도를 판단하는 방법 및 장치를 제공하는 데에 목적이 있다. SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to provide a method and an apparatus for determining sentence similarity by comparing an answer inputted by a test candidate with a correct answer in order to solve the problems of the related art described above.

또한, 본 발명의 일실시예는 상기 문장유사도를 판단하는 방법을 이용하여 쓰기와 말하기 시험에 있어서 사용자가 입력한 답안에 대한 채점을 수행하는 방법 및 장치를 제공하는 데에 목적이 있다. It is another object of the present invention to provide a method and apparatus for scoring an answer inputted by a user in a writing and speaking test using a method of determining the sentence similarity.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 제 1 측면에 따르는 서버에 의하여 수행되는 문장유사도 판단방법에 있어서, 사용자로부터 복수 개의 단어로 구성된 제 1문장을 입력받는 단계; 상기 서버에서 상기 제 1 문장과 대응하여 저장된 제 2문장에 포함되는 복수 개의 단어를 이용하여 적어도 하나의 단어 조합을 생성하는 단계; 및 상기 각 단어 조합과 상기 제 1 문장을 비교하여, 상기 제 1 및 제 2문장의 유사도를 계산하는 단계;를 포함한다.According to a first aspect of the present invention, there is provided a method for determining similarity of a sentence performed by a server, the method comprising: receiving a first sentence composed of a plurality of words from a user; Generating at least one word combination using a plurality of words included in a second sentence corresponding to the first sentence in the server; And comparing the first sentence with each combination of the words to calculate the similarity of the first and second sentences.

또한, 상기 적어도 하나의 단어 조합을 생성하는 단계는, 상기 제2문장을 구성하는 단어의 수가 n개인 경우, 상기 제2문장의 단어들 중 1개 내지 n개 중 어느 하나의 개수로 단어를 추출하여 상기 적어도 하나의 단어 조합을 생성한다.The generating of the at least one word combination may include extracting a word from any one of n words of the second sentence when the number of words constituting the second sentence is n, To generate the at least one word combination.

또한, 상기 생성되는 단어 조합의 총 개수는 2ⁿ-1개일 수 있다.In addition, the total number of word combinations to be generated may be 2 ⁿ -1.

또한, 상기 적어도 하나의 단어 조합을 생성하는 단계는, 상기 제 2 문장 내의 단어배열 순서와 동일하도록 상기 제 2 문장 내의 단어를 적어도 하나 추출하여, 상기 적어도 하나의 단어 조합을 생성하는 단계를 포함한다.In addition, the step of generating the at least one word combination includes extracting at least one word in the second sentence so as to be equal to the word arrangement order in the second sentence, thereby generating the at least one word combination .

또한, 상기 유사도를 계산하는 단계는, 상기 각 단어 조합의 단어배열순서를 고려하였을 때, 상기 각 단어 조합에 포함된 단어와 동일한 상기 제 1 문장의 단어의 개수와 상기 제 1 문장의 총 단어 개수와의 비율을 통하여 각 단어 조합마다 유사도를 계산하는 단계; 및 상기 계산된 유사도 중 가장 높은 값을 상기 제 1 문장과 상기 제 2 문장 간의 유사도로 결정하는 단계;를 포함한다.The step of calculating the degree of similarity may include calculating a degree of similarity based on the number of words of the first sentence that is the same as the word included in each word combination and the total number of words of the first sentence Calculating a degree of similarity for each word combination based on a ratio of the degree of similarity to the degree of similarity; And determining the highest value among the calculated similarities as the degree of similarity between the first sentence and the second sentence.

또한, 상기 유사도를 계산하는 단계는, 상기 제 1 및 제 2 문장을 구성하는 단어들을 부호화하여, 부호화된 값을 기초로 상기 유사도를 출력한다.The step of calculating the degree of similarity may include coding the words constituting the first and second sentences, and outputting the degree of similarity based on the coded values.

또한, 본 발명의 제 2 측면에 따르는 서버에 의해 수행되는 시험채점방법에 있어서, 사용자로부터 시험문제에 대한 응답으로서 복수 개의 단어로 구성된 입력문장을 입력받는 단계; 상기 입력문장에 대응하여 상기 서버에 미리 저장된 정답문장에 포함되는 복수 개의 단어를 이용하여 적어도 하나의 단어 조합을 생성하는 단계; 상기 각 단어 조합과 상기 입력문장을 비교하여, 상기 입력문장 및 정답문장 간의 유사도를 계산하는 단계; 및 상기 계산된 유사도를 기초로 상기 입력문장에 대한 점수를 계산하는 단계;를 포함한다.According to a second aspect of the present invention, there is provided a test scoring method performed by a server, comprising: receiving an input sentence composed of a plurality of words as a response to a test question from a user; Generating at least one word combination using a plurality of words included in a correct answer sentence previously stored in the server corresponding to the input sentence; Comparing the combination of words with the input sentence and calculating a similarity between the input sentence and the correct sentence; And calculating a score for the input sentence based on the calculated similarity.

또한, 상기 생성되는 단어 조합의 총 개수는 2ⁿ-1개일 수 있다. In addition, the total number of word combinations to be generated may be 2 ⁿ -1.

또한, 상기 적어도 하나의 단어 조합을 생성하는 단계는, 상기 정답문장 내의 단어배열 순서와 동일하도록 상기 정답문장 내의 단어를 적어도 하나 추출하여, 상기 적어도 하나의 단어 조합을 생성하는 단계를 포함한다.The generating of the at least one word combination may include generating at least one word combination by extracting at least one word in the correct answer sentence so as to be the same as the word arrangement order in the correct answer sentence.

또한, 상기 유사도를 계산하는 단계는, 상기 각 단어 조합의 단어배열순서를 고려하였을 때 상기 각 단어 조합에 포함된 단어와 동일한 상기 입력문장의 단어의 개수와 상기 입력문장의 총 단어 개수와의 비율을 통하여 각 단어 조합마다 유사도를 계산하는 단계; 및 상기 계산된 유사도 중 가장 높은 값을 상기 입력문장과 상기 정답문장 간의 유사도로 결정하는 단계;를 포함한다.The step of calculating the degree of similarity may include calculating a degree of similarity based on a ratio of the number of words in the input sentence to the total number of words in the input sentence that is the same as a word included in each word combination, Calculating a degree of similarity for each word combination through a plurality of word combinations; And determining the highest value among the calculated similarities as the similarity between the input sentence and the correct answer sentence.

또한, 상기 점수를 계산하는 단계는, 상기 결정된 유사도에 대응하는 상기 입력문장과 상기 단어 조합의 쌍 간의 서로 일치하는 단어의 개수와 상기 정답문장의 총 단어 개수 간의 비율을 계산하여 평가인자를 산출하는 단계; 상기 입력문장의 총 단어 개수가 상기 정답문장의 총 단어 개수보다 많은 경우, 감점인자를 산출하는 단계; 및 상기 평가인자에서 감점인자를 감산하여 상기 시험문제에 대한 상기 사용자의 점수를 계산하는 단계;를 포함한다.The step of calculating the score may include calculating a ratio between the number of matching words between the pair of the input sentence and the word combination corresponding to the determined degree of similarity and the total number of words of the correct sentence to calculate an evaluation factor step; Calculating a reduction factor when the total number of words in the input sentence is greater than the total number of words in the correct answer sentence; And calculating a score of the user for the test question by subtracting the score factor from the evaluation factor.

또한, 상기 감점인자는, 상기 정답문장보다 많은 상기 입력문장의 단어 개수에 비례하고, 상기 정답문장의 총 단어 개수에 반비례한다.The reduction factor is proportional to the number of words of the input sentence more than the correct answer sentence, and is inversely proportional to the total number of words of the correct answer sentence.

또한, 상기 유사도값을 계산하는 단계는, 상기 응답문장 및 정답문장을 구성하는 단어들을 부호화하여, 부호화된 값을 기초로 상기 유사도값을 출력한다.The step of calculating the similarity value may include encoding the words constituting the response sentence and the correct answer sentence, and outputting the similarity value based on the encoded value.

또한, 상기 시험문제에 대하여 계산된 점수가 최고값이 아닌 경우, 상기 사용자가 입력한 입력문장 중 상기 정답문장과 일치하지 않는 부분에 대한 첨삭 정보를 제공하는 단계를 더 포함한다.In addition, if the score calculated for the test question is not the highest value, the step of providing supplementary information on a part of the input sentence that is not matched with the correct answer sentence by the user is provided.

또한, 상기 시험채점방법은, 상기 입력된 입력문장이 음성 형태인 경우, 음성인식 기술을 적용하여 상기 입력문장을 텍스트형태로 변환하는 단계를 더 포함한다.In addition, the test scoring method further includes converting the input sentence into a text form by applying a speech recognition technique when the input sentence is a speech form.

또한, 상기 시험채점방법은, 텍스트 형태의 상기 입력문장에 대하여 자연어 처리 기술을 적용하여, 상기 입력문장을 구성하는 각각의 단어를 구분하는 단계를 더 포함한다.In addition, the test scoring method may further include dividing each word constituting the input sentence by applying a natural language processing technique to the input sentence in a text form.

한편, 본 발명의 제 3 측면에 따르는 문장유사도 판단장치는, 사용자로부터 복수 개의 단어로 구성된 제 1 문장을 입력받는 제1문장 수신부; 상기 제 1 문장과 대응하는 상기 서버에 미리 저장된 제 2문장에 포함되는 복수 개의 단어를 이용하여 적어도 하나의 단어 조합을 생성하는 단어 조합 생성부; 및 상기 각 단어 조합과 상기 제 1 문장을 비교하여, 상기 제 1 및 제 2문장의 유사도값을 계산하는 유사도 계산부;를 포함한다.According to a third aspect of the present invention, there is provided a sentence similarity determination apparatus comprising: a first sentence receiving unit receiving a first sentence composed of a plurality of words from a user; A word combination generator for generating at least one word combination using a plurality of words included in a second sentence stored in advance in the server corresponding to the first sentence; And a similarity calculation unit for comparing the combination of the words with the first sentence and calculating a similarity value of the first and second sentences.

한편, 본 발명의 제 4 측면에 따르는 시험채점장치는, 사용자로부터 임의의 시험문제에 대한 응답으로서 복수 개의 단어로 구성된 입력문장을 입력받는 입력문장 수신부; 상기 입력문장에 대응하여 상기 서버에 미리 저장된 정답문장에 포함되는 복수 개의 단어를 이용하여 적어도 하나의 단어 조합을 생성하는 단어 조합 생성부; 상기 각 단어 조합과 상기 입력문장을 비교하여, 상기 입력문장 및 정답문장의 유사도값을 계산하는 유사도 계산부; 및 상기 계산된 유사도를 기초로 상기 입력문장에 대한 점수를 계산하는 점수 제공부;를 포함한다.Meanwhile, a test scoring apparatus according to a fourth aspect of the present invention includes: an input sentence receiving unit that receives an input sentence composed of a plurality of words as a response to a test question from a user; A word combination generation unit for generating at least one word combination using a plurality of words included in a correct answer sentence previously stored in the server corresponding to the input sentence; A similarity calculation unit for comparing the input word with the combination of words to calculate a similarity value between the input sentence and the correct sentence; And a score generator for calculating a score for the input sentence based on the calculated similarity.

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 본 발명의 일실시예는 시험응시자가 답안으로 입력한 문장과 정답문장을 자동으로 비교하여 채점함으로써, 사람의 노력을 기울이지 않고도 정확한 채점을 수행할 수 있다. 그에 따라, 채점 시스템을 전자화, 자동화할 수 있게 되며, 시험응시자에게 빠른 채점결과를 제공할 수 있다. According to any one of the above-mentioned tasks of the present invention, an embodiment of the present invention automatically scans and compares a sentence inputted with an answer by a test applicant with a correct answer sentence, thereby performing accurate scoring without devoting the effort of a person can do. Accordingly, it becomes possible to automate and automate the scoring system, and it is possible to provide a quick scoring result to the test taker.

또한, 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 채점 결과로부터 시험응시자가 입력한 답안의 틀린 부분에 대하여 첨삭을 제공할 수 있다. 첨삭결과로부터 시험응시자는 잘못된 부분이 어느 부분인지 곧바로 피드백받을 수 있다. Further, according to any one of the tasks of the present invention, it is possible to provide an impression on the wrong part of the answer inputted by the test candidate from the scoring result. From the results of the annotation, the test taker can immediately receive feedback on which part is wrong.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.
The effects obtained by the present invention are not limited to the above-mentioned effects, and other effects not mentioned can be clearly understood by those skilled in the art from the following description will be.

도 1은 본 발명의 일실시예에 따른 시험채점시스템의 구성도이다.
도 2는 본 발명의 일실시예에 따른 시험채점장치의 내부 구성을 도시한 구조도이다.
도3은 본 발명의 일 실시예에 따라 생성되는 단어 조합의 예시를 나타낸 표이다.
도 4a 내지 도 4d는 본 발명의 일 실시예에 따르는 문장유사도 판단방법을 나타낸 개념도이다.
도 5는 종래기술에 따르는 문장유사도 판단방법을 나타낸 개념도이다.
도 6a는 정답문장에 중복되는 단어가 있을 경우, 종래기술에 따라 문장유사도를 판단하는 방법을 나타낸 개념도이며, 도 6b는 상기와 같은 경우 본 발명의 일 실시예에 따라 문장유사도를 판단하는 방법을 나타낸 개념도이다.
도 7은 본 발명의 일 실시예에 따라 첨삭정보가 제공된 결과를 나타내는 예시 도면이다.
도 8은 본 발명의 일실시예에 따른 시험채점방법을 설명하기 위한 순서도이다.
도 9는 도8의 S140단계를 구체화한 순서도이다. 1 is a configuration diagram of a test scoring system according to an embodiment of the present invention.
FIG. 2 is a structural view showing an internal configuration of a test scoring apparatus according to an embodiment of the present invention.
3 is a table illustrating an example of a word combination generated according to an embodiment of the present invention.
4A to 4D are conceptual diagrams illustrating a method of determining sentence similarity according to an embodiment of the present invention.
FIG. 5 is a conceptual diagram illustrating a method of determining sentence similarity according to the prior art.
FIG. 6A is a conceptual diagram illustrating a method of determining a sentence similarity degree according to a conventional technique when there is a duplicate word in a correct answer sentence. FIG. 6B illustrates a method of determining sentence similarity degree according to an embodiment of the present invention. Fig.
7 is an exemplary diagram showing a result of providing annotation information according to an embodiment of the present invention.
8 is a flowchart for explaining a test scoring method according to an embodiment of the present invention.
FIG. 9 is a flow chart embodying step S140 of FIG.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, which will be readily apparent to those skilled in the art. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "electrically connected" with another part in between . Also, when an element is referred to as "comprising ", it means that it can include other elements as well, without departing from the other elements unless specifically stated otherwise.

본 발명의 일 실시예를 설명하기에 앞서, 아래에서 사용되는 용어들의 의미를 먼저 정의한다. Prior to describing an embodiment of the present invention, the meanings of the terms used below are first defined.

이하에서, “문장”이란 복수 개의 단어의 조합으로 형성된 구절을 포함하며, 또는 완결된 감정이나 생각을 나타내는 최소 단위(즉, 마침표, 물음표, 느낌표로 구분될 수 있는 단어들의 조합)를 포함하는 개념이다.Hereinafter, the phrase " sentence " includes a phrase formed of a combination of a plurality of words, or a concept including a minimum unit representing a completed emotion or idea (i.e., a combination of words that can be divided into a period, a question mark, to be.

또한, “입력문장(또는 제 1 문장)”이란 시험문제에 대하여 시험응시자가 입력한 답안을 의미한다. Also, the "input sentence (or first sentence)" means the answer entered by the test taker for the test question.

또한, “정답문장(또는 제 2 문장)”이란 시험문제에 대하여 정답으로 규정된 문장을 의미하는 것으로서, 정답문장은 하나 이상 존재할 수 있다. Also, the "correct answer sentence (or second sentence)" means a sentence defined as a correct answer to the examination question, and there may be one or more correct answer sentences.

“단어 조합”이란 정답문장을 구성하는 단어들 중 적어도 하나를 추출하여, 정답문장에 배열된 단어들의 순서대로 추출된 단어들의 조합을 의미한다. 이러한 단어 조합이 생성될 수 있는 경우의 수는 정답문장을 구성하는 단어의 개수에 따라 달라질 수 있다. 예를 들어, 정답문장을 구성하는 단어의 개수가 4개인 경우, 단어 조합을 구성하는 단어의 개수는 1개, 2개, 3개, 4개씩으로 구성될 수 있다. 이 경우, 생성되는 단어의 개수는 15개(2⁴-1)이며, 이를 공식화할 경우, 단어 조합이 생성되는 경우의 수는 2ⁿ-1 개(n=정답문장을 구성하는 단어의 개수)이다. &Quot; Word combination " means a combination of words extracted in order of words arranged in a correct answer sentence by extracting at least one of the words constituting the correct answer sentence. The number of cases in which such a combination of words can be generated depends on the number of words constituting the correct answer sentence. For example, when the number of words constituting the correct answer sentence is four, the number of words constituting the word combination may be one, two, three or four. In this case, the number of words to be generated is 15 (2 ⁴ -1). In the case of formulating the word combinations, the number of word combinations is 2 ⁿ -1 (n = the number of words constituting the correct answer sentence) to be.

이하 첨부된 도면을 참고하여 본 발명을 상세히 설명하기로 한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 시험채점시스템을 설명하기 위한 구성도이다.1 is a block diagram illustrating a test scoring system according to an embodiment of the present invention.

네트워크(N)는 근거리 통신망(Local Area Network; LAN), 광역 통신망(Wide Area Network; WAN), 부가가치 통신망(Value Added Network; VAN), 개인 근거리 무선통신(Personal Area Network; PAN), 이동 통신망(mobile radio communication network), Wibro(Wireless Broadband Internet), Mobile WiMAX, HSDPA(High Speed Downlink Packet Access) 또는 위성 통신망 등과 같은 모든 종류의 유/무선 네트워크로 구현될 수 있다. The network N may be a local area network (LAN), a wide area network (WAN), a value added network (VAN), a personal area network (PAN) mobile radio communication network, Wibro (Wireless Broadband Internet), Mobile WiMAX, HSDPA (High Speed Downlink Packet Access) or satellite communication network.

사용자 단말(100)은 문제제공 서버(미도시)로부터 시험문제를 제공받고, 시험문제에 대하여 사용자가 답안으로 입력한 입력문장을 수신하여 채점서버(200)로 답안을 전송하며, 시험채점결과를 채점서버(200)로부터 수신하여 표시하는 기능을 수행한다. 이러한 단말(100)은 네트워크(N)를 통해 원격지의 서버에 접속하거나, 타 단말 및 서버와 연결 가능한 컴퓨터나 휴대용 단말기, 텔레비전으로 구현될 수 있다. 여기서, 컴퓨터는 예를 들어, 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop) 등을 포함하고, 휴대용 단말기는 예를 들어, 휴대성과 이동성이 보장되는 무선 통신 장치로서, PCS(Personal Communication System), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), GSM(Global System for Mobile communications), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet), 스마트폰(Smart Phone), 모바일 WiMAX(Mobile Worldwide Interoperability for Microwave Access) 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치를 포함할 수 있다. 또한, 텔레비전은 IPTV(Internet Protocol Television), 인터넷 TV(Internet Television), 지상파 TV, 케이블 TV 등을 포함할 수 있다.The user terminal 100 is provided with a test question from a problem providing server (not shown), receives an input sentence inputted by the user as an answer to the test question, transmits the answer to the score server 200, From the grading server 200 and displays it. The terminal 100 may be implemented as a computer, a portable terminal, or a television set, which can be connected to a remote server through the network N or connected to other terminals and servers. Here, the computer includes, for example, a notebook computer, a desktop computer, a laptop computer, and the like, each of which is equipped with a web browser (WEB Browser), and the portable terminal may be a wireless communication device , Personal Communication System (PCS), Personal Digital Cellular (PDC), Personal Handyphone System (PHS), Personal Digital Assistant (PDA), Global System for Mobile communications (GSM), International Mobile Telecommunication (IMT) (W-CDMA), Wibro (Wireless Broadband Internet), Smart Phone, Mobile WiMAX (Mobile Worldwide Interoperability for Microwave Access) (Handheld) based wireless communication device. In addition, the television may include an Internet Protocol Television (IPTV), an Internet television (TV), a terrestrial TV, a cable TV, and the like.

채점서버(200)는 사용자가 답안으로 입력한 입력문장을 수신하고, 입력문장과 정답문장을 문장유사도 판단방법을 통해 유사도를 판단한 다음, 이를 기초로 채점을 수행할 수 있다. 또한, 채점서버(200)는 채점결과 뿐만 아니라 입력문장에 대한 첨삭결과도 사용자의 단말(100)로 제공할 수 있다. 이러한 채점서버(200)는 데이터베이스 서버(미도시)로 채점결과 및/또는 첨삭결과를 제공할 수도 있다. 이 경우, 사용자는 추후에 채점결과를 확인하기 위하여, 데이터베이스 서버(미도시)에 접근할 수도 있다. 다만, 이러한 예에 한하지 않으며, 채점서버(200)가 데이터베이스 서버(미도시)의 역할을 수행할 수도 있다. The grading server 200 receives the input sentence inputted by the user as an answer, judges the degree of similarity between the input sentence and the correct sentence through the sentence similarity determination method, and performs scoring based on the similarity. In addition, the scoring server 200 can provide not only a scoring result but also an annotation result for an input sentence to the user terminal 100. The grading server 200 may provide a grading result and / or an annotation result to a database server (not shown). In this case, the user may access the database server (not shown) to check the scoring result at a later time. However, the present invention is not limited to this example, and the scoring server 200 may serve as a database server (not shown).

한편, 채점서버(200)의 기능은 사용자 단말(100)에서 수행될 수도 있다. 예를 들어, 사용자 단말(100)이 입력문장에 대하여 채점을 수행하고, 채점결과를 데이터베이스 서버(미도시)에 전송하도록 구성될 수도 있다. Meanwhile, the function of the grading server 200 may be performed in the user terminal 100. [ For example, the user terminal 100 may be configured to perform scoring on an input sentence and to transmit the scoring result to a database server (not shown).

이하, 도2를 참고하여, 본 발명의 일 실시예에 따르는 시험채점장치의 구체적인 구성을 설명하도록 한다. 이하에서 설명하는 시험채점장치는 채점서버(200)인 것으로 설명하였으나, 사용자 단말(100)에 포함되도록 구현될 수도 있다. 또한, 상기 시험채점장치는 쓰기 또는 말하기 시험에 대한 채점을 수행할 수 있으며, 바람직하게, 영어 쓰기 또는 영어 말하기 시험에 대한 채점일 수 있다.Hereinafter, with reference to FIG. 2, a specific configuration of the test scoring apparatus according to an embodiment of the present invention will be described. Although the test scoring apparatus described below is described as the scoring server 200, it may be implemented to be included in the user terminal 100. [ The test scoring device may also perform scoring for a writing or speaking test and may preferably be a scoring for an English writing or English speaking test.

본 발명의 일 실시예에 따르는 시험채점장치는 입력문장 수신부(210), 입력문장 분석부(220), 단어 조합 생성부(230), 유사도 계산부(240), 점수 제공부(250), 첨삭정보 제공부(260)를 포함한다.The test scoring apparatus according to an embodiment of the present invention includes an input sentence receiving unit 210, an input sentence analyzing unit 220, a word combination generating unit 230, a similarity calculating unit 240, a scoring unit 250, And an information providing unit 260.

입력문장 수신부(210)는 사용자로부터 입력문장을 수신한다. 수신된 입력문장은 사용자에게 제공된 시험문제에 대한 대답이다. 또한, 수신된 입력문장은 텍스트 또는 음성의 형태가 될 수 있다. 즉, 시험문제가 쓰기문제인 경우, 입력문장은 사용자가 단말의 키보드 등과 같은 입력 수단을 통하여 입력하는 텍스트 형태가 되며, 시험문제가 말하기문제인 경우, 입력문장은 사용자의 음성이 될 수 있다. The input sentence receiving unit 210 receives an input sentence from a user. The received input sentence is the answer to the test questions provided to the user. Also, the received input sentence may be in the form of text or voice. That is, if the test question is a write problem, the input sentence becomes a text form input by a user through an input means such as a keyboard of the terminal, and if the test question is a question of speaking, the input sentence can be a voice of the user.

입력문장 분석부(220)는 입력문장의 형태에 따라 음성인식을 수행하여 입력문장의 텍스트를 인식한다. 입력문장 분석부(220)는 음성 형태의 입력문장을 인식하여 텍스트 형태로 변환할 수 있다. 따라서, 말하기 시험에서 사용자가 입력한 음성은 텍스트로 변환될 수 있다. 한편, 쓰기 시험에서 입력된 입력문장은 최초에 텍스트 형태로 입력된다. 결과적으로, 입력문장 분석부(220)는 쓰기 시험과 말하기 시험에서 사용자가 입력한 입력문장을 텍스트 형태로 인식할 수 있다. The input sentence analysis unit 220 recognizes the text of the input sentence by performing speech recognition according to the type of the input sentence. The input sentence analysis unit 220 recognizes the input sentence in the form of a voice and converts the input sentence into a text form. Therefore, the speech inputted by the user in the speaking test can be converted into text. On the other hand, the input sentence inputted in the writing test is input in the form of text at first. As a result, the input sentence analysis unit 220 can recognize the input sentence inputted by the user as a text form in the writing test and the speaking test.

이어서, 입력문장 분석부(220)는 텍스트 형태의 입력문장에 대하여 자연어 처리를 수행한다. 자연어 처리(natural language processing)란 인간이 발화한 언어를 기계적으로 분석하여 컴퓨터가 이해할 수 있는 형태로 처리하는 것을 의미한다. 여기서 자연어 처리기술은 i)형태소 분석과정, ii)구문 분석과정, iii)의미 분석과정을 포함한다. i) 형태소 분석과정은 의미기능을 부여하는 최소 단위인 형태소를 문장으로부터 분리해내는 과정이다. 예를 들어, “I bought　an apple.”과 같은 영어 문장이 주어진 경우 문장을 구성하는　각 단어　 “I”, “bought”, “an”, “apple”이 　형태소로서 분리될 수 있다. ii)구문 분석과정은 형태소 분석결과를 기반으로 명사구, 동사구와 같은 구문들을 분리해내는 과정이다. iii) 의미 분석과정은 문장 구성 성분들 사이의 의미적 관계를 논리적으로 밝혀내어 문장의 전체 의미를 파악하는 과정이다. 바람직하게, 본 발명의 일 실시예에서, 입력문장 분석부(220)는 자연어 처리기술 중 형태소 분석과정을 통하여 사용자가 입력한 입력문장으로부터 각각의 형태소를 분리함으로써 입력문장을 구성하는 단어들을 추출할 수 있다. 따라서, 입력문장 분석부(220)를 거친 입력문장은 단어 별로 구분될 수 있다. Then, the input sentence analysis unit 220 performs a natural language process on the input sentence of the text form. Natural language processing is the process of mechanically analyzing human speech and processing it in a form that the computer understands. Here, the natural language processing technique includes i) morpheme analysis, ii) parsing, and iii) semantic analysis. i) The morpheme analysis process is the process of separating the morpheme from the sentence, which is the minimum unit that gives meaning function. For example, when an English sentence such as "I bought an apple." Is given, each word "I", "bought", "an", "apple" constituting the sentence can be separated as a morpheme. ii) The parsing process is a process of separating phrases such as noun phrases and verb phrases based on the results of morpheme analysis. iii) Semantic analysis process is the process of identifying the whole meaning of a sentence by logically clarifying the semantic relationship between sentence components. Preferably, in one embodiment of the present invention, the input sentence analysis unit 220 extracts words constituting the input sentence by separating each morpheme from the input sentence input by the user through the morphological analysis process of the natural language processing technique . Accordingly, the input sentence through the input sentence analysis unit 220 can be divided into words.

단어 조합 생성부(230)는 입력문장과 비교하기 위한 대상을 추출하기 위해, 정답문장으로부터 적어도 하나의 단어 조합을 생성한다. 구체적으로, 단어 조합 생성부(230)는 정답문장을 구성하는 단어들이 서로 조합될 수 있는 모든 경우의 수에 대한 단어 조합을 생성한다. 따라서, 단어 조합은, 정답문장을 구성하는 단어의 개수가 n개라고 가정할 때, 1개, 2개, ..., n개 중 어느 하나의 개수의 단어로 구성될 수 있다. 이때, 경우의 수를 공식화할 경우, “경우의 수 = 2ⁿ-1 (n=정답문장을 구성하는 단어의 개수)”라고 할 수 있다. The word combination generation unit 230 generates at least one word combination from the correct answer sentence in order to extract an object to be compared with the input sentence. Specifically, the word combination generation unit 230 generates a word combination of the number of all cases in which the words constituting the correct answer sentence can be combined with each other. Therefore, the word combination may be composed of any one of 1, 2, ..., n words, assuming that the number of words constituting the correct answer sentence is n. In this case, when the number of cases is formulated, it can be said that "the number of cases = 2 ⁿ -1 (n = the number of words constituting the correct sentence)".

여기서, 단어 조합 생성부(230)는 정답문장을 구성하는 각각의 단어를 부호화한 후, 모든 경우의 수에 해당하는 단어 조합을 생성할 수 있다. 예를 들어, “I am a boy”라는 정답문장이 존재한다고 가정한다. 이때, 정답문장을 구성하는 각각의 단어는 “I”, “am”, “a”, “boy”이다. 이들 각 단어들을 “0”, “1”, “2”, “3”과 같이 부호화 할 수 있다. 이 경우, 정답문장에 대하여 생성될 수 있는 모든 경우의 수의 단어 조합은 도 3에 도시된 표와 같다. 아울러, 정답문장을 구성하는 단어에 중복단어가 포함된 경우에도 동일한 원리로 부호화될 수 있다. 예를 들어, 정답문장의 각 단어가 “good”, “boy”, “and”, “good”, “girl”인 경우, “0”, “1”, “2”, “0”, “3”과 같이 부호화될 수 있다. Here, the word combination generation unit 230 may generate a word combination corresponding to the number of all cases after coding each word constituting the correct answer sentence. For example, suppose that there is a correct sentence called "I am a boy". At this time, each word constituting the correct answer sentence is "I", "am", "a", "boy". These words can be encoded as "0", "1", "2", "3". In this case, the number of word combinations in all cases that can be generated for the correct sentence is the same as the table shown in FIG. In addition, even when a word constituting a correct answer sentence includes a duplicate word, the same principle can be encoded. For example, if each word of the correct answer sentence is "good", "boy", "and", "good", "girl", "0", "1", "2", "0" &Quot;

또한, 단어 조합 생성부(230)는 정답문장의 단어 배열 순서와 동일한 순서로 단어가 배치되도록 단어 조합을 생성할 수 있다. 예를 들어, 도 3을 참조하면, “I am a boy”로부터 3개의 단어를 무작위로 추출하였을 때 “I am a”, “I am boy”, “I a boy”, “am a boy”와 같이 정답문장에서 주어, 보어, 서술어 역할을 하는 단어들의 배열 순서가 달라지지 않도록 단어 조합이 생성될 수 있다. 왜냐하면, 문장 간의 유사도 판단에 있어서 문장 내의 단어들의 배열 순서도 인자(factor)로서 포함될 수 있는 것이기 때문에 단어배열순서가 정답문장과 달라지지 않도록 하는 것이다. The word combination generation unit 230 may generate a word combination such that the words are arranged in the same order as the word arrangement order of the correct answer sentence. For example, referring to FIG. 3, when three words are randomly extracted from "I am a boy", "I am a", "I am boy", "I a boy", "am a boy" Similarly, a word combination can be generated so that the order of words in the correct answer sentence, bore, and predicate does not change. This is because the arrangement order of the words in the sentence can be included as a factor in the determination of the similarity between the sentences so that the word arrangement order does not differ from the correct sentence.

이어서, 유사도 계산부(240)는 생성된 단어 조합들과 입력문장을 비교하여 정답문장과 입력문장 간의 유사도를 결정한다. 구체적으로, 유사도 계산부(240)는 문장 간의 단어의 종류, 단어의 순서를 고려하여 각 단어 조합과 입력문장 간의 유사도를 계산한다. 이때, 유사도 계산부(240)는 입력문장과 단어 조합 각각의 서두에 배치된 단어부터 말미에 배치된 단어까지 서로 순차적으로 비교함으로써 유사도를 계산한다. 이어서, 유사도 계산부(240)는 가장 높은 값을 가지는 유사도를 입력문장과 정답문장의 유사도로서 결정한다. Then, the similarity calculation unit 240 compares the generated word combinations with the input sentence to determine the similarity between the correct sentence and the input sentence. Specifically, the similarity calculation unit 240 calculates the similarity between each word combination and the input sentence in consideration of the kind of words between words and the order of words. At this time, the similarity calculation unit 240 calculates the similarity by sequentially comparing the words placed at the beginning of the input sentence and the words arranged at the end of each word combination. Then, the similarity calculation unit 240 determines the similarity having the highest value as the similarity between the input sentence and the correct sentence.

예를 들어, 도 4a를 참조하면, 하나의 단어 조합이 “am a boy” 인 경우, 단어 조합의 단어와 입력문장의 단어를 서두부터 말미까지 순차적으로 비교하였을 때, 입력문장의 4개 단어 중 3개의 단어가 정답문장의 단어와 일치하므로 유사도는 0.75로 계산될 수 있다. 또한, 도 4b에 따라, 또 다른 단어 조합이 “I a boy”인 경우, 입력문장의 단어와 순차적으로 비교하였을 때, 유사도는 0.50으로 계산될 수 있다. 또한, 도 4c에 따라, 또 다른 단어 조합이 “I am boy” 인 경우, 입력문장과 비교하였을 때, 유사도는 0.50으로 계산될 수 있다. 이러한 방식으로 모든 단어 조합과 입력문장을 비교하였을 때, “I am a boy”라는 정답문장과 “am a I boy”라는 입력문장 간의 유사도는 도 4d와 같이 0.75로 결정될 수 있다(즉, 입력문장을 구성하는 4개의 단어 중 3개의 단어가 정답과 동일하다)라는 결과를 도출될 수 있다.For example, referring to FIG. 4A, when one word combination is " am a boy ", when the words of the word combination and the words of the input sentence are sequentially compared from the beginning to the end, Since the three words match the words of the correct sentence, the similarity can be calculated to be 0.75. In addition, according to FIG. 4B, when another word combination is " I a boy ", the degree of similarity can be calculated to be 0.50 when sequentially compared with the words of the input sentence. Further, according to Fig. 4C, when another word combination is " I am boy ", the similarity can be calculated to be 0.50 when compared with the input sentence. When comparing all word combinations and input sentences in this manner, the similarity between the correct answer sentence "I am a boy" and the input sentence "am a I boy" can be determined to be 0.75 as shown in FIG. 4d (ie, The three words out of the four words constituting the sentence are the same as the correct answer).

한편, 종래기술에 의하면, 위의 경우에서 정답문장과 입력문장 간의 유사도는 0.75로 계산되지 않을 수 있다. 도 5를 참조하면, 종래기술의 문장비교 알고리즘은 ①부터 ⑥까지의 단어 간 순차적 비교에 따라, 입력문장의 단어 중 “I”와 “boy”만 정답문장과 일치하는 것으로 판별된다. 따라서, 입력문장의 단어 4개 중 2개만 답을 맞춘 결과가 되므로 유사도는 0.50으로 계산될 수 있다. 즉, 실질적으로 입력문장의 “am a I boy”는 “I”의 배열 순서만 잘못된 것에 불과하므로, 0.75의 유사도가 도출되어야 함에도 불구하고, 종래기술에 의하면 0.50의 유사도가 도출되는 것이다. On the other hand, according to the related art, in the above case, the similarity degree between the correct answer sentence and the input sentence may not be calculated to be 0.75. Referring to FIG. 5, the conventional sentence comparison algorithm determines that only "I" and "boy" in the input sentence are matched with the correct answer sentence according to the sequential comparison between the words ① to ⑥. Thus, only two of the four words in the input sentence are the answers, so the similarity can be calculated as 0.50. In other words, although the "am a I boy" of the input sentence is merely a wrong order of arrangement of "I", the similarity degree of 0.50 is derived according to the prior art although the similarity degree of 0.75 should be derived.

또 한편, 종래기술에 의하면, 정답문장에 동일한 단어가 중복 배열된 경우에 실제의 유사도보다 더욱 낮게 도출될 수 있으나, 본 발명의 일 실시예는 더욱 정확한 유사도를 도출할 수 있다. 예를 들어, “good boy and good girl”이라는 정답문장에 대하여 “good girl and good boy”라는 입력문장이 입력된 경우를 가정한다. 이때, 종래기술에 의하면, 도 6a에 따라, 단어 간의 비교는 ①부터 ⑤까지의 순서로 이루어지게 되고, 유사도는 0.40으로 도출될 수 있다. 그러나, 본 발명의 일 실시예에 따라 문장유사도를 판단할 경우, 도 6b에 따라, “good and good”이라는 단어 조합과 입력문장과의 비교를 통해 0.60의 최고 유사도가 도출되므로, 유사도는 0.60으로 결정될 수 있다. 즉, 단어 조합을 여러 가지로 구성하여 입력문장과 비교할 경우, 이러한 중복 단어가 정답문장에 배치되는 경우에도, 더욱 정확한 유사도를 판단할 수 있다. On the other hand, according to the related art, when the same word is repeatedly arranged in the correct answer sentence, the degree of similarity can be derived to be lower than the actual similarity degree, but an embodiment of the present invention can derive more accurate similarity. For example, suppose that the input sentence "good girl and good boy" is input for the correct sentence "good boy and good girl". According to the related art, according to FIG. 6A, the comparison between words is performed in the order of (1) to (5), and the degree of similarity can be derived to be 0.40. However, when determining the sentence similarity degree according to the embodiment of the present invention, the highest similarity degree of 0.60 is obtained by comparing the word combination of "good and good" with the input sentence according to FIG. 6B, Can be determined. That is, when a plurality of words are combined to be compared with an input sentence, even when such a duplicate word is placed in a correct sentence, more accurate similarity can be determined.

다만, 상기의 예에서, 입력문장과 정답문장이 일치하지는 않으나, 실제로는 동일함에도 불구하고 1.00이 아닌 0.60이 도출되었다. 따라서, 추가 실시예로서, 정답문장이 병렬의미의 등위 접속사(예를 들어, and, or등)를 포함하는 경우, 등위 접속사를 기준으로 앞과 뒤의 단어배열을 서로 바꾼 경우의 단어 조합을 생성하도록 구현될 수 있다. 추가 실시예에 의하면, 상기의 예에서 유사도가 1.00으로 도출될 수 있다. However, in the above example, although the input sentence and the correct sentence do not match, 0.60, rather than 1.00, is derived despite the fact that they are the same. Therefore, as a further embodiment, when a correct sentence includes a consonant conjunction (for example, and, or or the like) having a parallel meaning, a word combination is generated when the word arrangements before and after the reference word are exchanged . &Lt; / RTI > According to a further embodiment, the similarity may be derived as 1.00 in the above example.

점수 제공부(250)는 최종 결정된 유사도를 바탕으로 입력문장에 대한 점수를 계산한다. 점수 제공부(250)는 유사도로부터 입력문장의 단어 중 정답문장의 단어와 가장 많이 일치하는 단어의 개수를 파악할을 수 있다. 예를 들어, 입력문장이 “am a I boy”이고, 정답문장이 “I am a boy”인 경우, 유사도는 0.75로 결정되며, 이를 통해 입력문장 중 최대 3개의 단어가 정답문장과 일치한다는 사항을 파악할 수 있다. 이어서, 평가 인자는 다음과 같이 계산될 수 있다. The scorer 250 calculates a score for the input sentence based on the finally determined similarity. The scoring unit 250 can recognize the number of words that most closely match the words of the correct sentence among the words of the input sentence from the similarity. For example, if the input sentence is "am a I boy" and the correct sentence is "I am a boy", the similarity is determined to be 0.75, which means that up to three of the input sentences match the correct sentence . Then, the evaluation factor can be calculated as follows.

예를 들어, 상기와 같은 예에서, 평가 인자는 75점으로 계산될 수 있다. For example, in the above example, the evaluation factor may be calculated as 75 points.

한편, 정답문장이 “I am a boy”이고, 입력문장이 “I am a my boy”인 경우, 수학식1과 같은 계산식에의 하면 100점이 산출된다. 그러나, 실질적으로, “my”가 추가로 기재됨으로써, 입력문장은 문법적으로 잘못된 것이라고 할 수 있다. 이러함에도 불구하고 100점이 산출되는 경우를 방지하기 위하여, 점수 제공부(250)는 정답문장과 일치하지 않는 추가 단어가 존재하는지 판단하고, 추가 단어에 대해서는 감점인자를 부가할 수 있다. 감점인자는 수학식 2와 같이, 정답문장과 일치하지 않는 입력문장의 단어 개수와 비례하고 정답문장의 단어 개수와 반비례하도록 설정될 수 있다. On the other hand, when the correct answer sentence is " I am a boy ", and the input sentence is " I am a my boy, " However, practically, since "my" is additionally described, the input sentence is grammatically incorrect. In order to prevent a case where 100 points are calculated in spite of this, the scorer 250 may determine whether there are additional words that do not coincide with the correct sentence, and add a subtraction factor for the additional words. The scoring factor can be set to be inversely proportional to the number of words of the input sentence that does not match the correct answer sentence and the number of words of the correct answer sentence as shown in Equation (2).

첨삭정보 제공부(260)는 입력문장에서 정답문장과 일치하지 않는 부분이 있을 경우, 일치하지 않는 부분을 표시하고 정답 내용을 첨삭정보로서 제공할 수 있다. 예를 들어, 도 7을 참조하면, 사용자가 입력한 답안은 파란색으로 표시되며, 사용자가 입력한 답안 중 틀린 부분은 밑줄과 붉은색 글씨로 표시되어 제공될 수 있다. 그리고, 틀린 부분에 대응한 정답 내용은 밑줄 아래에 기재되어, 사용자가 한눈에 틀린 부분과 정답을 확인할 수 있도록 할 수 있다. If there is a part that does not coincide with the correct answer sentence in the input sentence, the addend information providing part 260 may display a part that does not match and provide the correct answer content as annotation information. For example, referring to FIG. 7, an answer inputted by a user is displayed in blue, and an incorrect answer inputted by a user can be provided in an underlined and red text. The contents of the correct answer corresponding to the wrong part are described below the underline, so that the user can check the wrong part and the correct answer.

계산된 점수와 첨삭정보는 데이터베이스 서버(미도시)로 일단 전송될 수 있으며, 추후에 사용자가 자신의 시험채점결과를 확인하기 위하여 시험 후에 데이터베이스 서버(미도시)에 접속하여 계산된 점수와 첨삭정보를 확인할 수 있다. 다만, 이러한 예에 한하지 않으며, 채점서버(200)가 계산된 점수와 첨삭정보를 저장하고 사용자에게 곧바로 제공할 수도 있다. The calculated score and annotation information can be transmitted to the database server (not shown) once. Afterwards, the user accesses the database server (not shown) after the test to check the result of the test scoring of the user, . However, the present invention is not limited to this example, and the scoring server 200 may store the calculated score and the annotation information and may provide the score directly to the user.

이어서, 도 8 및 도 9를 참조하여, 본 발명의 일 실시예에 따르는 시험채점방법에 대하여 구체적으로 설명한다. 이하에서, 시험채점장치는 사용자 단말(100) 또는 채점서버(200)로 구현될 수 있다.Next, a test scoring method according to an embodiment of the present invention will be described in detail with reference to Figs. 8 and 9. Fig. Hereinafter, the test scoring device may be implemented as the user terminal 100 or the scoring server 200. [

먼저, 시험채점장치는 사용자로부터 입력문장을 수신할 수 있다(S100). 입력문장은 시험문제에 대한 답안으로서 사용자가 입력한 문장이다. 입력문장은 텍스트나 음성형태로 수신될 수 있다. 시험문제가 말하기 문제인 경우, 음성형태의 입력문장이 수신될 수 있으며, 시험문제가 쓰기 문제인 경우, 텍스트 형태의 입력문장이 수신될 수 있다. First, the test scoring device can receive an input sentence from a user (S100). The input sentence is a sentence entered by the user as an answer to the test question. The input sentence can be received in text or voice form. If the test question is a speaking problem, a speech form input sentence may be received, and if the test question is a write problem, a text form input sentence may be received.

이어서, 시험채점장치는 입력문장에 대하여 자연어 처리를 수행하여 입력문장을 구성하는 단어를 구분할 수 있다(S110). 여기서, 입력문장이 음성 형태인 경우, 자연어 처리를 수행하기 위하여 입력문장을 음성인식 기술을 통하여 텍스트 형태로 변환할 수 있다. 그리고, 시험채점장치는 텍스트 형태의 입력문장에 대하여 자연어 처리를 수행하여, 입력문장을 구성하는 단어들을 구분할 수 있다. Then, the test scoring apparatus can perform natural language processing on the input sentence to distinguish words constituting the input sentence (S110). Here, when the input sentence is a speech form, the input sentence can be converted into a text form through the speech recognition technique to perform natural language processing. In addition, the test scoring apparatus can perform natural language processing on a text-type input sentence to distinguish words constituting the input sentence.

그리고, 시험채점장치는 시험문제에 대응하는 정답문장을 구성하는 복수 개의 단어를 이용하여 적어도 하나의 단어 조합을 생성한다(S130). 정답문장은 복수 개의 단어로 구성될 수 있는데, 복수 개의 단어를 임의의 개수로 추출하여 조합할 경우 하나의 단어 조합을 생성할 수 있다. 다만, 정답문장의 단어배열 순서와 단어 조합의 단어 배열 순서는 서로 일치하도록 단어 조합을 생성할 수 있다. 예를 들어, 정답문장의 단어의 개수가 n 개인 경우, 단어 조합을 구성하는 단어의 개수는 1개부터 n개까지 될 수 있다. Then, the test scoring apparatus generates at least one word combination using a plurality of words constituting a correct answer sentence corresponding to the test question (S130). The correct answer sentence can be composed of a plurality of words. When a plurality of words are extracted and combined in an arbitrary number, one word combination can be generated. However, a word combination can be generated so that the word arrangement order of the correct answer sentence and the word arrangement order of the word combination coincide with each other. For example, when the number of words in the correct answer sentence is n, the number of words constituting the word combination can be from 1 to n.

이어서, 시험채점장치는 각 단어 조합과 입력문장을 비교하여 입력문장과 정답문장 간의 유사도를 계산한다(S140). 구체적으로, 도 9를 참조하면, 시험채점장치는 입력문장과 적어도 하나의 단어 조합을 각각 비교하여 각각의 경우에 대한 유사도를 계산한다(S141). 유사도는 입력문장과 단어 조합 간에 단어배열순서와 단어의 종류가 일치하게 되는 단어의 개수에 대한 입력문장의 총 단어 개수의 비율로 계산될 수 있다. 예를 들어, “am a I boy”라는 입력문장과 “am a boy”라는 단어 조합 간에는 “am”, “a”, “boy”라는 3개의 단어가 서로 일치한다. 이때, 입력문장을 구성하는 4개의 단어 중 3개가 일치하게 되는 것이므로, 유사도는 0.75가 된다. 이어서, 시험채점장치는 계산된 유사도 중 가장 높은 값을 갖는 유사도를 입력문장과 정답문장 간의 최종 유사도로 결정하여 출력한다(S142). 예를 들어, 0.50, 0.45, 0.75 의 유사도가 계산된 경우, 이 중에서 가장 유사도가 높은 0.75가 정답문장과 입력문장 간의 유사도로 결정될 수 있다. Then, the test scoring apparatus compares each word combination with the input sentence and calculates the similarity between the input sentence and the correct sentence (S140). Specifically, referring to FIG. 9, the test scoring apparatus compares the input sentence with at least one word combination to calculate the similarity for each case (S141). The degree of similarity can be calculated as a ratio of the total number of words in the input sentence to the number of words in which the word order and word type match between the input sentence and the word combination. For example, the three words "am", "a", and "boy" match each other between the input sentence "am a I boy" and the word "am a boy" At this time, since three of the four words constituting the input sentence are matched, the similarity is 0.75. Then, the test scoring apparatus determines the similarity having the highest value among the calculated similarities as the final similarity between the input sentence and the correct answer sentence (S142). For example, if the similarity of 0.50, 0.45, and 0.75 is calculated, the most similarity of 0.75 can be determined by the similarity between the correct answer sentence and the input sentence.

시험채점장치는 유사도를 기초로 입력문장에 대한 점수를 계산한다(S150). 시험채점장치는 유사도를 기초로 입력문장 중 정답문장과 일치하는 단어의 개수를 파악할 수 있다. 시험채점장치는 파악된 개수에 대한 정답문장을 구성하는 단어의 개수의 비율로서 평가인자를 계산할 수 있다. 이어서, 시험채점장치는 입력문장 중 정답문장에 없는 불필요한 단어가 추가되어 있을 경우, 추가단어에 대하여 감점인자를 계산할 수 있다. 시험채점장치는 이러한 평가인자와 감점인자의 합산을 통하여 최종 점수를 계산할 수 있다. The test scoring apparatus calculates a score for the input sentence based on the similarity (S150). The test scoring device can recognize the number of words matching the correct sentence in the input sentence based on the degree of similarity. The test scoring device can calculate the evaluation factor as a ratio of the number of words constituting the correct answer sentence for the identified number. Then, the test scoring device can calculate the subtraction factor for the additional word when the unnecessary word not included in the correct sentence is added to the input sentence. The test scoring system can calculate the final score through the addition of these evaluation factors and subtraction factors.

나아가, 시험채점장치는 입력문장 중 틀린 부분에 대하여 첨삭정보를 제공할 수도 있다(S160). Furthermore, the test scoring device may provide the annotation information for the wrong part of the input sentence (S160).

이상으로 설명한 바와 같이, 본 발명의 일 실시예는 문장 간의 유사도를 종래기술보다 더욱 정확도있게 계산하여 정확한 채점을 수행할 수 있다. 그에 따라, 말하기와 쓰기 시험문제에 있어서, 채점자의 수작업으로 이루어졌던 채점과정을 자동화할 수 있게 되므로, 시험응시자의 입장에서는 더욱 빠르게 시험결과 정보를 확인할 수 있게 된다. As described above, according to the embodiment of the present invention, it is possible to calculate the degree of similarity between sentences more accurately than in the prior art to perform accurate scoring. As a result, it is possible to automate the scoring process, which was performed manually by the scorer, in the speaking and writing test, so that the test result information can be confirmed more quickly from the test candidate's point of view.

상술한 내용 중, 입력문장 수신부 내지 유사도 계산부(210 - 240)는 문장유사도 판단장치의 구성으로서 포함될 수 있으며, 도 8의 S100 단계 내지 S140 단계는 문장유사도 판단방법의 하위단계로서 포함될 수 있다. The input sentence receiving unit or similarity calculating unit 210-240 may be included as a component of the sentence similarity determination apparatus, and steps S100 through S140 of FIG. 8 may be included as a sub-step of the sentence similarity determination method.

도 8 내지 도 9를 통해 설명된 실시예에 따른 문장유사도 판단방법 및 시험채점방법은 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다. The sentence similarity determination method and the test scoring method according to the embodiment described with reference to Figs. 8 to 9 may also be implemented in the form of a recording medium including instructions executable by a computer such as a program module executed by a computer . Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. In addition, the computer-readable medium may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes any information delivery media, including computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.It will be understood by those skilled in the art that the foregoing description of the present invention is for illustrative purposes only and that those of ordinary skill in the art can readily understand that various changes and modifications may be made without departing from the spirit or essential characteristics of the present invention. will be. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

100 : 사용자 단말 200 : 채점서버
210 : 입력문장 수신부 220 : 입력문장 분석부
230 : 단어 조합 생성부 240 : 유사도 계산부
250 : 점수 제공부 260 : 첨삭정보 제공부100: User terminal 200: Scoring server
210: input sentence receiving unit 220: input sentence analyzing unit
230: word combination generation unit 240: similarity calculation unit
250: scoring system 260:

Claims

In a method for determining similarity of sentences performed by a server,
Receiving a first sentence composed of a plurality of words from a user;
Generating at least one word combination using a plurality of words included in a second sentence corresponding to the first sentence in the server; And
And comparing the first sentence with each combination of words to calculate a similarity of the first sentence and the second sentence,
Wherein the generating the at least one word combination comprises:
Wherein the first sentence includes at least one word and the second sentence includes at least one word and the second sentence includes at least one word, How to determine similarity.

The method according to claim 1,
Wherein the generating the at least one word combination comprises:
Extracting a word from one of n words of the second sentence to generate the at least one word combination when the number of words constituting the second sentence is n, Way.

3. The method of claim 2,
Wherein the total number of word combinations to be generated is 2 ^{< n >} -1.

The method according to claim 1,
Wherein the generating the at least one word combination comprises:
Extracting at least one word in the second sentence so as to be equal to a word arrangement order in the second sentence to generate the at least one word combination.

5. The method of claim 4,
Wherein the step of calculating the degree of similarity comprises:
A degree of similarity is calculated for each combination of words through a ratio of the number of words in the first sentence to the total number of words in the first sentence that is the same as a word included in each word combination, ; And
Determining the highest value among the calculated similarities as the degree of similarity between the first sentence and the second sentence;
And determining a sentence similarity degree.

delete

In the test scoring method performed by the server,
Receiving an input sentence composed of a plurality of words as a response to a test question from a user;
Generating at least one word combination using a plurality of words included in a correct answer sentence previously stored in the server corresponding to the input sentence;
Comparing the combination of words with the input sentence and calculating a similarity between the input sentence and the correct sentence; And
And calculating a score for the input sentence based on the calculated similarity,
Wherein the generating the at least one word combination comprises:
Wherein said at least one word combination is generated using a plurality of encoded words included in said correct answer sentence after each word constituting said correct sentence is encoded so that the same word has the same sign value, .

8. The method of claim 7,
Wherein the generating the at least one word combination comprises:
Wherein when the number of words constituting the correct answer sentence is n, the at least one word combination is generated by extracting words from any one of 1 to n of the words of the correct answer sentence.

9. The method of claim 8,
Wherein the total number of word combinations to be generated is 2 ^{< n >} -1.

8. The method of claim 7,
Wherein the generating the at least one word combination comprises:
Extracting at least one word in the correct answer sentence so as to be equal to a word arrangement order in the correct answer sentence to generate the at least one word combination.

11. The method of claim 10,
Wherein the step of calculating the degree of similarity comprises:
The degree of similarity is calculated for each word combination based on the ratio of the number of words in the input sentence to the total number of words in the input sentence when the word arrangement order of the word combinations is considered, step; And
Determining the highest value among the calculated similarities as the degree of similarity between the input sentence and the correct answer sentence;
The method comprising the steps of:

12. The method of claim 11,
The step of calculating the score comprises:
Calculating an evaluation factor by calculating a ratio between the number of mutually matching words between the pair of the input sentence and the word combination corresponding to the determined degree of similarity and the total number of words of the correct sentence;
Calculating a reduction factor when the total number of words in the input sentence is greater than the total number of words in the correct answer sentence; And
Calculating a score of the user for the test problem by subtracting a score factor from the evaluation factor;
The method comprising the steps of:

13. The method of claim 12,
The declination factor may be,
Wherein the number of words in the input sentence is in proportion to the number of words in the input sentence more than the correct answer sentence and is inversely proportional to the total number of words in the correct sentence.

delete

8. The method of claim 7,
Further comprising the step of providing supplementary information on a part of the input sentence input by the user that does not coincide with the correct sentence when the score calculated for the test question is not the highest value.

8. The method of claim 7,
In the test scoring method,
Further comprising the step of converting the input sentence into a text form by applying a speech recognition technique when the input sentence is a speech form.

8. The method of claim 7,
In the test scoring method,
Further comprising the step of distinguishing each word constituting the input sentence by applying a natural language processing technique to the input sentence in the form of a text.

In the sentence similarity degree determining apparatus,
A first sentence receiving unit for receiving a first sentence composed of a plurality of words from a user;
A word combination generating unit for generating at least one word combination using a plurality of words included in a second sentence stored in advance in a server communicating with the sentence similarity degree determining apparatus or the sentence similarity degree determining apparatus corresponding to the first sentence; And
And a degree of similarity calculation unit for comparing the combination of words with the first sentence and calculating a similarity value of the first and second sentences,
Wherein the word-
Wherein the first sentence includes at least one word and the second sentence includes at least one word and the second sentence includes at least one word, Similarity determination device.

In the test scoring device,
An input sentence receiving unit for receiving an input sentence composed of a plurality of words as a response to a test question from a user;
A word combination generator for generating at least one word combination using a plurality of words included in a correct answer sentence previously stored in a server communicating with the test scoring device or the test scoring device corresponding to the input sentence;
A similarity calculation unit for comparing the input word with the combination of words to calculate a similarity value between the input sentence and the correct sentence; And
And a score providing unit for calculating a score for the input sentence based on the calculated similarity,
Wherein the word-
Wherein each of the words constituting the correct answer sentence is encoded so that the same word has the same sign value and then the at least one word combination is generated using a plurality of encoded words included in the correct answer sentence, .