TW579470B - Chinese article evaluation method and system and computer reading medium - Google Patents

Chinese article evaluation method and system and computer reading medium Download PDF

Info

Publication number
TW579470B
TW579470B TW91116800A TW91116800A TW579470B TW 579470 B TW579470 B TW 579470B TW 91116800 A TW91116800 A TW 91116800A TW 91116800 A TW91116800 A TW 91116800A TW 579470 B TW579470 B TW 579470B
Authority
TW
Taiwan
Prior art keywords
article
chinese
word
patent application
scope
Prior art date
Application number
TW91116800A
Other languages
Chinese (zh)
Inventor
Wen-Chih Chen
Lu-Ping Chang
Yuei-Lin Chiang
I-Heng Meng
Original Assignee
Inst Information Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inst Information Industry filed Critical Inst Information Industry
Priority to TW91116800A priority Critical patent/TW579470B/en
Application granted granted Critical
Publication of TW579470B publication Critical patent/TW579470B/en

Links

Landscapes

  • Machine Translation (AREA)

Abstract

A Chinese article evaluation method comprises first proceeding annotation of break word and word characteristics in Chinese article in accordance with Chinese word material/word library; determining if each word is incorrect character and calculating count of incorrect words in Chinese article; subsequently dividing Chinese article into multiple sentences, proceeding grammar analysis for each sentence based on Chinese structure tree and calculating sentence count violating grammar in Chinese article; then, quantitatively grading the count of incorrect words and sentence count violating grammar making sense; next following article ontology to proceed complete assessment of article structure in Chinese article to thereby acquire the corresponding orderly quantitative grading; and following article skill ontology to proceed article skill judgment of Chinese article to acquire corresponding tasteful quantitative grading; and finally following the sensible quantitative grading, the orderly quantitative grading and the tasteful quantitative grading to decide evaluation score of corresponding Chinese article.

Description

579470 五、發明說明(1) 本發明係有關於一種中文文章評量方法及系統,且特 別有關於一種結合中文語料/詞庫、及中文結構樹比對與 以本體論(0 n t ο 1 〇 g y )為基礎之知識表達技術,進而自動地 將中文文章進行評量之中文文章評量方法及系統。 目前,對於中文文章,如中文作文的評鑑方式大多是 採用人工評分的方式來進行評量,這樣的做法需要花費大 量的人力與時間。此外,若以人工的方式進行評分,當進 行大量作文文件的評分時,勢必無法以一人的力量來獨立 完成,而當透過多人對於不同的文章進行評分時,每位評 鑑人員的評分標準不盡相同,容易造成評鑑不公的情況發 生。 有鑑於此,本發明之主要目的為提供一種結合中文詞 庫與語料庫、及中文結構樹比對與以本體論為基礎之知識 表達技術,進而自動地將中文文章進行所謂『言之有 物』、『言之有序』、與『言之有味』評量之中文文章評 量方法及系統。 為了達成本發明之上述目的,可藉由本發明所提供之 中文文章評量方法及系統來達成。 依據本發明實施例之中文文章評量方法,首先,依據 中文語料/詞庫將中文文章進行斷詞與詞性標註。判斷每 一詞是否為錯別字,並計算中文文章中之錯別字的數目。 接著,將中文文章分割為多個句子,依據中文結構樹 將每一句子進行文法分析,並計算中文文章中之不符合文 法之句子的數目。之後,依據錯別字的數目與不符合文法579470 V. Description of the invention (1) The present invention relates to a method and system for evaluating Chinese articles, and particularly relates to a combination of Chinese corpus / thesaurus, Chinese structure tree comparison, and ontology (0 nt ο 1 〇gy) -based knowledge expression technology, and then a Chinese article evaluation method and system for automatically evaluating Chinese articles. At present, most Chinese articles, such as Chinese essays, are evaluated by manual scoring, which requires a lot of manpower and time. In addition, if scoring is performed manually, when grading a large number of composition files, it will inevitably be impossible to complete independently with one person's strength. When multiple people are scoring different articles, the scoring standard of each reviewer They are not the same, which can easily lead to unfair evaluation. In view of this, the main purpose of the present invention is to provide a Chinese thesaurus and corpus, and Chinese structure tree comparison and ontology-based knowledge expression technology, so as to automatically carry out the so-called "words and things" of Chinese articles. , "Orderly words", and "Chinese words" evaluation methods and systems of Chinese articles. In order to achieve the above-mentioned object of the present invention, the Chinese article evaluation method and system provided by the present invention can be used to achieve it. According to the method for evaluating Chinese articles according to the embodiment of the present invention, first, a Chinese article is segmented and part-of-speech based on a Chinese corpus / thesaurus. Determine whether each word is a typo and count the number of typos in the Chinese text. Then, the Chinese article is divided into multiple sentences, each sentence is grammatically analyzed according to the Chinese structure tree, and the number of non-grammatical sentences in the Chinese article is calculated. After that, based on the number of misspellings and non-compliant grammar

0213-8280TWF(N);Y i anhou.ptd 第4頁 579470 五、發明說明(2) 之句子數目計算言之有物量化評分。 接著,依據文章結構本體庫將中文文章進行文章架構 完整性判斷,從而得到相應之言之有序量化評分,並依據 文章技巧本體庫將中文文章進行文章技巧判斷,從而得到 _ 相應之言之有味量化評分。 最後,依據言之有物量化評分、言之有序量化評分與 · 言之有味量化評分,決定相應中文文章之評量分數。 依據本發明實施例之中文文章評量系統,包括一資料 儲存裝置與一處理器。資料儲存裝置中具有一中文語料/ 詞庫、一中文結構樹、一文章結構本體庫,用以紀錄相應 _ 文章之文章結構概念與文章結構概念間之關聯、與一文章 技巧本體庫,用以紀錄相應文章之文章技巧概念與文章技 巧概念間之關聯。 處理器耦接至資料儲存裝置,致使依據中文語料/詞 庫,將一中文文章進行斷詞與詞性標註,從而得到多個包 含至少一字之詞,且每一詞具有相應之一詞性。接·著,依 據每一詞所包含字的數目,判斷每一詞是否為一錯別字, 並計算中文文章中之錯別字的數目。 之後,處理器依據標點符號將中文文章分割為多個句 子,並依據中文結構樹將每一句子進行文法分析,從而得 4 知每一句子是否符合文法,並計算中文文章中之不符合文 法之該等句子的數目。依據錯別字的數目與不符合文法之 句子數目計算一言之有物量化評分。 、 接著,處理器依據文章結構本體庫將中文文章進行文0213-8280TWF (N); Yi anhou.ptd Page 4 579470 V. The number of sentences in the description of the invention (2) Calculate the quantified score of the material. Then, based on the article structure ontology library, the Chinese articles are judged on the integrity of the article structure, so as to obtain an ordered quantitative score, and the article skills ontology library is used to judge the Chinese articles, so as to obtain _ corresponding words. Flavor quantification score. Finally, according to the quantified scoring, the ordered quantified scoring, and the quantified scoring, the corresponding Chinese article's evaluation score is determined. A Chinese article evaluation system according to an embodiment of the present invention includes a data storage device and a processor. The data storage device has a Chinese corpus / thesaurus, a Chinese structure tree, and an article structure ontology library to record the corresponding relationship between the article structure concept and article structure concept of the article, and an article skill ontology library. Record the relationship between the article technique concept and the article technique concept of the corresponding article. The processor is coupled to the data storage device, so that a Chinese article is segmented and part-of-speech based on the Chinese corpus / thesaurus, thereby obtaining a plurality of words containing at least one word, and each word has a corresponding part of speech. Then, according to the number of words contained in each word, determine whether each word is a typo, and calculate the number of typos in the Chinese text. After that, the processor divides the Chinese article into multiple sentences according to punctuation marks, and performs grammatical analysis on each sentence according to the Chinese structure tree, so as to know whether each sentence conforms to the grammar and calculates the non-grammatical The number of such sentences. A one-word quantified score is calculated based on the number of typos and the number of sentences that do not conform to the grammar. Then, the processor publishes the Chinese article according to the article structure ontology library.

0213-8280TWF(N);Yianhou.ptd 第5頁 579470 五、發明說明(3) 早杀構70整性判斷’從而得到相應之一言之有序量化評 分’並依據文章技巧本體庫,將中文文章進行文章技巧判 斷’ k而得到相應之一言之有味量化評分。最後,處理器 依據言之有物量化評分、言之有序量化評分與言之有味量 化評分’決定相應中文文章之評量分數。 實施例 第1圖為一示意圖係顯示依據本發明實施例之中文文 章評量系統之系統架構。 依據本發明實施例之中文文章評量系統,包括一資料 儲存衣置100與輕接至資料儲存裝置1QQ之處理器11〇。資 料儲存裝置1 0 0中具有一中文語料/詞庫丨〇 i、一中文結構 樹1、一文章結構本體庫(Onto logy π 03,用以紀錄相應 文章之文章結構概念與文章結構概念間之關聯、與一文章 技巧本體庫1 0 4,用以紀錄相應文章之文章技巧概念與文 早技巧概念間之關聯。文章結構本體庫丨〇 3中之文章結構 概念包含起、承、轉、與合等之結構概念,文章技巧本體 庫1 04中之文章技巧概念包含擬人法、譬喻法、層遞法、 映襯法、與暗喻法等之文章技巧概今。 其中,中文語料/詞庫101可以是中研院CHP詞庫小組 的研究成f平衡語料庫,,與、'中文詞庫,,,豆中包含大 量的詞、句與詞所相應的詞性。中 Υ文結構樹1 0 2可以是中 研院CK I Ρ詞庫小組的研究成果、、中文纤 了中文句子的文法詞性組合。 ° 文章結構本體庫1 0 3與文章枯κ 早孜巧本體庫104可以是遵循 構樹〃,其中包含0213-8280TWF (N); Yianhou.ptd Page 5 579470 V. Description of the invention (3) Judging the integrity of the early killing structure 70 to obtain a correspondingly ordered quantitative score, and based on the article skill ontology library, Chinese The article is judged by the article skills and gets a corresponding quantified scoring. Finally, the processor determines the corresponding Chinese article's evaluation score according to the verbal quantified score, the ordered quantified score, and the quantified quantified score '. Embodiment FIG. 1 is a schematic diagram showing a system architecture of a Chinese text evaluation system according to an embodiment of the present invention. The Chinese article evaluation system according to an embodiment of the present invention includes a data storage device 100 and a processor 11 that is lightly connected to the data storage device 1QQ. The data storage device 100 has a Chinese corpus / thesaurus, a Chinese structure tree, and an article structure ontology library (Ontology π 03), which is used to record the article structure concepts and article structure concepts of corresponding articles. The relationship between it and an article skill ontology library 104 is used to record the relationship between the article technique concept and the early text technique concept of the corresponding article. The article structure ontology library 丨 〇3 article structure concepts include starting, inheriting, transferring, The concept of the structure of equality and essay. The article technique concept in Article Technique Ontology Library 04 includes article techniques such as personification, metaphor, layering, reflection, and metaphor. Among them, Chinese corpus / thesaurus. 101 can be a balanced corpus of research from the CHP thesaurus group of the Chinese Academy of Sciences, and, 'Chinese thesaurus,' and the bean contains a large number of words, sentences, and parts of speech. The research results of the CK I P thesaurus group of the Chinese Academy of Sciences, and the grammatical and lexical combinations of Chinese sentences in the Chinese language. ° Article structure ontology library 1 0 3 and article Ku κ Zaozi ontology library 104 can be a tree structure, where contain

〇213-8280TWF(N);Yianhou.ptd 第6頁 579470 五、發明說明(4) 資源描述架構(Resource Description Framework,RDF )、知識交換格式(Knowledge Interchange Format, KIF)、或是DAML+0IL(DARPA Agent Markup Language + Ontology lnference Layer)標準所建立之知識體系。文 章結構本體庫1 0 3為以知識本體論為基礎之知識表達技 術’使得機器能夠自動判讀一篇文章之架構是否完整。類 似地,文章技巧本體庫丨〇 4為以知識本體論為基礎之知識 表達技術’使得機器能夠自動判讀一篇文章之知識認知與 文章技巧。注意的是,建立文章結構本體庫丨〇 3與文章技 巧本體庫1 04的方法與一般知識本體庫類似,在此不加以 詳述。 處理器1 1 0可以依據中文語料/詞庫丨0 1,將中文文章 進行斷詞與詞性標註,從而得到多個詞,且藉由查詢中文 語料/詞庫1 〇 1每一詞可以得到其相應之詞性。處理器Η 〇 依據每一詞中所包含字的數目,判斷每一詞是否為一錯別 字’並計算中文文章中之錯別字的數目。其中,由於實際 情況下詞通常都包含有兩個以上的字,因此,若斷詞時戶^ 斷出的詞僅包含一個字,則處理器丨丨〇判斷該詞為一錯別 字。 ° ...... 之後,處理為1 1 0依據標點符號,如逗號盥/戋句號將 中文文章分割為多個句子,並依據中文結構樹i 將每一 句子進行文法分析,從而得知每一句子是否符人文法,並 計算中文文章中之不符合文法之句子的數目。&著\處理 器1 1 0依據錯別字的數目與不符合文法之句子數目計算&一〇213-8280TWF (N); Yianhou.ptd Page 6 579470 5. Invention Description (4) Resource Description Framework (RDF), Knowledge Interchange Format (KIF), or DAML + 0IL ( DARPA Agent Markup Language + Ontology lnference Layer). The article structure ontology library 103 is a knowledge expression technology based on knowledge ontology, which enables the machine to automatically judge whether the structure of an article is complete. Similarly, the article skills ontology library 丨 04 is a knowledge expression technology based on knowledge ontology, which enables the machine to automatically judge the knowledge and article skills of an article. Note that the method for establishing the article structure ontology library is similar to the article skill ontology library 104, which is similar to the general knowledge ontology library, and will not be described in detail here. The processor 1 1 0 can perform word segmentation and part-of-speech tagging on a Chinese article according to the Chinese corpus / thesaurus 丨 0 1 to obtain multiple words, and by querying the Chinese corpus / thesaurus 1 〇1 each word can Get its corresponding part of speech. The processor Η 〇 determines whether each word is a typo according to the number of characters contained in each word and calculates the number of typos in the Chinese text. Among them, since the word usually contains more than two words in actual conditions, if the word that the user breaks out contains only one word when the word is broken, the processor judges that the word is a typo. ° ...... After that, it is processed as 1 1 0. According to punctuation, such as comma / 戋 period, the Chinese article is divided into multiple sentences, and each sentence is grammatically analyzed according to the Chinese structure tree i. Whether each sentence conforms to human grammar and counts the number of non-grammatical sentences in the Chinese text. & 着 \ Processor 1 1 0 Calculated based on the number of typos and the number of sentences that do not conform to the grammar

0213 -8280TWF(N);Y i anhou.ptd 第7頁 579470 五、發明說明(5) 言之有物量化評分。 接著,處理器11 0依據文章結構本體庫1 0 3將中文文章 進行文章架構完整性判斷,從而得到相應之一言之有序量 化評分。並依據文章技巧本體庫1 0 4,將中文文章進行文 章技巧判斷,從而得到相應之一言之有味量化評分。如前 所述,文章結構本體庫103為以知識本體論為基礎之知識 表達技術,使得機器能夠自動判讀一篇文章之架構是否完 整,且文章技巧本體庫1 0 4為以知識本體論為基礎之知識 表達技術,使得機器能夠自動判讀一篇文章之知識認知與 文章技巧。 處理器1 1 0可以依據言之有物量化評分、言之有序量 化評分與言之有味量化評分,決定相應中文文章之評量分 數。 第2圖為一流程圖係顯示依據本發明實施例之中文文 章評量方法之操作流程。 依據本發明實施例之中文文章評量方法,首先,如步 驟S2 0 0,接收一中文文章。接著,如步驟S201,依據中文 語料/詞庫將中文文章進行斷詞與詞性標註,從而得到多 個詞,且藉由查詢中文語料/詞庫每一詞可以得到其相應 之詞性。其中,中文語料/詞庫可以是中研院CK I P詞庫小 組的研究成果 ''平衡語料庫〃與 ''中文詞庫〃,其中包含 大量的詞、句與詞所相應的詞性。 接著,如步驟S2 0 2,判斷每一詞是否為一錯別字,並 計算中文文章中之錯別字的數目。其中,若斷詞時所斷出0213 -8280TWF (N); Yi anhou.ptd Page 7 579470 V. Description of the invention (5) The quantified scoring of things. Then, the processor 110 judges the integrity of the article structure of the Chinese article according to the article structure ontology library 103, so as to obtain a correspondingly ordered quantitative score. Based on the article skills ontology library 104, the Chinese articles are judged on the text skills to get a corresponding quantified scoring. As mentioned earlier, the article structure ontology library 103 is a knowledge expression technology based on knowledge ontology, which enables the machine to automatically judge whether the structure of an article is complete, and the article skills ontology library 104 is based on knowledge ontology. The knowledge expression technology enables the machine to automatically read the knowledge and essay skills of an article. The processor 110 can determine the corresponding Chinese article's evaluation score according to the quantified tangible score, the ordered quantified score, and the quantified quantified score. Fig. 2 is a flowchart showing the operation flow of the Chinese text evaluation method according to the embodiment of the present invention. According to the Chinese article evaluation method of the embodiment of the present invention, first, a step S2 0 is performed to receive a Chinese article. Next, in step S201, the Chinese article is segmented and part-of-speech based on the Chinese corpus / thesaurus to obtain multiple words, and the corresponding part of speech can be obtained by querying each word in the Chinese corpus / thesaurus. Among them, the Chinese corpus / thesaurus can be the research results of the CK I P thesaurus group of the Chinese Academy of Sciences '' balanced corpus 〃 and '' Chinese thesaurus 〃, which contains a large number of words, sentences and parts of speech corresponding to the part of speech. Next, in step S202, it is determined whether each word is a typo, and the number of typos in the Chinese text is calculated. Among them, if the word break

0213-8280TWF(N);Yianhou.ptd 第8頁 5794700213-8280TWF (N); Yianhou.ptd Page 8 579470

的詞僅包含一個字,則判斷該詞為一錯別字。 ^ -Υδ J -lt\l •卜 驟S203,依據逗號與/或句號將中文文章分割為多個乂 子。並如步驟S2 04,將句子中詞之詞性與中文結% " 建之文法詞性組合進行比對,以將句子進行文法分析 $ 中,中文結構樹可以是中研院CK丨Ρ詞庫小組的研处其 中文結構樹",其中包含了中文句子的文法詞2組合。 之後,如步驟S2 0 5,判斷句子是否不符合文法、',二。 子不符合文法(步驟S2 0 5中的是),則如步驟32〇6 $句 文文章中之不符合文法之句子的數目(進行累加);若=中 符合文法(步驟S 2 0 5中的否)或中文文章中之不符合1 ^子 句子的數目累加(步驟S 2 0 6 )之後,則如步驟s 2 〇 7,判斷曰 否所有句子都完成文法分析,若還有句子尚未完 Τ 析(步驟S20 7中的否),則回到步驟S 2 04,將复:法分 六他5]子進杆 文法分析。若所有句子都完成文法分析(步驟S2 〇 7中的 是),則如步驟S 2 0 8,依據錯別字的數目與不符合法 句子數目计异言之有物量化評分。 接著,如步驟S2 0 9,依據文章結構本體庠將中文立 進行文章架構完整性判斷,從而得到相應之言之有序旦早 評分,並如步驟S 2 1 0,依據文章技巧本體庫將中文文^ ^ 行文章技巧判斷,從而得到相應之言之有味量化評分:相 同地’文章結構本體庫與文章技巧本體庫可以是遵循資源 描述架構(RDF )、知識交換格式(KIF)、或是DAML + OIL標 準所建立之知識體系。文章結構本體庫為以知識本體論為 基礎之知識表達技術,使得機器能夠自動判讀一篇文章之If the word contains only one word, the word is judged to be a typo. ^ -Υδ J -lt \ l • In step S203, the Chinese article is divided into multiple sub-items based on commas and / or periods. And, as in step S2 04, compare the part-of-speech of the word in the sentence with the Chinese knot% " Jianzhi grammatical part-of-speech combination to analyze the sentence grammatically. The Chinese structure tree can be the research of the Chinese Academy of Sciences CK 丨 P thesaurus group. The Chinese structure tree is quoted, which contains the grammatical word 2 combination of Chinese sentences. After that, if step S205 is performed, it is determined whether the sentence does not conform to the grammar. If the subgram does not conform to the grammar (Yes in step S205), then the number of sentences that do not conform to the grammar in the sentence sentence (step S205) is added (accumulate); if = is consistent with the grammar (step S205) No) or the number of non-matching 1 ^ clauses in the Chinese article is accumulated (step S 2 0 6), then step s 2 〇7, it is judged whether all sentences have completed grammatical analysis, if there are still sentences that have not been completed Τ analysis (No in step S20 7), then return to step S 2 04 and analyze the complex grammar by dividing it into six points and five points. If all sentences have completed grammatical analysis (Yes in step S207), then according to step S208, the quantified materiality score is calculated based on the number of misspellings and the number of non-compliant sentences. Next, as in step S209, the Chinese language is judged on the integrity of the article structure based on the article structure ontology, so as to obtain the corresponding early order score. Then, according to step S2 0, the Chinese language is translated according to the article skill ontology library. The article ^ ^ judges the article skills, so as to obtain the corresponding quantified scoring score: the same article structure ontology library and article skills ontology library can follow the resource description architecture (RDF), knowledge exchange format (KIF), or Knowledge system established by DAML + OIL standards. Article structure ontology library is a knowledge expression technology based on knowledge ontology, which enables the machine to automatically interpret an article.

0213-8280TWF(N);Yianhou.ptd 第9頁 579470 五、發明說明(7) 架構是否完整。類似地,文章技巧本體庫為以知識本體論 為基礎之知識表達技術,使得機器能夠自動判讀一篇文章 之知識認知與文章技巧。其中,文章結構本體庫中具有包 含起、承、轉、與合等之結構概念,文章技巧本體庫中具 . 有包含擬人法、譬喻法、層遞法、映襯法、與暗喻法等之 文章技巧概念。 · 最後,如步驟S 2 1 1,依據言之有物量化評分、言之有 序量化評分與言之有味量化評分,決定相應中文文章之評 量分數。 此外,依據本發明另一型態,亦可以編碼於電腦讀取 φ 媒介中之電腦程式來致能中文文章評量,如本發明實施例 所述。 因此,藉由本發明所提出之中文文章評量方法及系 統,可以結合中文詞庫與語料庫、及中文結構樹比對與以 本體論為基礎之知識表達技術,進而自動地將中文文章進 行所謂『言之有物』、『言之有序』、與『言之有味』評 量,以節省大量的人力與時間,並達到評分標準相同的目 的。 雖然本發明已以較佳實施例揭露如上,然其並非用以 限定本發明,任何熟悉此項技藝者,在不脫離本發明之精丨_ 神和範圍内,當可做些許更動與潤飾,因此本發明之保護 範圍當視後附之申請專利範圍所界定者為準。0213-8280TWF (N); Yianhou.ptd Page 9 579470 5. Description of the invention (7) Whether the structure is complete. Similarly, the article skills ontology library is a knowledge expression technology based on knowledge ontology, which enables the machine to automatically read the knowledge and article skills of an article. Among them, the article ontology library has structural concepts including origin, inheritance, transfer, and union, and the article skills ontology library has. There are articles including anthropomorphism, metaphor, layering, mapping, and metaphor. Skill concept. · Finally, according to step S 2 1 1, the corresponding Chinese essay's evaluation score is determined based on the verbal quantified score, the ordered quantified score, and the quantified quantified score. In addition, according to another form of the present invention, a computer program encoded in a medium φ can be read by a computer to enable Chinese article evaluation, as described in the embodiment of the present invention. Therefore, with the Chinese article evaluation method and system provided by the present invention, the Chinese thesaurus and corpus, and Chinese structure tree comparison and ontology-based knowledge expression technology can be combined, and the Chinese article can be automatically called " "Speak with something", "Order with words", and "Taste with words" to save a lot of manpower and time, and achieve the same purpose of scoring standards. Although the present invention has been disclosed in the preferred embodiment as above, it is not intended to limit the present invention. Anyone skilled in the art can make some changes and decorations without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be determined by the scope of the appended patent application.

0213-8280TWF(N);Yianhou.ptd 第10頁 579470 圖式簡單說明 為使本發明之上述目的、特徵和優點能更明顯易懂, 下文特舉實施例,並配合所附圖示,詳細說明如下: 第1圖為一示意圖係顯示依據本發明實施例之中文文 章評量系統之系統架構。 第2圖為一流程圖係顯示依據本發明實施例之中文文 章評量方法之操作流程。 符號說明 1 0 0〜資料儲存裝置; 1 0 1〜中文語料/詞庫; _ 1 0 2〜中文結構樹; 1 0 3〜文章結構本體庫; 1 0 4〜文章技巧本體庫; 1 1 0〜處理器; S2 0 0、. . . 、S211〜操作步驟。0213-8280TWF (N); Yianhou.ptd Page 10 579470 Brief description of the drawings In order to make the above-mentioned objects, features and advantages of the present invention more comprehensible, the following specific examples are given in conjunction with the accompanying drawings to explain in detail As follows: FIG. 1 is a schematic diagram showing a system architecture of a Chinese article evaluation system according to an embodiment of the present invention. Fig. 2 is a flowchart showing the operation flow of the Chinese text evaluation method according to the embodiment of the present invention. Explanation of symbols 1 0 0 ~ data storage device; 1 0 1 ~ Chinese corpus / thesaurus; _ 1 0 2 ~ Chinese structure tree; 1 0 3 ~ article structure ontology library; 1 0 4 ~ article skill ontology library; 1 1 0 ~ processor; S2 0 0,..., S211 ~ operation steps.

0213-8280W(N);Yianhou.ptd 第11頁0213-8280W (N); Yianhou.ptd Page 11

Claims (1)

579470 六、申請專利範圍 1. 一種中 一資料儲 構樹、一文章 結構概念與該 體庫,用以紀 技巧概念間之 一處理器 文文章評量系統,包括: 存裝置,具有一中文語料/詞庫、一中文結 結構本體庫,用以紀錄相應文章之複數文章 等文章結構概念間之關聯、與一文章技巧本 錄相應文章之複數文章技巧概念與該等文章 關聯;以及 ,耦接至該資料儲存裝置,致使進行下列步 依據該中 性標註 具有相 依 否為一 依 依 從而得 中之不 依 計算一 依 完整性 依 判斷, 依 ,從而 應之一 據每一 錯別字 據標點 據該中 知每一 符合文 據該錯 言之有 據該文 判斷, 據該文 從而得 據該言 文語料/詞庫 得到複數包含 詞性; 該等詞所包含 ,並計算該中 符號,將該中 文結構樹’將 該等句子是否 法之該等句子 別字的數目與 物量化評分, 章結構本體庫 從而得到相應 章技巧本體庫 將一中文文章進行斷詞與詞 至少一字之詞,且每一該等詞 字的數目,判斷每一該等詞是 文文章中之該錯別字的數目; 文文章分割為複數句子; 每一該等句子進行文法分析, 符合文法,並計算該中文文章 的數目; 不符合文法之該等句子的數目 ,將該中文文章進行文章架構 之一言之有序量化評分; ,將該中文文章進行文章技巧 到相應之一言之有味量化評分;以及 之有物量化評分、該言之有序量化評分與該579470 VI. Scope of patent application 1. A S1 data storage structure tree, an article structure concept and the body library, a processor for evaluating articles and texts, including: a storage device with a Chinese language Material / thesaurus, a Chinese knot structure ontology library, used to record the relationship between the article structure concepts such as plural articles of the corresponding article, and the article article concepts of the corresponding article in this record are related to these articles; and, Connected to the data storage device, causing the following steps to be performed according to the neutral labeling, whether the dependency is a dependency or not, and the obtained non-dependency is calculated, the completeness is based on judgment, and the response is based on each typo, Knowing that each mismatch is based on the judgment of the text, and based on the text, the plural corpus of part-of-speech can be obtained from the corpus / thesaurus of the language; the words are included, the Chinese symbol is calculated, and the Chinese structure is The tree 'quantifies the number of sentences and the number of words in these sentences, and the chapter structure ontology library is obtained accordingly. The skill ontology library performs word segmentation and at least one word in a Chinese article, and the number of each of these words determines whether each of these words is the number of typos in the article; the article is divided into plural sentences Grammatical analysis of each such sentence, consistent with the grammar, and calculating the number of Chinese articles; number of such sentences that do not conform to the grammar, the Chinese article is subjected to an orderly quantified score in the article structure; Chinese articles carry the article skills to the corresponding quantified scoring; and the quantified tangible scoring, the ordered quantified scoring and the 0213-8280TWF(N);Yianhou.ptd 第12頁 579470 六、申請專利範圍 言之有味量化評分,決定相應該中文文章之绰曰、 早〈冲夏分數。 2 ·如申請專利範圍第1項所述之中文文章嘴b / &gt; 八平汁5:系統, 其中該文章結構本體庫與該文章技巧本體庫係遵彳盾資源描 述架構(Resource Description Framework,rdf )、挖 準ο0213-8280TWF (N); Yianhou.ptd Page 12 579470 VI. Scope of patent application The quantified scoring of the wording determines the nickname, early <Chongxia score of the corresponding Chinese article. 2 · Chinese article mouth b / &gt; as described in item 1 of the scope of the patent application, & Happing Juice 5: system, where the article structure ontology library and the article skills ontology library are in accordance with the Resource Description Framework (Resource Description Framework, rdf) 3 ·如申請專利範圍第1項所述之中文文章評量系統, 其中該文章結構本體庫與該文章技巧本體庫係遵|循知識交 換格式(Knowledge Interchange Format,ΚΙ F)標準 〇 4·如申請專利範圍第1項所述之中文文章評^系統, 其中該文章結構本體庫與該文章技巧本體庫係遵循' DAML+OILCDARPA Agent Markup Language + Ontology Inference Layer)標準 〇 5 ·如申請專利範圍第1項所述之中文文章評量系統, 其中該處理器係判斷該詞是否僅包含一個字,若該詞僅包 含一個字,則判斷該詞為一錯別字。 6·如申請專利範圍第1項所述之中文文章評量系統, 其中該處理器係依據逗點與/或句點,將該中文文章分割 為複數句子。 7·如申請專利範圍第1項所述之中文文章評量系統,3 · The Chinese article evaluation system described in item 1 of the scope of patent application, wherein the article structure ontology library and the article skill ontology library comply with the Knowledge Interchange Format (KIF) standard. The Chinese article review system described in item 1 of the scope of patent application, wherein the article structure ontology library and the article skill ontology library follow the 'DAML + OILCDARPA Agent Markup Language + Ontology Inference Layer) standard. The Chinese article evaluation system according to item 1, wherein the processor determines whether the word contains only one word, and if the word contains only one word, determines that the word is a typo. 6. The Chinese article evaluation system according to item 1 of the scope of patent application, wherein the processor divides the Chinese article into plural sentences based on commas and / or periods. 7. The Chinese article evaluation system described in item 1 of the scope of patent application, 其中該處理器係依據將該句子中該等詞之詞性與該中文結 構樹中内建之文法詞性組合進行比對,以列斷該^子是 符合文法。 8 ·如申請專利範圍第1項所述之中文文章評量系統, 其中該文章結構本體庫中之該等文章結構概念包^含起結構The processor compares the part-of-speech of the words in the sentence with the built-in grammatical part-of-speech combination in the Chinese structure tree to determine whether the ^ is in line with the grammar. 8 · The Chinese article evaluation system described in item 1 of the scope of patent application, wherein the article structure concept library in the article structure ontology library contains the structure 0213-8280TWF(N);Yianhou.ptd 第13頁 579470 六、申請專利範圍 - 概念、承結構概念、轉結構概念、與合結構概念。 9 ·如申請專利範圍第1項所述之中文文章評量系統, 其中該文章技巧本體庫中之該等文章技巧概念包含擬人法 概念、譬喻法概念、層遞法概念、映襯法概念、與暗喻法 . 概念。 1 0 · —種中文文章評量方法,包括下列步驟: * 依據一中文語料/詞庫,將一中文文章進行斷詞與詞 性標註,從而得到複數包含至少一字之詞,且每一該等詞 具有相應之一詞性; 依據每一該等詞所包含字的數目,判斷每一該等詞是 _ 否為一錯別字,並計算該中文文章中之該錯別字的數目; _ 依據標點符號,將該中文文章分割為複數句子; 依據一中文結構樹,將每一該等句子進行文法分析, 從而得知每一該等句子是否符合文法,並計算該中文文章 中之不符合文法之該等句子的數目; 依據該錯別字的數目與不符合文法之該等句子的數目 計算一言之有物量化評分; 依據一文章結構本體庫’將該中文文章進行文章架構 完整性判斷,從而得到相應之一言之有序量化評分; 依據一文章技巧本體庫,將該中文文章進行文章技巧Is 判斷,從而得到相應之一言之有味量化評分;以及 依據該言之有物量化評分、該言之有序量化評分與該 言之有味量化評分,決定相應該中文文章之評量分數。 11.如申請專利範圍第1項所述之中文文章評量方法,0213-8280TWF (N); Yianhou.ptd Page 13 579470 6. Scope of Patent Application-Concept, concept of bearing structure, concept of transfer structure, and concept of joint structure. 9 · The Chinese article evaluation system described in item 1 of the scope of patent application, wherein the article technique concepts in the article technique ontology library include anthropomorphic concepts, metaphorical concepts, layering concepts, mapping concepts, and Metaphor. Concept. 1 0 · —A method for evaluating Chinese articles, including the following steps: * According to a Chinese corpus / thesaurus, perform word segmentation and part-of-speech tagging on a Chinese article to obtain plural words containing at least one word, and each Equal words have a corresponding part of speech; according to the number of words contained in each such word, determine whether each of these words is a typo, and calculate the number of typos in the Chinese article; _ based on punctuation, Divide the Chinese article into plural sentences; Perform a grammatical analysis of each of these sentences according to a Chinese structure tree, so as to know whether each of these sentences conforms to the grammar, and calculate the non-grammatical ones in the Chinese article. Number of sentences; Quantitative quantified scores based on the number of typos and the number of sentences that do not conform to the grammar; The Chinese article is judged on the integrity of the article structure based on an article structure ontology library to obtain the corresponding In a word, orderly quantified scoring; According to an article skill ontology library, the Chinese article is judged by the article skill Is to get the corresponding A word flavored quantitative score; and a quantitative score based on the substance in speech, the words orderly quantitative score with the words flavored quantitative score, assessment should determine the relative fraction of the Chinese article. 11. The Chinese article evaluation method described in item 1 of the scope of patent application, 0213-8280TWF(N);Yianhou.ptd 第14頁 579470 六、申請專利範圍 其中該文章結構本體庫與該文章技巧本體庫係遵循資源描 述架構(Resource Description Framework,RDF )標 準。 1 2 ·如申請專利範圍第1項所述之中文文章評量方法, 其中該文章結構本體庫與該文章技巧本體庫係遵循知識交 換格式(Knowledge Interchange Format,KIF)標準。0213-8280TWF (N); Yianhou.ptd Page 14 579470 6. Scope of patent application The article structure ontology library and article technique ontology library follow the Resource Description Framework (RDF) standard. 1 2 · The Chinese article evaluation method described in item 1 of the scope of patent application, wherein the article structure ontology library and the article skill ontology library follow the Knowledge Interchange Format (KIF) standard. 1 3 ·如申請專利範圍第1項所述之中文文章評量方法, 其中該文章結構本體庫與該文章技巧本體庫係遵循 DAML+0IL(DARPA Agent Markup Language + Ontology Inference Layer)標準 ° 1 4 ·如申請專利範圍第i項所述之中文文章評量方法, 其中該處理器係判斷該詞是否僅包含一個字,若該詞僅包 含一個字,則判斷該詞為一錯別字。 1 5·如申請專利範圍第丨項所述之中文文章評量方法, 其中該處理器係依據逗點與/或句點,將該中文文章分割 為複數句子。 16.如申請專利範圍第丨項所述之中文文章評量方法, 其中該處理器係依據將該句子中該等詞之詞性盥該中文社 構樹中内建之文法詞性組合進行比對,以判斷該^ 符合文法。1 3 · The Chinese article evaluation method described in item 1 of the scope of patent application, wherein the article structure ontology library and the article technique ontology library follow the DAML + 0IL (DARPA Agent Markup Language + Ontology Inference Layer) standard ° 1 4 · The Chinese article evaluation method described in item i of the patent application scope, wherein the processor judges whether the word contains only one word, and if the word contains only one word, judges the word as a typo. 15. The method for evaluating a Chinese article as described in item 丨 of the patent application scope, wherein the processor divides the Chinese article into plural sentences based on commas and / or periods. 16. The Chinese article evaluation method described in item 丨 of the scope of patent application, wherein the processor compares the part-of-speech of the words in the sentence with the built-in grammatical part-of-speech combination in the Chinese social tree, To determine that the ^ conforms to the grammar. 1二7·如申請專利範圍第丨項所述之中文文章評量方法 其中該文章結構本體庫中包含起結構概念 轉結構概念、與合結構概念。 偁概心 18 ·如申請專利範圍第1項所述之中文文章評量方法127. The method for evaluating Chinese articles as described in item 丨 of the scope of patent application, where the article ontology library contains the concept of structural transformation, and the concept of combined structure.偁 Overview 18 · Assessment method for Chinese articles as described in item 1 of the scope of patent application 579470 六、申請專利範圍 其中該文章技巧本體庫中包含擬人法概念、譬喻法概念、 層遞法概念、映襯法概念、與暗喻法概念。 1 9 · 一種電腦讀取媒介,編碼一電腦程式來致能中文 文章評量,包括下列步驟: 依據一中文語料/詞庫,將一中文文章進行斷詞與詞 性標註,從而得到複數包含至少一字之詞,且每一該等詞 具有相應之一詞性; 依據每一該等詞所包含字的數目,判斷每一該等詞是 否為一錯別字,並計算該中文文章中之該錯別字的數目; 依據標點符號,將該中文文章分割為複數句子; 依據一中文結構樹,將每一該等句子進行文法分析, 從而得知每一該等句子是否符合文法,並計算該中文文章 中之不符合文法之該等句子的數目; 依據該錯別字的數目與不符合文法之該等句子的數目 計算一言之有物量化評分; 依據一文章結構本體庫,將該中文文章進行文章架構 完整性判斷,從而得到相應之一言之有序量化評分; 依據一文章技巧本體庫,將該中文文章進行文章技巧 判斷,從而得到相應之一言之有味量化評分;以及 依據該言之有物量化評分、該言之有序量化評分與該 言之有味量化評分,決定相應該中文文章之評量分數。 2 0.如申請專利範圍第1 9項所述之電腦讀取媒介,其 中該處理器係判斷該詞是否僅包含一個字,若該詞僅包含 一個字,則判斷該詞為一錯別字。579470 6. Scope of patent application The article's skill ontology library contains the concept of anthropomorphic law, metaphorical method, layering method, mapping method, and metaphor. 1 9 · A computer-readable medium that encodes a computer program to enable Chinese article evaluation, including the following steps: According to a Chinese corpus / thesaurus, tagging and tagging a Chinese article, thereby obtaining a plural containing at least A word of one word, and each of those words has a corresponding part of speech; based on the number of words contained in each word, determine whether each of the words is a typo, and calculate the The Chinese article is divided into plural sentences according to punctuation marks. Each of these sentences is grammatically analyzed according to a Chinese structure tree, so as to know whether each such sentence conforms to the grammar and calculate the number of sentences in the Chinese article. The number of sentences that do not conform to the grammar; the one-word quantitative score is calculated based on the number of typos and the number of sentences that do not conform to the grammar; the Chinese article is subjected to article structure integrity based on an article structure ontology library Judgment, so as to get a corresponding quantified scoring order; Based on an ontology library of article skills, the Chinese article was written. Judgment skills to get the corresponding quantified scoring score; and based on the quantified scoring of the material, the ordered quantified scoring of the verb and the quantified scoring of the verb, determine the corresponding evaluation score of the Chinese article . 20. The computer-readable medium as described in item 19 of the scope of patent application, wherein the processor determines whether the word contains only one word, and if the word contains only one word, determines that the word is a typo. 0213-8280TWF(N);Yianhou.ptd 第16頁 579470 '中請專利範圍 21 ·如申請專利範圍第1 9項所述之電腦讀取媒介,其 中該處理器係依據逗點與/或句點,將該中文文章分割為 衩數句子。 2 2 ·如申請專利範圍第1 9項所述之電腦讀取媒介,其 士 Λ處理态係依據將該句子中該專5司之,性與該中文結構 内建之文法詞性組合進行比對,以判斷該句子是否符 23·如申請專利範圍第1 9項所述之電腦讀取媒介,其 j文章結構本體庫中包含起結構概念、承結構概念、 〜構概念、與合結構概念。 2 4 ·如申請專利範圍第1 9項所述之電腦讀取媒介,其 =文章技巧本體庫中包含擬人法概念、譬喻法概念、-層 儿/概念、映襯法概念、與暗喻法概念。0213-8280TWF (N); Yianhou.ptd Page 16 579470 'Applicable patent scope 21 · The computer-readable medium described in item 19 of the patent application scope, wherein the processor is based on commas and / or periods, The Chinese article is divided into several sentences. 2 2 · According to the computer-readable medium described in item 19 of the scope of the patent application, the Λ processing state is based on the comparison of the 5 divisions in the sentence with the built-in grammatical lexical combination of the Chinese structure. In order to determine whether the sentence matches the computer-readable medium described in item 19 of the scope of the patent application, the article ontology library of the article contains the concept of structure, the concept of structure, the concept of ~ structure, and the concept of conjunction structure. 2 4 · The computer-readable medium as described in item 19 of the scope of patent application, which includes the article ontology library of anthropomorphic concepts, metaphorical concepts, -layers / concepts, mapping concepts, and metaphorical concepts.
TW91116800A 2002-07-26 2002-07-26 Chinese article evaluation method and system and computer reading medium TW579470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW91116800A TW579470B (en) 2002-07-26 2002-07-26 Chinese article evaluation method and system and computer reading medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW91116800A TW579470B (en) 2002-07-26 2002-07-26 Chinese article evaluation method and system and computer reading medium

Publications (1)

Publication Number Publication Date
TW579470B true TW579470B (en) 2004-03-11

Family

ID=32924101

Family Applications (1)

Application Number Title Priority Date Filing Date
TW91116800A TW579470B (en) 2002-07-26 2002-07-26 Chinese article evaluation method and system and computer reading medium

Country Status (1)

Country Link
TW (1) TW579470B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193805A (en) * 2017-06-06 2017-09-22 北京百度网讯科技有限公司 Article Valuation Method, device and storage medium based on artificial intelligence
CN107506360A (en) * 2016-06-14 2017-12-22 科大讯飞股份有限公司 A kind of essay grade method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506360A (en) * 2016-06-14 2017-12-22 科大讯飞股份有限公司 A kind of essay grade method and system
CN107506360B (en) * 2016-06-14 2020-09-11 科大讯飞股份有限公司 Article scoring method and system
CN107193805A (en) * 2017-06-06 2017-09-22 北京百度网讯科技有限公司 Article Valuation Method, device and storage medium based on artificial intelligence
US11481572B2 (en) 2017-06-06 2022-10-25 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for evaluating article value based on artificial intelligence, and storage medium

Similar Documents

Publication Publication Date Title
Sharma et al. Prediction of Indian election using sentiment analysis on Hindi Twitter
Over et al. DUC in context
Balahur et al. Computational approaches to subjectivity and sentiment analysis: Present and envisaged methods and applications
Bosco et al. Developing corpora for sentiment analysis: The case of irony and senti-tut
Balahur et al. Sentiment analysis system adaptation for multilingual processing: The case of tweets
Banea et al. Sense-level subjectivity in a multilingual setting
CA2807494C (en) Method and system for integrating web-based systems with local document processing applications
WO2020199600A1 (en) Sentiment polarity analysis method and related device
Brown et al. Mechanized margin to digitized center: black feminism's contributions to combatting erasure within the digital humanities
TW201220088A (en) Text conversion method and system
Sardinha An assessment of metaphor retrieval methods
Laparra et al. eXtended WordFrameNet.
Jain et al. Text independent root word identification in Hindi language using natural language processing
Balahur et al. Summarizing threads in blogs using opinion polarity
TW579470B (en) Chinese article evaluation method and system and computer reading medium
De Clercq et al. Rude waiter but mouthwatering pastries! An exploratory study into Dutch Aspect-Based Sentiment Analysis
Tonelli Semi-automatic techniques for extending the FrameNet lexical database to new languages
Sterckx et al. Assessing quality of unsupervised topics in song lyrics
Saini et al. EmoXract: Domain independent emotion mining model for unstructured data
Ye et al. Interpreting logical metonymy through dense paraphrasing
CN102999485A (en) Real emotion analyzing method based on public Chinese network text
Di Felippo et al. Applying lexical-conceptual knowledge for multilingual multi-document summarization
Lazarinis et al. Improving non-English web searching (iNEWS07)
Zuraw Allomorphs of French de in coordination: A reproducible study
Katz The Gothic Resultative: Non-agentive Verbs and Perfect Expression in Early Germanic

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MM4A Annulment or lapse of patent due to non-payment of fees