TH87129B

TH87129B - Automatic sentence segmentation system for languages without explicit sentence markers

Info

Publication number: TH87129B
Application number: TH602003101F
Authority: TH
Inventors: ตั้งเวียงวัง นายจิระ; ทรัพย์นิธิ นายเทพชัย; เรืองรจิตปกรณ์ นายธเนศ
Original assignee: สำนักงานพัฒนาวิทยาศาสตร์และเทคโนโลยีแห่งชาติ; แอสโซซิเอทเต็ต บริติช ฟูดส์ พีแอลซี
Filing date: 2010-03-25
Publication date: 2022-03-24

Abstract

------30/10/2563------(OCR) ระบบแบ่งประโยคอัตโนมัติสำหรับภาษาที่ไม่มีตัวบ่งประโยคชัดเจน ตามการประดิษฐ์นี้ประกอบด้วย กลไกสำหรับแบ่งประโยคโดยอัตโนมัติที่รองรับภาษาที่ไม่มีส่วนบอกใบ้ถึงจุดเริ่มต้นและจุดสิ้นสุดของประโยคที่ชัดเจน สิ่งประดิษฐ์นี้ได้นำเสนอกลไกพิเศษ อันได้แก่วิธีการค้นหาส่วนบอกใบ้ถึงตัวบ่งประโยคที่ไม่ชัดเจน และกระบวนการพิจารณาขอบเขตของประโยคจากส่วนบอกใบ้ที่กำกวม เพื่อแก้ปัญหาการแบ่งประโยคในภาษาที่ไม่มีตัวบ่งประโยคชัดเจน กระบวนการตัดสินการแบ่งประโยคเป็นไปในแนวทางประยุกต์ระหว่างการใช้กฎทางภาษาศาสตร์กับการใช้ค่าทางสถิติ กล่าวคือ ผลลัพธ์การแบ่งประโยคที่ได้จะครอบคลุมความสามารถของทั้ง 2 แนวทาง ทั้งนี้ความถูกต้องขึ้นอยู่กับความละเอียดของกฎทางภาษาศาสตร์ที่กำหนดให้และปริมาณของข้อมูลที่นำไปให้ระบบเรียนรู้ค่าสถิติ ------------ DC60 ระบบแบ่งประโยคอัตโนมัติสำหรับภาษาที่ไม่มีตัวบ่งประโยคชัดเจน ตามการประดิษฐ์นี้ ประกอบด้วย กลไกสำหรับแบ่งประโยคโดยอัตโนมัติที่รองรับภาษาที่ไม่มีส่วนบอกใบ้ถึงจุดเริ่มต้น และจุดสิ้นสุดของประโยคที่ชัดเจน สิ่งประดิษฐ์นี้ได้นำเสนอกลไกพิเศษ อันได้แก่วิธีการค้นหา ส่วนบอกใบ้ถึงตัวบ่งประโยคที่ไม่ชัดเจน และกระบวนการพิจารณาขอบเขตของประโยคจากส่วน บอกใบ้ที่กำกวม เพื่อแก้ปัญหาการแบ่งประโยคในภาษาที่ไม่มีตัวบ่งประโยคชัดเจน กระบวนการ ตัดสินการแบ่งประโยคเป็นไปในแนวทางประยุกต์ระหว่างการใช้กฎทางภาษาศาสตร์กับการใช้ค่า ทางสถิติ กล่าวคือ ผลลัพธ์การแบ่งประโยคที่ได้จะครอบคลุมความสามารถของทั้ง 2 แนวทาง ทั้งนี้ ความถูกต้องขึ้นอยู่กับความละเอียดของกฎทางภาษาศาสตร์ที่กำหนดให้และปริมาณของข้อมูลที่ นำไปให้ระบบเรียนรู้ค่าสถิติ ระบบแบ่งประโยคอัตโนมัติสำหรับภาษาที่ไม่มีตัวเบ่งประโยคชัดเจน ตามการประดิษฐ์นี้ ประกอบด้วย กลไกสำหรับแบ่งประโยคโดยอัตโนมัติที่รองรับภาษาที่ไม่มีส่วนบอกใบ้ถึงจุดเริ่มต้น และจุดสิ้นสุดของประโยคที่ชัดเจน สิ่งประดิษฐ์นี้ได้นำเสนอกลไกพิเศษ อันได้แก่วิธีการค้นหา ส่วนบอกใบ้ถึงตังบ่งประโยคที่ไม่ชัดเจน และกระบวนการพิจารณาขอบเขตของประโยคจากส่วน บอกใบ้กำกวมเพื่อแก้ปัญหาการแบ่งประโยคในภาษาที่ไม่มีตัวบ่งประโยคชัดเจน กระบวนการ ตัดสินการแบ่งประโยคเป็นไปในแนวทางประยุกต์ระหว่างการใช้กฏทางภาษาศาสตร์กับการใช้ค่า ทางสถิติ กล่าวคือ ผลลัพธ์การแบ่งประโยคที่ได้จะครอบคลุมความสามารถของทั้ง 2 แนวทาง ทั้งนี้ ความถูกต้องขึ้นอยู่กับความละเอียดของกฏทางภาษาศาสตร์ที่กำหนดให้และปริมาณของข้อมูลที่ นำไปให้ระบบเรียนรู้ค่าสถิติ------30/10/2020------(OCR) Automatic sentence segmentation system for languages without explicit sentence cues. The present invention comprises a mechanism for automatic sentence segmentation that supports languages without explicit sentence beginning and end clues. The present invention provides a special mechanism, namely, a method for finding unclear sentence cues and a process for determining sentence boundaries from ambiguous cues, to solve the sentence segmentation problem in languages without explicit sentence cues. The sentence segmentation decision process is based on a combination of linguistic rules and statistics, i.e., the resulting sentence segmentation covers the capabilities of both approaches. However, the accuracy depends on the granularity of the given linguistic rules and the amount of data fed to the system to learn the statistics. ------------ DC60 Automatic sentence segmentation system for languages without explicit sentence cues. The present invention comprises a mechanism for automatic sentence segmentation that supports languages without explicit sentence beginning and end clues. The present invention provides a special mechanism, namely, a method for finding unclear sentence cues and a process for determining sentence boundaries from ambiguous cues. To solve the sentence segmentation problem in languages without explicit sentence markers, the sentence segmentation decision process is a combination of linguistic rules and statistical methods. That is, the resulting sentence segmentation will cover the capabilities of both approaches. The accuracy depends on the granularity of the given linguistic rules and the amount of data fed to the system to learn the statistics. The automatic sentence segmentation system for languages without explicit sentence markers, according to the present invention, comprises a mechanism for automatic sentence segmentation that supports languages without explicit sentence beginning and end clues. The present invention provides special mechanisms, namely, a method for finding ambiguous sentence marker clues and a process for judging sentence boundaries from ambiguous clues, to solve the sentence segmentation problem in languages without explicit sentence markers. The sentence segmentation decision process is a combination of linguistic rules and statistical methods. That is, the resulting sentence segmentation will cover the capabilities of both approaches. The accuracy depends on the granularity of the given linguistic rules and the amount of data fed to the system to learn the statistics.

Claims

1. I hereby reserve the rights to the product design, which includes the shape and appearance of the container, as shown in the product design image which has been submitted herewith.