TWI834293B - Natural language processing method and its system and application - Google Patents

Natural language processing method and its system and application Download PDF

Info

Publication number
TWI834293B
TWI834293B TW111135010A TW111135010A TWI834293B TW I834293 B TWI834293 B TW I834293B TW 111135010 A TW111135010 A TW 111135010A TW 111135010 A TW111135010 A TW 111135010A TW I834293 B TWI834293 B TW I834293B
Authority
TW
Taiwan
Prior art keywords
symbol
natural language
language
signifier
pinyin
Prior art date
Application number
TW111135010A
Other languages
Chinese (zh)
Other versions
TW202414270A (en
Inventor
陳森淼
Original Assignee
陳森淼
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 陳森淼 filed Critical 陳森淼
Priority to TW111135010A priority Critical patent/TWI834293B/en
Priority to CN202311098858.1A priority patent/CN117709350A/en
Application granted granted Critical
Publication of TWI834293B publication Critical patent/TWI834293B/en
Publication of TW202414270A publication Critical patent/TW202414270A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

一種自然語言處理方法,包含:將一自然語言透過歷時及/或共時比較,掌握此自然語言具備的認知框架,其中此自然語言的認知框架層級概為語義>語法>語音>語文;編修此自然語言的符號能指,自然語言的符號能指應包含語音能指及語文能指,劃分語音能指和語文能指使自然語言的符號所指=語音能指>語文能指;將此自然語言的語音結構形式化,藉由包含:自定義符號選擇;特定排列組合設計;以及符號及∕或符號組合於音素的操作,使每一音段對映獨一符號及∕或符號組,讓此符號能指的的語音合於音序且其拼音結構為音段構符,不移不複的語言形式;符號集合元件亦可形成獨立、自洽且完備的公理狀態。A natural language processing method, including: comparing a natural language through diachrony and/or synchrony, and grasping the cognitive framework of the natural language. The cognitive framework level of the natural language is generally semantics>grammar>speech>linguistic; edited this The symbolic signifier of natural language should include the phonetic signifier and the linguistic signifier. The phonetic signifier and the linguistic signifier are divided so that the symbolic signifier of natural language = phonetic signifier > linguistic signifier; this natural language The phonetic structure is formalized by including: custom symbol selection; specific permutation and combination design; and the operation of symbols and/or symbol combinations on phonemes, so that each segment corresponds to a unique symbol and/or symbol group, so that this The pronunciation of the signifier is consistent with the phonetic sequence and its pinyin structure is a segmental structure, an unchangeable language form; the set of sign elements can also form an independent, self-consistent and complete axiomatic state.

Description

自然語言處理方法及其系統與應用Natural language processing methods and their systems and applications

本發明係關於一種自然語言處理方法及其系統與應用;特別關於實踐自然語言符號形式公理系統化的技術方法及其人為系統與成果應用。The present invention relates to a natural language processing method and its system and application; in particular, it relates to a technical method for practicing the systematization of natural language symbol form axioms and its artificial system and application of results.

自然語言處理 (Natural Language Processing; NLP) 是語言學和人工智慧領域的分支學科。此領域探討如何處理及運用自然語言;自然語言處理包括多方面和步驟,基本有認知、理解、生成等部分。自然語言認知和理解是讓電腦把輸入的語言變成有意思的符號和關係,然後根據目的再處理,生成各式成果應用。Natural Language Processing (NLP) is a branch of linguistics and artificial intelligence. This field explores how to process and use natural language; natural language processing includes many aspects and steps, basically including cognition, understanding, generation and other parts. Natural language cognition and understanding is to allow the computer to turn the input language into interesting symbols and relationships, and then process it according to the purpose to generate various results and applications.

自然語言處理方法習知技術在1980年代末期,自然語言處理引進機器學習 (Machine Learning) 的演算法,不再像1950年代用程式語言命令電腦所有規則,而是建立演算法模型,讓電腦學會從訓練的資料中,尋找資料所含的特定模式和趨勢。亦有以向量找規則之其他非公理化符號形式所建立的語料庫,然而向量的計算耗用大量的系統資源且其效果不彰,本發明最終同性質對譯亦非一般非公理系統集合演算法所能完成。Natural Language Processing Methods and Knowledge Technology In the late 1980s, natural language processing introduced the algorithm of machine learning. Instead of using programming language to command the computer all the rules in the 1950s, it established an algorithm model to let the computer learn to follow the rules. In the training data, look for specific patterns and trends in the data. There are also corpora established in the form of other non-axiomatic symbols using vector search rules. However, the calculation of vectors consumes a large amount of system resources and the effect is not effective. The final homogeneous translation of this invention is not a general non-axiomatic system set algorithm. can be accomplished.

以 “中文” 為例,然不限於此, “中文當代共時語音” 的 “語音結構” 係 “注音音序”,其指稱的是 “語音結構”,無關符號文字使用上的差異;進而運用 “符號” 表示語音,形成 “拼音系統”。再以 “中文” 為例,現今的 “注音符號” 採用的 “語文所指” 不合時宜,缺乏國際通用性且非電腦科學傳統符號元件 (26拉丁字母和0123456789);而 “漢語拼音” 則因 “符號形變” 及/或 “擬音” 脫離本身的 “語音結構”。Take "Chinese" as an example, but it is not limited to this. The "phonetic structure" of "Chinese contemporary synchronic speech" is the "phonetic sequence", which refers to the "phonetic structure" and has nothing to do with the differences in the use of symbolic characters; and then use "Symbols" represent pronunciation, forming a "Pinyin system." Taking "Chinese" again as an example, the "linguistic reference" used in today's "phonetic symbols" is out of date, lacks international versatility, and is not a traditional symbol element of computer science (26 Latin letters and 0123456789); while "Hanyu Pinyin" is due to " Symbol deformation" and/or "foley" are separated from the "speech structure" of the original.

自然語言處理目前產業未有語言符號拼音結構形式公理化之做法,然依此做法符號元件亦可自建字料庫或做詞向量分析。In the natural language processing industry, there is currently no axiomatic approach to the pinyin structure and form of language symbols. However, according to this approach, symbol components can also build their own character libraries or perform word vector analysis.

本發明提供一種自然語言處理方法及其系統與應用,實踐語言符號拼音結構形式公理系統化,包含:將一種以上之自然語言透過歷時及/或共時比較,掌握人類具有的自然語言認知框架。其中,人類的 “語言認知” 對於 “語音層面” (sounds) 與 “語文層面” (words) 之間具有相對認知階層順序,因為人類以 “當代共時語音” 解讀 “語文”, 所以 “語音層面” 較接近語言認知核心 (core),是人類對於自然語言的 “認知框架”。其中可發現人類對於自然語言認知框架的認知階級順序為語義(core)>語法>語音>語文。The present invention provides a natural language processing method, its system and application, and systematizes the axioms of the pinyin structure and form of language symbols, including: comparing more than one natural language diachronically and/or synchronically to grasp the natural language cognitive framework possessed by human beings. Among them, human "language cognition" has a relative cognitive hierarchy order between "sounds" and "language level" (words), because humans interpret "language" with "contemporary synchronic speech", so the "voice level" " Closer to the core of language cognition, it is the "cognitive framework" of human beings for natural language. Among them, it can be found that the order of human cognitive hierarchy for natural language cognitive framework is semantics (core)>grammar>speech>linguistic.

已知索緒爾符號學提出語言符號同時具備能指與所指,索緒爾從語言學的觀點指出 “符號” 應該由兩個部分組成,一是 “能指” (signifier),二是 “所指” (signified),所謂 “能指”,是 “有聲意象” (sound-image),而 “所指”,則是 “有聲意象連繫的概念” (concept),所有符號都應該具備 “能指” 與 “所指”,缺一不可。本發明整合語言認知框架知識內容並修正索緒爾符號學不足之處,其中自然語言的符號能指應包含語音能指及語文能指,本發明劃分語音能指和語文能指使自然語言符號意義所指=語音能指>語文能指。語音能指具有線條性,屬於聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線。其中語音能指的直線性質即為語音音序,觀察自然語言語音音序本身的拼音結構,歸納自然語言語音音序之自然法則特徵可假設語音音序之公理為音段構符,不移不複。其中, “音段構符” :例如中文普通話之音素有37個,分別為21個聲母,3個介母,13個韻母,聲調有4聲,依序排列為聲母段、介母段、韻母段、聲調段四個音段,並且所有漢字一律由聲母段到聲調段依序組合進行發音; “不移不複” :例如中文普通話之四個音段依照每一個漢字的個別需求出沒,但音段順序固定不移,並且在每一個漢字讀音之中,同音段的音素與聲調不重複出現。It is known that Saussure's semiotics proposes that language signs have both signifiers and signifieds. From a linguistic point of view, Saussure pointed out that "signs" should be composed of two parts, one is the "signifier" and the other is the "signifier". "Signified" (signified), the so-called "signifier" is a "sound-image" (sound-image), and "signified" is a "concept" that connects sound-images. All symbols should have " "Signifier" and "signified" are indispensable. This invention integrates the knowledge content of the language cognitive framework and corrects the shortcomings of Saussure's semiotics. The symbolic signifier of natural language should include the phonetic signifier and the linguistic signifier. The invention divides the phonetic signifier and the linguistic signifier to make the meaning of the natural language symbol Signified = phonetic signifier > linguistic signifier. The phonetic signifier has linearity, is of auditory nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, (b) this length can only be measured in one direction: it is A straight line. Among them, the straight-line nature of the phonetic signifier is the phonetic sequence. Observing the pinyin structure of the natural language phonetic sequence itself, and summarizing the natural law characteristics of the natural language phonetic sequence, we can assume that the axioms of the phonetic sequence are segment structures, which cannot be changed. complex. Among them, "segment structure": For example, Chinese Mandarin has 37 phonemes, including 21 initial consonants, 3 medial consonants, 13 finals, and 4 tones. They are arranged in order as initial consonant segment, medial consonant segment, and final vowel. There are four segments, namely segment and tone segment, and all Chinese characters are pronounced sequentially from the initial segment to the tone segment; "Unshakable": For example, the four segments of Chinese Mandarin appear and appear according to the individual needs of each Chinese character, but The sequence of the segments is fixed, and in the pronunciation of each Chinese character, the phonemes and tones of the same segment do not appear repeatedly.

因為人類對於自然語言的 “認知框架” 存在於 “語音層面”,所以自然語言的 “意義所指” 和 “語音能指” 之間具有 “隨機性”,非人為可以完全控制。但是 “語音能指” 和 “語文所指” 之間卻是 “任意性”,本發明藉此任意性轉換語文能指符號元件實踐語言符號拼音結構形式公理系統化,藉由包含:形式化;自定義語言符號集合符號元件選擇權;自定義語言符號集合符號元件排列組合設計權;以及符號元件及/或符號元件組陳列規則合於語音音序之自然法則特徵;保持一音素對映一符號元件及/或符號元件組產生具有辨識性及/或判別性的人為系統之操作,讓此自然語言符號集合的語文能指符號元件及/或符號元件組之排列組合狀態強制唯一且閉鎖獨立並合於語音音序之自然法則特徵其拼音結構為音段構符,不移不複的語言表達形式,公理化;驗證本發明自定義語言符號集合與其內所有符號元件及/或符號元件組之關係具備自洽性,獨立性與完備性,因此音段構符,不移不複收斂上即為語言符號集合公理,又因語言總體集合均為語言符號集合的符號元件及/或符號元件組重複排列組合架構而成,因此音段構符,不移不複發散上即為此自然語言公理,藉此證明音段構符,不移不複之自然語言假設公理為真,並可將多種自然語言符號集合的符號元件及/或符號元件組轉換一致使多種自然語言符號集合相等其符號元件及/或符號元件組之元素同質共集達成多語言轉換翻譯。Because human beings' "cognitive framework" for natural language exists at the "speech level", there is "randomness" between the "meaning referent" and "speech signifier" of natural language, which can be completely controlled by non-human beings. However, there is "arbitrariness" between "phonetic signifier" and "linguistic referent". The present invention uses this arbitrariness to convert the linguistic signifier symbol element to systematize the axioms of the pinyin structure form of language symbols, by including: formalization; The right to select the symbol elements of a custom language symbol set; the right to design the arrangement and combination of symbol elements in a custom language symbol set; and the display rules of symbol elements and/or symbol element groups are consistent with the natural law characteristics of phonetic sequences; maintaining one phoneme corresponding to one symbol The operation of components and/or symbol component groups to produce an artificial system with recognition and/or discriminability, so that the language of this natural language symbol collection can refer to the arrangement and combination state of symbol components and/or symbol component groups, which is mandatory, unique, closed and independent. In line with the natural law characteristics of phonetic sequence, its pinyin structure is a segmental symbol, an unchanging form of language expression, and axiomatic; verify the self-defined language symbol set of the present invention and all symbol elements and/or symbol element groups within it The relationship is self-consistent, independent and complete. Therefore, the segment structure is an axiom of the language symbol set, and it is the axiom of the language symbol set. Moreover, the overall language set is the symbol element and/or symbol element group of the language symbol set. It is structured by repeated permutations and combinations. Therefore, the segment structure will never change or repeat. This is the axiom of natural language. This is used to prove that the segment structure will never change or return. The axiom of natural language is true, and many kinds of The transformation of the symbol elements and/or symbol element groups of the natural language symbol set is consistent so that the multiple natural language symbol sets are equal and the elements of the symbol elements and/or symbol element groups are homogeneously assembled to achieve multi-language conversion and translation.

將本發明之自然語言處理方法做為人類端與電腦端的連結樞紐,人類端透過語言符號近似讀音且無符號形變規則之擬音滿足人類閱讀理解需求,電腦端則透過語言符號拼音結構形式公理系統化使語言符號集合呈現音段構符,不移不複的形式公理系統便於電腦端人工智慧機器理解人類自然語言。而系統 (英文:system) 泛指由一群有關聯的個體組成,根據某種規則運作,能完成個別元件不能單獨完成的工作的群體。本發明之系統為根據語音音序自然法則特徵生成之人為系統 。本發明之應用包含自然語言處理方法應用及其系統應用與本發明成果應用,其中更包含本發明之實施例於人工智慧泛科技領域之應用。The natural language processing method of the present invention is used as a connection hub between the human side and the computer side. The human side uses language symbols to approximate pronunciation and foley without symbol deformation rules to meet human reading comprehension needs. The computer side uses language symbols to systematize the pinyin structure form axioms. The collection of language symbols presents segmental structures, and the unwavering formal axiom system facilitates computer-side artificial intelligence machines to understand human natural language. A system (English: system) generally refers to a group of related individuals that operate according to certain rules and can complete work that individual components cannot complete alone. The system of the present invention is an artificial system generated based on the natural law characteristics of speech sequence. The applications of the present invention include the application of natural language processing methods and system applications and the application of the results of the present invention, which further include the application of embodiments of the present invention in the field of artificial intelligence and pan-technologies.

在此實施例中,修正此自然語言的符號能指包含劃分語音能指以及語文能指與新增語文能指自定義集合符號元件。In this embodiment, modifying the symbolic signifier of the natural language includes dividing the phonetic signifier and the linguistic signifier and adding a new custom collection symbol element of the linguistic signifier.

在此實施例中,語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線,藉此,整合其直線性質及音序,提出音序拼音結構,使此自然語言語言音序特徵為音段構符,不移不複。In this embodiment, the phonetic signifier has linearity, is auditory in nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, (b) this length can only be in one direction. Measuring in degree: it is a straight line. By integrating its straight-line properties and phonetic sequence, a phonetic sequence pinyin structure is proposed, so that the phonetic sequence characteristics of this natural language are segmental structures, which cannot be changed.

在此實施例中,更包含:使此語言形式所包含之符號集合的所有符號元件及/或元件組具有自洽 (相容) 性;將此語言形式做為符號集合的公理,強制維持每個符號元件及/或元件組獨立性,要求最末音段必有符號元件及/或元件組讓整體構符閉鎖,形成獨立唯一及閉鎖;以及使此語言形式符合音段構符,不移不複,其特徵之一拼音結構展示音韻關係。In this embodiment, it further includes: making all symbol elements and/or component groups of the symbol set included in this language form self-consistent (compatible); using this language form as an axiom of the symbol set, forcing each to maintain The independence of each symbolic element and/or element group requires that the last segment must have a symbolic element and/or element group to close the entire structure, forming an independent, unique and closed form; and to make this language form conform to the segmental structure and remain unchanged. No longer, one of its characteristics is that the pinyin structure shows phonological relationships.

在此實施例中,更包含:以一音段一符號或一符號組滿足此語言形式。In this embodiment, it further includes: one segment, one symbol or one symbol group to satisfy the language form.

在此實施例中,更包含:將此語言形式做為符號集合的公理,強制維持每個符號元件獨立性,要求最末音段必有元件讓整體構符閉鎖,達成自洽與完備性。In this embodiment, it further includes: taking this language form as an axiom of a symbol set, it is mandatory to maintain the independence of each symbol element, and it is required that the last segment must have an element to close the entire structure to achieve self-consistency and completeness.

本發明亦提供一種利用上述之自然語言處理方法的自然語言系統,包含:語言符號拼音結構形式公理系統化之集合及/或依語言音序自然法則發明之人為系統。The present invention also provides a natural language system that utilizes the above-mentioned natural language processing method, including: a systematic set of language symbols, pinyin structures, forms and axioms and/or an artificial system invented based on the natural laws of language pronunciation.

在此實施例中,符號拼音結構形式化的形包含:符號元件選擇權,其中符號選擇包含電腦科學所用之符號元件;符號拼音結構形式化的式包含:符號元件及/或符號元件組;合於音素;以及可辨識符號組之排列組合設計權。In this embodiment, the formalized form of the symbol pinyin structure includes: symbol element selection, wherein the symbol selection includes symbol elements used in computer science; the formalized form of the symbol pinyin structure includes: symbol elements and/or symbol element combinations; phonemes; and the right to design the arrangement and combination of identifiable symbol groups.

本發明上述之自然語言處理方法的成果亦可應用於電腦可讀取記錄媒體,包含:轉錄程式碼及/或音訊與符號文字轉換;電腦程式產品,包含:語音辨識;聲控;翻譯;音訊轉文字;及/或文字轉音訊;及/或人工智慧領域之自然語言處理,包含:音訊積體電路設計及/或自然語言理解的人工智慧語音技術。The results of the above-mentioned natural language processing method of the present invention can also be applied to computer-readable recording media, including: transcription code and/or audio and symbol text conversion; computer program products, including: speech recognition; voice control; translation; audio conversion Text; and/or text-to-audio; and/or natural language processing in the field of artificial intelligence, including: audio integrated circuit design and/or artificial intelligence speech technology for natural language understanding.

在一實施例中,此自然語言符號拼音結構形式公理系統化包含語言符號形式化之特徵;公理化之特徵;以及形式公理化之系統化之特徵的系統集合,其中符號元件及/或元件組更包含均等轉換符號元件或音訊形式。In one embodiment, the formal axiomatic systematization of the pinyin structure of natural language symbols includes a systematic set of features of language symbol formalization; axiomatic features; and systematization features of formal axiomatics, wherein symbol elements and/or element groups It also includes equal conversion of symbol components or information modes.

在一實施例中,符號元件及/或元件組更包含:藉由符號一致轉譯不同語言之音訊符號、文字及/或程式碼,以及同質轉譯包含:透過深度學習強化處理自然語言之句段關係(rapports syntagmatiques)及/或聯想關係(rapports associatifs)之音訊符號、文字及/或程式碼。In one embodiment, the symbol components and/or component groups further include: consistent translation of audio symbols, text and/or code in different languages through symbols, and homogeneous translation includes: enhanced processing of segment relationships of natural language through deep learning. (rapports syntagmatiques) and/or rapports associatifs (rapports associatifs) audio symbols, text and/or code.

在此實施例中,此自然語言符號拼音結構形式公理系統化包含技術方法與人為系統及成果應用。In this embodiment, the systemization of the axioms of the pinyin structure and form of natural language symbols includes technical methods and artificial systems and application of results.

根據本發明的一較佳實施例,提供一種自然語言處理方法,包含:將一自然語言透過歷時及/或共時比較,掌握此自然語言具有的認知框架,其中此自然語言的語義>語法>語音>語文;修正此自然語言的符號所指,其中此自然語言的符號能指包含語音能指及語文能指,劃分語音能指和語文能指使符號所指=語音能指>語文能指;將此自然語言的語音結構藉由包含:符號選擇;特定排組設計權;以及合於音素的操作,讓此自然語言的語音合於音序且其拼音結構為音段構符,不移不複;以及以一音段一符號或一符組進行此自然語言符號拼音結構形式公理系統化成一語言形式。According to a preferred embodiment of the present invention, a natural language processing method is provided, including: comparing a natural language through diachrony and/or synchrony, and grasping the cognitive framework of the natural language, wherein the semantics > grammar > of the natural language Speech > language; modify the sign referent of this natural language, where the sign signifier of this natural language includes the phonetic signifier and the language signifier, and divide the phonetic signifier and the language signifier so that the sign referent = phonetic signifier > the language signifier; The pronunciation structure of this natural language is made by including: symbol selection; specific arrangement design rights; and operations that are consistent with phonemes, so that the pronunciation of this natural language conforms to the phonetic sequence and its pinyin structure is a segmental structure, which is unchangeable. Complex; and systematizing the axioms of the pinyin structure form of natural language symbols into a language form using one segment, one symbol or one symbol group.

在此實施例中,更包含:將此語言形式做為符號集合的公理,強制維持每個元件獨立性,要求最末音段必有元件讓整體構符閉鎖,達成自洽與完備性。In this embodiment, it further includes: treating the language form as a set of symbols, forcing the independence of each element, and requiring that the last segment must have an element to close the entire structure to achieve self-consistency and completeness.

在此實施例中,符號選擇包含電腦科學所用之符號元件。In this embodiment, the symbol selection includes symbol elements used in computer science.

根據本發明的另一較佳實施例,提供一種自然語言處理方法,實踐此自然語言符號拼音結構形式公理系統化,包含:將一種以上之自然語言透過歷時及/或共時比較,掌握人類具有的自然語言認知框架,其中人類對於自然語言的認知框架之核心順序為語義>語法>語音>語文;修正此自然語言的符號所指,其中此自然語言的符號能指包含語音能指及語文能指,劃分語音能指和語文能指使符號所指=語音能指>語文能指;根據語言音序的拼音結構為音段構符,不移不複的規則完成包含:形式化符號選擇;特定排組設計權;以及合於音素的操作;以及以一音段一符號或一符組進行該自然語言符號拼音結構形式公理系統化成一語言形式。According to another preferred embodiment of the present invention, a natural language processing method is provided to implement the systematization of the axioms of the pinyin structure of natural language symbols, including: comparing more than one natural language through diachrony and/or synchrony, and grasping the characteristics of human beings. The natural language cognitive framework, in which the core order of human cognitive framework for natural language is semantics > grammar > phonetics > language; modify the symbolic referent of this natural language, where the symbolic signifier of this natural language includes the phonetic signifier and the linguistic signifier. Refers to, divides the phonetic signifier and the language signifier so that the sign refers = phonetic signifier > the language signifier; according to the pinyin structure of the language sequence, the segment is constructed, and the completion of the unwavering rules includes: formal symbol selection; specific Arrangement design rights; and operations corresponding to phonemes; and systematizing the axioms of the pinyin structure form of natural language symbols into a language form using one segment, one symbol or one symbol group.

在此實施例中,修正此自然語言的符號所指包含新增語文能指符號元件。In this embodiment, modifying the sign referent of the natural language includes adding a new linguistic signifier sign element.

在此實施例中,語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線,藉此,整合其直線性質及音序,提出音序拼音結構,使此自然語言語言音序特徵為音段構符,不移不複。In this embodiment, the phonetic signifier has linearity, is auditory in nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, (b) this length can only be in one direction. Measuring in degree: it is a straight line. By integrating its straight-line properties and phonetic sequence, a phonetic sequence pinyin structure is proposed, so that the phonetic sequence characteristics of this natural language are segmental structures, which cannot be changed.

在此實施例中,符號拼音結構形式化的形包含:符號元件選擇權,其中符號選擇包含電腦科學所用之符號元件;符號拼音結構形式化的式包含:符號元件及/或符號元件組;合於音素;以及可辨識符號組之排列組合設計權。In this embodiment, the formalized form of the symbol pinyin structure includes: symbol element selection, wherein the symbol selection includes symbol elements used in computer science; the formalized form of the symbol pinyin structure includes: symbol elements and/or symbol element combinations; phonemes; and the right to design the arrangement and combination of identifiable symbol groups.

在此實施例中,更包含:使此語言形式所包含之符號集合的所有符號元件及/或元件組具有自洽(相容)性;將此語言形式做為符號集合的公理,強制維持每個符號元件及/或元件組獨立性,要求最末音段必有符號元件及/或元件組讓整體構符閉鎖,形成獨立唯一及閉鎖;以及使此語言形式符合音段構符,不移不複,其特徵拼音結構展示音韻關係。In this embodiment, it further includes: making all symbol elements and/or component groups of the symbol set included in this language form self-consistent (compatible); using this language form as an axiom of the symbol set, forcing each to be maintained The independence of each symbolic element and/or element group requires that the last segment must have a symbolic element and/or element group to close the entire structure, forming an independent, unique and closed form; and to make this language form conform to the segmental structure and remain unchanged. No longer, its characteristic pinyin structure shows phonological relationships.

在此實施例中,此自然語言符號拼音結構形式公理系統化包含人為系統技術方法及成果應用。In this embodiment, the systematization of the natural language symbol pinyin structure form axioms includes artificial system technology methods and application of results.

在此實施例中,此自然語言符號拼音結構形式公理系統化包含語言符號形式化;公理化;以及形式公理化之系統化方式,其中符號元件及/或元件組更包含均等轉換符號元件或音訊形式。In this embodiment, the formal axiomatic systematization of the natural language symbol pinyin structure includes language symbol formalization; axiomatic; and a systematic approach to formal axiomatization, in which symbol elements and/or element groups further include equal conversion of symbol elements or messages. form.

在此實施例中,符號元件及/或元件組更包含:藉由符號一致轉譯不同語言之音訊符號、文字及/或程式碼,以及同質轉譯包含:透過深度學習強化處理自然語言之句段關係(rapports syntagmatiques)及/或聯想關係(rapports associatifs)之音訊符號、文字及/或程式碼。In this embodiment, the symbol components and/or component groups further include: consistent translation of audio symbols, text and/or code in different languages through symbols, and homogeneous translation includes: enhanced processing of segment relationships of natural language through deep learning. (rapports syntagmatiques) and/or rapports associatifs (rapports associatifs) audio symbols, text and/or code.

在此實施例中,可將多種自然語言符號集合的符號元件及/或符號元件組轉換一致使多種自然語言符號集合相等其符號元件及/或符號元件組之元素同質共集達成多語言轉換翻譯。In this embodiment, the symbol elements and/or symbol element groups of multiple natural language symbol sets can be converted into the same form so that the multiple natural language symbol sets are equal, and the symbol elements and/or symbol element groups of the multiple natural language symbol sets are homogeneously assembled to achieve multi-language conversion and translation. .

本發明亦提供一種利用上述之自然語言處理方法的自然語言系統,包含:語言符號拼音結構形式公理系統化之集合及/或依語言音序自然法則發明之人為系統。The present invention also provides a natural language system that utilizes the above-mentioned natural language processing method, including: a systematic set of language symbols, pinyin structures, forms and axioms and/or an artificial system invented based on the natural laws of language pronunciation.

本發明之應用包含自然語言處理方法應用及其系統應用與本發明成果應用,其中更包含本發明之實施例於人工智慧泛科技領域之應用。The applications of the present invention include the application of natural language processing methods and system applications and the application of the results of the present invention, which further include the application of embodiments of the present invention in the field of artificial intelligence and pan-technologies.

本發明上述之自然語言處理方法的成果亦可應用於電腦可讀取記錄媒體,包含:轉錄程式碼及/或音訊與符號文字轉換;電腦程式產品,包含:語音辨識;聲控;翻譯;音訊轉文字;及/或文字轉音訊;及/或人工智慧領域之自然語言處理,包含:音訊積體電路設計及/或自然語言理解的人工智慧語音技術。The results of the above-mentioned natural language processing method of the present invention can also be applied to computer-readable recording media, including: transcription code and/or audio and symbol text conversion; computer program products, including: speech recognition; voice control; translation; audio conversion Text; and/or text-to-audio; and/or natural language processing in the field of artificial intelligence, including: audio integrated circuit design and/or artificial intelligence speech technology for natural language understanding.

請參照圖1,其為本發明之一較佳自然語言處理方法及其系統與應用之關係示意圖。人類端102透過自然語言104經由本發明之自然語言處理方法106做為人類端102與電腦端114 (包含人工智慧112)的連結樞紐,人類端102透過語言符號近似讀音且無符號形變規則之擬音滿足人類閱讀理解需求,電腦端114則透過語言符號拼音結構形式公理系統化使語言符號集合呈現音段構符,不移不複的形式公理系統便於電腦端114人工智慧112機器理解人類自然語言。而本發明之系統108為根據語音音序自然法則特徵生成之人為系統,包含:本發明之自然語言處理方法106之語言符號拼音結構形式公理系統化之集合及/或依語言音序自然法則發明之人為系統。而本發明之應用110包含自然語言處理方法106之應用及其系統108之應用與本發明成果之應用,其中更包含本發明之實施例於人工智慧泛科技領域之應用。Please refer to FIG. 1 , which is a schematic diagram of the relationship between a preferred natural language processing method and its system and applications of the present invention. The human terminal 102 uses natural language 104 and the natural language processing method 106 of the present invention as a connection hub between the human terminal 102 and the computer terminal 114 (including artificial intelligence 112). The human terminal 102 approximates pronunciation through language symbols and onomatopoeia without symbol deformation rules. To meet the needs of human reading comprehension, the computer terminal 114 systematizes the formal axioms of the pinyin structure of language symbols so that the collection of language symbols presents segmental structures. The unwavering formal axiom system facilitates the computer terminal 114 artificial intelligence 112 machine to understand human natural language. The system 108 of the present invention is an artificial system generated based on the characteristics of the natural laws of phonetic sequence, including: a systematic set of axioms of the language symbol pinyin structure form of the natural language processing method 106 of the present invention and/or inventions based on the natural laws of language phonetic sequence. It is a system. The application 110 of the present invention includes the application of the natural language processing method 106 and its system 108 and the application of the results of the present invention, which further includes the application of the embodiments of the present invention in the field of artificial intelligence and pan-technologies.

請參照圖1A,其為本發明之一較佳自然語言處理方法 (實踐語言符號拼音結構形式公理系統化) 之概略流程圖。在步驟160,將一種以上之自然語言透過歷時及/或共時比較,掌握人類具有的自然語言認知框架。其中,人類的 “語言認知” 對於 “語音層面” (sounds) 與 “語文層面” (words) 之間具有相對認知階級順序,因為人類以 “當代共時語音” 解讀 “語文”, 所以 “語音層面” 較接近語言認知核心 (core),是人類對於自然語言的 “認知框架”。其中可發現人類對於自然語言認知框架的認知階級順序為語義(core)>語法>語音>語文。Please refer to FIG. 1A , which is a schematic flow chart of one of the preferred natural language processing methods (systematization of practical language symbols, pinyin structures, forms and axioms) of the present invention. In step 160, more than one natural language is compared diachronically and/or synchronically to grasp the natural language cognitive framework possessed by humans. Among them, human "language cognition" has a relative cognitive hierarchy between "sounds" and "language level" (words), because humans interpret "language" with "contemporary synchronic speech", so the "voice level" " Closer to the core of language cognition, it is the "cognitive framework" of human beings for natural language. Among them, it can be found that the order of human cognitive hierarchy for natural language cognitive framework is semantics (core)>grammar>speech>linguistic.

已知索緒爾符號學提出語言符號同時具備能指與所指,索緒爾從語言學的觀點指出 “符號” 應該由兩個部分組成,一是 “能指” (signifier),二是 “所指” (signified),所謂 “能指”,是 “有聲意象” (sound-image),而 “所指”,則是 “有聲意象連繫的概念” (concept),所有符號都應該具備 “能指” 與 “所指”,缺一不可。在步驟170,本發明整合語言認知框架知識內容並修正索緒爾符號學不足之處,其中自然語言的符號能指應包含語音能指及語文能指,本發明劃分語音能指和語文能指使自然語言符號意義所指=語音能指>語文能指;語音能指具有線條性,屬於聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線。其中語音能指的直線性質即為語音音序,觀察自然語言語音音序本身的拼音結構,歸納自然語言語音音序之自然法則特徵可假設語音音序之公理為音段構符,不移不複,如步驟180所示。其中,“音段構符”:例如中文普通話之音素有37個,分別為21個聲母,3個介母,13個韻母,聲調有4聲,依序排列為聲母段、介母段、韻母段、聲調段四個音段,並且所有漢字一律由聲母段到聲調段依序組合進行發音; “不移不複”:例如中文普通話之四個音段依照每一個漢字的個別需求出沒,但音段順序固定不移,並且在每一個漢字讀音之中,同音段的音素與聲調不重複出現。It is known that Saussure's semiotics proposes that language signs have both signifiers and signifieds. From a linguistic point of view, Saussure pointed out that "signs" should be composed of two parts, one is the "signifier" and the other is the "signifier". "Signified" (signified), the so-called "signifier" is a "sound-image" (sound-image), and "signified" is a "concept" that connects sound-images. All symbols should have " "Signifier" and "signified" are indispensable. In step 170, the present invention integrates the knowledge content of the language cognitive framework and corrects the shortcomings of Saussure's semiotics. The symbolic signifier of natural language should include the phonetic signifier and the linguistic signifier. The present invention divides the phonetic signifier and the linguistic signifier into two categories: The signified meaning of a natural language symbol = phonetic signifier > linguistic signifier; the phonetic signifier is linear, auditory in nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, ( b) This length can only be measured in one direction: it is a straight line. Among them, the straight-line nature of the phonetic signifier is the phonetic sequence. Observing the pinyin structure of the natural language phonetic sequence itself, and summarizing the natural law characteristics of the natural language phonetic sequence, we can assume that the axioms of the phonetic sequence are segment structures, which cannot be changed. Repeat, as shown in step 180. Among them, "segment structure": For example, Mandarin Chinese has 37 phonemes, including 21 initial consonants, 3 medial consonants, 13 finals, and 4 tones. They are arranged in order as initial consonant segment, medial consonant segment, and final vowel. There are four segments, namely segment and tone segment, and all Chinese characters are pronounced sequentially from the initial segment to the tone segment; "Unchangeable": For example, the four segments of Chinese Mandarin appear and appear according to the individual needs of each Chinese character, but The sequence of the segments is fixed, and in the pronunciation of each Chinese character, the phonemes and tones of the same segment do not appear repeatedly.

因為人類對於自然語言的 “認知框架” 存在於 “語音層面”,所以自然語言的 “意義所指” 和 “語音能指” 之間具有 “隨機性”,非人為可以完全控制。但是 “語音能指” 和 “語文所指” 之間卻是 “任意性”,因此在步驟180,本發明藉此任意性轉換語文能指符號元件實踐語言符號拼音結構形式公理系統化,藉由包含:形式化;自定義語言符號集合符號元件選擇權;自定義語言符號集合符號元件排列組合設計權;以及符號元件及/或符號元件組陳列規則合於語音音序之自然法則特徵;在步驟190,保持一音素對映一符號元件及/或符號元件組產生具有辨識性及/或判別性的人為系統之操作,讓此自然語言符號集合的語文能指符號元件及/或符號元件組之排列組合狀態強制唯一且閉鎖獨立並合於語音音序之自然法則特徵其拼音結構為音段構符,不移不複的語言表達形式,公理化;驗證本發明自定義語言符號集合與其內所有符號元件及/或符號元件組之關係具備自洽性,獨立性與完備性,因此音段構符,不移不複收斂上即為語言符號集合公理,又因語言總體集合均為語言符號集合的符號元件及/或符號元件組重複排列組合架構而成,因此音段構符,不移不複發散上即為此自然語言公理,藉此證明音段構符,不移不複之自然語言假設公理為真,並可將多種自然語言符號集合的符號元件及/或符號元件組轉換一致使多種自然語言符號集合相等其符號元件及/或符號元件組之元素同質共集達成多語言轉換翻譯。Because human beings' "cognitive framework" for natural language exists at the "speech level", there is "randomness" between the "meaning referent" and "speech signifier" of natural language, which can be completely controlled by non-human beings. However, there is "arbitrariness" between "phonetic signifier" and "linguistic referent". Therefore, in step 180, the present invention uses this method to arbitrarily convert the linguistic signifier symbol element to systematize the axioms of the pinyin structure form of language symbols. Including: formalization; the right to select symbol elements of a custom language symbol set; the right to design the arrangement and combination of symbol elements in a custom language symbol set; and the symbol element and/or symbol element group display rules are consistent with the natural law characteristics of the phonetic sequence; in the steps 190. The operation of maintaining a phoneme corresponding to a symbol element and/or symbol element group to produce an artificial system with recognition and/or discriminability, so that the language of this natural language symbol set can refer to the symbol element and/or symbol element group. The permutation and combination state is forced to be unique and closed, independent and combined with the natural law characteristics of phonetic sequence. Its pinyin structure is a segmental symbol, an unchanging language expression form, and axiomatic; verify the custom language symbol set of the present invention and all the characters in it. The relationship between semiotic elements and/or semiotic element groups is self-consistent, independent and complete. Therefore, the segment structure will never move or converge, which is the axiom of the language symbol set, and because the overall language set is a language symbol set It is structured by repeated arrangement and combination of symbolic elements and/or groups of symbolic elements. Therefore, the structure of segmental symbols will never be moved or repeated. This is the axiom of natural language. This is used to prove that the structure of segmental symbols will never be moved or repeated in natural language. Assume that the axiom is true, and the symbol elements and/or symbol element groups of multiple natural language symbol sets can be converted into the same set, so that the multiple natural language symbol sets are equal, and the symbol elements and/or symbol element group elements are homogeneous and co-assembled to achieve multi-language conversion and translation. .

將本發明之自然語言處理方法做為人類端與電腦端的連結樞紐,人類端透過語言符號近似讀音且無符號形變規則之擬音滿足人類閱讀理解需求,電腦端則透過語言符號拼音結構形式公理系統化使語言符號集合呈現音段構符,不移不複的形式公理系統便於電腦端人工智慧機器理解人類自然語言。而系統 (system) 泛指由一群有關聯的個體組成,根據某種規則運作,能完成個別元件不能單獨完成的工作的群體。本發明之系統為根據語音音序自然法則特徵生成之人為系統 。本發明之應用包含自然語言處理方法應用及其系統應用與本發明成果應用其中更包含本發明之實施例於人工智慧泛科技領域之應用。The natural language processing method of the present invention is used as a connection hub between the human side and the computer side. The human side uses language symbols to approximate pronunciation and foley without symbol deformation rules to meet human reading comprehension needs. The computer side uses language symbols to systematize the pinyin structure form axioms. The collection of language symbols presents segmental structures, and the unwavering formal axiom system facilitates computer-side artificial intelligence machines to understand human natural language. A system generally refers to a group of related individuals that operate according to certain rules and can complete work that individual components cannot complete alone. The system of the present invention is an artificial system generated based on the natural law characteristics of speech sequence. The applications of the present invention include the application of natural language processing methods and system applications and the application of the results of the present invention, which further include the application of embodiments of the present invention in the field of artificial intelligence and pan-technologies.

請參照圖2,其為習知人類對於自然語言認知框架研究之發展示意圖,包含:語言定義202;學術整合204 (現代語言學與中國語言學之比較處理);時脈206 (時脈與中文演變之關係對照);認知階段順序208 (語言認知分層);認知框架210;以及中文語音音序212。其等說明如以下所述。Please refer to Figure 2, which shows the development intention of human understanding of the cognitive framework of natural language, including: language definition 202; academic integration 204 (comparative treatment of modern linguistics and Chinese linguistics); Shimai 206 (Shimai and Chinese Evolutionary relationship comparison); cognitive stage sequence 208 (language cognitive stratification); cognitive framework 210; and Chinese phonetic sequence 212. The explanations are as follows.

“語言” 的定義解釋:語言是一種溝通的系統,由語音、語文、語法,組織而成;另外的說法,語言是特定的人群使用的溝通系統 (“Language” in British English: “A system of communication consisting of sounds, words, and grammar, or the system of communication used by people in a particular country or type of work.”)。Definition of “Language”: Language is a system of communication, organized by pronunciation, language, and grammar; in other words, language is a communication system used by a specific group of people (“Language” in British English: “A system of communication consists of sounds, words, and grammar, or the system of communication used by people in a particular country or type of work.").

請參照圖2A,其為一 “現代語言學” 與 “中國語言學” 之比較處理示意圖。在 “中國語言學” 的研究領域裡,部分學者會採用 “語言與文字” 的說法進行 “字音與字形” 兩者之間關聯性的 “對照處理”。“語言與文字” 之中的 “語言”,追溯 “中國聲韻學”,是 “字音歷時研究” (Sounds),“語言與文字” 之中的 “文字”,追溯 “中國漢字學”,是 “字形歷時研究” (Words)。然而 “語言與文字” 的說法在對映 “現代語言學” 進行比較處理之際,並不妥當,根據上述的引文,我們知曉在 “現代語言學” 的說法之中,所謂的 “語言” (Language) 事實上已經涵蓋了 “語音” (Sounds) 與 “語文” (Words) 兩種不同層面的語言概念。是以若要將 “現代語言學” 與 “中國語言學” 兩者的 “語言認知” 加以整合,進行比較處理,“語言與文字” 應該修正為 “語音與語文”,才不會在論述時陷入概念誤區。Please refer to Figure 2A, which is a schematic diagram of comparative processing between “modern linguistics” and “Chinese linguistics”. In the research field of "Chinese linguistics", some scholars will use the term "language and characters" to "compare and process" the correlation between "character pronunciation and grapheme". The "language" in "Language and Characters" traces back to "Chinese phonology" and is "diachronic study of phonetics" (Sounds). The "character" in "Language and Characters" traces back to "Chinese phonology" and is "Sounds". A Diachronic Study of Glyphs” (Words). However, the term "language and writing" is not appropriate when comparing "modern linguistics". According to the above quotations, we know that in the term "modern linguistics", the so-called "language" ( Language) actually covers two different levels of language concepts: "Sounds" and "Words". Therefore, if we want to integrate and compare the "language cognition" of "modern linguistics" and "Chinese linguistics", "language and characters" should be revised to "speech and language", so as not to be used in the discussion. Falling into conceptual misunderstandings.

透過上述的概念釐清,我們順利將 “現代語言學” 與 “中國語言學” 兩者論述的語言概念整合串連。這樣的過程同時也串連起中國 “傳統” 到 “現代” 的 “時脈”,讓我們能夠對於中華民族 “歷時” 之中所發生的 “中文演變” 進行更清晰地觀察,請參照圖2B,其為一 “時脈” 與 “中文演變” 之關係對照示意圖。時脈(歷時/歷史)從上古(商周秦漢)、中古(魏晉隋唐)、近古(宋元明清)、現代(中華民族)到當代(共時),其等分別對照語音(聲韻/音韻)以及語文(漢字/文字)的演化,從上古(詩經音系以及甲骨、金篆)、中古(廣韻音系以及隸書、草書)、近古(中原音系以及楷書、行書)、現代(語音音系以及繁體、簡化)到當代(共時語音以及共時語文)。Through the above conceptual clarification, we have successfully integrated and connected the language concepts discussed in "modern linguistics" and "Chinese linguistics". This process also connects the "timeline" of China's "tradition" to "modernity", allowing us to more clearly observe the "evolution of Chinese" that has occurred in the "diachrony" of the Chinese nation. Please refer to Figure 2B , which is a schematic diagram comparing the relationship between "time pulse" and "Chinese evolution". The timing (diachronic/history) ranges from ancient times (Shang, Zhou, Qin and Han), medieval times (Wei, Jin, Sui and Tang), modern times (Song, Yuan, Ming and Qing), modern times (Chinese nation) to contemporary times (synchronic), which are compared with phonetics (phonology/phonology) and The evolution of Chinese language (Chinese characters/writing), from ancient times (the phonology of the Book of Songs and oracle bones and golden seal scripts), to the middle ages (the Guangyun phonology and official script and cursive script), to the modern times (the phonetic phonology and regular script and running script of the Central Plains) and traditional and simplified) to contemporary (synchronic speech and synchronic language).

現在,我們回歸到自己身處的時間點-當代,從 “共時” 的視角重新檢視中文 “語音” 與 “語文” 之間的關係,請參照圖2C,其為一語言認知分層示意圓靶圖,我們得到一個重大的發現:人類的 “語言認知” 對於 “語音層面” (Sounds) 與 “語文層面” (Words) 具有相對的先後順序,因為 “語音” 或 “語文” 都不斷地在歷時過程之中演變,然而生活在當代的我們,無論是以 “中國聲韻學” 進行 “語音探源研究” 或是以 “中國漢字學” 進行 “語文探源研究” 之際,都無法脫離 “當代共時語音” 的解讀,由此可知 “語音層面” (Sounds) 更接近語言認知核心(Core),是相對意義上的 “認知框架”, 而此認知框架在此解釋為人們解釋外在真實世界的心理基模,用來作為瞭解、指認以及界定行事經驗的基礎。Now, we return to the time point we are in - contemporary times, and re-examine the relationship between Chinese "speech" and "language" from a "synchronic" perspective. Please refer to Figure 2C, which is a schematic circle of language cognitive layering. Target map, we made a major discovery: human "language cognition" has a relative order of "speech level" (Sounds) and "language level" (Words), because "speech" or "language" are constantly changing It has evolved over time. However, we living in the contemporary era, whether we are conducting "speech origin research" with "Chinese phonology" or "language origin research" with "Chinese kanjiology", cannot escape from " Interpretation of "Contemporary Synchronic Speech", it can be seen that the "speech level" (Sounds) is closer to the core of language cognition (Core), and is a "cognitive framework" in a relative sense, and this cognitive framework is explained here as people's interpretation of external reality Psychological archetypes of the world that serve as the basis for understanding, identifying, and defining experiences.

更進一步說明,將一自然語言透過歷時及/或共時比較,掌握此自然語言具有的認知框架,其中此自然語言的語義(核)>語法>語音>語文。To further explain, a natural language is compared diachronically and/or synchronically to grasp the cognitive framework of this natural language, where the natural language's semantics (core) > grammar > phonetics > language.

舉例說明:周代人使用語言的方式,可能是口說詩經音系的方言,手寫金文,但是當我們在探源研究周代的語言之際,並無法如同周代人直接以詩經音系的語音解讀金文,因為我們的 “語言認知” 依賴 “當代共時語音” 架構意義。這也是為何現代戲劇無論以哪個古典朝代做為演繹背景,戲劇人物的 “口頭語” 仍然是 “當代共時語音”。我們從自己的歷時生命過程之中也能夠發現端倪加以印證,像是幼兒時期先是牙牙學語,後來才讀書寫字。更有甚者,在教育不普遍的時代,有些人一輩子也看不懂文字,是為 “文盲”,但是這些人依然能夠使用語言進行聽與說方面的對答,這更體現出人類 “語言認知” 階級順序上 “語音”先於 “語文” 的事實。換而言之,語言存在 “口頭語” 正是 “語言認知” 之中 “語音” 先於 “語文” 的最佳實證。For example: The way people in the Zhou Dynasty used language may be to speak the dialect of the phonological system of the Book of Songs orally and write in gold inscriptions by hand. However, when we explore the source and study the language of the Zhou Dynasty, we cannot directly interpret the Jin dynasty using the phonetic pronunciation of the Book of Songs like the people of the Zhou Dynasty. text, because our "linguistic cognition" relies on "contemporary synchronic speech" to structure meaning. This is why no matter which classical dynasty is used as the performance background of modern drama, the "spoken language" of the drama characters is still the "contemporary synchronic voice". We can also find clues and confirm it from our own life process. For example, in childhood, we first learn to talk, and later we read and write. What's more, in an era when education was not common, some people could not read words in their entire lives and were called "illiterates". However, these people were still able to use language for listening and speaking responses, which further reflected human "language cognition" ” The fact that “speech” precedes “language” in class order. In other words, the existence of "spoken language" in language is the best evidence that "speech" precedes "language" in "language cognition".

有鑑於此,請參照圖2D,其為一認知框架示意圖。由於語音具有音序,而透過上述的概念及經驗假設公理,語音音序自然法則認知框架在此解釋為人們解釋外在真實世界的心理基模,用來作為瞭解、指認以及界定行事經驗的基礎。而圖2E,其為一中文語音序示意圖。In view of this, please refer to Figure 2D, which is a schematic diagram of a cognitive framework. Since speech has a sound sequence, and through the above-mentioned concepts and empirical hypothesis axioms, the cognitive framework of the natural law of speech sequence is explained here as a psychological prototype for people to interpret the external real world, and is used as the basis for understanding, identifying and defining behavioral experiences. . And Figure 2E is a schematic diagram of Chinese phonetic sequence.

請參照圖3,其為本發明之一較佳索緒爾符號學修正之發明步距示意圖。從另一方面來說,自19世紀末起,符號學興起之初是為研究語言的一門學科,但目前普遍談語言的符號學已多用語言學指稱,現今符號學意義指稱的是結構主義符號學 (structuralist semiotics)。索緒爾符號學的特點是:單一符號 (Sign) 分成能指 (Signifier) 和所指 (Signified) 兩部分,如習知符號學302所示。能指是符號的語音形象;所指是符號的意義概念部份。由兩部份組成的一個整體,稱為符號。能指和所指兩者之間的關係存在隨機性 (arbitrariness),沒有必然關連。例如英文中的「tree」的發聲及串字組合,因約定俗成的習慣被指涉為「一種以木質枝桿為主體的葉本植物」的概念。語言符號是一種兩面的心理實體:概念和音響形象。索緒爾把概念和音響形象的結合叫做符號,把概念叫做「所指」(signifié),把音響形象叫做「能指」(significant)。Please refer to FIG. 3 , which is a schematic diagram of the invention steps of a preferred Saussurean semiotics modification of the present invention. On the other hand, since the end of the 19th century, semiotics has initially emerged as a discipline to study language. However, nowadays semiotics that generally talks about language has mostly been referred to as linguistics. Today’s semiotics refers to structuralist semiotics. (structuralist semiotics). The characteristic of Saussure's semiotics is that a single sign (Sign) is divided into two parts: signifier and signified, as shown in Semiotics of Conventions 302. The signifier is the phonetic image of the symbol; the signified is the conceptual part of the symbol. A whole composed of two parts is called a symbol. The relationship between the signifier and the signified is arbitrary and not necessarily connected. For example, the pronunciation and word combination of "tree" in English refer to the concept of "a leafy plant with wooden branches as the main body" due to conventional habits. Linguistic symbol is a two-sided psychological entity: concept and sound image. Saussure called the combination of a concept and a sound image a sign, the concept a "signified" (signifié), and the sound image a "signifier" (significant).

合併上述之 “語言認知” 的 “認知框架” 以及索緒爾對於 “符號” 的解釋,因為 “語言認知” 之中 “語音” 先於 “語文”,所以 “能指” 事實上能夠再區分為 “語音能指” 和 “語文能指”,如修正符號學304所示,並且 “語音能指” 才是 “有聲意象” 的主體。因為自然語言具有 “口頭語”,語音認知先於語文認知,所以 “語文能指” 必要的功能是召喚 “語音能指”。而人類對於自然語言的 “認知框架” 存在於 “語音層面”,所以自然語言的 “語音能指” 和 “意義所指” 之間具有 “隨機性”,非人為可以完全控制。但是 “語音能指” 和 “語文所指” 之間卻是 “任意性”, 因為 “口頭語” 是可理解的存在,因此可透過 “召喚語音” 的方法進行語文編程,形成 “系統化” 的 “共時語文”。例如:密碼學對映方法的 “一音一符”,一個音素對映一個符號,“語音所指” 的語音結構對映 “語文所指” 的文字符號。Combining the above-mentioned "cognitive framework" of "language cognition" and Saussure's explanation of "symbol", because "speech" precedes "language" in "language cognition", the "signifier" can actually be divided into "Phonetic signifier" and "linguistic signifier", as shown in Modified Semiotics 304, and "phonological signifier" is the subject of "sound image". Because natural language has "spoken language" and phonetic cognition precedes linguistic cognition, the necessary function of "linguistic signifier" is to summon "phonological signifier". Human beings' "cognitive framework" for natural language exists at the "speech level", so there is "randomness" between the "speech signifier" and "meaning signifier" of natural language, which can be completely controlled by non-human beings. However, there is "arbitrariness" between "phonetic signifier" and "linguistic referent", because "spoken language" is a comprehensible existence, so language programming can be carried out through the method of "summoning voice" to form a "systematic" "Synchronic Language". For example: "one sound, one symbol" of the cryptographic mapping method, one phoneme corresponds to one symbol, and the phonetic structure of "phonetic reference" corresponds to the text symbol of "linguistic reference".

接著,修正此自然語言的符號能指,其中此自然語言的符號能指包含語音能指及語文能指,劃分語音能指和語文能指使符號所指=語音能指>語文能指。語言學家索緒爾從語言學的觀點指出 “符號” 應該由兩個部分組成,一是 “能指” (signifier),二是 “所指” (signified),所謂 “能指”,是 “有聲意象” (sound-image),而 “所指”,則是 “有聲意象連繫的概念” (concept),所有符號都應該具備 “能指” 與 “所指”,缺一不可。Next, revise the symbolic signifier of this natural language, where the symbolic signifier of this natural language includes the phonetic signifier and the linguistic signifier, and divide the phonetic signifier and the linguistic signifier so that the sign refers = phonetic signifier > linguistic signifier. Linguist Saussure pointed out from a linguistic point of view that "sign" should be composed of two parts, one is the "signifier" and the other is the "signified". The so-called "signifier" is " "Sound-image" (sound-image), and "signified" is the "concept" that connects sound-images. All symbols should have a "signifier" and a "signified", both of which are indispensable.

此外,修改語文能指,因為人類對於自然語言的 “認知框架” 存在於 “語音層面”,所以自然語言的 “意義所指” 和 “語音能指” 之間具有 “隨機性”,非人為可以完全控制。但是 “語音能指” 和 “語文所指” 之間卻是 “任意性”,本發明藉此任意性轉換語文能指符號元件實踐語言符號拼音結構形式公理系統化將此自然語言的語音結構形式化,讓此自然語言的語音合於音序且其拼音結構為音段構符,不移不複的一語言形式。其中,藉由包含:符號選擇;特定排組設計權;以及合於音素的操作,讓此自然語言的語音合於音序且其拼音結構為音段構符,不移不複的語言形式。其中,符號選擇包含電腦科學所用之符號元件。在一實施例中,例如:以適用當代標準鍵盤與計算機輸入法,利用26個基本拉丁 (英文) 字元標註華語37個音素,並以1、2、3、4呈現聲調;在語音能指上,要符合四層構築,不移不複;在語文能指上,要符一音 (音素) 一符 (符號),亦即,以字元擬音優先,擬音又以華語音序優先英語語音,而無法掌握擬音聯想則視字元為符號,符號依據心理認知召換語音滿足拼音需求,運用數學理論中的排列組合維持符號在系統中的獨一性。本實施例係用以說明本發明之一較佳實施方法,並非用以限制本發明之實施。In addition, modifying the linguistic signifier is because human beings' "cognitive framework" for natural language exists at the "speech level", so there is "randomness" between the "meaning signifier" and "speech signifier" of natural language, which cannot be artificially modified. fully control. However, there is "arbitrariness" between "phonological signifier" and "linguistic referent". The present invention uses this arbitrariness to convert the linguistic signifier symbol element to practice the axioms of the pinyin structure form of the language symbol and systematize the phonetic structure form of this natural language. Transformation makes the pronunciation of this natural language conform to the phonetic sequence and its pinyin structure becomes a segmental structure, which is an unchangeable language form. Among them, by including: symbol selection; specific arrangement design rights; and operations consistent with phonemes, the pronunciation of this natural language is consistent with the phonetic sequence and its pinyin structure is a segmental structure, an unchangeable language form. Among them, symbol selection includes symbol components used in computer science. In one embodiment, for example, using contemporary standard keyboards and computer input methods, 26 basic Latin (English) characters are used to mark 37 phonemes in Chinese, and tones are presented as 1, 2, 3, and 4; in the phonetic signifier In terms of language signifiers, one sound (phoneme) and one symbol (symbol) must be matched, that is, the onomatopoeia of the characters is given priority, and the onomatopoeia is given priority in the Chinese pronunciation order of the English pronunciation. , and those who cannot grasp the onomatopoeic association regard the characters as symbols. The symbols are replaced by phonetic sounds based on psychological cognition to meet the pinyin needs, and the permutations and combinations in mathematical theory are used to maintain the uniqueness of the symbols in the system. This embodiment is used to illustrate a preferred implementation method of the present invention and is not intended to limit the implementation of the present invention.

請參照圖4,其為一較佳語言學公理音段構符不移不複示意圖。索緒爾認為語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵:(a)它體現一個長度,(b)這長度只能在一個向度上測定:它是一條直線。有鑑於此,整合其直線性質及音序 (拼音結構自然法則現象402),提出音序拼音結構,使此自然語言語言音序特徵為音段構符,不移不複,如拼音結構人為處理404所示。又因語言為符號集合之大集合,音段構符,不移不復,收斂上為符號集合公理;發散上為語言學公理。Please refer to Figure 4, which is a schematic diagram of a preferred linguistic axiom that the segmental structure of the segment cannot be moved. Saussure believes that the phonetic signifier has linearity, is of auditory nature, only expands in time, and has characteristics borrowed from time: (a) it embodies a length, (b) this length can only be measured in one direction : It is a straight line. In view of this, integrating its linear nature and phonetic sequence (natural law phenomenon of Pinyin structure 402), a phonetic sequence Pinyin structure is proposed, so that the phonetic sequence characteristics of this natural language language are segmental constructions, which cannot be moved or changed, such as the Pinyin structure is artificially processed 404 is shown. And because language is a large collection of symbol sets, the sound segment constructs a symbol and never moves or recovers. In terms of convergence, it is an axiom of symbol collection; in terms of divergence, it is an axiom of linguistics.

發明人在此要強調的是,其指稱的是語音結構形式化,並不限定於文字符號使用上的差異。其中,語音結構形式化的形:為符號元件選擇權,其中符號選擇包含電腦科學所用之符號元件。語音結構形式化的式:為符號元件及/或符號元件組,合於音素,可辨識符號組排列組合設計權,其特徵:不具符號形變。再則,在一較佳實施例中,上述之音段構符的符為聲之形,係為此自然語言之語義的最小單位。此外,上述之步驟除了有因果之關係外,並不限定其等之先後順序。請參照表1,其為本發明之一較佳中文拼音實施例之對照表, [表1] 本發明之一較佳中文拼音實施例之對照表 漢語拼音 中文例注 本發明之拼音 聲母 b 玻ㄅ b p 坡ㄆ p m 摸ㄇ m f 佛ㄈ f d 得ㄉ d t 特ㄊ t n 訥ㄋ n l 勒ㄌ l g 哥ㄍ g k 科ㄎ k h 喝ㄏ h j 基ㄐ j q 欺ㄑ q x 希ㄒ c zh 知ㄓ zv ch 蚩ㄔ xv sh 詩ㄕ sv r 日ㄖ v z 資ㄗ z c 雌ㄘ x s 思ㄙ s 介母 i 衣ㄧ i u 烏ㄨ w ü 迂ㄩ y 韻母 a 啊ㄚ r o 喔ㄛ o e 鵝ㄜ u ê 誒ㄝ e ai 哀ㄞ ai ei 欸ㄟ ae ao 熬ㄠ au ou 歐ㄡ ao an 安ㄢ am en 恩ㄣ an ang 昂ㄤ arm eng 亨ㄥ arn er 兒ㄦ u The inventor would like to emphasize here that what it refers to is the formalization of phonetic structure and is not limited to differences in the use of text symbols. Among them, the shape of the formalization of phonetic structure: is the symbol element selection, where the symbol selection includes the symbol elements used in computer science. The formal formula of phonetic structure: it is a symbol element and/or a group of symbol elements, combined with phonemes, and the arrangement and combination design of the symbol group can be recognized. Its characteristics: no symbol deformation. Furthermore, in a preferred embodiment, the symbols forming the symbols of the above-mentioned sound segments are in the shape of sound, which is the smallest unit of the semantics of this natural language. In addition, apart from the relationship between cause and effect, the above steps do not limit their order. Please refer to Table 1, which is a comparison table of a preferred Chinese Pinyin embodiment of the present invention. [Table 1] A comparison table of a preferred Chinese Pinyin embodiment of the present invention. Chinese Pinyin Chinese examples Pinyin of the present invention initial consonant b glass b p slope p m touch m f Buddha f d Got it d t Special t n ne ㄋ n l Leㄌ l g Brother g k Branch k h drink h j base j q bully q x Hope c zh know zv ch Chi ㄔ xv sh Poetry sv r Japan v z capital z c Female x s Thinking s stepmother i clothing i u black ㄨ w ü roundabout y vowels a Ah r o Oh ㄛ o e goose u ê Hey e ai Sorry ai ei eh ㄟ ae ao boil au ou Europe ao an An ㄢ am en Yes an ang Ang ㄤ arm eng Henry arn er son ㄦ u

其中,聲母:v首位為日ㄖ,次位表捲舌音;介母:i、w、y;韻母:a+()兩位字元,ar+()三位字元;聲調:1、2、3、4聲。Among them, the initial consonant: v, the first one is ㄖ, and the second one is retroflex; mediates: i, w, y; finals: two characters a+(), three characters ar+(); tones: 1, 2, 3 or 4 sounds.

其中,本發明之拼音符號定律包含:四層構築,不移不複,即語音結構包含37個音素,分別為21個聲母,3個介母,13個韻母,聲調有4聲,依序排列為聲母層、介母層、韻母層、聲調層四個音層,所有漢字一律由聲母層到聲調層依序組合進行發音;四個音層依照每一個漢字的個別需求出沒,但音層順序固定不移,且每一個漢字讀音之中,同音層的音素與聲調不重複出現。Among them, the pinyin symbol law of the present invention includes: a four-layer structure, which is unchanging, that is, the phonetic structure contains 37 phonemes, which are 21 initials, 3 mediators, 13 finals, and the tones have 4 tones, arranged in order There are four sound layers: initial consonant layer, medial layer, final layer, and tone layer. All Chinese characters are combined and pronounced in sequence from the initial consonant layer to the tone layer; the four sound layers appear and appear according to the individual needs of each Chinese character, but the order of the sound layers It is fixed and unmovable, and in the pronunciation of each Chinese character, the phonemes and tones of the homophonic layer do not appear repeatedly.

此外,根據本發明之形式公理系統化之步驟,更包含:將上述之語言形式做為符號集合的公理,強制維持每個元件獨立性,要求最末音段必有元件讓整體構符閉鎖,達成自洽與完備性。以一音段一符號或一符組滿足上述之語言形式。其中,索緒爾認為語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵:(a)它體現一個長度,(b)這長度只能在一個向度上測定:它是一條直線。有鑑於此,語音具有音序,而透過上述的概念及經驗假設公理:本發明之語言形式係音段構符,不移不複,以中文為例(然不限於此),四段構符,不移不複。又因語言為符號集合之大集合,音段構符,不移不復收斂上為符號集合公理;發散上為語言學公理。上述之步驟除了有因果之關係外,並不限定其等之先後順序。In addition, the steps of systematizing the formal axioms according to the present invention further include: treating the above-mentioned language form as an axiom of a symbol set, forcing the independence of each element to be maintained, and requiring that the last segment must have an element to close the overall structure. Achieve self-consistency and completeness. One segment, one symbol or one symbol group satisfies the above language form. Among them, Saussure believes that the phonetic signifier has linearity, is of auditory nature, only expands in time, and has characteristics borrowed from time: (a) it embodies a length, (b) this length can only be in one dimension Upper measurement: It is a straight line. In view of this, speech has a phonetic sequence, and through the above-mentioned concepts and empirical assumptions, the language form of the present invention is a segmental structure, which cannot be changed. Taking Chinese as an example (but not limited to this), the four-segment structure , unwavering. And because language is a large collection of symbol sets, sound segments constitute symbols, and it is an axiom of symbol collection in terms of convergence and linguistics axiom in terms of divergence. Except for the relationship between cause and effect, the above steps do not limit their order.

再從另一角度而言,公理系統的數學模型是一個定義良好的集合,它給系統中出現的未定義術語賦予意義,並且是用一種和系統中所定義的關係一致的方式。具體模型的存在性能證明系統的自洽(相容)性。模型也可以用來顯示一個公理在系統中的獨立性。通過構造除去一個特定公理的子系統的有效模型,我們表明該省去的公理是獨立的,若它的正確性不可以從子系統得出。兩個模型被稱為同構,如果它們的元素可以建立一一對應,並且以一種保持它們之間的關係的方式。一個其每個模型都同構於另一個的公理系統稱為範疇式的,而可範疇化的性質保證了系統的完備性。從此角度來說明本發明上述之拼音結構公理化,即:使上述之語言形式所包含之符號集合的所有符號元件及/或元件組具有自洽(相容)性;將上述之語言形式做為符號集合的公理,強制維持每個符號元件及/或元件組獨立性,要求最末音段必有符號元件及/或元件組讓整體構符閉鎖,形成獨立唯一及閉鎖;以及使上述之語言形式符合音段構符,不移不複,其特徵拼音結構展示音韻關係。而上述之系統 (英語:system;德語:System;法語:système;西班牙語:sistema) 泛指由一群有關聯的個體組成,根據某種規則運作,能完成個別元件不能單獨完成的工作的群體。系統分為自然系統與人為系統兩大類。本發明語言符號拼音結構形式公理系統化係人為系統技術方法及成果應用。From another perspective, a mathematical model of an axiomatic system is a well-defined set that gives meaning to undefined terms that appear in the system in a way that is consistent with the relationships defined in the system. The existence of specific models can prove the self-consistency (compatibility) of the system. Models can also be used to show the independence of an axiom in a system. By constructing an efficient model of a subsystem excluding a particular axiom, we show that the omitted axiom is independent if its correctness cannot be derived from the subsystem. Two models are said to be isomorphic if their elements can establish a one-to-one correspondence, and in a way that preserves the relationship between them. An axiomatic system in which each model is isomorphic to another is called categorical, and the categorizable property ensures the completeness of the system. From this perspective, the above-mentioned Pinyin structure axiomatization of the present invention is explained, that is: all symbol elements and/or element groups of the symbol set included in the above-mentioned language form are self-consistent (compatibility); the above-mentioned language form is regarded as The axioms of symbol sets force the maintenance of the independence of each symbol component and/or component group, requiring that the last segment must have a symbol component and/or component group to close the entire structure, forming an independent, unique and closed structure; and making the above language The form conforms to the segmental structure and remains unchanged, and its characteristic pinyin structure shows the phonological relationship. The above-mentioned system (English: system; German: System; French: système; Spanish: sistema) generally refers to a group of related individuals that operate according to certain rules and can complete work that individual components cannot complete alone. Systems are divided into two categories: natural systems and man-made systems. The systematization of language symbols, pinyin structures, forms and axioms of the present invention is based on artificial system technical methods and application of results.

在本發明中,將自然語言形式公理系統化做為人類與電腦端的連結樞紐,人類端透過中英近似讀音(擬音)滿足人類閱讀理解需求,電腦端則透過一音段一符號/一符組滿足語言(音段構符,不移不複)形式之保留並規劃特定的排列組合便於機器理解形式(排列組合之公理)。In the present invention, the formal axioms of natural language are systematized as the connection hub between human beings and computer terminals. The human terminal meets human reading comprehension needs through Chinese and English approximate pronunciation (foley), and the computer terminal uses one segment, one symbol/one symbol group. It satisfies the preservation of the form of language (segment construction, unchanging) and plans specific permutations and combinations to facilitate machine understanding of the form (the axioms of permutations and combinations).

發明人在此要說明的是,依據 “現代語文學” 引文: “語言是一種溝通系統,由語音、語文、語法組織而成”,如果語言認知的 “認知框架” 存在於 “語音” 之中,則語法(grammar)便不會只是 “書面語” 上的 “規則”,更該指稱 “語音結構” 之中存在 “定律”,而 “定律” 有別於規則,非人為制定,而是對於自然現象的歸納陳述。此外,一個嚴格完善的公理系統必須具備三個基本要求:自洽(相容)性、獨立性以及完備性,此部分為習知技術之範疇,因此發明人除相關之部分外,並未多加以著墨。What the inventor wants to explain here is that according to the citation of "Modern Philology": "Language is a communication system organized by speech, language, and grammar." If the "cognitive framework" of language cognition exists in "speech" , then grammar (grammar) is not just a "rule" in "written language", but should refer to the existence of "laws" in the "speech structure", and "laws" are different from rules and are not made by humans, but are based on nature. An inductive statement of a phenomenon. In addition, a strictly complete axiom system must have three basic requirements: self-consistency (compatibility), independence and completeness. This part is within the scope of common knowledge, so the inventor has not elaborated on it except the relevant parts. Ink it.

接著,將一種以上之自然語言透過歷時及/或共時比較,掌握人類具有的自然語言認知框架,其中人類對於自然語言的認知框架之核心順序為語義>語法>語音>語文。其中,歷時及/或共時比較以及認知框架之核心順序的說明,可以參照上述之圖2A~圖2C的說明,在此不再重複贅述。Next, compare more than one natural language diachronically and/or synchronically to grasp the natural language cognitive framework possessed by humans. The core order of human cognitive framework for natural language is semantics > syntax > phonetics > language. For descriptions of diachronic and/or synchronic comparisons and the core sequence of the cognitive framework, please refer to the above descriptions of Figures 2A to 2C and will not be repeated here.

接著,修正此自然語言的符號所指,其中此自然語言的符號能指包含語音能指及語文能指,劃分語音能指和語文能指使符號所指=語音能指>語文能指。其中,修正此自然語言的符號所指包含新增語文能指符號元件。Next, the sign referent of this natural language is modified, where the sign signifier of this natural language includes a phonetic signifier and a linguistic signifier, and the phonetic signifier and the linguistic signifier are divided so that the sign referent = phonetic signifier > linguistic signifier. Among them, the modified sign referent of this natural language includes a new linguistic signifier sign element.

接著,根據語言音序的拼音結構為音段構符,不移不複的規則完成包含:形式化符號選擇;特定排組設計權;以及合於音素的操作。其中,語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線,藉此,整合其直線性質及音序,提出音序拼音結構,使此自然語言語言音序特徵為音段構符,不移不複。又因語言為符號集合之大集合,音段構符,不移不復,收斂上為符號集合公理;發散上為語言學公理。Then, the symbols are constructed for the segments according to the pinyin structure of the language sequence, and the unwavering rules are completed, including: formal symbol selection; specific arrangement design rights; and operations consistent with phonemes. Among them, the phonetic signifier has linearity, is auditory in nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, (b) this length can only be measured in one direction: It is a straight line. By integrating its linear properties and phonetic sequence, a phonetic sequence pinyin structure is proposed, so that the phonetic sequence characteristics of this natural language language are segmental structures, which cannot be changed. And because language is a large collection of symbol sets, the sound segment constructs a symbol and never moves or recovers. In terms of convergence, it is an axiom of symbol collection; in terms of divergence, it is an axiom of linguistics.

接著,以一音段一符號或一符組進行該自然語言符號拼音結構形式化成一語言形式。其中,符號拼音結構形式化的形包含:符號元件選擇權,其中符號選擇包含電腦科學所用之符號元件;符號拼音結構形式化的式包含:符號元件及/或符號元件組;合於音素;以及可辨識符號組之排列組合設計權。Then, the pinyin structure of the natural language symbol is formalized into a language form by using one segment and one symbol or one symbol group. Among them, the formalized form of the symbol pinyin structure includes: symbol element selection, where the symbol selection includes symbol elements used in computer science; the formalized form of the symbol pinyin structure includes: symbol elements and/or symbol element groups; combined with phonemes; and The right to design the arrangement and combination of identifiable symbol groups.

在上述之步驟,進行本發明之拼音結構公理化。其中,公理系統的數學模型是一個定義良好的集合,它給系統中出現的未定義術語賦予意義,並且是用一種和系統中所定義的關係一致的方式。具體模型的存在性能證明系統的自洽(相容)性。模型也可以用來顯示一個公理在系統中的獨立性。通過構造除去一個特定公理的子系統的有效模型,我們表明該省去的公理是獨立的,若它的正確性不可以從子系統得出。兩個模型被稱為同構,如果它們的元素可以建立一一對應,並且以一種保持它們之間的關係的方式。一個其每個模型都同構於另一個的公理系統稱為範疇式的,而可範疇化的性質保證了系統的完備性。從此角度,使此語言形式所包含之符號集合的所有符號元件及/或元件組具有自洽(相容)性。將此語言形式做為符號集合的公理,強制維持每個符號元件及/或元件組獨立性,要求最末音段必有符號元件及/或元件組讓整體構符閉鎖,形成獨立唯一及閉鎖。使此語言形式符合音段構符,不移不複,其特徵拼音結構展示音韻關係。而上述之系統 (英語:system;德語:System;法語:système;西班牙語:sistema) 泛指由一群有關聯的個體組成,根據某種規則運作,能完成個別元件不能單獨完成的工作的群體集合。系統分為自然系統與人為系統兩大類。本發明語言符號拼音結構形式公理系統化係人為系統技術方法及成果應用。In the above steps, the pinyin structure of the present invention is axiomatized. Among them, the mathematical model of an axiomatic system is a well-defined set that gives meaning to undefined terms that appear in the system in a way that is consistent with the relationships defined in the system. The existence of specific models can prove the self-consistency (compatibility) of the system. Models can also be used to show the independence of an axiom in a system. By constructing an efficient model of a subsystem excluding a particular axiom, we show that the omitted axiom is independent if its correctness cannot be derived from the subsystem. Two models are said to be isomorphic if their elements can establish a one-to-one correspondence, and in a way that preserves the relationship between them. An axiomatic system in which each model is isomorphic to another is called categorical, and the categorizable property ensures the completeness of the system. From this perspective, all symbolic elements and/or element groups of the symbolic set included in this language form are self-consistent (compatible). Taking this language form as an axiom for a set of symbols, it is mandatory to maintain the independence of each symbol component and/or component group. It is required that the last segment must have a symbol component and/or component group to close the entire structure, forming an independent, unique and closed structure. . This language form conforms to the segmental structure, and its characteristic pinyin structure shows the phonological relationship. The above-mentioned system (English: system; German: System; French: système; Spanish: sistema) generally refers to a group of related individuals that operate according to certain rules and can complete work that individual components cannot complete alone. . Systems are divided into two categories: natural systems and man-made systems. The systematization of language symbols, pinyin structures, forms and axioms of the present invention is based on artificial system technical methods and application of results.

請參照圖5,其為本發明之一較佳語言符號拼音結構形式公理系統化集合論示意圖。本發明之自然語言符號拼音結構形式公理系統化包含語言符號形式化502;公理化504;以及形式公理化之系統化506方式,其中符號元件及/或元件組更包含均等轉換符號元件或音訊形式。Please refer to Figure 5, which is a schematic diagram of a systemized set theory of axiomatic set theory of pinyin structure form and axiom of one of the preferred language symbols of the present invention. The systematization of the form axiom of the pinyin structure of natural language symbols in the present invention includes the language symbol formalization 502; the axiomization 504; and the systematization 506 method of the form axiomization, in which the symbol elements and/or element groups further include equal conversion of symbol elements or audio forms. .

本發明之符號元件及/或元件組更包含:藉由符號一致轉譯不同語言之音訊符號、文字及/或程式碼,以及同質轉譯包含:透過深度學習強化處理自然語言之句段關係(rapports syntagmatiques)及/或聯想關係(rapports associatifs)之音訊符號、文字及/或程式碼。The symbol components and/or component groups of the present invention further include: consistent translation of audio symbols, text and/or program codes in different languages through symbols, and homogeneous translation includes: enhanced processing of natural language syntagmatiques (rapports syntagmatiques) through deep learning. ) and/or rapports associatifs audio symbols, text and/or code.

本發明之拼音符號在定律下具備獨一性,每個符號組合在定律中均可視為獨一無二的存在,語言的整體系統就像一塊大拼圖,而組成大拼圖的每塊小拼圖都必須是特定的唯一,本發明之拼音沒有符號形變,所以在語言信息處理方面具有突破性的進展,能順利將與音數碼元件化。The pinyin symbols of the present invention are unique under the law. Each combination of symbols can be regarded as a unique existence under the law. The overall system of language is like a big puzzle, and each small puzzle piece that makes up the big puzzle must be specific. Uniquely, the pinyin of the present invention has no symbol deformation, so it is a breakthrough in language information processing and can smoothly convert the pinyin into digital components.

請參照圖6,其為本發明之一較佳實施例關係圖。將本發明之自然語言處理610方法、系統及其成果做為人類端612與電腦端614的連結樞紐,人類端612經由自然語言處理之人類端成果622透過語言符號近似讀音且無符號形變規則之擬音滿足人類閱讀理解需求,電腦端614則經由自然語言處理之電腦端成果624透過語言符號拼音結構形式公理系統化使語言符號集合呈現音段構符,不移不複的形式公理系統便於電腦端614人工智慧機器理解人類自然語言。在此實施例中,可將多種自然語言符號集合的符號元件及/或符號元件組轉換一致,使多種自然語言符號集合相等其符號元件及/或符號元件組之元素同質共集並儲存於一/多語言共集符號池620,因此,當不同的自然語言經由本發明之自然語言處理610處理時,即可至一/多語言共集符號池620擷取對應之符號元件及/或符號元件組而達成多語言轉換翻譯。Please refer to Figure 6, which is a diagram of a preferred embodiment of the present invention. The natural language processing 610 method, system and results of the present invention are used as a connection hub between the human terminal 612 and the computer terminal 614. The human terminal 612 obtains the human terminal results 622 through natural language processing through language symbols that approximate pronunciation and have no symbol deformation rules. Foley meets the needs of human reading comprehension, and the computer terminal 614 uses the computer terminal results of natural language processing 624 to systematize the pinyin structure of language symbols through formal axioms, so that the collection of language symbols presents the segment structure. The unchangeable formal axiom system is convenient for the computer terminal 614 artificial intelligence machine understands human natural language. In this embodiment, the symbol elements and/or symbol element groups of multiple natural language symbol sets can be converted into the same form, so that the symbol elements and/or symbol element groups of the multiple natural language symbol sets are homogeneously collected and stored in one /Multi-language common symbol pool 620, therefore, when different natural languages are processed by the natural language processing 610 of the present invention, the corresponding symbol elements and/or symbol elements can be retrieved from the /Multi-language common symbol pool 620 Group to achieve multi-language conversion and translation.

以下為音韻結構展示:The following shows the phonological structure:

發明人以一實施例說明之,請參照表2,其為本發明之一實施例對照表。韻尾ㄚ,此處引文以漢語拼音轉寫押韻 a,以本發明之拼音轉寫押韻 r。其他部分,字詞  “夕陽西下” 在漢語拼音有形變問題(介母 i 在字詞 “陽” 形變),而本發明之拼音 i 則一目了然。 [表2] 本發明之一實施例對照表 天淨沙 秋思 (作詞:馬致遠) 枯藤老樹昏鴉‖ 小橋流水人家‖ 古道西風瘦馬‖ 夕陽西下‖ 斷腸人在天涯‖ 注音符號 ㄎㄨ ㄊㄥˊㄌㄠˇㄕㄨˋㄏㄨㄣ ㄧㄚ ‖ ㄒㄧㄠˇㄑㄧㄠˊㄌㄧㄡˊㄕㄨㄟˇㄖㄣˊㄐㄧㄚ ‖ ㄍㄨˇㄉㄠˋㄒㄧ ㄈㄥ ㄕㄡˋㄇㄚˇ‖ ㄒㄧˋㄧㄤˊㄒㄧ ㄒㄧㄚˋ‖ ㄉㄨㄢˋㄔㄤˊㄖㄣˊㄗㄞˋㄊㄧㄢ ㄧㄚˊ‖ 漢語拼音 ku 1teng 2lao 3shu 4hun 1ya 1‖ xiao 3qiao 2liu 2shui 3ren 2jia 1‖ gu 3dao 4xi 1feng 1shou 4ma 3‖ xi 4yang 2xi 1xia 1‖ duan 4chang 2ren 2zai 4tian 1ya 2 本發明之拼音 kw 1tarn 2lau 3svw 4hwan 1ir 1‖ ciau 3qiau 2liao 2svwae 3van 2jir 1‖ gw 3dau 4ci 1farn 1svao 4mr 3‖ ci 4iarn 2ci 1cir 1‖ dwam 4xvarm 2van 2zai 4tiam 1ir 2 The inventor illustrates this with an embodiment. Please refer to Table 2, which is a comparison table of one embodiment of the present invention. The rhyme ends with ㄚ. The quotation here is transcribed in Chinese pinyin to rhyme with a, and the citation in the present invention is transcribed in pinyin to rhyme with r. In other parts, the word "sunset" has deformation problems in Chinese pinyin (the medial i is deformed in the word "yang"), but the pinyin i of the present invention is clear at a glance. [Table 2] Comparison table of one embodiment of the present invention Tian Jing Sha Qiu Si (Lyrics: Ma Zhiyuan) Withered vines and old trees, dim crows‖ Small bridges and flowing water, people's houses‖ The west wind and thin horses on the ancient road‖ The sunset‖ The heartbroken people are at the end of the world‖ phonetic symbols ㄎㄨㄊㄥˊㄌㄠˇㄕㄨˋㄏㄨㄣㄧㄚ‖ ㄒㄧㄠˇㄑㄧㄠˊㄌㄧㄡˊㄕㄨㄟˇㄖㄣˊㄐㄧㄚ‖ ㄍㄨˇㄉ ㄠˋㄒㄧㄈㄥ ㄕㄡˋㄇㄚˇ‖ ㄒㄧˋㄧㄤˊㄒㄧㄒㄧㄚˋ‖ ㄉㄨㄢˋㄔㄤˊㄖㄣˊㄗㄞˋㄊㄧㄢㄧㄚˊ‖ Chinese Pinyin ku 1 teng 2 lao 3 shu 4 hun 1 ya 1 ‖ xiao 3 qiao 2 liu 2 shui 3 ren 2 jia 1 ‖ gu 3 dao 4 xi 1 feng 1 shou 4 ma 3 ‖ xi 4 yang 2 xi 1 xia 1 ‖ duan 4 chang 2 ren 2 zai 4 tian 1 ya 2 Pinyin of the present invention kw 1 tarn 2 lau 3 svw 4 hwan 1 ir 1 ‖ ciau 3 qiau 2 liao 2 svwae 3 van 2 jir 1 ‖ gw 3 dau 4 ci 1 farn 1 svao 4 mr 3 ‖ ci 4 iarn 2 ci 1 cir 1 ‖ dwam 4 xvarm 2 van 2 zai 4 tiam 1 ir 2

發明人以另一實施例說明之,請參照表3,其為本發明之另一實施例對照表。韻尾ㄝ,此處引文以漢語拼音轉寫押韻 ê,以本發明之拼音轉寫押韻 e,ê 不便於鍵盤輸入。 [表3] 本發明之另一實施例對照表 千古 節錄 (作詞:許嵩) 夏蟬冬雪‖ 不過輪迴一瞥‖ 悟道修煉‖ 不問一生緣劫‖ 注音符號 ㄒㄧㄚˋㄔㄢˊㄉㄨㄥ ㄒㄩㄝˇ‖ ㄅㄨˋㄍㄨㄛˋㄌㄨㄣˊㄏㄨㄟˊㄧˋㄆㄧㄝˇ‖ ㄨˋㄉㄠˋㄒㄧㄡ ㄌㄧㄢˋ‖ ㄅㄨˊㄨㄣˋㄧˋㄕㄥ ㄩㄢˊㄐㄧㄝˊ‖ 漢語拼音 xia 4chan 2dong 1xuê 3‖ bu 2guo 4lun 2hui 2yi 4piê 3‖ wu 4dao 4xiu 1lan 4‖ bu 2wen 4yi 4sheng 1yuan 2jiê 2 本發明之拼音 cir 4xvam 2dwarn 1cye 3‖ bw 2gwo 4lwan 2hwae 2i 4pie 3‖ w 4dau 4ciao 1liam 4‖ bw 2wan 4i 4svarn 1yam 2jie 2 The inventor illustrates this with another embodiment. Please refer to Table 3, which is a comparison table of another embodiment of the present invention. The rhyme ends ㄝ, and the quotation here is transcribed in Chinese pinyin to rhyme ê, and the rhyme is e in the pinyin of the present invention, and ê is not convenient for keyboard input. [Table 3] Comparison table of another embodiment of the present invention Excerpts from the Ages (Lyrics: Xu Song) Summer cicada and winter snow‖ are just a glimpse of reincarnation‖ Enlightenment and cultivation‖ Regardless of fate and calamity in life‖ phonetic symbols ㄒㄧㄚˋㄔㄢˊㄉㄨㄥㄒㄩㄝˇ‖ ㄅㄨˋㄍㄨㄛˋㄌㄨㄣˊㄏㄨㄟˊㄧˋㄆㄧㄝˇ‖ ㄨˋㄉㄠˋㄒㄧㄡ ㄌㄧㄢˋ‖ ㄅㄨˊㄨㄣˋㄧˋㄕㄥㄩㄢˊㄐㄧㄝˊ‖ Chinese Pinyin xia 4 chan 2 dong 1 xuê 3 ‖ bu 2 guo 4 lun 2 hui 2 yi 4 piê 3 ‖ wu 4 dao 4 xiu 1 lan 4 ‖ bu 2 wen 4 yi 4 sheng 1 yuan 2 jiê 2 Pinyin of the present invention cir 4 xvam 2 dwarn 1 cye 3 ‖ bw 2 gwo 4 lwan 2 hwae 2 i 4 pie 3 ‖ w 4 dau 4 ciao 1 liam 4 ‖ bw 2 wan 4 i 4 svarn 1 yam 2 jie 2

發明人以又一實施例說明之,請參照表4,其為本發明之又一實施例對照表。韻尾ㄟ、ㄝ,“醉”、“月”、“悔”、“碑” 四字押韻,漢語拼音韻腳混雜,本發明之拼音韻腳相對清楚。另外,第一段“醉”與“微”具有相似音律關係,在漢語拼音 zui 4與 wei 2看不出來,但本發明之拼音 zwae 4與 wae 2則一目了然。 [表4] 本發明之又一實施例對照表 髮如雪 節錄 (作詞:方文山) 紅塵醉‖ 微醺的歲月‖ 我用無悔‖ 刻永世愛你的碑‖ 注音符號 ㄏㄨㄥˊㄔㄣˊㄗㄨㄟˋ‖ ㄨㄟˊㄒㄩㄣ ㄉㄜ ㄙㄨㄟˋㄩㄝˋ‖ ㄨㄛˇㄩㄥˋㄨˊㄏㄨㄟˇ‖ ㄎㄜˋㄩㄥˇㄕˋㄞˋㄋㄧˇㄉㄜ ㄅㄟ‖ 漢語拼音 hong 2chen 2zui 4‖ wei 2xun 1de 1sui 4yue 4‖ wo 3yong 4wu 2hui 3‖ ke 4yong 3shi 4ai 4ni 3de 1bei 1 本發明之拼音 hwarn 2xvan 2zwae 4‖ wae 2cyan 1du 1swae 4ye 4‖ wo 3yarn 4w 2hwae 3‖ ku 4yarn 3sv 4ai 4ni 3du 1bae 1 The inventor illustrates this with yet another embodiment. Please refer to Table 4, which is a comparison table of yet another embodiment of the present invention. The rhyme endings ㄟ and ㄝ, "zui", "month", "regret", and "stele" all rhyme, and the Chinese pinyin rhyme is mixed. The pinyin rhyme of the present invention is relatively clear. In addition, "zui" and "微" in the first paragraph have a similar pronunciation relationship, which cannot be seen in the Chinese pinyin zui 4 and wei 2 , but the pinyin zwae 4 and wae 2 of the present invention are clear at a glance. [Table 4] Comparison table of another embodiment of the present invention Excerpt from "Fa Ruxue" (lyrics: Fang Wenshan) Drunk in the mortal world‖ The years of being slightly tipsy‖ I have no regrets‖ Carving a monument to love you forever‖ phonetic symbols ㄏㄨㄥˊㄔㄣˊㄗㄨㄟˋ‖ ㄨㄟˊㄒㄩㄣㄉㄜㄙㄨㄟˋㄩㄝˋ‖ ㄨㄛˇㄩㄥˋㄨˊㄏㄨㄟˇ‖ ㄎㄜˋㄩ ㄥˇㄕˋㄞ ˋㄋㄧˇㄉㄜㄅㄟ‖ Chinese Pinyin hong 2 chen 2 zui 4 ‖ wei 2 xun 1 de 1 sui 4 yue 4 ‖ wo 3 yong 4 wu 2 hui 3 ‖ ke 4 yong 3 shi 4 ai 4 ni 3 de 1 bei 1 Pinyin of the present invention hwarn 2 xvan 2 zwae 4 ‖ wae 2 cyan 1 du 1 swae 4 ye 4 ‖ wo 3 yarn 4 w 2 hwae 3 ‖ ku 4 yarn 3 sv 4 ai 4 ni 3 du 1 bae 1

發明人以再一實施例說明之,請參照表5,其為本發明之再一實施例對照表。韻尾ㄠ、ㄜ,以漢語拼音轉寫 “的”、“好”、“掉” 韻母分別為 “e”、“ao”、“ao” 看不見押韻關係,以本發明之拼音轉寫的韻母則分別是 “u”、“au”、“au” 方可看見押韻關係。 [表5] 本發明之再一實施例對照表 偶然 節錄 (作詞:徐志摩) 你有你的‖ 我有我的‖ 方向‖ 你記得也好‖ 最好你忘掉‖ 在這交會時互放的光芒‖ 注音符號 ㄋㄧˇㄧㄡˇㄋㄧˇㄉㄜ ‖ ㄨㄛˇㄧㄡˇㄨㄛˇㄉㄜ ‖ ㄈㄤ ㄒㄧㄤˋ‖ ㄋㄧˇㄐㄧˋㄉㄜˊㄧㄝˇㄏㄠˇ‖ ㄗㄨㄟˋㄏㄠˇㄋㄧˇㄨㄤˋㄉㄧㄠˋ‖ ㄗㄞˋㄓㄜˋㄐㄧㄠ ㄏㄨㄟˋㄕˊㄏㄨˋㄈㄤˋㄉㄜ ㄍㄨㄤ ㄌㄧㄤˋ‖ 漢語拼音 ni 3you 3ni 3de 1‖ wo 3you 3wo 3de 1‖ fang 1xiang 4‖ ni 3ji 4de 2ye 3hao 3‖ zui 4hao 3ni 3wang 4diao 4‖ zai 4zhe 4jiao 1hui 4shi 2hu 4fang 4de 1guang 1liang 4 本發明之拼音 ni 3iao 3ni 3du 1‖ wo 3iao 3wo 3du 1‖ farm 1ciarm 4‖ ni 3ji 4du 2ie 3hau 3‖ ni 3zwae 4hau 3warm 4diau 4‖ zai 4zvu 4jiau 1hwae 4sv 2hw 4farm 4du 1gwarm 1liarm 4 The inventor illustrates this with yet another embodiment. Please refer to Table 5, which is a comparison table of yet another embodiment of the present invention. The rhyme endings ㄠ and ㄜ are translated into Chinese Pinyin as "的", "好", and "流". The finals are "e", "ao", and "ao" respectively. The rhyme relationship cannot be seen. The finals are translated into Pinyin of the present invention. They are "u", "au", and "au" to see the rhyme relationship. [Table 5] Comparison table of yet another embodiment of the present invention An accidental excerpt (lyrics: Xu Zhimo) You have your ‖ I have mine ‖ direction ‖ whether you remember it ‖ or better you forget ‖ the light that shines on each other during this intersection ‖ phonetic symbols ㄋㄧˇㄧㄡˇㄋㄧˇㄉㄜ‖ ㄨㄛˇㄧㄡˇㄨㄛˇㄉㄜ‖ ㄈㄤㄒㄧㄤˋ‖ ㄋㄧˇㄐㄧˋㄉㄜˊㄧㄝˇㄏㄠˇ ‖ ㄗㄨㄟ ㄋ ㄋ ˇ ˇ ㄧ ㄏㄠ ㄨㄤ ㄉ ㄉ ㄉ ˋ‖ ㄗㄞ ㄓㄜ ˋ ㄐ ㄐ ㄐ ㄏ ㄨㄟ ˋ ㄏㄨ ˋ ˋ ㄉㄜ ㄍ ㄤ ㄤ ㄧ ㄤ ˋ‖ ㄤ ㄤ ㄤ Chinese Pinyin ni 3 you 3 ni 3 de 1 ‖ wo 3 you 3 wo 3 de 1 ‖ fang 1 xiang 4 ‖ ni 3 ji 4 de 2 ye 3 hao 3 ‖ zui 4 hao 3 ni 3 wang 4 diao 4 ‖ zai 4 zhe 4 jiao 1 hui 4 shi 2 hu 4 fang 4 de 1 guang 1 liang 4 Pinyin of the present invention ni 3 iao 3 ni 3 du 1 ‖ wo 3 iao 3 wo 3 du 1 ‖ farm 1 ciarm 4 ‖ ni 3 ji 4 du 2 ie 3 hau 3 ‖ ni 3 zwae 4 hau 3 warm 4 diau 4 ‖ zai 4 zvu 4 jiau 1 hwae 4 sv 2 hw 4 farm 4 du 1 gwarm 1 liarm 4

請參照表6,其為本發明之另一較佳中文拼音實施例之對照表,其相對於表1,除了聲母(ㄅ~ㄙ)、介母(ㄧ、ㄨ、ㄩ) 以及韻母(ㄚ~ㄥ)的符號拼音之外,更加列了聲調(1~4聲)以及對介母(齊齒呼、合口呼、撮唇呼)與韻母(單元音韻母、雙元音韻母、帶鼻音韻母)的分類。簡而言之,表1及表6均是利用本發明之自然語言處理方法所產生之系統與成果,其等均具有本發明之自然語言處理方法的性質與特徵(詳參以上之說明,在此不再贅述),除了應用於上述表2~表5的實施範例之外,亦可應用於電腦可讀取記錄媒體,包含:轉錄程式碼及/或音訊與符號文字轉換;電腦程式產品,包含:語音辨識;聲控;翻譯;音訊轉文字;及/或文字轉音訊;及/或人工智慧領域之自然語言處理,包含:音訊積體電路設計及/或自然語言理解的人工智慧語音技術,藉此解決現行產品執行上的缺失與不方便,本發明在此並不加以限定。此外,本發明之應用亦包含自然語言處理方法應用及其系統應用與本發明成果應用,其中更包含本發明之實施例於人工智慧泛科技領域之應用。 [表6] 本發明之一較佳中文拼音實施例之對照表 聲母 b 玻 ㄅ p 坡 ㄆ m 摸 ㄇ f 佛 ㄈ d 得 ㄉ t 特 ㄊ n 訥 ㄋ l 勒 ㄌ g 哥 ㄍ k 科 ㄎ h 喝 ㄏ j 基 ㄐ q 欺 ㄑ c 希 ㄒ zv 知 ㄓ xv 蚩 ㄔ sv 詩 ㄕ v 日 ㄖ z 資 ㄗ x 雌 ㄘ s 思 ㄙ                   介母 韻母 開口呼 (無) 齊齒呼 i 衣       一 合口呼 w 烏       ㄨ 撮唇呼 y 迂       ㄩ 單元音韻母 r r 啊       ㄚ ir 呀     一ㄚ wr 蛙     ㄨㄚ o o 喔       ㄛ wo 窩     ㄨㄛ u u 鵝       ㄜ e e 誒       ㄝ ie 耶     一ㄝ ye 約     ㄩㄝ 雙元音韻母 ai ai 哀       ㄞ wai 歪     ㄨㄞ ae ae 欸       ㄟ wae 威     ㄨㄟ au au 熬       ㄠ iau 腰     一ㄠ ao ao 歐       ㄡ iao 懮     一ㄡ 帶鼻音韻母 am am 安       ㄢ iam 烟     一ㄢ wam 彎     ㄨㄢ yam 冤     ㄩㄢ an an 恩       ㄣ ian 因    一ㄣ wan 溫     ㄨㄣ yan 暈     ㄩㄣ arm arm 昂       ㄤ iarm 央     一ㄤ warm 汪     ㄨㄤ arn arn 亨       ㄥ iarn 英     一ㄥ warn 翁     ㄨㄥ yarn 雍     ㄩㄥ 聲調 1、2、3、4聲 Please refer to Table 6, which is a comparison table of another preferred Chinese pinyin embodiment of the present invention. Compared with Table 1, except for initial consonants (ㄅ~ㄙ), mediates (ㄧ, ㄨ, ㄩ) and finals (ㄚ~ In addition to the pinyin of ㄥ), it also lists the tones (1~4 sounds) and the pairs of medial vowels (Qi Te Hu, He Kou Hu, Pinch Lip Hu) and finals (monophthong finals, diphthong finals, nasal finals) classification. In short, Table 1 and Table 6 are systems and results produced by using the natural language processing method of the present invention, and they all have the properties and characteristics of the natural language processing method of the present invention (see the above description for details, see below (This will not be described again), in addition to being applied to the implementation examples in Tables 2 to 5 above, it can also be applied to computer-readable recording media, including: transcribing program codes and/or converting audio and symbol text; computer program products, Including: speech recognition; voice control; translation; audio to text; and/or text to audio; and/or natural language processing in the field of artificial intelligence, including: audio integrated circuit design and/or artificial intelligence speech technology for natural language understanding, In order to solve the shortcomings and inconveniences in the implementation of current products, the present invention is not limited here. In addition, the application of the present invention also includes the application of the natural language processing method and its system application and the application of the results of the present invention, which further includes the application of the embodiments of the present invention in the field of artificial intelligence and pan-technologies. [Table 6] Comparison table of one of the preferred Chinese pinyin embodiments of the present invention initial consonant b glass p slope m touch f buddha d got ㄉ t special n ne ㄋ l Leㄌ g brother k branch h drink j base q bully c hope zv know xv 萩ㄔ sv poem v 日ㄖ z capital x female s thinking medial vowel Open your mouth to call (none) Qiqihu i clothes one Close the mouth and call w ㄨ Pinch lips and call out y 肂 ㄩ single vowel vowel r r ah ㄚ ir wr frog ㄨㄚ o o Oh ㄛ wo nest ㄨㄛ u u goose ㄜ e e eh ㄝ ie Yeah 一ㄝ ye about ㄩㄝ diphthong vowels ai ai mourn ㄞ wai crooked ㄨㄞ ae ae eh ㄟ wae 伟 ㄨㄟ au au boil ㄠ iau waist 一ㄠ ao ao 欧ㄡ iao 殮一ㄡ nasal vowels am am 安ㄢ iam smoke a ㄢ wam bend ㄨㄢ yam injustice ㄩㄢ an an en ㄣ ian because of one ㄣ wan Wen ㄨㄣ yan dizzy ㄩㄣ arm arm ANG ㄤ iarm 杨一ㄤ warm Wow ㄨㄤ arn arn heng ㄥ iarn 英一ㄥ warn von ㄨㄥ yarn 鍍 ㄩㄥ tone 1, 2, 3, 4 sounds

雖然本發明已利用上述較佳實施例揭示,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作各種更動與修改,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed using the above preferred embodiments, they are not intended to limit the present invention. Anyone skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the present invention is The scope of protection shall be determined by the appended patent application scope.

102:人類端 104:自然語言 106:語言符號拼音結構形式公理系統化 108:系統 110:應用 112:人工智慧 114:電腦端 160~190:步驟 202:語言定義 204:學術整合 206:時脈 208:認知階段順序 210:認知框架 212:中文語音音序 302:習知符號學部份 304:修正符號學部份 402:拼音結構自然法則現象 404:拼音結構人為處理 502:形式化範圍 504:公理化範圍 506:形式公理系統化範圍 610:自然語言處理 612:人類端 614:電腦端 620:一/多語言共集符號池 622:自然語言處理之人類端成果 624:自然語言處理之電腦端成果 102:Human side 104:Natural language 106: Systematization of the axioms of the pinyin structure and form of language symbols 108:System 110:Application 112:Artificial intelligence 114:PC version 160~190: steps 202: Language definition 204:Academic Integration 206: Clock 208: Cognitive stage sequence 210:Cognitive Framework 212:Chinese phonetic sequence 302: Semiotics of familiarity 304: Correct the semiotics part 402: Phenomenon of natural laws of Pinyin structure 404: Pinyin structure is artificially processed 502:Formal scope 504: Axiomatic range 506: Scope of formal axiom systematization 610:Natural Language Processing 612:Human side 614:PC 620: One/multi-language symbol pool 622: Human side results of natural language processing 624: Computer-side results of natural language processing

圖1係本發明之一較佳自然語言處理方法及其系統與應用之關係示意圖; 圖1A係本發明之一較佳自然語言處理方法之概略流程圖; 圖2係習知人類對於自然語言認知框架研究之發展示意圖; 圖2A係一現代語言學與中國語言學之比較處理示意圖; 圖2B係一時脈與中文演變之關係對照示意圖; 圖2C係一語言認知分層示意圓靶圖; 圖2D係一認知框架示意圖; 圖2E係一中文語音序示意圖; 圖3係本發明之一較佳索緒爾符號學修正之發明步距示意圖; 圖4係本發明之一較佳語言學公理音段構符不移不複示意圖; 圖5係本發明之一較佳語言符號拼音結構形式公理系統化集合論示意圖;以及 圖6係本發明之一較佳實施例關係圖。 Figure 1 is a schematic diagram of the relationship between a preferred natural language processing method and its system and applications of the present invention; Figure 1A is a schematic flow chart of a preferred natural language processing method of the present invention; Figure 2 shows the development intention of human understanding of natural language cognitive framework research; Figure 2A is a schematic diagram of comparative processing between modern linguistics and Chinese linguistics; Figure 2B is a schematic diagram comparing the relationship between time pulse and the evolution of Chinese; Figure 2C is a target diagram illustrating the layering of language cognition; Figure 2D is a schematic diagram of a cognitive framework; Figure 2E is a schematic diagram of Chinese phonetic sequence; Figure 3 is a schematic diagram of the inventive steps of a preferred Saussurean semiotics correction of the present invention; Figure 4 is a schematic diagram of one of the preferred linguistic axioms of the present invention: the segmental structure of the segment cannot be moved; Figure 5 is a schematic diagram of the systemized set theory of pinyin structure form and axiom of one of the preferred language symbols of the present invention; and Figure 6 is a relationship diagram of a preferred embodiment of the present invention.

160~190:步驟 160~190: steps

Claims (11)

一種自然語言處理方法,實踐該自然語言符號拼音結構形式公理系統化,包含:將一種以上之自然語言透過歷時及/或共時比較,掌握人類具有的自然語言認知框架,其中人類對於自然語言認知框架的認知階級順序為語義(core)>語法>語音>語文;修正該自然語言的符號所指,其中該自然語言的符號能指包含語音能指及語文能指,劃分語音能指和語文能指使符號所指=語音能指>語文能指;根據語言音序的拼音結構為音段構符,不移不複的規則完成包含:形式化;自定義語言符號集合符號元件選擇權;自定義語言符號集合符號元件排列組合設計權;以及符號元件及/或符號元件組陳列規則合於語音音序之自然法則特徵;以及以一音段一符號或一符組進行該自然語言符號拼音結構形式公理系統化成一語言形式;其中語音能指具有線條性,屬聽覺性質,只在時間上展開,而且具有借自時間的特徵包含:(a)它體現一個長度,(b)此長度只能在一個向度上測定:它是一條直線,藉此,整合其直線性質及音序,提出音序拼音結構,使該自然語言語言音序特徵為音段構符,不移不複。 A natural language processing method that implements the systematic axioms of the pinyin structure of natural language symbols, including: comparing more than one natural language through diachrony and/or synchrony, and grasping the natural language cognitive framework of human beings, in which human beings have natural language cognition framework. The order of the cognitive hierarchy of the framework is semantics (core)>grammar>speech>linguistic; modify the symbolic signifier of the natural language, where the symbolic signifier of the natural language includes the phonetic signifier and the linguistic signifier, dividing the phonetic signifier and the linguistic signifier. The signified signifier = phonetic signifier > linguistic signifier; construct symbols for segments according to the pinyin structure of the language sequence, and the unwavering rules include: formalization; customization of language symbol collection symbol element selection; customization The right to design the arrangement and combination of symbol elements in language symbol sets; and the arrangement rules of symbol elements and/or symbol element groups are consistent with the natural law characteristics of phonetic sequences; and the pinyin structure of the natural language symbols is implemented with one symbol per segment or one symbol group. The axioms are systematized into a language form; the phonetic signifier has linearity, is auditory in nature, only expands in time, and has characteristics borrowed from time, including: (a) it embodies a length, (b) this length can only be in Measured in one dimension: it is a straight line. By integrating its linear properties and phonetic sequence, a phonetic sequence pinyin structure is proposed, so that the phonetic sequence characteristics of the natural language are segmental structures, which remain unchanged. 如請求項1之自然語言處理方法,其中修正該自然語言的符號所指包含新增語文能指自定義集合符號元件。 For example, the natural language processing method of claim 1, wherein the modified symbol referent of the natural language includes a newly added linguistic signifier custom collection symbol element. 如請求項1之自然語言處理方法,其中符號拼音結構形式化的形包含:符號元件選擇權,其中符號選擇包含電腦科學所用之符號元件;符號拼音結構形式化的式包含:符號元件及/或符號元件組;合於音素;以及可辨識符號組之排列組合設計權。 Such as the natural language processing method of claim 1, wherein the formalized form of the symbol pinyin structure includes: symbol component selection, wherein the symbol selection includes symbol components used in computer science; the formalized formula of the symbol pinyin structure includes: symbol components and/or Symbol component groups; combined with phonemes; and the right to design the arrangement and combination of identifiable symbol groups. 如請求項1之自然語言處理方法,更包含:使該語言形式所包含之符號集合的所有符號元件及/或元件組具有自洽(相容)性;將該語言形式做為符號集合的公理,強制維持每個符號元件及/或元件組獨立性,要求最末音段必有符號元件及/或元件組讓整體構符閉鎖,形成獨立唯一及閉鎖;以及使該語言形式符合音段構符,不移不複,其特徵之一拼音結構展示音韻關係。 For example, the natural language processing method of claim 1 further includes: making all symbol elements and/or component groups of the symbol set included in the language form self-consistent (consistent); using the language form as an axiom of the symbol set , it is mandatory to maintain the independence of each symbolic component and/or component group, requiring that the last segment must have a symbolic component and/or component group to close the entire structure, forming an independent, unique and closed structure; and making the language form conform to the segment structure. The symbol never moves or recovers. One of its characteristics is that the pinyin structure shows the phonological relationship. 如請求項1之自然語言處理方法,其中該自然語言符號拼音結構形式公理系統化包含人為系統技術方法及成果應用。 For example, the natural language processing method of claim 1, wherein the systematization of the natural language symbol pinyin structure form axioms includes artificial system technology methods and application of results. 如請求項1之自然語言處理方法,其中該自然語言符號拼音結構形式公理系統化包含語言符號形式化;公理化;以及形式公理化之系統化方式,其中符號元件及/或元件組更包含均等轉換符號元件或音訊形式。 For example, the natural language processing method of claim 1, wherein the formal axiomatic systematization of the pinyin structure of natural language symbols includes the formalization of language symbols; Convert symbol components or audio formats. 如請求項1之自然語言處理方法,其中符號元件及/或元件組更包含:藉由符號一致轉譯不同語言之音訊符號、文字及/或程式碼,以及同質轉譯包含:透過深度學習強化處理自然語言之句段關係(rapports syntagmatiques)及/或聯想關係(rapports associatifs)之音訊符號、文字及/或程式碼。 For example, the natural language processing method of claim 1, wherein the symbol components and/or component groups further include: the consistent translation of audio symbols, text and/or codes in different languages through symbols, and the homogeneous translation includes: enhanced processing of natural language through deep learning The audio symbols, text and/or codes of the language's syntagmatiques and/or rapports associatifs. 如請求項1之自然語言處理方法,更包含將多種自然語言符號集合的符號元件及/或符號元件組轉換一致使多種自然語言符號集合相等其符號元件及/或符號元件組之元素同質共集達成多語言轉換翻譯。 For example, the natural language processing method of claim 1 further includes converting the symbol elements and/or symbol element groups of multiple natural language symbol sets into the same set, so that the multiple natural language symbol sets are equal, and the elements of the symbol elements and/or symbol element groups are homogeneous. Achieve multi-language conversion and translation. 一種利用如請求項1之自然語言處理方法的自然語言系統,包含:語言符號拼音結構形式公理系統化之集合及/或依語言音序自然法則發明之人為系統。 A natural language system using the natural language processing method as claimed in claim 1, including: a systematized set of language symbols, pinyin structures, formal axioms and/or an artificial system invented based on the natural laws of language pronunciation. 一種利用如請求項1之自然語言處理方法的應用包含其處理方法應用及其系統應用與其成果應用,其中更包含本發明之實施例於人工智慧泛科技領域之應用。 An application using the natural language processing method as claimed in claim 1 includes its processing method application, its system application and its result application, which further includes the application of the embodiments of the present invention in the field of artificial intelligence and pan-technologies. 一種利用如請求項1之自然語言處理方法的成果應用於包含:電腦可讀取記錄媒體,包含:轉錄程式碼及/或音訊與符號文字轉換;電腦程式產品,包含:語音辨識;聲控;翻譯;音訊轉文字;及/或文字轉音訊;及/或人工智慧領域之自然語言處理,包含:音訊積體電路設計及/或自然語言理解的人工智慧語音技術,包含:音訊晶片、軟 硬韌體設計、量子組序、各式電子、機械材料設計、因子工程序列、因數工法訊息處理。 A result application using the natural language processing method of claim 1 includes: computer-readable recording media, including: transcription code and/or audio and symbol text conversion; computer program products, including: speech recognition; voice control; translation ; Audio to text; and/or text to audio; and/or natural language processing in the field of artificial intelligence, including: audio integrated circuit design and/or artificial intelligence speech technology for natural language understanding, including: audio chips, software Hard firmware design, quantum assembly sequencing, various electronic and mechanical material design, factor engineering sequence, factor engineering method information processing.
TW111135010A 2022-09-15 2022-09-15 Natural language processing method and its system and application TWI834293B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
TW111135010A TWI834293B (en) 2022-09-15 2022-09-15 Natural language processing method and its system and application
CN202311098858.1A CN117709350A (en) 2022-09-15 2023-08-29 Natural language processing method, system and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW111135010A TWI834293B (en) 2022-09-15 2022-09-15 Natural language processing method and its system and application

Publications (2)

Publication Number Publication Date
TWI834293B true TWI834293B (en) 2024-03-01
TW202414270A TW202414270A (en) 2024-04-01

Family

ID=90153976

Family Applications (1)

Application Number Title Priority Date Filing Date
TW111135010A TWI834293B (en) 2022-09-15 2022-09-15 Natural language processing method and its system and application

Country Status (2)

Country Link
CN (1) CN117709350A (en)
TW (1) TWI834293B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749531A (en) * 2021-01-13 2021-05-04 北京声智科技有限公司 Text processing method and device, computer equipment and computer readable storage medium
TW202121230A (en) * 2019-11-20 2021-06-01 中央研究院 Natural language processing method and computing apparatus thereof
CN113626563A (en) * 2021-08-30 2021-11-09 京东方科技集团股份有限公司 Method and electronic equipment for training natural language processing model and natural language processing
US20210397791A1 (en) * 2020-06-19 2021-12-23 Beijing Baidu Netcom Science And Technology Co., Ltd. Language model training method, apparatus, electronic device and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW202121230A (en) * 2019-11-20 2021-06-01 中央研究院 Natural language processing method and computing apparatus thereof
US20210397791A1 (en) * 2020-06-19 2021-12-23 Beijing Baidu Netcom Science And Technology Co., Ltd. Language model training method, apparatus, electronic device and readable storage medium
CN112749531A (en) * 2021-01-13 2021-05-04 北京声智科技有限公司 Text processing method and device, computer equipment and computer readable storage medium
CN113626563A (en) * 2021-08-30 2021-11-09 京东方科技集团股份有限公司 Method and electronic equipment for training natural language processing model and natural language processing

Also Published As

Publication number Publication date
CN117709350A (en) 2024-03-15
TW202414270A (en) 2024-04-01

Similar Documents

Publication Publication Date Title
US11488577B2 (en) Training method and apparatus for a speech synthesis model, and storage medium
CN113811946A (en) End-to-end automatic speech recognition of digital sequences
Phan Lacquered words: the evolution of Vietnamese under Sinitic influences from the 1st century BCE through the 17th century CE
Huang et al. Pretraining techniques for sequence-to-sequence voice conversion
WO2005121993A1 (en) Application system of multidimentional chinese learning
Sanaullah et al. A real-time automatic translation of text to sign language.
Staal Artificial languages across sciences and civilizations
TWI834293B (en) Natural language processing method and its system and application
CN117852528A (en) Error correction method and system of large language model fusing rich semantic information
Zhao et al. An online database of phonological representations for Mandarin Chinese
CN101639735B (en) Method for inputting Chinese character with sound and tone definition by using simplified pinyin
Gu The" Zhouyi"(Book of Changes) as an Open Classic: A Semiotic Analysis of Its System of Representation
Pubadi et al. A focus on codemixing and codeswitching in Tamil speech to text
Zhang et al. Natural Language Processing and Chinese Computing: 7th CCF International Conference, NLPCC 2018, Hohhot, China, August 26–30, 2018, Proceedings, Part I
CN107391464A (en) New standard Chinese information ASCII gathers code
CN107315725A (en) Standard Chinese information ASCII gathers code
Park et al. Hangeul (Korean alphabet) Notation Method based on Hunminjeongeum, for the Pronunciations of [sh],[zh], and [ch] in Chinese Language
CN106951402A (en) New standard Chinese information ASCII systems code
Keszthelyi Virtual assistants: 21st century Towers of Babel?
US11893349B2 (en) Systems and methods for generating locale-specific phonetic spelling variations
Lin Lost and Found: Issues of Translating Japanophone Taiwanese Literature
Xu An Analysis of Duan Yucai’s Theory of Shengyi tongyuan in his Annotations to the Shuowen jiezi
TW201913303A (en) Input device and method for setting key corresponding to phonetic input thereof
CN105389017A (en) Tonetic Chinese phonetic four-tone inputting and writing printing method
Werner Indigenous Language Revitalization using Virtual Reality