CN110457551B - Method for constructing semantic recursion representation system of natural language - Google Patents

Method for constructing semantic recursion representation system of natural language Download PDF

Info

Publication number
CN110457551B
CN110457551B CN201910750547.6A CN201910750547A CN110457551B CN 110457551 B CN110457551 B CN 110457551B CN 201910750547 A CN201910750547 A CN 201910750547A CN 110457551 B CN110457551 B CN 110457551B
Authority
CN
China
Prior art keywords
semantic
basic
objects
semantic object
definition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910750547.6A
Other languages
Chinese (zh)
Other versions
CN110457551A (en
Inventor
梁冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201910750547.6A priority Critical patent/CN110457551B/en
Publication of CN110457551A publication Critical patent/CN110457551A/en
Application granted granted Critical
Publication of CN110457551B publication Critical patent/CN110457551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Abstract

The embodiment of the disclosure discloses a construction method of a semantic recursive representation system of a natural language. One embodiment of the construction method comprises: selecting a dictionary of a predetermined kind of natural language; generating a basic meaning item table; generating a composite item table; selecting a suitable semantic classification system from the additional classification of the semantic item; executing a method for creating a basic semantic object; executing an object structure processing method of the basic semantic object; executing a semantic definition processing method of the basic semantic object; executing a generic object processing method of the basic semantic object; performing an additional classification processing method of the basic semantic object; executing a creating method of the composite semantic object; an additional classification processing method of the compound semantic object is performed. The semantic object constructed by the embodiment is unique, solves the semantic representation problem and the font ambiguity problem, and has the semantic self-representation and self-interpretation capability.

Description

Method for constructing semantic recursion representation system of natural language
Technical Field
The embodiment of the disclosure relates to the technical field of natural language processing, in particular to a construction method of a semantic recursion representation system of a natural language.
Background
Natural Language Processing (NLP) is Processing information such as a shape, a sound, and a meaning of a Natural Language by a computer. Because of the ambiguity, context correlation, expression environment difference, and the broad nature of semantic knowledge of natural language semantics, the semantic representation and processing are always difficult problems in the technical field of natural language processing.
Page four, line 30 of chinese information processing development reports (2016) (the chinese academy of information): "what representation the semantics should take has always plagued researchers".
Page 2, line 8 of the "knowledge graph development report (2018) (the chinese information society language and knowledge calculation committee) states that" strict semantic theoretical model and formalized semantic definition are absent in "semantic web, framework language and production rules. "
Glyphs in natural language are words of natural language written, printed on paper, or displayed as graphics on a computer screen. Glyphs generally have multiple semantics, different pronunciations, and different parts of speech are common phenomena in natural languages.
The general flow of natural language processing includes word segmentation, lexical analysis, syntactic analysis, semantic analysis and other processing procedures, each of which involves semantic processing, and if each of which is processed in a font form, it will suffer from the problems of font ambiguity and multiword nature.
Thus, regardless of the model construction form of the processing semantic system, if the processing elements in the model are glyphs, the model suffers from the ambiguity problem of glyphs when expressing and processing semantics.
Disclosure of Invention
Some embodiments of the present disclosure provide a method for constructing a semantic recursive representation system of a natural language, including: step 1, selecting a word dictionary of a preset natural language, wherein the word dictionary comprises meaning items, and each meaning item comprises a character form, a character pronunciation and a character meaning; step 2, generating a basic semantic item table according to the definition of the basic semantic object; step 3, generating a compound semantic item table according to the definition of the compound semantic object; step 4, according to the selected object-oriented language and the formal definition of the semantic object, selecting a proper semantic classification system from the additional classification of the semantic item, and writing a software code of the semantic recursive representation system semantic object class of the natural language; step 5, executing a method for creating a basic semantic object; step 6, executing an object structure processing method of the basic semantic object; step 7, executing a semantic definition processing method of the basic semantic object; step 8, executing the object species object processing method of the basic semantics; step 9, executing an additional classification processing method of the basic semantic object; step 10, executing a creating method of the composite semantic object; and step 11, executing an additional classification processing method of the composite semantic object.
According to the construction method of the natural language semantic recursive representation system provided by some embodiments of the disclosure, the semantic object constructed through the above method steps is unique, the problem that the semantics of the natural language are represented in a computer and the problem of ambiguity of the font is solved, and the semantic object defined by the font-sound-meaning as a unified body in a recursive manner has the semantic self-representation and self-interpretation capabilities.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 illustrates an architecture diagram of a semantic recursive representation system of natural language according to some embodiments of the present disclosure;
FIG. 2 is a structural graphical representation of a semantic object according to some embodiments of the present disclosure;
FIG. 3 is a generic class framework diagram of a semantic system according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of a method of constructing a semantic recursive representation system of a natural language according to an embodiment of the present disclosure;
FIG. 5 is a flow diagram of a method of performing creation of an underlying semantic object according to an embodiment of the present disclosure;
FIG. 6 is a flow diagram of an object structure processing method of executing underlying semantic objects;
FIG. 7 is a flow diagram of a semantic definition processing method of executing underlying semantic objects;
FIG. 8 is a flow diagram of a method of performing generic object processing of underlying semantic objects;
FIG. 9 is a flow diagram of a method of performing additional classification processing of underlying semantic objects;
FIG. 10 is a flow chart diagram of a method of performing the creation of a composite semantic object;
FIG. 11 is a flow diagram of an additional classification processing method for compound semantic objects;
FIG. 12 is a diagram of a semantic classification framework according to an embodiment of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an architecture diagram of a semantic recursive representation system of natural language according to some embodiments of the present disclosure.
As shown in fig. 1, the semantic recursive representation system 100 may include a base semantic object module 101 and a compound semantic object module 102. The basic semantic object and the compound semantic object are recursively defined in the semantic recursive representation system and are unique. The natural language is any natural language with a word system in the world.
In order to solve the problems of ambiguity and semantic expression of the character patterns in natural language processing, the disclosure provides a semantic recursion expression system and a construction method which integrate the character patterns, the character pronunciation and the character meaning of natural language characters. The expression result of each semantic of the character in the computer by using the method is called a semantic object. The semantic objects can be compared to judge whether the semantic objects are equal, so that the semantic objects can be guaranteed to be unique. The building of the semantic object system of the natural language is completed by constructing the unique semantic objects of the natural language vocabulary.
Natural language characters are natural products of human beings understanding the world, and a plurality of natural language characters exist in the world at present, such as: chinese, English, German, Japanese, etc. The actual characters are represented by characters inside the computer. The computer has a plurality of character sets, and a common character set is a Unicode character set of ISO (international organization for standardization), wherein the character set comprises a plurality of literal characters and phonetic symbols. Characters are the most basic objects of computer processing.
Glyphs are visually-visible graphics of natural language text, and a computer can graphically represent characters of natural language text. Glyphs are the external representation of characters and characters are the internal representation of glyphs.
A phonetic letter is an extrinsic representation using phonetic symbols. The computer may perform a graphical representation of the pronunciation of the natural language text.
The words refer to the vocabulary of natural language, including words of alphabetic writing, single words of Chinese writing, and word groups.
The Word of the alphabetic writing (Word) can be represented by a character string in the computer, the Word unit of Chinese comprises a Chinese Word and a character group consisting of the Chinese Word, the Chinese Word is a character in the computer, and the Chinese character group is a character string in the computer. Glyphs can be represented inside a computer using character strings.
Words in natural language are generally ambiguous, and a word often has several meanings, each meaning being a semantic item. Therefore, the words have a plurality of meaning items in the word dictionary, and the meaning items are itemized descriptions of the rational meanings of the words. The interpretation and definition of the semantic items in the dictionary is still explained using the words in the dictionary.
Word sense is a collection of word sense items. In certain cases, a meaning term for the word. An semantic item may be referred to as a semantic object.
Sentences, articles and books can be considered as a set of glyphs in natural language.
Semantics is the sum of all semantic objects and semantic object relationships, and semantic object relationships are described using semantic objects, so semantic object relationships are also semantic objects.
The semantics of any natural language can be divided into two parts, namely the semantics of basic vocabulary of the natural language, which is called basic semantics; the second is a semantic meaning formed by combining basic semantic meanings, which is called as a composite semantic meaning.
The basic semantics are the most basic and stable common sense semantics in the natural language, and have generality.
The compound semantics is the semantics of which the basic semantics are organized by lexical and syntactic means. Due to the limiting effect between the semantics, the composite semantics has specificity.
The number of basic semantics is limited, the basic semantics can be regarded as meaning items in a word dictionary, and the word dictionary also includes some common compound semantics in the forms of idioms, short sentences and the like. While the number of compound semantics is unlimited. Such as daily conversations, press books, and the textual content of websites.
The partitioning of the base semantics and the compound semantics is not strict.
A class is a cognitive result that abstracts something in some way. Objects are class elements. All objects in a class have the same characteristics.
Human cognitive thinking comes from classification, and the expression and record of thinking are mainly based on natural language words.
The semantic generation of a word in a language is a classification of the real world knowledge. The semantics of a word are both semantic objects of a semantic class and another, more abstract semantic class.
And according to the semantic features, the relation between classes and objects of the natural language and the division of basic semantics and compound semantics, giving a recursive definition of semantic object formalization.
In some embodiments, a symbolic description used for the formal definition of the semantic objects may be given first.
The use of a pair of symbols "()" indicates that the objects within the symbol constitute a set. I.e., the object is unordered; space separation is used among a plurality of objects; (X) represents a set consisting of the elements in the X set.
A pair of symbols "< >" is used to indicate that the objects within the symbol constitute an ordered set. I.e. objects are ordered in the set, with commas between objects "," separated ".
A pair of symbols "[ ]" is used to represent one object. [ XXX ] denotes the object XXX.
A class is represented using a pair of symbols "{ }", and a { XXX class } represents a class.
The symbol "e" is defined as "belonging" and X e X indicates that object X belongs to set X.
The notation is used ": meaning" defined as ".
The symbol "|" is used to denote an "OR operation" of a set.
The symbol "U" is used to denote the "AND operation" of the set.
Formal definition of natural language semantics:
suppose that:
j represents the set of all the already defined underlying semantic objects;
d represents the set of all defined composite semantic objects;
z the set of all glyphs, Z representing one glyph;
p set of phonetic symbol strings of all the character pronunciations, P representing a phonetic symbol string of a word.
{ semantic object class }::: { basic semantic object class }, taute { compound semantic object class }
Description of the drawings: the semantic object class is composed of a base semantic object class and a composite semantic object class. { basic semantic object class }: { ([ object structure ] [ semantic definition ] [ genus object ] (X)) }
Wherein [ object structure ] < c1, c2, … cn > ] c is belonged to J;
the structure of the shape and sound is [ < Z, P > ] Z ∈ Z, P ∈ P;
[ semantic definition ] < J1, J2, …, jn > ] J1, J2, …, jn ∈ J;
[ genus object ] < a > a ∈ J;
(X)::=(x1x2…xn) x1,x2…x2∈J。
the [ object structure ], [ semantic definition ], [ genus object ] are three basic attribute objects, which cannot be null values, and (X) is optional. These four objects are also referred to as attributes of the semantic object.
And on the basis of the definition of the basic semantic object class, giving the form definition of the composite semantic object class.
{ composite semantic class }: { ([ composite object structure ] (Y)) }
Wherein [ composite object structure ] < D1, D2, …, dn > ] D1, D2, … dn ∈ J ∈ U.D;
(Y)::=(y1y2…yn) y1,y2…yn∈J∪D;
the composite object structure cannot be empty. (Y) is optional.
Description of properties of the underlying semantic object:
the [ object structure ] is an ordered set consisting of underlying semantic objects or [ configuration and pronunciation structures ]. The constituent structure of the semantic object is explained.
The structure of the font and the pronunciation is an ordered pair of the font and the pronunciation of the font, which is called the font pronunciation pair for short;
for example, the following steps are carried out:
representing glyphs using a pair of quotation marks'; semantic objects are represented using a pair of brackets '[ ]'.
‘sentities' are semantic objects [ segments ] in English]Is to be taken to mean that the glyph of (1),
Figure BDA0002167033500000061
(wherein
Figure BDA0002167033500000062
As an accent) is [ segments ]]The character representation of the character pronunciation of (a).
<‘sementics’,
Figure BDA0002167033500000063
>Is a font-pronunciation pair of English.
'sofa' is a glyph representation of the semantic object in Chinese [ sofa ]; 'sh ā f ā' is a character representation of the pronunciation of the semantic object [ sofa ] in Chinese. (Note that the sofa is continuous word, a transliteration.)
< 'sofa', 'sh ā f ā' is a character-pronunciation pair of Chinese.
'semantics' consists of two semantic objects [ words of a language ] and [ senses ] (senses of meaning).
The object structure of [ semantics ] is: [ < [ words ], [ sense ] ]. Wherein the content of the first and second substances,
the object structure of [ language ] is: [ < ', ' y ǔ ' > ];
[ definition ] the object structure is: [ < ' >, ' y mu ' > ].
Many words in english and german are composed of simple words.
Such as: free is called as beam + room
The semantic definition is an ordered set composed of basic semantic objects, explains the definition structure of the semantic objects and expresses the definition relation between the semantic objects and other semantic objects;
for example, the following steps are carried out:
the semantics of [ semantics ] are defined as: < [ language ], [ of ], [ meaning ] >.
A genus object is a basic semantic object, which is a direct superordinate concept of the semantic object. Specifying a meaning classification for the semantic object; the genus object establishes the conceptual superior-inferior relation of two semantic objects. Meaning classification relations between semantic objects are described, and meaning classification of concepts is inherent attributes of semantics.
For example, the following steps are carried out:
the genus object of [ semantic ] is [ concept ], and the semantic is a univocal word. But for the ambiguous word 'apple', if one semantic of 'apple' is 'fruit of apple tree'; then the species object of the [ apple ] is [ fruit ]. Example sentence: ' this apple is very sweet. '
If one semantic of 'apple' is 'apple america company'. Then the generic object of the [ apple ] is [ manufacturer ]. Example sentence: ' Samsung was recently used by apple research patent. '
If one semantic of 'apple' is 'apple's cell phone. Then the genus of the [ apple ] is [ mobile phone ]. Example sentence: ' apples are hot when playing games for a long time. '
(X) is a collection of base semantic objects that illustrate additional classifications for the base semantic objects.
A generic object is an inherent classification of a semantic object, which requires additional classifications as well. Such as: part-of-speech classification, emotion classification, corpus classification, and semantic classification for other purposes. Additional classifications are generalizations of other classifications.
For example, the following steps are carried out:
the semantic definition of [ secure ] is: < [ stable ], [ ground ], [ life ] >, if the additional classification is a part-of-speech classification, a corpus classification, then the additional classification of [ safe ] (X) ═ is ([ verb ] [ written language ]).
From the definition of { basic semantic object class }, it can be seen how many semantic items are in a basic vocabulary, and how many basic semantic objects can be generated, as long as two basic attributes of [ object structure ] and [ semantic definition ] in the semantic object are different, namely different semantic objects. Semantic classification is also classified using semantic objects.
Description of attributes of compound semantic objects:
composite object structure is an ordered set composed using already defined base semantic objects or composite semantic objects;
such as: 'China people liberation army' can be defined as a composite semantic object [ China people liberation army ], and the composite object structure is as follows: < [ China ], [ people ], [ liberation force ] >.
Such as: [ a cup of coffee ] is a compound semantic object in English.
(Y) is a set of base semantic objects or composite semantic objects that accounts for a variety of additional classes of composite semantic objects.
As can be seen from the definition of semantic objects, given a natural language word, the font character and the pronunciation character of the language word can be used to construct the semantic object system of the language.
The definition of the basic semantic object has the capability of judging whether the semantic objects are the same or not in the technical effect. Two semantic objects are judged to be identical if and only if the [ object structure ] and [ semantic definition ] of the two semantic objects are identical.
The composite semantic object is composed of basic semantic objects, two composite semantic objects are judged to be the same, and only judgment of whether all semantic objects in the [ composite object structure ] are the same is needed.
The semantic object can be ensured to be uniquely represented in the computer only by judging whether the semantic objects are the same, and the ambiguity problem of representing the semantics by using the font in the computer is solved by using the semantic object to represent the semantics.
The (X) and (Y) attributes of the semantic object are extensible and various additional semantic classifications for semantics, are important technical means for constructing a semantic model, and have high generality.
The components defined by the semantic object may be represented as a structured graph to illustrate the constituent structural relationships between the semantic object attributes.
With continued reference to fig. 2, fig. 2 is a graphical representation of semantic object structures according to some embodiments of the present disclosure.
According to the formalized recursion definition of the semantic object, recursion components are used as composition modules, the structural relationship of the definition modules and the action relationship between the modules are explained, and an abstract structural device of the semantic object is provided.
It should be noted that in all embodiments of the present disclosure, each module, for example, each module in fig. 2, may be implemented as a module of a chip. Thus, interaction between modules may be performed by data communication between chip modules.
As shown, A → B represents a compositional relationship, i.e., A is an element in B.
Each block represents a building block.
1. The basic semantic object module 201: the semantic object set comprises an object structure module 202, a semantic definition module 203, a generic object module 204 and a basic semantic object classification (X) module 205. The object structure module 202, the semantic definition module 203, the attribute object module 204 and the basic semantic object classification module 205 are composed of basic semantic objects. Embodying the structure recursion relation.
2. Font-phonetic pair module 206: given a set of glyphs, each glyph is combined with each pronunciation of the glyph into a glyph pronunciation pair. The glyph pronunciation pair module 206 is a collection of glyph pronunciation pairs owned by a natural language. This is the starting point for the semantic object definition. The semantic objects are made up of pairs of glyphs and pronunciations as the beginning of the construct.
3. The object structure module 202: the composition structure of the basic semantic object is described, and the basic semantic object is an ordered set. The constituent elements are either basic semantic objects or pairs of glyphs and pronunciations. The glyph pronunciation pair is the initial structure of the object structure.
4. The semantic definition module 203: the definition structure of the basic semantic object is explained, and is an ordered set consisting of the basic semantic objects.
5. The genus object module 204: the upper concept relation of the defined basic semantic object is explained, and the defined basic semantic object is a basic semantic object.
6. The base semantic object classification module 205: is a set consisting of a set of basic semantic objects for additional classification of the basic semantic objects.
7. The compound semantic object module 207: the set is composed of a composite object structure module and a composite semantic classification module.
8. The composite object structure module 208: is an ordered set composed of either base semantic objects or composite semantic objects to illustrate the constituent structure of the defined composite semantic objects.
9. The compound semantic object classification module 209: is a set of base semantic objects or composite semantic objects that accounts for additional classifications of the defined composite semantic objects.
As can be seen from the structure of the semantic objects, the semantic object system is composed of basic semantic objects and composite semantic objects.
The basic semantic object takes the character and pronunciation pairs as initial elements, and through mutual reference and interaction between modules, the structural relationship, the definition relationship and the species classification relationship between the semantic objects are established, so that a semantic system consisting of the basic semantic object and the composite semantic object is formed.
Next, a hierarchical classification structure of semantic objects will be explained.
The formal definition of semantic objects and the architectural representation of semantic objects are the conceptual basis for constructing a semantic object system.
The formal definition of semantic objects and the structure diagram of semantic objects illustrate only the definition of a single semantic object and the relationship between semantic objects.
The semantics of natural language is the embodiment of general knowledge, and the semantics also has good knowledge classification characteristics. The semantic object system not only needs to realize the internal structure of the semantic, but also needs to establish the knowledge classification level of the semantic objects and realize the processing function of each semantic object class. The classification hierarchy of semantic objects is the basis for implementing a semantic object software system.
A semantic classification system model is provided, which generates a new semantic object classification hierarchical structure according to the abstraction of the structural features or behavior features of the whole semantic object, and researches the action relationship between the semantic objects according to the classification hierarchical structure.
The object-oriented programming language and environment is a software development environment supporting classification hierarchical structures, structural recursion and behavior recursion, and provides an implementation environment for a semantic object system.
According to the object-oriented idea, if an object can answer the word form, the word pronunciation and the word meaning, the object can be regarded as a semantic object. Then firstly, the character pattern and pronunciation pair can answer the character pattern and pronunciation attribute of itself, secondly, all semantic objects corresponding to the character pattern and pronunciation pair are found out by inquiring the basic semantic object module, and the semantic objects are the semantics of the character pattern and pronunciation pair, so the character pattern and pronunciation pair is also a semantic object.
The semantic object system is a tree structure composed of semantic classes of a classification level. A semantic object class has various constraint rules among semantic objects, which can be implemented by writing program code.
The (X) in the base semantic object definition and the (Y) in the composite semantic object definition are the collection of the various classes of semantic objects in one set.
One classification may be chosen from (X) to generate a subclass of the underlying semantic object class.
A subclass may be chosen from (Y) to produce a subclass of the compound semantic object class.
With further reference to FIG. 3, FIG. 3 is a class framework diagram of a generalized semantic system according to embodiments of the present disclosure.
In fig. 3, the connecting line segment of the diamond shape: the object representing one end of the line segment is the property object of the class pointed to by the diamond. The graphical representation forming the ring represents the other objects of the class that are referenced in the object properties of the class, which is a recursively defined representation.
A connecting line segment with one end being a triangle: the class representing one end of the line segment is a subclass of the class indicated by the triangle.
Semantic object classes are abstract classes, which are basic object-oriented terms that emphasize only the class of functional implementation.
Font, character and pronunciation: is a class for saving and manipulating < glyph, pronunciation > objects.
Basic semantic object classes: for characterizing superclasses that describe underlying semantic objects.
The new partition results in a new set of base semantic class subclasses.
Compound semantic object class: and the super class is used for describing the compound semantic object.
Compound semantic subclasses 1-n, a new set of compound semantic subclasses generated by the new partitioning.
If the number of the divisible semantic objects is used for judging whether a new subclass is generated, it is suggested that if the number of the divisible semantic objects exceeds 100, a subclass can be divided.
Next, a construction method of a semantic recursive representation system of a natural language according to some embodiments of the present disclosure will be explained.
A semantic recursive expression system of natural language is constructed by using character-shape character-sound pair and character string as initial elements, and a construction method of semantic object includes two procedures.
Two processes for the basic semantic object creation are:
the first process is a process of creating an initial semantic object, and since the semantic object does not exist at the time of initial creation, when a newly generated semantic object is initially defined, it is necessary to use a character string as a temporary representation to serve as a semantic object in the object structure, semantic definition, generic object, semantic additional classification of the semantic object.
The second process is a process of replacing temporary character strings used in the initial creation with semantic objects, and a plurality of semantic objects may exist in the replacement, and need to be manually judged to find out a proper semantic object to replace the temporary character strings.
The temporary character strings used in the construction process are replaced by the generated semantic objects, and the process is also a process for establishing the composition relationship, the definition relationship, the concept context relationship and other additional classification relationships of one semantic object and other semantic objects, and the temporary character strings are gradually replaced until a semantic system only consisting of the semantic objects is finally constructed.
The first process is to create an initial composite semantic object, and because the basic semantic object is already created, the basic semantic object can be directly used to construct a composite object structure. If the other additional classification of the compound semantic object is a compound semantic object, a temporary string replacement is required. The second process is to use the compound semantic object to replace the corresponding temporary character string after all the compound semantic objects are defined.
With further reference to fig. 4, fig. 4 is a flow chart of a method of constructing a semantic recursive representation system of a natural language according to an embodiment of the present disclosure.
Step 401, selecting a word dictionary of a natural language of a predetermined kind as a reference, wherein the word dictionary comprises meaning items, and each meaning item comprises a character form, a character pronunciation and a character meaning; the word dictionary generally gives the part of speech of the semantic item, and some word dictionaries also give the corpus classification and the like.
Optionally, the word dictionary further comprises at least one of: part of speech of the semantic item, and corpus classification of the semantic item.
Step 402, writing or generating a basic semantic item table according to the definition of the basic semantic object.
In some embodiments of the present disclosure, the base item table is written or generated in the format of 'item name object structure item definition generic additional category (X)'.
The meaning item name is a word in a dictionary for alphabetic writing such as English.
The meaning item name can be single character, associated character and fixed character group of Chinese for Chinese.
Giving a direct upper concept of the meaning item according to concept common knowledge as a generic object; if the vocabulary used by the generic object is not in the semantic item table, it needs to be added to the basic semantic item table.
Given the additional category of the semantic item, if the vocabulary used by the additional category is not in the semantic item table, it needs to be added to the base semantic item table.
Step 403, writing or generating a compound semantic item table according to the definition of the compound semantic object;
in some embodiments of the present disclosure, the compound semantic item table is generated by writing in the' semantic item name compound object structure additional classification (Y).
The meaning item name is English phrase or Chinese phrase, idiom, adage, etc. commonly used and recorded in the word dictionary. Given the additional classification of the meaning item, the vocabulary used by the additional classification is not in the meaning item table and needs to be added to the basic meaning item table or the compound meaning item table.
Step 404, selecting a suitable semantic classification system according to the selected object-oriented language, the formal definition of the semantic object, and the additional classification of the semantic item.
In some embodiments of the present disclosure, a software code for a semantic recursive representation system semantic object class of natural language may be written by selecting a suitable semantic classification system from additional classifications of semantic items according to a selected object-oriented language and a formal definition of semantic objects.
In some embodiments of the present disclosure, the taxonomy may be a multi-layered tree structure. Basic class framework code and basic method implementations are written according to the above requirements.
Defining [ object structure ], [ semantic definition ], [ genus object ], [ additional classification ] attribute structures in the superclass of the basic semantic object; and defining [ compound object structure ] and [ additional classification ] attribute structure in the superclass of the compound semantic object.
The method at least comprises the following steps: a method of generating a new instance object; assigning and reading basic attributes [ object structure ], [ semantic definition ], [ genus object ], [ additional classification ] of the basic semantic object; and judging whether the semantic objects are the same. And a method for recursively inquiring [ object structure ] and [ semantic definition ] ordered sets. To form the self-definition and self-interpretation capability of the semantic object. And (3) assigning and reading basic attributes [ compound object structure ] and [ additional classification ] of the compound semantic object.
Step 405, a base semantic object creation method is performed.
Step 406, an object structure processing method of the underlying semantic object is performed.
Step 407, executing the semantic definition processing method of the basic semantic object.
Step 408, executing the generic object processing method of the basic semantic object.
Step 409, an additional classification processing method of the underlying semantic object is performed.
Step 410, a method for creating a composite semantic object is performed.
Step 411, an additional classification processing method of the compound semantic object is executed.
And finishing the construction method of the whole semantic object system. The semantic object system is an object system which can be operated in a memory and can be output to a file or a database for storage by using an object persistence technology.
In some optional implementations of some embodiments, with further reference to fig. 5, fig. 5 is a flow chart of a method of performing creation of an underlying semantic object.
The method for creating the basic semantic object comprises the following steps:
step 501, an item is taken from the basic item table.
Step 502, according to the content of the semantic item, selecting a proper subclass according to the classification system of the basic semantic object class, and generating an instance object of the class.
Step 503, three basic attributes of [ object structure ], [ semantic definition ], [ genus object ] and [ additional classification ] of the instance object are initialized, and a semantic object is generated for the first time. Since the required semantic object has not yet been generated, it is still necessary to temporarily store information on the attribute of the semantic object in the form of a character string in [ object structure ], [ semantic definition ], [ genus object ].
Step 504, initially define [ object structure ] attributes.
The character-shape and pronunciation pairs of the words are directly used for the alphabetic writing. If the word is a word formed by combining two words, the ordered set of character strings of the two words is used for temporary storage.
The object structure can be directly used for Chinese single characters and associated cotton characters by character-sound pairs. Chinese character sets are temporarily stored using an ordered set of characters of a single character.
Step 505, initially generate an ordered set of [ semantic definitions ].
The elements of the set are strings; for characters with space division words in writing formats such as English and German, word character strings defined by the meaning term are stored in an ordered set as temporary storage. In this step, a semantic definition attribute is preliminarily generated, wherein the semantic definition attribute is an ordered set of character strings.
Semantic definition for characters without space separation in Chinese or Japanese writing formats, the word segmentation process needs to be performed manually. The character strings of the single character or the character group are stored in the ordered set as temporary storage.
Step 506, preliminarily defining [ genus objects ], and temporarily storing character strings of words, phrases and character groups of the direct upper concepts.
Step 507, generating a set of [ additional classifications ] preliminarily.
The additional classification is temporarily stored as a set of character strings representing classified words, phrases, and character groups.
And step 508, repeating the steps 501, 502, 503, 504, 505, 506 and 507 until all the meaning items in the basic meaning item table are generated into the initial basic semantic object.
At this point, the initial semantic object generation in all the basic semantic item tables is completed, but the { object structure }, [ semantic definition ], [ genus object ], (X) in the basic semantic object is still in a character string form and needs to be replaced by a corresponding semantic object.
In some optional implementations of some embodiments, further referring to fig. 6, fig. 6 is a flow diagram of a method of performing object structure processing of underlying semantic objects.
The object structure processing method of the basic semantic object comprises the following steps:
step 601, in the initial semantic object system, a semantic object is queried.
Step 602, the object structure of the semantic object is extracted.
Step 603, determine whether the object structure contains a temporary character string.
Step 604, if the object structure contains a temporary character string, querying an initial semantic object system, finding a corresponding semantic object, and replacing the semantic object with the temporary character string.
And step 605, repeating the step 601, the step 602, the step 603 and the step 604 until the replacement of the temporary character strings in all the object structures in the initial basic semantic object system is completed.
To this end, the temporary character strings in the [ object structure ] in the initial semantic object system have been replaced with semantic objects.
In some optional implementations of some embodiments, with further reference to fig. 7, fig. 7 is a flow diagram of a semantic definition processing method of executing an underlying semantic object.
The semantic definition processing method of the basic semantic object comprises the following steps:
step 701, in an initial basic semantic object system, a semantic object is queried.
Step 702, the semantic definition of the semantic object is extracted.
Step 703, determine whether the semantic definition contains a temporary string.
Step 704, if the semantic definition contains a temporary character string, then the initial semantic object system is queried to find the corresponding semantic object, and the temporary character string is replaced by the semantic object.
Step 705, repeating step 701, step 702, step 703 and step 704 until the temporary character strings in all semantic definitions in the initial basic semantic object system are replaced.
To this end, temporary character strings in [ object structure ] and [ semantic definition ] in the initial semantic object system are replaced by semantic objects.
In some optional implementations of some embodiments, with further reference to fig. 8, fig. 8 is a flow diagram of a method of performing generic object processing of underlying semantic objects.
The method for processing the generic object of the basic semantic object comprises the following steps:
step 801, in an initial basic semantic object system, a semantic object is queried.
Step 802, the generic object of the semantic object is fetched.
In step 803, it is determined whether the generic object is a temporary character string.
Step 804, if the object belongs to the temporary character string, inquiring the basic semantic object system, finding a proper basic semantic object and replacing the temporary character string.
And 805, repeating the steps 801, 802, 803 and 804 until the temporary character strings in all the generic objects in the initial basic semantic object system are replaced.
In some optional implementations of some embodiments, with further reference to fig. 9, fig. 9 is a flow diagram of a method of performing additional classification processing of underlying semantic objects.
The additional classification processing method of the basic semantic object comprises the following steps:
step 901, in the initial basic semantic object system, a semantic object is queried.
At step 902, additional classifications of the semantic object are retrieved.
Step 903, determine whether the additional classification contains a temporary string.
Step 904, if the additional classification contains a temporary string, then the semantic object system is queried to find the corresponding basic semantic object and replace the temporary string.
And step 905, repeating the step 901, the step 902, the step 903 and the step 904 until the replacement of the temporary character strings in all the additional classifications in the initial basic semantic object system is completed.
In some optional implementations of some embodiments, further referring to fig. 10, fig. 10 is a flow chart of a method of performing creation of a composite semantic object.
The method for creating the composite semantic object comprises the following steps:
step 1001, a compound item is taken from the compound item table.
Step 1002, according to the content of the semantic item, selecting a proper subclass according to the classification system of the composite semantic object, and generating an instance object.
Step 1003, initializing the attributes of [ compound object structure ] and [ additional classification ] of the instance object.
Step 1004, generating a [ composite object structure ] object; since the underlying semantic object is already defined; only the basic semantic objects forming the composite semantics need to be selected and stored in the ordered set of [ composite object structure ] in order.
Step 1005, generate a temporary set of strings for [ additional classification ].
Step 1006, repeatedly executing step 1001, step 1002, step 1003, step 1004, and step 1005; and generating an initial compound semantic object system until all the semantic items of the compound semantic item table are generated into compound semantic objects.
In some optional implementations of some embodiments, further referring to fig. 11, fig. 11 is a flow diagram of an additional classification processing method of a compound semantic object.
The additional classification processing method of the compound semantic object comprises the following steps:
in step 1101, a composite semantic object is queried in the initial composite semantic object system.
Step 1102, extract the additional classification of the compound semantic object.
Step 1103, determine whether the additional classification contains a temporary string.
Step 1104, if the additional classification contains a temporary string, then the semantic object system is queried to find the corresponding basic semantic object or composite semantic object and replace the temporary string.
And step 1105, repeating the steps 1101, 1102, 1103 and 1104 until the temporary character strings in the additional classification of all the compound semantic objects are replaced.
In some optional implementations of some embodiments, the method for constructing a semantic recursive representation system further comprises: storing the constructed semantic object system; and querying and presenting the basic semantic object, the compound semantic object and the relationship among the objects in the constructed semantic object system. The composition structure relationship of the basic semantic object can form a multi-level graphical representation for showing the composition relationship between the semantics. The semantic definition relationship for the underlying semantic objects may form a multi-level graphical representation to show the definition relationship between semantics. And forming a multi-level graphical representation for the upper and lower conceptual relations of the species of the basic semantic object so as to show the upper and lower conceptual relations of the semantics.
The semantic recursive representation system and the construction method of the semantic recursive representation system according to some embodiments of the present disclosure can partially or wholly achieve one or more of the following effects.
The technical effect I is that the semantic object defined by the shape and sound unification recursion solves the problem of semantic representation inside a computer and the problem of ambiguity of the font.
The method has the capability of judging whether the semantic objects are the same or not, and judges that the two semantic objects are the same, and if and only if the object structures of the two semantic objects are the same, the semantic definitions are the same. Only if the ability of judging whether the semantic objects are the same is provided, the semantic objects can be guaranteed to be uniquely represented in the computer.
After the semantic object system is completed, the system can be used for processing the relation of converting the linear font sentences on the surface layer into the deep non-linear semantic objects, and is used for sentence semantic analysis and semantic understanding, and the semantic analysis by using the semantic objects is a problem without the ambiguity and the multiword of the font.
The unique representation of semantics is very fundamental and essential for natural language understanding.
The second technical effect is that the shape and sound definitions of the semantic objects are used as a unified recursive definition, and the self-expression and self-interpretation capabilities of the semantics are brought.
The object structure is defined in a nested manner and is composed of other more basic semantic objects, such as the single words of Chinese vocabulary Chinese characters, and the object structure can continuously inquire the component objects forming the object structure. The component objects can also continue to query their object structure, semantic definitions, attributes of the generic objects. At the end of the object structure the recursive query is a glyph pronunciation pair.
The semantic definition is defined recursively by means of other semantic objects, firstly, the accuracy of the semantic definition is guaranteed, secondly, other defined objects can be recursively inquired, and other objects can also continuously inquire the object structure, semantic definition and attribute of the object. The endpoints of recursive queries are glyphs and pronunciations.
The generic object is a semantic object and is an upper concept of the semantic object, the generic object is subjected to upward traversal query to generate a single-chain structure of upper and lower concepts, and the downward traversal query is subjected to generate a tree structure of the upper and lower concepts. The concept classification based on the upper and lower levels is a hierarchical concept, and in the recursive structure, a [ genus object ] of a semantic object is set as the semantic object to terminate the recursion so as to form a tree structure. For example, [ generic object ] of [ concept ] is still [ concept ].
If a Chinese character semantic electronic dictionary is made by utilizing a Chinese character semantic object system, the query of the semantics has the characteristics of more accuracy, convenience and quickness compared with the existing character type electronic dictionary due to the recursive definition of the semantic object. The semantic electronic dictionaries of other natural languages manufactured by the semantic object technology have the same effect.
And the application of lexical rules on semantic objects is sufficient.
The lexical method is an abstraction of the vocabulary in the syntactic function, and is a rule summarized by neglecting semantic meaning difference. The application of lexical rules to glyphs is inadequate due to the ambiguity, multiword nature of glyphs. Due to the uniqueness and the determined part-of-speech of the semantic object, the application of lexical rules on the semantic object is sufficient.
The technical effect is four, and the participle granularity can be controlled based on the participle of the semantic object.
The general process of natural language processing includes word segmentation, lexical analysis, syntactic analysis, semantic analysis and other processing processes.
According to GBT 13715-; the [ apple mobile phone ] needs to be divided into [ apple ] and [ mobile phone ], so that the decomposition is not beneficial to syntactic component analysis, if a semantic object technology is used, the [ Chinese cigarette ], [ apple mobile phone ] and the [ beautiful Chinese panda ] are all composite semantic objects, the composite semantic objects directly correspond to the syntactic component classes, and the decomposition into the basic semantic objects is not needed during the syntactic analysis.
The modern Chinese word segmentation standard for information processing is a fixed word segmentation, and the word segmentation method based on semantic objects has the capability of adjusting the granularity of the word segmentation according to the purpose of the word segmentation.
The semantic objects are realized in an object-oriented development environment, the voice is converted into the text by using the technology of converting the text by voice, the relation of the semantic objects is obtained by analyzing the text by using the method, the semantic objects are combined with a control system which can be used for intelligent equipment such as an unmanned aerial vehicle and a robot, sentences of natural language are converted into object-oriented object messages to be sent, and man-machine conversation can be conveniently realized.
Comparison of semantic object models with other semantic models
Common to the currently popular semantic models are: semantic models such as semantic networks, semantic frameworks, production rules and knowledge maps, wherein the semantic networks focus on representing the relationship between concepts and the knowledge maps, and the knowledge maps focus on representing the relationship between entities and attributes.
The semantics is very different from concepts and knowledge; semantics is the basis of concepts, concepts are the basis of knowledge, concepts and knowledge belong to high-level logical semantics, and semantics also include non-logical metaphorical semantics. Such as: 'the crowd boils up' and 'half of the man is the woman'. Furthermore, semantics such as metaphors, bilinguals, etc., cannot be expressed and processed by conceptual logic.
Unlike other semantic models, semantic object technology implements a low-level representation of semantics, rather than a high-level representation of semantics, such as concepts, entities, attributes, and the like.
Some embodiments of the present disclosure use the modern chinese "new chinese dictionary" or "modern chinese dictionary" as the reference of the basic semantic content, and use the modern chinese morphology and syntax as the semantic classification frame, to implement a chinese character semantic object system.
The ability requirement of the present disclosure to the ordinary skilled person is that the familiar object-oriented design idea has certain object-oriented software design ability, has basic Chinese knowledge, and masters an object-oriented programming language. Common object-oriented language development environments are: smalltalk, Java, Python, C + +, and the like.
Common software developers with the capability can develop the Chinese character semantic object system according to the definition of the semantic object and the classification hierarchical structure of the semantic object and the given semantic object construction method. The developer can also propose a classification system of the developer according to the embodiment, and realize a semantic object system of the developer.
With further reference to fig. 12, fig. 12 shows a classification framework according to an embodiment of the disclosure, and the specific implementation can further partition sub-classes.
And giving a basic framework class of the Chinese character semantic object system, wherein the indentation form represents the relation of father and son classes.
{ Chinese character semantic object class } "abstract class of all Chinese character semantic classes"
{ font-pronunciation structure class } "indicates the class of font-pronunciation pairs, the object structure of the original fundamental semantics"
{ lexical class } "superclass of lexical components, object of subclass is Chinese character basic semantic object"
{ Compound semantic object class } "super class of Compound semantic objects"
{ sentence component class } "super class of syntactic components. Structure specification for compound semantic objects "
A description of the basic attributes and example methods of each type is given below.
{ Chinese character semantic object class }:: ()
{ Chinese character semantic object class } is a super class of all semantic objects, is an abstract class, has no attribute definition, and only realizes an abstract method, and defines a common method for all semantic objects, and at least includes a method interface for answering semantic object structures, semantic definitions, parts of speech and attributes, and a method interface for judging whether two semantic objects are equal.
{ structure of shape and pronunciation }:: [ < font, character pronunciation > ])
The font can be a single font, or can be a plurality of font strings, such as: 'Marx'
The ordered set of < character, pronunciation > has two expression modes, and the corresponding effect of the font and the pronunciation is the same.
[ < single font 1, single font sound 1> < single font 2, single font sound 2> … < single font n, single font sound n > ]; equal [ < single font 1, single font 2 … single font n > < single font sound 1, single font sound 2 … single font sound n > ];
since the { structural class of shape and pronunciation } is a subclass of { semantic class of Chinese characters }, a basic method for answering semantics needs to be implemented. Answering the single font semantic, and finding all semantic objects corresponding to the single font in the { lexical class } subclass.
Answering the pronunciation of a single font, and finding all the pronunciations corresponding to the single font, namely all the pronunciations of the font in the subclass of the { part of speech class }. Answering the semantics of multiform and speech is a similar query approach.
{ morphology }:: ([ object structure ] [ semantic definition ] [ genus object ] (X))
{ lexical class } is an abstract class of all basic semantic objects, the basic semantic objects are semantics of basic words used for representing Chinese languages, an abstract interface method of a parent class { Chinese semantic object class } of the basic semantic objects needs to be realized, and sentence components which can be played by the objects can also be realized; example methods of word-word combining ability, semantic constraint rules, and the like.
{ lexical class } is the part-of-speech division of Chinese semantics, and can realize the conversion of semantic objects with part-of-speech into semantic objects with syntactic components, where semantic objects such as nouns can serve as subjects and objects, and semantic objects of verbs can serve as predicates, etc. And the collocation, the part of speech change and the semantic constraint function of the phrases can be realized.
The lexical class may implement several types of instance object methods:
the first type is attribute assignment and reading method about semantic object; the method for recursively inquiring [ object structure ] and [ semantic definition ] ordered set. The self-definition and self-interpretation capability of the semantic object is formed.
The second category is a method for word forming capability of semantic objects, which realizes semantic objects of various word classes according to phrase combination rules.
The third class is a syntactic function of semantic objects, implementing syntactic components that semantic objects of each part of speech may serve as, for supporting structural analysis of sentences, particularly in the context of semantic objects.
The fourth class is rules for assembling words of lexical semantics based on semantic definitions and genres, parts of speech, etc. of semantic objects, which may be implemented using an example approach. The semantic combinations of Chinese have certain rules which are reflected in constraint relations between parts of speech and between semantic concepts, and these constraint relations can also be realized by using an example method.
The fifth type is to judge the rationality of the semantics of a non-included vocabulary according to the definition of the semantics and the existing word forming rules. Reasonable, unreported vocabulary may be used to generate new semantic objects.
The subclass of the { lexical class } is an important part of the technical scheme, because the object-oriented programming is to write program codes for classes, new subclasses can be generated by subdivision on a class framework, the subclasses inherit all the properties and methods of the superclass, the subclasses can also realize the same-name method as the superclass, and the subclasses can also define the new methods of the subclasses.
To explore semantics in depth, (X) in the semantic object definition is used to extend the classification of the underlying semantic object.
Such as: (X) can be defined as ([ corporeal classification ] [ semantic type ] [ affective classification ] [ thesaurus classification ]), i.e. [ corporeal classification ] is used to describe the "colloquial, written, slang" classification of semantic objects; "literal, foundational, extended, metaphorical, fictitious" to describe semantic objects; "Emotion Classification" is used to describe the "elegant, commendable, elegant, neutral, derogative, vulgar, down-stream" classification of semantic objects; the term classification is used to describe the classification of "worship, courtesy, graceful and courtesy" as semantic objects, which are all the detailed classifications of semantics.
The lexical method is an abstraction of the vocabulary in the syntactic function, and is a rule summarized by neglecting semantic meaning difference. The application of lexical rules to glyphs is inadequate due to the ambiguity, multiword nature of glyphs. Due to the uniqueness and the determined part-of-speech of the semantic object, the application of lexical rules on the semantic object is sufficient.
The { lexical class } subclass classification is based on national standards ([ information processing Standard for modern Chinese part-of-speech tags ] "GB/T20532-2006) (of course, other classification schemes can be used).
The main hierarchy of { lexical classes } is:
{ lexical class }
{ semantic classes } "semantic objects that can individually serve as syntactic structure components"
{ noun class } "denotes a semantic object of the name of a person or thing. "
{ general noun } "denotes a name of a person or thing"
{ Abstract noun } "noun representing abstract conceptual meaning"
{ proper noun } "denotes a name of a specific person or thing"
{ time noun } "nouns indicating time and time of day"
{ place noun } "A noun indicating a place, a place name, etc."
{ orientation noun } "nouns indicating direction and orientation"
{ verb class } "represents semantic objects such as actions, behaviors, activities, presence, changes, and the like. "
{ general verb } "denotes a verb of an action or behavior, with verb grammatical features"
{ mental verb } "mental verb is a verb that represents a mental activity of a person"
{ verb judgment } "verb is only 'yes', meaning no, existence and the like"
{ verb capable } "denotes a verb that may, necessarily, willingly, etc. have a meaning"
{ tendency verb } "shows a verb of a tendency of an action behavior"
{ verb usage } "denotes a verb of a command or request"
{ adjective } "represents semantic objects such as properties, states, and the like. "
{ class of shape adjectives } "adjectives indicating the shape of things"
{ character adjective class } "adjectives that represent the properties of things"
{ distinction word } "indicates the feature and classification of things, and only the noun can be modified for definite language. "
{ status adjective class } "adjectives that represent the status of things"
{ part-of-speech } "denotes the semantic objects of number and order. "
{ cardinal number } "cardinal number words may be used to denote multiples, fractions, decimals, and overviews"
{ ordinal number } "indicates ordinal numbers"
{ quantifier class } "denotes a semantic object of a unit of a person, thing, or action. "
Volume word representing a unit of measure of a person or thing "
{ momentum word } "commonly used to denote the number of actions behind a verb"
{ time quantifier } "quantifier that represents time"
{ Compound quantifier } "two or more quantifiers constitute a quantifier representing a compound unit"
{ word class } "semantic objects that function as substitutes and renaissants. "
{ person name pronoun } "instead of the name of a person or thing"
{ pronouncing for questions } "means pronouncing for questions, or questions in return"
{ pronoun indication } "word to refer to or distinguish between a person and a situation"
{ side-word class } "modifies semantic objects of verbs and adjectives, representing ranges, degrees, etc. "
{ degree adverb } "adverb with meaning of degree, grade, etc."
{ episodic adverb } "adverb that indicates the meaning of a situation"
{ frequency adverb } "adverb with frequency"
{ time adverb } "adverb to indicate time"
{ range adverb } "adverb to denote range and limitation"
{ negative adverb } "denotes an adverb of negative meaning"
{ associated adverb } "adverb with an association in a phrase or sentence,") "
{ adverb of language atmosphere } "represents adverb of language atmosphere such as question, conjecture, turn, emphasis, etc."
{ virtual sense class }
{ interword class }
{ Prime class of enforcement } "educate the enforcer or the victim"
{ means of introducing actions, methods, tools, etc. } means of introducing actions "
{ class of temporal prepositions } "prepositions which elicit times at which actions occur"
{ Square preposition class } "prepositions of location, direction, start point, or end point of a lead-out action"
{ object intergeneric class } "prepositions which elicit objects or scopes associated with actions"
{ cause preposition class } "prepositions which elicit causes of actions"
{ destination interject class } "prepositions which elicit the purpose and result of an action"
{ conjunctive class } "semantic objects for connecting two semantic objects"
{ Help the part of speech }
{ structural Assistant } "denotes the structural relationship between an additional component and a core word"
{ dynamic Assistant } "representing the State of the progress of an action"
{ phrase help } "attached behind noumenal or predicate words to represent metaphors"
{ plural helpwords } "means the helpwords of plural, general numbers"
{ speech gas class }
The sentence component class is an abstract class of sentence components and is a subclass of Chinese semantic objects. The composition and action of the sentence are explained. Wherein the structure of the compound semantic object is described using sentence components to describe the combinatorial relationship.
{ sentence component class }
{ subject class } "object of sentence statement or description, who or what the description is. "
{ predicate class } "states a statement subject. Answering questions such as 'how' or 'what' of subject "
{ object class } "represents a language unit of a predicate verb that relates to an object"
{ stationary class } "restricted language units used in front of subjects and objects"
{ object class } "language units used before the predicate of verb and adjective, which act as a restriction"
{ complemental class } "language units such as additional components of predicate, time of response, location, and result"
{ composite semantic object class }:: ([ composite object structure ] ([ additional classification ]))
The compound semantic object class is a super class of a phrase or phrase class, and the compound semantic object is restricted by rules and semantics of syntax.
{ Compound semantic object class }
{ phrase Structure class } "Compound semantic object without null word"
{ phrase structure class } "Compound semantic object composed of interword, help, conjunctive word, etc."
The subclass framework of { phrase structure class } is:
{ phrase structure class }
{ major predicate structure class }
{ partial positive structure class }
{ centering Structure class }
{ middle-of-Structure class of Format }
{ V-Bin Structure class }
{ dynamic complement structure class }
{ Association structure class }
Phrases and phrases are units of language collocated on the level of lexical, syntactic, semantic and pragmatic, etc. Whether the structure is distinguished by the fact that the structure is provided with the fiction or not is taken as the distinction of the structural characteristics.
A composite semantic object is a semantic object larger than the base semantic object, and can be associated with a subject, a predicate, an object, a state, and a complement of a syntax unit, and is therefore often used for analysis of syntax components.
In the following, the term class and the verb class are taken as examples to describe constraint rules between semantics, which can be implemented on the corresponding class level through programming of the method. More constraint rules can be obtained by consulting grammar books.
{ noun class } is a semantic object that represents the name of a person, thing or time, place, etc. Generally, the structure can be modified by a number of phrases and can appear behind prepositions to form a preposition structure together. Not modified by adverbs. In a special fixed format of adverb, a noun may be preceded by an adverb.
Nouns are mainly subjects, objects and determinants, but also predicates and subjects. This is the basis upon which further refinement of the subclasses can be performed.
{ moving part of speech } represents semantic objects such as actions, behaviors, activities, presence, changes, etc.; the main grammatical features of the verb can be modified by [ not ] without being modified by the degree adverb, and the following can be provided with dynamic auxiliary words, the overlapping form of the verb, and the question can be asked in a positive or negative way.
Verbs with objects are called as transitive verbs, verbs without objects are called as impatient verbs, and verbs of most table actions can be overlapped. The monosyllabic verb is overlapped in AA and the bisyllabic verb is overlapped in ABAB.
The role of semantic objects in semantic analysis of chinese is illustrated by some examples.
1. Adding new semantic object to the deficiency of dictionary meaning item
Such as: on page 1507 in the modern chinese dictionary (seventh edition), the 'study' has two meanings, 'study 1-consider or discussion', 'study 2-explore the true phase, nature, regularity, etc.' which are verbs.
From the perspective of semantic objects, 'research' is an activity that is procedural, and research may refer to the activity of the research. Its upper concept is activity.
The dictionary meaning gives: an activity is a verb that acts for a purpose, and an activity is a noun that acts for a purpose.
Since the genus object of [ study ] is [ activity ], which has characteristics of verb and noun, the noun characteristics that [ study ] should have, may also be noun.
For example, in ' this study is important ', ' detailed study is the basis for success ', and from the subject's perspective, it should be a noun.
For the method of Chinese character semantic object, the semantics of 'research' needs to define four semantic objects, two are nouns and two are verbalization. Or inquiring the 'research' generic object chain, and obtaining the part-of-speech characteristics of the noun according to the part-of-speech of the superior generic object of the 'research' generic object.
For another example: the dictionary meaning of apple is only fruit and tree, ' apple 1-fruit of apple tree ', ' apple 2-deciduous tree, leaves are oval, and flower white has red halo. The apple is round and sweet or slightly sour', in reality, the apple can be apple company or apple mobile phone, and the word dictionary does not have the semantics of mobile phone and manufacturer, so that the demand for increasing is high.
2. Whole and partial constraint relationship application
The knowledge system is based on classification, the structures between parent classes and subclasses are the same, the knowledge system is reflected on the whole and partial constraint relation, and the inheritability is realized.
The flower, bird, fish and insect as a class of things all have their own components to form a whole and partial constraint relationship.
Flower: flower buds, petals; roots, stems, leaves;
bird: head, feathers, wings, claws, etc.;
fish: head, tail, gill, fin, scale, swim bladder;
can form 'fish head, fish tail, fish gill, fish fin, fish scale and swim bladder';
the shape of the cooking material is treated by: in terms of blocks, segments, slices, shreds and balls, fish is a food material; thus, ' fish chunk, fish segment, fish fillet, fish filet, fish ball ' may be mentioned '
According to the combination of frying, sauting, stewing and boiling and food material shape ' segments, pieces, shreds and balls ' in the cooking processing mode, the ' fish blocks, fish segments, fish slices and fish shreds are cooked, fish soup is cooked by fish balls and individual fish, and the like.
In the same way, chicken, duck and eel are also used. Writing a corresponding method to process similar rules is easy to implement in a system of semantic objects.
The constraint relation is also the embodiment of knowledge and is the rule for word forming and word segmentation application.
3. Role of semantic relationships
For example: explanation of [ driving ] dictionary "steering (car, boat, airplane, tractor, etc.) to run".
The semantic definition of [ drive ] should be: [ maneuver ], [ vehicle ], [ travel ] >, vehicle is a generic concept of semantic objects such as vehicles, ships, airplanes, and the like. After a new vehicle is generated, the semantic effect of driving is still applicable.
The lexical requirement that the verb can be followed by the object to form a structure of the bingo, and the concept meaning constraint is not generated.
Such as: the driving chair conforms to the rules of the bingo structure, but the general concept of the chair is furniture, and the general concept chain of the chair has no traffic tools, so that the driving chair has no logic significance. The logical problem can be judged by using the constraint relation of the semantic object.
As another example, [ hope ] is used as the semantic object of verb, and the object behind it requires an actionable vocabulary to complete the semantic expression, so that the listener can be satisfied. The generic object of the semantic object is a [ desire requirement ], and the semantic requirement for the [ desire requirement ] can be completed in a semantic object mode, so that all semantic objects meeting the lower concepts of the generic object have the semantic requirement.
It can be seen from the above specific examples that the semantic processing of the chinese character semantic object can be processed by subclasses of the semantic object, and can also be directly processed for a certain semantic object, which has great flexibility.
The semantic objects are implemented in an object-oriented development environment, and the semantic objects are not different from other objects in a software system, such as smart device objects, drone objects, robot objects of a smart device control system, a drone, a robot control system.
In the software code class of the intelligent device control system, the realization of the identification, the control parameter and the control method of the device is realized. The name identification of the intelligent equipment object is used as a semantic object, so that a semantic object system is combined with an intelligent equipment control system, instruction voices sent by an operator are converted into texts by using a voice text conversion technology, the relation between the equipment and control parameters in the instruction texts is analyzed by using the method and the system, the relation between the semantic objects in the texts can be obtained, the semantic object relation is converted into a control method of the equipment and message sending of the parameters, and man-machine conversation can be conveniently realized.
Examples are: in an object-oriented implementation class controlled by a logistics unmanned aerial vehicle, the identification of the unmanned aerial vehicle is WW111, and the unmanned aerial vehicle has time, flight height, direction, speed and address coordinate parameters; the method comprises control methods such as flying, delivering, picking up, taking off, landing and the like; the method is performed by means of the sending of messages, which need to be accompanied by corresponding parameters.
In noun class of Chinese character semantic object system, creating an unmanned aerial vehicle subclass, starting a Chinese character name for the unmanned aerial vehicle as 'logistics number one', and associating 'logistics number one' with unmanned aerial vehicle identification of the unmanned aerial vehicle control system as 'WW 111'.
The existing technology for converting the text by the voice can adapt to the common Chinese and various dialects, and converts the voice of an operator into the text of the Chinese through the technologies of training, machine learning and the like.
Assuming that the operator has just received a pick order, and found that 'Logistics one number' can go to pick, a voice command is sent to the operator, and after the voice is converted into text, the text may be in the form of:
after the commodity circulation I is sent to the article No. 4, please take the article at the address No. 2 and return. ' or
After the article No. 4 is sent, the article is taken from the address No. 2 immediately, and the article flow No. one can be returned. '
The same content, language can have many expression ways, but the relation of semantic objects is consistent.
The preliminary decomposition of text into semantic objects may be based on language understanding of the semantic objects:
the method comprises the steps of [ material flow I ], [ after the No. 4 article is sent out ], [ fetching the article to the address A ], [ returning.
The semantic object [ after the item No. 4 is sent out ] is decomposed into: [ Send ], [ item 4 ]. The semantic object [ fetch to address a ] is decomposed into: [ to ], [ address ], [ A ], [ fetch ].
The semantic object system can forward the relationship of the semantic objects to a message sequence of the logistics unmanned plane for the unmanned plane to execute.
WW111send No. 4 item; "[ Send ], [ item 4 ]") "
WW111flightTo No. A; "[ to ], [ address ], [ A ]") "
WW111 gettarticle; "[ get item ]"
Ww111goback. "[ return ]".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (11)

1. A construction method of a semantic recursive representation system of a natural language comprises the following steps:
step 1, selecting a word dictionary of a preset natural language, wherein the word dictionary comprises meaning items, and each meaning item comprises a character form, a character pronunciation and a character meaning;
step 2, generating a basic semantic item table according to the definition of the basic semantic object;
step 3, generating a compound semantic item table according to the definition of the compound semantic object;
step 4, according to the selected object-oriented language and the formal definition of the semantic object, selecting a proper semantic classification system from the additional classification of the semantic item, and writing a software code of the semantic recursive representation system semantic object class of the natural language;
step 5, executing a basic semantic object creating method, wherein the executing basic semantic object creating method comprises the following steps: step 501, an item is taken from a basic item table; 502, according to the content of the semantic item, selecting a proper subclass according to the classification system of the basic semantic object class to generate an instance object of the class; step 503, initializing the object structure, semantic definition, three basic attributes of the generic object, and additional classification of the instance object, and generating a semantic object for the first time; step 504, preliminarily defining the object structure attribute; step 505, generating a semantic definition ordered set preliminarily; step 506, preliminarily defining the species objects, and temporarily storing the character strings of the words, phrases and character groups of the direct upper concepts; step 507, generating a set of additional classifications preliminarily; step 508, repeating step 501, step 502, step 503, step 504, step 505, step 506, step 507 until all the semantic items in the basic semantic item table are generated into the initial basic semantic object;
step 6, executing an object structure processing method of the basic semantic object;
step 7, executing a semantic definition processing method of the basic semantic object;
step 8, executing a generic object processing method of the basic semantic object;
step 9, executing an additional classification processing method of the basic semantic object;
step 10, executing a creating method of the composite semantic object, wherein the creating method of the composite semantic object includes: 1001, taking a compound item from a compound item table; step 1002, selecting a proper subclass according to the content of the semantic item and a classification system of the composite semantic object to generate an example object; step 1003, initializing the composite object structure and the attribute of the additional classification of the instance object; step 1004, generating a composite object structure object; step 1005, generating a set of the additionally classified temporary character strings; step 1006, repeatedly executing step 1001, step 1002, step 1003, step 1004 and step 1005 until all the semantic items of the compound semantic item table are generated into a compound semantic object, and generating an initial compound semantic object system;
step 11, executing an additional classification processing method of the composite semantic object;
the basic semantic object and the composite semantic object are constructed recursively in the semantic recursive expression system and are unique, wherein the natural language is any natural language with a word system in the world.
2. The construction method according to claim 1, wherein the execution of the method for creating the base semantic object comprises:
step 51, taking an item from the basic item table;
step 52, selecting a proper subclass according to the content of the semantic item and the classification system of the basic semantic object class to generate an instance object of the class;
step 53, initializing the object structure, semantic definition, generic object and additional classification of the instance object, and generating a semantic object for the first time, wherein the object structure, semantic definition and generic object are three basic attributes, and the attribute information of the semantic object is temporarily stored in the form of character strings in the object structure, semantic definition, generic object and additional classification;
step 54, preliminarily defining the object structure attribute;
step 55, preliminarily generating a semantic definition attribute, wherein the semantic definition attribute is an ordered set of character strings;
step 56, preliminarily defining the species objects, and temporarily storing character strings of words, phrases and character groups of direct upper concepts;
step 57, generating a set of additional classifications preliminarily, wherein the additional classifications are temporarily stored in a set of character strings of words, phrases and character groups;
and 58, repeating the step 51, the step 52, the step 53, the step 54, the step 55, the step 56 and the step 57 until all the meaning items in the meaning item table are generated into temporary basic semantic objects, and forming an initial basic semantic object system.
3. The construction method according to claim 1, wherein the method for performing object structure processing of the underlying semantic objects comprises:
step 61, in the initial basic semantic object system, inquiring a semantic object;
step 62, extracting the object structure of the semantic object;
step 63, judging whether the object structure contains a temporary character string;
step 64, if the object structure contains a temporary character string, inquiring an initial basic semantic object system, finding a corresponding basic semantic object, and replacing the temporary character string;
and 65, repeating the steps 61, 62, 63 and 64 until the temporary character strings in all the object structures in the initial basic semantic object system are replaced.
4. The construction method according to claim 1, wherein the execution of the semantic definition processing method of the underlying semantic object comprises:
step 71, inquiring a basic semantic object in an initial basic semantic object system;
step 72, extracting the semantic definition of the basic semantic object;
step 73, judging whether the semantic definition contains temporary character strings;
step 74, if the semantic definition contains a temporary character string, inquiring an initial basic semantic object system, finding a corresponding basic semantic object, and replacing the temporary character string;
and 75, repeating the steps 71, 72, 73 and 74 until the temporary character strings in all the semantic definitions in the initial basic semantic object system are replaced.
5. The construction method according to claim 1, wherein the generic object processing method for executing the basic semantic objects comprises the following steps:
step 81, in the initial basic semantic object system, inquiring a semantic object;
step 82, taking out the generic object of the semantic object;
step 83, judging whether the object of the genus is a temporary character string;
step 84, if the generic object is a temporary character string, querying a basic semantic object system, finding a proper basic semantic object, and replacing the temporary character string;
and step 85, repeating the steps 81, 82, 83 and 84 until the temporary character strings in all the generic objects in the initial basic semantic object system are replaced.
6. The construction method according to claim 1, wherein the processing method for performing additional classification of the underlying semantic objects comprises:
step 91, querying a semantic object in an initial basic semantic object system;
step 92, extracting additional classifications of the semantic object;
step 93, judging whether the additional classification contains a temporary character string;
step 94, if the additional classification contains a temporary character string, inquiring an initial basic semantic object system, finding a corresponding basic semantic object, and replacing the temporary character string;
and step 95, repeating the steps 91, 92, 93 and 94 until the replacement of the temporary character strings in all the additional classifications in the initial basic semantic object system is completed.
7. The construction method according to claim 1, wherein the executing the creation method of the composite semantic object comprises:
step 101, taking a sense item in a compound sense item table;
102, selecting a proper subclass according to the content of the semantic item and a classification system of the composite semantic object to generate an instance object of the class;
step 103, initializing the composite object structure and the attribute of the additional classification of the instance object;
step 104, generating a composite object structure object, wherein only basic semantic objects forming composite semantics need to be selected and stored in an ordered set of a composite object structure according to the sequence as the basic semantic objects are defined;
step 105, generating a temporary character string set of additional classification;
and 106, repeatedly executing the step 101, the step 102, the step 103, the step 104 and the step 105 until all the meaning items of the compound meaning item table are generated into compound semantic objects, and forming an initial compound semantic object system.
8. The construction method according to claim 1, wherein the additional classification processing method for executing the composite semantic object comprises:
step 111, in the initial compound semantic object system, inquiring a compound semantic object;
step 112, taking out the additional classification of the compound semantic object;
step 113, judging whether the additional classification contains a temporary character string;
step 114, if the additional classification contains a temporary character string, inquiring a basic semantic object system and an initial composite semantic object system, finding a corresponding basic semantic object or a corresponding composite semantic object, and replacing the temporary character string;
and step 115, repeating the steps 111, 112, 113 and 114 until the replacement of the temporary character strings in all the additional classifications in the initial compound semantic object system is completed.
9. The method of constructing as claimed in claim 1, wherein the method further comprises:
storing the constructed semantic object system;
and querying basic semantic objects, compound semantic objects and the relations among all the objects in the constructed semantic object system, and performing image presentation on the object structure relations, semantic definition relations and species upper and lower relations of the semantic objects.
10. The construction method according to claim 1, wherein the two constructed basic semantic objects are the same in case of the same object structure and the same semantic definition, for ensuring uniqueness of the semantic objects.
11. The method of constructing as claimed in claim 1, wherein the method further comprises: and establishing the upper and lower relation of the semantic objects in the conceptual sense by using the generic objects of the basic semantic objects.
CN201910750547.6A 2019-08-14 2019-08-14 Method for constructing semantic recursion representation system of natural language Active CN110457551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910750547.6A CN110457551B (en) 2019-08-14 2019-08-14 Method for constructing semantic recursion representation system of natural language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910750547.6A CN110457551B (en) 2019-08-14 2019-08-14 Method for constructing semantic recursion representation system of natural language

Publications (2)

Publication Number Publication Date
CN110457551A CN110457551A (en) 2019-11-15
CN110457551B true CN110457551B (en) 2021-04-23

Family

ID=68486731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910750547.6A Active CN110457551B (en) 2019-08-14 2019-08-14 Method for constructing semantic recursion representation system of natural language

Country Status (1)

Country Link
CN (1) CN110457551B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1417707A (en) * 2002-12-02 2003-05-14 刘莎 Natural language semantic information united-coding method
CN101895517B (en) * 2009-05-19 2013-05-15 北京启明星辰信息技术股份有限公司 Method and device for extracting script semantics
CN102479191B (en) * 2010-11-22 2014-03-26 阿里巴巴集团控股有限公司 Method and device for providing multi-granularity word segmentation result
US8473503B2 (en) * 2011-07-13 2013-06-25 Linkedin Corporation Method and system for semantic search against a document collection
US9916357B2 (en) * 2014-06-27 2018-03-13 Microsoft Technology Licensing, Llc Rule-based joining of foreign to primary key
CN104484411B (en) * 2014-12-16 2017-12-22 中国科学院自动化研究所 A kind of construction method of the semantic knowledge-base based on dictionary

Also Published As

Publication number Publication date
CN110457551A (en) 2019-11-15

Similar Documents

Publication Publication Date Title
US8478581B2 (en) Interlingua, interlingua engine, and interlingua machine translation system
CN101937430B (en) Method for extracting event sentence pattern from Chinese sentence
RU2509350C2 (en) Method for semantic processing of natural language using graphic intermediary language
WO2014160309A1 (en) Method and apparatus for human-machine interaction
EP1866810A1 (en) Method for transforming language into a visual form
KR20110009205A (en) Systems and methods for natural language communication with a computer
JP2006164293A (en) Automatic natural language translation
CN112825111A (en) Natural language processing method and computing device thereof
CN109783819A (en) A kind of generation method and system of regular expression
Stede Lexical semantics and knowledge representation in multilingual text generation
CN110457551B (en) Method for constructing semantic recursion representation system of natural language
Franconi et al. Quelo natural language interface: Generating queries and answer descriptions
CN110489752B (en) Semantic recursion representation system of natural language
CN104281695B (en) The semantic information abstracting method and its system of natural language based on combinatorial theory
Hicks et al. Content analysis
Pavlic et al. Adjective representation with the method Nodes of Knowledge
CN112115722A (en) Human brain-simulated Chinese analysis method and intelligent interaction system
KR102363131B1 (en) Multi-dimensional knowledge searching method and system for expert systems
Magnini Use of a lexical knowledge base for information access systems
Attia Implications of the agreement features in machine translation
Dannélls Multilingual text generation from structured formal representations
Batarfi et al. Building an Arabic semantic lexicon for Hajj
Silaban et al. Simalungun Batak Language Causative Construction
Shao et al. The Construction of the Semantic Collocation Database of Verb-Complement Structure in Modern Chinese based on a Large-scale Chinese Chunkbank
Narayan et al. Pre-Neural Approaches

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant