Summary of the invention
In fact, all computer programs that the mankind write can regard the simple semantic system of the one of computing machine as.Its semantic model be embodied in computer program code in the rule that implies, semantic instance is stored in various program variable or database data, and its semantic applications engine is exactly the function of program itself and the power function that can call.No matter it is emphasized that for computer program, be semantic model, semantic instance, or semantic applications engine is all that programmer inputs computing machine and is solidificated among computing machine, and computing machine just mechanically, is passively made a response by formula.Maximum different of artificial intelligence system and computer program are that the semanteme of computing machine and semantic applications engine all may be created by computing machine oneself and perfect.But most of traditional artificial intelligence study has still continued to use the method for programmer's programming, do not distinguish semantic establishment and the establishment of semantic applications engine, but tight coupling ground them are write in a program.So, artificial intelligence system complex in the past, and poor universality.
The primary innovation of the present patent application protection is exactly that the semantic system in artificial intelligence study is decomposed into semantic establishment system and semantic applications engine establishment system.Just as the mankind can be separately the same with learning skill learning knowledge.The two can make a breakthrough respectively.Semantic engineering system of the present invention mainly completes computing machine oneself and creates semantic part, and the part creating semantic applications engine is made open development environment, creates for third party programmer.Creating semantic applications engine as how allowing computing machine oneself gives open by the other patent of invention of application.
When building the semantic model of machine word justice engineering system, I had once attempted a lot of existing Mathematical Modeling Methods in the industry, comprised based on ontological OWL modeling method, but all undesirable.Reason is the completeness that neither one mathematical model can meet Turing machine, that is: can allow the phenomenon that computer simulation is arbitrarily complicated in the world.Based on mathematical algorithm knowledge model can only specific field among a small circle in be suitable for, once will lose efficacy in the face of the complicated social phenomenon of large data.In making repeated attempts, I finds, only has the natural language of the mankind can describe social phenomenon complicated arbitrarily.Nationality, clan, no matter how falls behind, understand mathematics even not at all, but their langue is all perfect in the world, and people never feel can not link up because of the defect of language.The more important thing is, the rule of language---grammer, easilier than mathematical algorithm grasp many.
The Section 2 important innovations of the present patent application protection replaces mathematical algorithm model to build the semantic model of semantic engineering system with natural language descriptive model, strengthens the universality of semantic engineering system with this.Fully absorbing the thought of Chinese ancient philosophy "The Book of Changes" when building semantic structure, comprising: Motion and repose, cloudy to accord with sun, number, time sequence, space sequence, periodic law and holographic law etc.Meanwhile, adopt natural language statement as semantic formula, adopt grammar for natural language as the description rule (acting on the SQL of similar database) of semantic model.
" semanteme " of computing machine is the concept of a broad sense, not necessarily the semanteme of human language.We can allow semantic engineering system construct computing machine special " semanteme " for any specific ken, develop special semantic applications engine.This is also the method for a lot of artificial intelligence system employing at present.The defect of this method does not have versatility.The present invention, in order to strengthen the versatility of system and process the ability of complicated large data, have employed the descriptive model of natural language as computing machine semantic model.So machine word justice engineering system of the present invention is particularly suitable for the research of CNLU.For reaching this target, the Section 3 innovation of the present patent application protection allows the cognitive process of the computer simulation mankind, removes to approach the semanteme that the mankind create in life, thus realize " semantic understanding " of Chinese natural language with " semanteme " that computing machine self is created.
The research of a lot of computer Chinese natural language understanding of current industry abandons original research method of imitating human thinking, then adopt the extensive statistical method expected, thus can only " Chinese information processing " study at last, that is: from real Chinese text information, refine some useful information, do not reach the degree of semantic understanding far away.And semantic engineering system of the present invention is expected to approach Chinese natural language semantic understanding studies original target.
Summary of the invention
The invention provides a kind of machine word justice engineering system, be mainly used in the research of Chinese natural language semantic understanding, the basic technology solution simultaneously can applied as all kinds of Computerized intelligent.
Machine word of the present invention justice engineering system, it is characterized in that: it comprises a computing machine according to external input information, application project method created, accumulation, management and self-perfection computing machine semanteme system.Here the concept of " semanteme " does not refer to the semanteme of the human language that it has often been said, and refers to computing machine self defines, the understandable semanteme of machine.The scope that the scope of computer understanding comprises its perception adds the set of its all semantic applications engine function.
This semantic engineering system, be further characterized in that: " semanteme " that produce to allow computing machine is as far as possible close with " semanteme " of Human Natural Language, the realization of this system adopts the method for computer recognition structural simulation human cognitive structure, that is: mathematical model is replaced to be computer construction semantic model with natural language model, the descriptive language of computing machine semantic model is exactly Human Natural Language, and the rule of the semantic Model description language of computing machine is exactly the grammer of Human Natural Language.Wherein,
" cognitive structure " comprises one is that the process of computing machine semanteme and one make the process of behavior response according to this mapping relations and a series of understanding rule by input information MAP.
The principle of this semantic engineering system and the overall framework of each subsystem are see accompanying drawing one.It comprises dynamic semantics voluminous dictionary, Semantic mapping engine, digital brain (CyberBrain), semantic model storehouse and modeling tool thereof, semantic study engine, semantic applications engine development environment and regular maintenance tool.Wherein,
Preferably, comprising: dynamic semantics voluminous dictionary, semantic model, digital brain (CyberBrain), Semantic mapping engine, semantic study engine and semantic applications engine development environment.
Dynamic semantics voluminous dictionary is the basis of semantics recognition, is with the key distinction of conditional electronic dictionary: the comment section of each word one section of word inputted by hand by language specialist often in traditional dictionary is changeless; And in dynamic semantics voluminous dictionary the comment section of each word by enriching constantly and improving and dynamic change according to semantic model storehouse and digital brain.
The semantic model simulating human structure of knowledge, it absorbs the idea about modeling of the structuring of Chinese ancient philosophy "The Book of Changes", digitizing, Motion and repose combination, the computing machine semantic structure that structure is unique.It is with the key distinction of other semantic models in the industry: traditional semantic model is all certain mathematical algorithm model, comprises current the most fiery ontology model; And the semantic model employing of semantic engineering system is speech like sound descriptive model, mathematics is the reasoning tool of this model.
Digital brain (CyberBrain) store and management semantic instance, is with the difference of all kinds of knowledge base or database: the storage organization of traditional knowledge storehouse or database is fixing, and its storing process is that data in storehouse or knowledge quantity are in increase and decrease; And the needs stored according to semantic instance and semanteme are learnt the feedback of engine and dynamically optimize by the storage organization of digital brain.So it is not " storehouse ", but " brain ".
Can Semantic mapping engine completes semantics recognition and mapping process, is the basic module of semantic engineering system, which determine Chinese natural language text message and successfully be converted into semantic instance in the digital brain of semantic engineering system.
Semantic study engine performs semantic engineering, refines complete semantic instance, create new semantic model from the fragmentation semantic instance of magnanimity, constantly enriches semantic model storehouse and the storage organization improving digital brain.It is the key modules that semantic engineering system moves towards practical.
Semantic applications engine development environment will cash the value of all semantemes that computing machine creates.The key distinction of the similar module of it and other artificial intelligence system is: other artificial intelligence system often only provides a concrete semantic applications engine for a set of knowledge model, completes the behavior of certain particular types; Semantic engineering system can design different semantic applications engines by the set of semantics of variable grain degree in a set of semantic model, and minimum granularity can be a semantic unit.Therefore, the establishment of semantic applications engine is designed to an open system by semantic engineering system, that is: semantic applications engine development environment, utilizes digital brain to develop various different semantic applications engine for third party, realizes the maximization that semantic engineering system semanteme is worth.
Fig. 1 is the principle framework figure of machine word justice engineering system.(with reference to Figure of description)
" semanteme " of computing machine is the concept of a broad sense, not necessarily the semanteme of human language.But the understanding that it is possible to allow computing machine approach human language semanteme is highest goal of the present invention.So next for Chinese natural language semantic understanding, what machine word of the present invention justice engineering system was described realizes logic.
1) semantic engineering system of the present invention for large data manipulation to as if the true Chinese text of internet mass.System input is set of semantics with article, and the every a word in article is a complete semanteme unit, represents a fragment in certain semantic model.One section of article at least relates to a semantic model, is usually directed to a lot of semantic model.System input single treatment a word, that is: continual character string between statement punctuation mark.What first uninterrupted character string entered is Semantic mapping engine.The character string of this engine calling dynamic semantics voluminous dictionary to input cuts word, and according to dynamic semantics voluminous dictionary, to the word be syncopated as, disambiguation, semantic feature mark are carried out to the annotation (that is: the usage of this word in various known semantic model) of each word, and identify all semantic formulas in every a word.
2) each semantic formula correspond to one or more semantic model behind.Semantic mapping engine derives the example of semantic model or the semantic model that call according to the semantic formula identified.If the example of these semantic models or semantic model was defined or was instantiated in digital brain in semantic model storehouse, then Semantic mapping engine directly will recall them; If wherein some semantic model does not exist in systems in which, then Semantic mapping engine will obtain semantic model information (because natural language is the descriptive language of semantic model) from semantic formula, and the semantic model new according to the rule creation creating semantic model based on semantic formula.
Such as, in short the enterprise that is called " Huawei " is mentioned.According to dynamic semantics voluminous dictionary, Semantic mapping engine knows that Huawei is an enterprise, and first it can look at the semantic instance not having ready-made enterprise of Huawei in digital brain, just recalls if had; The general semantics model of enterprise is removed in semantic model storehouse if no, arrived; If connection enterprise semantic model does not all have, an enterprise semantic model (fragment) will be created according to the semantic formula of the words.
3) semantic model be transferred out is mere skeleton, and the sentence that Semantic mapping engine is processing includes the specifying information of real world.Real information in the words can be filled in semantic model by Semantic mapping engine, semantic model is become a semantic instance based on this model, that is: complete a Semantic mapping.If what be transferred out is semantic instance existing in digital brain, the fresh information that the sentence processed comprises will add in the semantic instance be transferred out by Semantic mapping engine.The output of Semantic mapping engine is stored in digital brain semantic instance.
Such as, a general corporate model comprises organizational structure, business model, product slate, management state etc.This model is applicable to a lot of enterprise.After the filling of the information of Huawei is entered, this model just becomes the concrete semantic instance that describes enterprise of Huawei.In short often only mention (so being called fragment) in a certain respect of Huawei, Semantic mapping engine first time encounters the sentence of Huawei will instantiation incomplete Huawei enterprise semantic model, and stored in digital brain.Every sentence encountering Huawei later, Semantic mapping engine all can recall by the semantic instance Huawei from digital brain again, and new information is supplemented into, forms a larger semantic instance (fragment) about Huawei, and then stored in digital brain.
4) digital brain is the module of special access and management semantic instance.Its input is the semantic instance fragment that Semantic mapping engine exports; Its output is also the output of semantic engineering system to semantic applications engine, mainly more complete semantic instance.Its maximum feature is enriching constantly along with semantic instance, particularly describes the increase of dimension, and the storage organization of digital brain can make corresponding adjustment, to keep best memory property.This is the ability that conventional database systems does not possess.
5) semantic study engine is the functional module performing semantic engineering.Its input is the semantic instance fragment in digital brain, the neology model that output is more complete semantic instance and extracts.The learning rules storehouse of this engine simulates the most of learning method of the mankind, such as: resolve, conclude, comprehensive, abstract, inherit, associate, compare, judge etc.According to these rules, the semantic instance fragment in digital brain arranges by semantic study engine repeatedly, progressively restores complete semantic instance, deposits and get back in digital brain.Meanwhile, extract the part of general character and regular part, the suggestion improved as semantic model outputs to semantic model storehouse.
6) semantic model storehouse is the module creating and safeguard semantic engineering system Knowledge framework, is the soul place of whole system.It absorbs "The Book of Changes" structuring, digitizing, the design concept of being association of activity and inertia, and adopts class language model to replace Mathematical Models semantic model, adopts natural language as the descriptive tool of semantic model.Initial " seed " semantic model in this semantic model storehouse is artificial input, later under the cooperation of semanteme study engine constantly abundant, grow up.
7) dynamic semantics voluminous dictionary is actually another form of expression in semantic model storehouse.Its each word is defined in semantic model storehouse, and the comment section of each word is the usage of this word in relevant semantic model.Along with the enriching constantly of semantic model storehouse, perfect, dynamic semantics voluminous dictionary will dynamically upgrade thereupon, abundant and perfect.The effect of dynamic semantics voluminous dictionary is the basis for Semantic mapping engine provides word and word to contrast, and for the disambiguation of semantic formula provides foundation, for Semantic mapping engine, this calls that group semantic model or semantic instance is given a clue.
Arrive this, Semantic mapping engine à digital brain à semanteme study engine à semantic model storehouse à dynamic semantics voluminous dictionary forms the complete closed loop of a semantic engineering self study process.
It is worth mentioning that, in order to improve the extensibility of system, each engine is all furnished with can the rule base of manual intervention.That is, people controls the function of each engine and performance by the change of control law.
8) be stored in complete semantic instance in digital brain and will export to semantic applications engine as semantic engineering system.In order to adapt to different applications, different semantic instance is utilized to develop different application engines, semantic Design of Engineering Systems of the present invention semantic applications engine development environment, for semantic applications engine uses the semantic instance resource of digital brain to provide support smoothly.
To disclose in other patent of invention as the concrete semantic applications engine based on digital brain.
Embodiment
Below the mode of specific embodiment of the invention and order are further described.
1) complete the exploitation in semantic model storehouse, and manually input " seed " semantic model in semantic model storehouse;
2) complete the exploitation of dynamic semantics voluminous dictionary, and dynamically generate " seed " semantic voluminous dictionary according to " seed " semantic model in semantic model storehouse;
3) in order to increase versatility, the content of above-mentioned two steps can be planned with reference to " modern Chinese dictionary ", that is: plan all need which " seed " semantic model according to the word of including in " modern Chinese dictionary ", which semantic unit the dynamic semantics voluminous dictionary of dynamic generation should have, and covers how much vocabulary.
4) exploitation of digital brain is completed.
5) complete the exploitation of Semantic mapping engine, and selected real Chinese text debugs Semantic mapping engine as Sample Storehouse.Debug process is mainly enriched and is improved Semantic mapping rule base.
6) exploitation and the debugging of semantic study engine is completed.Debug process is then abundant and improves semantic study rule base.
7) in real internet environment, run semantic engineering system of the present invention, make in digital brain, to have enough more complete semantic instance, meanwhile, make semantic model storehouse and dynamic semantics voluminous dictionary " growth " to practical scale.
8) building of semantic applications engine development environment is completed, and digital brain can be utilized to develop several semantic applications engine, such as: internet information automatic classification system, intelligent public sentiment monitoring system, intelligent supplydemand relationship matching system and theme of news deduction system etc.