CN109710914A - Semantic training system and its method based on business model - Google Patents

Semantic training system and its method based on business model Download PDF

Info

Publication number
CN109710914A
CN109710914A CN201711011579.1A CN201711011579A CN109710914A CN 109710914 A CN109710914 A CN 109710914A CN 201711011579 A CN201711011579 A CN 201711011579A CN 109710914 A CN109710914 A CN 109710914A
Authority
CN
China
Prior art keywords
semantic
corpus
business model
training
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711011579.1A
Other languages
Chinese (zh)
Inventor
饶竹一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201711011579.1A priority Critical patent/CN109710914A/en
Publication of CN109710914A publication Critical patent/CN109710914A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention discloses a kind of semantic training system and its method based on business model;The system includes: recording module, and the typing for receiving business model operates, and obtains business model data, business model data include object, attribute or method;Analysis module is connected with recording module, for being analyzed and processed to object, attribute or method, obtains user language knowledge, semantic structure or Context event, and constructing semantic mode accordingly, generates basic corpus;Training use-case generation module, is connected with analysis module, for generating training case data according to basic corpus;Training execution module is connected with analysis module and training use-case generation module, for generating SPD file, TCD file and mapping relations file according to semantic pattern, training case data.The present invention solves the problems, such as that semantic training operational instrument is difficult to be used and grasped by application developer;Reduce the threshold that application field developer accesses semantic analysis ability.

Description

Semantic training system and its method based on business model
Technical field
The present invention relates to the technical field of natural language processing more particularly to a kind of semanteme training systems based on business model System and its method.
Background technique
As artificial intelligence is in the infiltration of every field, the whole world will be stepped into the artificial intelligence epoch from Internet era.It is natural The Language Processing branch important as artificial intelligence, plays important facilitation in the process of Artificial Intelligence Development.It is natural Language Processing (NLP) refers to that machine understands and explains the ability of mankind's writing, tongue.The target of NLP is the computer/machine that allows Device is intelligent as the mankind on understanding language, and final goal is to make up Human communication's (natural language) and computer understanding (machine Device language) between gap.NLP analytical technology is roughly divided into three levels: morphological analysis, syntactic analysis and semantic analysis.Language The final purpose of justice analysis is the true semanteme for understanding sentence expression.Semantic analysis needs semantic training tool, is instructed by semanteme Practice user language knowledge, semantic structure and the semantic pattern generated, i.e. semantic analysis energy is realized in morphological analysis and syntactic analysis Power.
The method of semantic training is all the means by morphological analysis and syntactic analysis at present, at the same to the corpus of user into Row semantic character labeling generates the shape that analysis engine as user language knowledge, semantic structure and semantic pattern is understood that Formula.Currently, there is no a preferable semantic training processes to help those not have the application neck on the basis NLP in current application market Domain developer carries out semantic exploitation.Such as the AIUI development platform of Iflytek, the AI development platform of Baidu and Microsoft luis.The semanteme training assistance tool that these development platforms provide can not be used directly for application field developer, The asymmetry of knowledge hinders the application of this artificial intelligence, while using in above-mentioned platform, needs great workload, management and dimension Protect it is at high cost, while application developer in training process concept and process be not to understand very much, although user is caused to exist Semanteme is carried out to train, but confused situation.
At present, the relevant training tool of semantic training is difficult with now, if application developer does not have NLP and knows Know, will lead to can not grasp semantic training process, to improve the threshold of access NLP analytical technology.Meanwhile the semanteme of user Training amount is very big, is not easy there are management and the very high technical problem of maintenance cost.
Summary of the invention
It is a primary object of the present invention to propose a kind of semantic training method and its system based on business model, it is intended to solve Certainly semantic training is difficult to manage, the high problem of maintenance cost.
To achieve the above object, the present invention proposes a kind of semantic training method based on business model, comprising:
Recording module, the typing for receiving business model operate, and obtain business model data, the business model data Including object, attribute or method;
Analysis module is connected with the recording module, for being analyzed and processed to the object, attribute or method, obtains Family linguistry, semantic structure or Context event are taken, and constructing semantic mode accordingly, generates basic corpus;
Training use-case generation module, is connected with the analysis module, for generating training use-case according to the basic corpus Data;
Training execution module is connected, for according to the semanteme with the analysis module and training use-case generation module Mode, training case data generate SPD file, TCD file and mapping relations file.
In a possible design, the recording module includes object typing unit, attribute typing unit and method record Enter unit, data storage cell;Wherein:
The object typing unit is used to receive the typing operation of object in the business model;The object includes abstract Word, hyponym and example word;
The attribute typing unit is used to receive the typing operation of the attribute of the object;The attribute include attribute word, Attribute value and attribute type;
The method typing unit is used to receive the typing operation of the method for the object;The method include method name, Input and output parameter;
The data storage cell is used for business model storage to the database.
In a possible design, the analysis module includes object analysis unit, attributive analysis unit, method analysis Unit, ontology expansion unit, schema construction unit, basic corpus generation unit and data storage cell;Wherein:
The object analysis unit is used to carry out the object extraction of term, obtains user language knowledge, and by institute Object is stated to be associated with the user language knowledge;
The attributive analysis unit obtains the semantic structure of the attribute for analyzing the attribute;
The method analytical unit obtains corresponding Context event for analyzing the method;
The ontology expansion unit, for expanding the user language knowledge, and to the user language knowledge Structure be adjusted;
The schema construction unit is used for through the predicate semantic relation and phrase semantic relation in the business model, knot The user language knowledge and semantic structure are closed, semantic pattern is constructed;
The basis corpus generation unit is for analyzing the semantic pattern, by each semantic angle in the semantic pattern Color replaces with specific word to generate basic corpus;
The data storage cell is for knowing the user language knowledge and the mapping relations of business model, user language Knowledge, semantic structure, semantic pattern and basic corpus storage are into database server.
In a possible design, the trained use-case generation module includes corpus acquiring unit, expansion merging of equal value Unit, automatic marking unit and data storage cell;Wherein:
The corpus acquiring unit is used to obtain the basic corpus that the analysis module generates, and is obtained by predetermined manner Expand corpus;
It is described it is of equal value expand corpus that combining unit is used to obtain common language knowledge and the corpus acquiring unit into Row equivalent expansion and merging, the standard type and associated equivalent table for obtaining corpus reach;
The automatic marking unit is used to carry out certainly the standard type of equal value for expanding the corpus that combining unit generates Dynamic mark obtains training case data;
The data storage cell is for the trained case data to be saved in database.
In a possible design, the trained execution module includes SPD generation unit, TCD generation unit, basic language Adopted map unit, complicated Semantic mapping unit and data storage cell;Wherein:
The SPD generation unit is used to for the Context event to be associated with the semantic pattern, generates SPD file;
The TCD generation unit is used to for the trained case data to be associated with the semantic pattern, generates TCD file;
The basic semantic map unit is used to analyze the simple statement in the trained case data, is mapped as base This semantic pattern generates simple mapping relations file;
The complexity Semantic mapping unit is used to analyze the complicated sentence in the trained case data, maps that base The set of this semantic pattern generates complex mapping relation file;
The data storage cell closes the SPD file, TCD file, simple mapping relations file and complex mappings It is that file uploads in file server.
The semantic training method based on business model that the present invention also provides a kind of, comprising:
The typing operation for receiving business model, obtains business model data, the business model data include object, attribute Or method;
The object, attribute or method are analyzed and processed, user language knowledge, semantic structure or semantic thing are obtained Part, and constructing semantic mode accordingly generate basic corpus;
Training case data is generated according to the basic corpus;
SPD file, TCD file and mapping relations file are generated according to the semantic pattern, training case data.
In a possible design, the typing operation for receiving business model includes:
Receive the typing operation of object in the business model;The object includes abstract word, hyponym and example word;
Receive the typing operation of the attribute of the object;The attribute includes attribute word, attribute value and attribute type;
Receive the typing operation of the method for the object;The method includes method name, input and output parameter;
By business model storage to the database.
In a possible design, described point is analyzed and processed the object, attribute or method, obtains user's language Say knowledge, semantic structure or Context event, and constructing semantic mode accordingly, generating basic corpus includes:
The extraction of term is carried out to the object, obtains user language knowledge, and by the object and the user language Knowledge is associated;
The user language knowledge is expanded, and the structure of the user language knowledge is adjusted;
The attribute is analyzed, the semantic structure of the attribute is obtained;
The method is analyzed, corresponding Context event is obtained;
By the predicate semantic relation and phrase semantic relation in the business model, in conjunction with the user language knowledge and Semantic structure constructs semantic pattern;
The semantic pattern is analyzed, each semantic role in the semantic pattern is replaced with into specific word to generate basis Corpus;
By the mapping relations of the user language knowledge and business model, user language knowledge, semantic structure, semantic pattern And basic corpus storage is into database server.
It is described to include: according to the basic corpus generation training case data in a possible design
The basic corpus that the analysis module generates is obtained, and is obtained by predetermined manner and expands corpus;
Common language knowledge into row equivalent expansion and is merged with the corpus that the corpus acquiring unit obtains, obtains language The standard type of material and associated equivalent table reach;
The standard type of the corpus carries out automatic marking, obtains training case data;
The trained case data is saved in database.
It is described that SPD file, TCD text are generated according to the semantic pattern, training case data in a possible design Part and mapping relations file include:
The Context event is associated with the semantic pattern, generates SPD file;
The trained case data is associated with the semantic pattern, generates TCD file;
The simple statement in the trained case data is analyzed, basic semantic mode is mapped as, generates simple mapping Relational file;
The complicated sentence in the trained case data is analyzed, the set of basic semantic mode is mapped that, is generated multiple Miscellaneous mapping relations file;
The SPD file, TCD file, simple mapping relations file and complex mapping relation file are uploaded into file In server.
It is difficult to solve semantic training operational instrument for semantic training system and its method proposed by the present invention based on business model The problem of to be used and be grasped by application developer;The threshold that application field developer accesses semantic analysis ability is reduced, Application developer only needs domain knowledge that training can be completed, and does not need additional grammatical and semantic knowledge especially and carries out Training;It solves semantic training to be difficult to manage, the high problem of maintenance cost especially relates to grammatical and semantic knowledge;It solves Semantic training is complicated, be difficult to completely to define training use-case and systematization defines that training use-case, the degree of automation is not high leads The problem for causing workload very big.
Detailed description of the invention
Fig. 1 is the structural schematic diagram of the semantic training system based on business model of the embodiment of the present invention;
Fig. 2 is the structural schematic diagram of the recording module of the embodiment of the present invention;
Fig. 3 is the structural schematic diagram of the analysis module of the embodiment of the present invention;
Fig. 4 is the structural schematic diagram of the training use-case generation module of the embodiment of the present invention;
Fig. 5 is the structural schematic diagram of the training execution module of the embodiment of the present invention;
Fig. 6 is the flow diagram of the semantic training method based on business model of the embodiment of the present invention;
Fig. 7 is the flow diagram of the business model typing of the embodiment of the present invention;
Fig. 8 is the flow diagram that the business model of the embodiment of the present invention is analyzed;
Fig. 9 is the flow diagram that the training use-case of the embodiment of the present invention generates;
Figure 10 is the flow diagram that the training of the embodiment of the present invention executes;
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The each embodiment of the present invention is realized in description with reference to the drawings.In subsequent description, using for indicating The suffix of such as " module ", " component " or " unit " of element is only for being conducive to explanation of the invention, and there is no special for itself Fixed meaning.Therefore, " module " can be used mixedly with " component ".
As shown in Figure 1, the present invention provides a kind of semantic training system based on business model, including recording module 10, point Analyse module 20, training use-case generation module 30 and training execution module 40.Wherein:
Recording module 10 is used to receive the typing operation of business model, obtains business model data, the business model data Including object, attribute or method;As it can be seen that recording module 10 can help the business model in its field of application developer's typing.
Analysis module 20 is connected with recording module 10, for being analyzed and processed to the object, attribute or method, obtains and uses Family linguistry, semantic structure or Context event, and constructing semantic mode accordingly, generate basic corpus, and corpus refers to that user answers With typical case's expression in field.In the specific implementation, analysis module 20 combines business model data and common language knowledge data, right Object, attribute and the method for business model are analyzed and processed, to obtain the user languages such as ontology dictionary knowledge and business Model and user language Knowledge Mapping relationship, ultimate analysis module store these data into database.
Training use-case generation module 30 is connected with analysis module 20, for generating trained number of cases according to the basis corpus According to.Training use-case generation module 30 can be imported by user, open corpus crawls, be obtained based on the means that attributed scheme generates Enough corpus, and expansion merging of equal value is carried out to corpus and automatic marking is carried out to corpus, generate trained data, user Training use-case can be manually adjusted, training case data is stored after user's confirmation into database.Training use-case refers to By the corpus of mark.
Training execution module 40 is connected with the analysis module 20 and training use-case generation module 30, for according to the semanteme Mode, training case data generate SPD file, TCD file and mapping relations file.More specifically, training execution module 40 For executing semantic training, based on user language knowledge and training case data, it is trained using training engine, generates SPD File, TCD file, while generating basic semantic and complicated Semantic mapping rule file, analysis engine load these language Knowledge and rule file provide the semantic analysis function of user's application field.Wherein, training engine is used to corpus being trained for language Say knowledge.Analysis engine is used to analyze user's expression, expresses in conjunction with linguistry generative semanticsization, and being converted into can quilt The form that user uses.
On the basis of Fig. 1 corresponding embodiment, as shown in Fig. 2, the present invention provides one kind in another embodiment of the present invention In semantic training system based on business model, which includes object typing unit 11,12 and of attribute typing unit Method typing unit 13, data storage cell 14;In the present embodiment, guiding typing operation is can be used in recording module 10, Middle object and method are necessary typings, and attribute typing is optional.In addition, typing sequence is from object to attribute again to method Wherein:
The object typing unit 11 is used to receive the typing operation of object in business model;The object include abstract word, under Position word and example word;Abstract word refers to a kind of things, such as " song ";Hyponym refers to the further classification of abstract word, outside Prolong and is less than abstract word, such as " popular song ";Example word refers to the specific example of abstract word and hyponym, such as " 10 years ".It is right It include above-mentioned three kinds of words and its hyponymy as all objects that typing unit needs to be related in typing business model.Example Such as, this business model of " looking into stock ", " stock " is exactly object." stock " is abstract word, can also include other hyponyms, It include example word, such as " Shanghai manization " such as " good performance stocks ".
The attribute typing unit 12 is used to receive the typing operation of the attribute of the object;The attribute includes attribute word, attribute Value and attribute type;The typing of attribute is to rely on object, so the object for needing to select to rely on when typing attribute.Example Such as, " stock that shareholder is Ma Yun " this business model has said that " stock " is object above, and " shareholder is Ma Yun " is attribute.Its In, " shareholder " is attribute word, and " Ma Yun " is attribute value, and attribute Value Types detect automatically, then user is transferred to confirm, at this In a example, attribute Value Types are name or shareholder's name.
This method typing unit is used to receive the typing operation of the method for the object;This method includes method name, input ginseng Several and output parameter;The meaning of method can be understood as a kind of for object functionality operation.The typing of method relies on object and category Property, when the method for definition other than designated parties religious name, it is also necessary to execute input and output parameter, input parameter from object and It is selected in attribute, output parameter is selected from object, and input parameter is optionally that output parameter is essential.Generally, method Name is usually customized, and outputting and inputting parameter then is object or the attribute of foundation business model to be selected.For example, " looking into the stock that shareholder is Ma Yun " this business model, " looking into " this is method, and the input parameter definition of method is based on object Or attribute comes increased, the definition of the output parameter of method is based on object come increased, and output parameter represents the intention of user. So " shareholder is Ma Yun " is attribute as input parameter, and " stock " is object as output parameter in above-mentioned example.
The data storage cell is used to business model storage arriving the database server 50;Database server 50 makes Use MySQL database.
On the basis of Fig. 1 corresponding embodiment, as shown in figure 3, the present invention provides one kind in another embodiment of the present invention In semantic training system based on business model, which includes object analysis unit 21, attributive analysis unit 22, side Method analytical unit 23, ontology expansion unit 24, schema construction unit 25, basic corpus generation unit 26 and data storage cell 27;In the specific implementation, pass through the work of user's clicking trigger each unit.Wherein:
The object analysis unit 21 is used to carry out the object extraction of term, obtains user language knowledge, and this is right As being associated with the user language knowledge;Object analysis unit 21 is by carrying out term to the object of 10 typing of recording module It extracts, obtains the user language knowledge of ontology dictionary and ontology thesaurus structure.Meanwhile it also needing the object of business model and this The user languages knowledge connection such as pronouns, general term for nouns, numerals and measure words library gets up.
Above-mentioned ontology dictionary includes ontology and name entity.For example, " people that the birthday is October nineteen ninety ", wherein " the birthday It is October nineteen ninety " it is an attribute, " birthday " is attribute word, and " October nineteen ninety " is attribute value, by naming Entity recognition It is known that it is date entity, i.e. attribute Value Types, attribute semantemes structure can be expressed as " birthday is the date " at this time.
The attributive analysis unit 22 obtains the semantic structure of the attribute for analyzing the attribute;Attributive analysis list Member 22 is analyzed by carrying out terminology extraction, name Entity recognition, syntactic structure to the property content of 10 typing of recording module, is obtained The ontology dictionary of Attribute Association and the semantic structure of attribute, attribute semantic structure performance be attribute word, relation on attributes word or The combination of person's Attribute expression, the combination of ontology dictionary.In addition, expand the expression of attribute on the basis of common language knowledge, from And obtain the semantic structure of more attributes.
This method analytical unit 23 obtains corresponding Context event for analyzing this method;Method analytical unit 23 method by analyzing 10 typing of recording module is analyzed, and corresponding Context event structure is generated.Context event is method The output of performance and analysis engine based on API style.Context event and method are in application field level and semantic domain The identical expression of level, if analysis engine parses Context event, specific output format can be associated in method.
Since the ontology obtained from business model is limited, so base of the ontology expansion unit 24 using common language knowledge Plinth expands the ontology dictionary got, while the structure of ontology dictionary can be modified and be increased.That is, this reality Apply in example, ontology expansion unit 24 for expanding the user language knowledge, and to the structure of the user language knowledge into Row adjustment, such as modification hyponymy.
The schema construction unit 25 is used to pass through the predicate semantic relation and phrase semantic relation in the business model, in conjunction with The user language knowledge and semantic structure, construct semantic pattern.More specifically, schema construction unit 25 passes through business model In include predicate semantic relation and phrase semantic relation, in conjunction with obtained user language knowledge, i.e. ontology dictionary and category The semantic structure of property, constructs preliminary semantic pattern.Semantic pattern shows as predicate (optional), attribute semantemes structure, ontology The combination of dictionary, wherein the semantic structure of attribute is composite construction.Semantic pattern is the foundation stone for generating basic corpus, is also used for point Analysis carries out semantic analysis.For example, the form of semantic pattern can be " the@people that the birthday is the@date " or " the predetermined city@to the city@ The price in city is the@air ticket within@quantity ";Wherein, "@" is to indicate ontology dictionary.
The basis corpus generation unit 26 is used to be based on user language knowledge and common language knowledge, analyzes the semanteme mould Each semantic role in the semantic pattern is replaced with specific word to generate basic corpus by formula;These corpus are user's applications The basal expression in field.In training use-case generation module, which obtains training case data as input.
The data storage cell 27 is for knowing the user language knowledge and the mapping relations of business model, user language Knowledge, semantic structure, semantic pattern and basic corpus storage are into database server.More specifically, data storage cell is negative It blames the mapping relations of the obtained ontology dictionary of above-mentioned analytic process and business model, ontology dictionary and its structure, attribute semantemes Structure, the attribute semantemes structure of expansion, Context event, semantic pattern and basic corpus storage are into database server.Number MySQL server is used according to library server.The user language knowledge of acquisition is stored in database by data storage cell, is saved in In MySQL.
On the basis of Fig. 1 corresponding embodiment, as shown in figure 4, the present invention provides one kind in another embodiment of the present invention In semantic training system based on business model, which includes corpus acquiring unit 31, expansion of equal value Combining unit 32, automatic marking unit 33 and data storage cell 34;In the specific implementation, training use-case generation module can be with It is worked by clicking trigger.
Corpus acquiring unit 31 is used to obtain the basic corpus of the analysis module 20 generation, and is obtained and expanded by predetermined manner Fill corpus;The predetermined manner is, for example, to import user's corpus, crawl the means such as open corpus, and expanding corpus can allow trained engine to obtain To the knowledge that can more train.That is, corpus acquiring unit 31 can read the basic corpus of analysis module generation, use Family can also import existing corpus storage, while user can choose and carry out open corpus and expand to obtain corpus, those languages Material by be trained use-case raw material
Equivalence expands combining unit 32 and is used to carry out the corpus that common language knowledge and the corpus acquiring unit obtain etc. The expansion and merging of valence, the standard type and associated equivalent table for obtaining corpus reach;The corpus range for needing to mark is reduced, simultaneously The expression of equal value for expanding corpus, expression of equal value is associated with.Equivalence, which expands, to be merged on the basis of common language knowledge base, is expanded Corpus simultaneously merges corpus of equal value, generates standard type corpus and associated corpus of equal value, achievees the effect that reduce workload, improve Semantic training effectiveness.
Automatic marking unit is used to expand the standard type for the corpus that combining unit generates to the equivalence to carry out automatic marking, It obtains training case data;The foundation of automatic marking is this pronouns, general term for nouns, numerals and measure words that binding analysis module obtains by analyzing corpus Library, attribute semantemes structure, semantic pattern are labeled corpus, being marked as a result, training use-case.Further, it is also possible to Annotation results are returned to user to confirm, can be modified if necessary to annotation results, will be trained after user's confirmation Case data is saved in database.
The training case data is saved in database for case data will to be trained to carry out persistence by data storage cell In, MySQL service can be used in database server.Data storage cell uses the corpus of generation and its corpus of equal value and training Number of cases is saved in MySQL server according in deposit database.The corpus that training engine obtains is more can be to mention for analysis For more semantic knowledges and semantic rules so that analysis engine have higher recall rate and accuracy rate,
On the basis of Fig. 1 corresponding embodiment, as shown in figure 5, the present invention provides one kind in another embodiment of the present invention In semantic training system based on business model, the training execution module 40 include SPD generation unit 41, TCD generation unit 42, Basic semantic map unit 43, complicated Semantic mapping unit 44 and data storage cell 45;Training execution module 40 can lead to Cross clicking trigger work.Wherein:
The semantic pattern that analysis module obtains is improved and is supplemented again by SPD generation unit, while Context event being closed It is linked to semantic pattern, generates SPD file;
TCD generation unit arranges the training use-case that training use-case generation module 30 obtains and its expression of equal value, is associated with Semantic pattern generates TCD file;
Basic semantic map unit is used to analyze the simple statement in the training case data, is mapped as basic semantic Mode generates simple mapping relations file;
Complicated Semantic mapping unit is used to analyze the complicated sentence in the training case data, maps that basic semantic The set of mode generates complex mapping relation file;
Data storage cell is by the SPD file, TCD file, simple mapping relations file and complex mapping relation file It uploads in file server 60.
As shown in fig. 6, the embodiment of the present invention provides a kind of semantic training method based on business model, comprising:
101, start.
102, the typing operation for receiving business model, obtains business model data, which includes object, belongs to Property or method;
103, the object, attribute or method are analyzed and processed, obtain user language knowledge, semantic structure or semantic thing Part, and constructing semantic mode accordingly generate basic corpus.
Corpus refers to the typical expression of user's application field.In the specific implementation, business model data and public affairs can be combined Linguistry data altogether, are analyzed and processed the object, attribute and method of business model, to obtain the users such as ontology dictionary Linguistry and business model and user language Knowledge Mapping relationship, ultimate analysis module store these data to database In.
104, training case data is generated according to the basis corpus.
When it is implemented, can be imported by user, open corpus crawls, obtains foot based on the means that attributed scheme generates Enough corpus, and expansion merging of equal value is carried out to corpus and automatic marking is carried out to corpus, generate trained data, Yong Huke To manually adjust to training use-case, training case data is stored after user's confirmation into database.Training use-case refer to through Cross the corpus of mark.
105, SPD file, TCD file and mapping relations file are generated according to the semantic pattern, training case data.
More specifically, executing semantic training, based on user language knowledge and training case data, carried out using training engine Training generates SPD file, TCD file, while generating basic semantic and complicated Semantic mapping rule file, analysis engine These linguistries and rule file are loaded, the semantic analysis function of user's application field is provided.Wherein, training engine to be used for will Corpus is trained for linguistry.Analysis engine is used to analyze user's expression, expresses in conjunction with linguistry generative semanticsization, And it is converted into the form that can be used by a user.
106, terminate.
On the basis of Fig. 6 corresponding embodiment, the embodiment of the present invention provides a kind of semanteme training based on business model Method, as shown in fig. 7, step 102 specifically includes:
301, start.
302, the typing operation of object in business model is received;
Business model 71 includes object, attribute and method.The object includes abstract word, hyponym and example word;Abstract word Refer to a kind of things, such as " song ";Hyponym refers to the further classification of abstract word, and extension is less than abstract word, such as " stream Row song ";Example word refers to the specific example of abstract word and hyponym, such as " 10 years ".In " looking into stock " this business model In, " stock " is exactly object." stock " is abstract word, can include also example comprising other hyponyms, such as " good performance stocks " Word, such as " Shanghai manization ".
303, the typing operation of the attribute of the object is received.
The attribute includes attribute word, attribute value and attribute type;The typing of attribute is to rely on object, so in typing category Property when need select rely on object.For example, " stock that shareholder is Ma Yun " this business model, has said " stock " above It is object, " shareholder is Ma Yun " is attribute.Wherein, " shareholder " is attribute word, and " Ma Yun " is attribute value, and attribute Value Types are automatic Detection, then user is transferred to confirm, in this example, attribute Value Types are name or shareholder's name.
304, the typing operation of the method for the object is received.
This method includes method name, input and output parameter;The meaning of method can be understood as a kind of for object Feature operation.The typing of method relies on object and attribute, when the method for definition other than designated parties religious name, it is also necessary to execute input Parameter and output parameter, input parameter are selected from object and attribute, and output parameter is selected from object, and input parameter is optional , output parameter is essential.Generally, method name is usually customized, and outputting and inputting parameter then is according to business mould The object or attribute of type is selected.For example, " looking into the stock that shareholder is Ma Yun " this business model, " looking into " this side of being Method, the input parameter definition of method are based on object or attribute come increased, and the output parameter definition of method is based on object Next increased, output parameter represents the intention of user.So " shareholder is Ma Yun " is attribute as input ginseng in above-mentioned example Number, " stock " is object as output parameter.
305, judge whether to continue typing operation;If so, return step 302.If so, entering step 306.
306, by the business model data 72 storage into the database server;Database server can be used MySQL database.
On the basis of Fig. 6 corresponding embodiment, as shown in figure 8, the present invention provides one kind in another embodiment of the present invention In semantic training method based on business model, step 103 is specifically included:
401, start.
402, object is analyzed.
The extraction that term is carried out to the object in the business model data 72 can combine common language knowledge when extracting It is operated;The user language knowledge of ontology dictionary and ontology thesaurus structure is obtained, and the object and the user language are known Knowledge is associated.
Above-mentioned ontology dictionary includes ontology and name entity.For example, " people that the birthday is October nineteen ninety ", wherein " the birthday It is October nineteen ninety " it is an attribute, " birthday " is attribute word, and " October nineteen ninety " is attribute value, by naming Entity recognition It is known that it is date entity, i.e. attribute Value Types, attribute semantemes structure can be expressed as " birthday is the date " at this time.
403, the attribute is analyzed, obtains the semantic structure of the attribute.
It is analyzed by carrying out terminology extraction, name Entity recognition, syntactic structure to the property content of typing, obtains attribute and close The semantic structure performance of the ontology dictionary of connection and the semantic structure of attribute, attribute is attribute word, relation on attributes word or attribute Expression formula combination, the combination of ontology dictionary.In addition, expanding the expression of attribute on the basis of common language knowledge, to obtain The semantic structure of more attributes.
404, this method is analyzed, obtains corresponding Context event.
Context event is the output of performance and analysis engine of the method based on API style.Context event and method be The identical expression of application field level and semantic domain level specifically exports lattice if analysis engine parses Context event Formula can be associated in method.
405, the ontology dictionary got is expanded.
Since the ontology obtained from business model is limited, it is possible to using the basis of common language knowledge, while can It modifies and increases with the structure to ontology dictionary, such as modification hyponymy.
406, schema construction is carried out.
By the predicate semantic relation and phrase semantic relation in the business model, in conjunction with the user language knowledge and semanteme Structure constructs semantic pattern.More specifically, by the predicate semantic relation and phrase semantic relation that include in business model, In conjunction with obtained user language knowledge, the i.e. semantic structure of ontology dictionary and attribute, preliminary semantic pattern is constructed.Language Adopted mode shows as the combination of predicate (optional), attribute semantemes structure, ontology dictionary, and wherein the semantic structure of attribute is composite junction Structure.Semantic pattern is the foundation stone for generating basic corpus, is also used for analysis and carries out semantic analysis.For example, the form of semantic pattern can To be " the@people that the birthday is the@date " or " price in the predetermined city@to the city@is the@air ticket within@quantity ";Wherein, "@" It is to indicate ontology dictionary.
407, basic corpus is generated.
Based on user language knowledge and common language knowledge, the semantic pattern is analyzed, by each language in the semantic pattern Adopted role replaces with specific word to generate basic corpus;These corpus are the basal expressions of user's application field.In training use-case In generation module, which obtains training case data as input.
408, data storage is carried out.
By the mapping relations of the user language knowledge and business model, user language knowledge 74, semantic structure, semantic pattern And basic corpus storage is into database server.More specifically, data storage cell is responsible for obtaining above-mentioned analytic process Ontology dictionary and mapping relations, ontology dictionary and its structure of business model, attribute semantemes structure, expansion attribute semantemes knot Structure, Context event, semantic pattern and basic corpus storage are into database server.Database server is taken using MySQL Business device.The user language knowledge of acquisition is stored in database by data storage cell, is saved in MySQL.
On the basis of Fig. 6 corresponding embodiment, as shown in figure 9, the present invention provides one kind in another embodiment of the present invention In semantic training method based on business model, step 104 is specifically included.
501, start.
502, corpus is obtained,
In the specific implementation, generated basic corpus 71 can be read, user can also import existing user's corpus 82 storages, while user can choose and carry out open corpus 83 and expand to obtain corpus, those corpus by be trained use-case original Material.
503, corpus is expanded by predetermined manner;
The predetermined manner is, for example, to import user's corpus, crawl the means such as open corpus, and trained engine can be allowed by expanding corpus Obtain the knowledge that can more train.
504, of equal value expand is carried out to merge.
Into row equivalent expansion and merging, the standard type and associated equivalent table for obtaining corpus reach the corpus that will acquire; The corpus range for needing to mark is reduced, while expanding the expression of equal value of corpus, expression of equal value is associated with.Equivalence, which expands, to be merged On the basis of common language knowledge base, expand corpus and merge corpus of equal value, generates standard type corpus and associated language of equal value Material achievees the effect that reduce workload, the semantic training effectiveness of raising.
505, automatic marking is carried out.
On the basis of having obtained user language knowledge, automatic marking is carried out to the standard type of corpus, obtains training use Number of cases evidence;The foundation of automatic marking is by analyzing corpus, in conjunction with ontology dictionary, attribute semantemes structure, semantic pattern Corpus is labeled, it is being marked as a result, training use-case.In addition, also training case data is saved after user's confirmation Into database.
506, to training use-case adjustment.
Annotation results can be returned to user to confirm, the adjustment such as can modify to annotation results if necessary Movement.
507, data storage is carried out.
Training case data is subjected to persistence, which is saved in database, database server MySQL service can be used.More specifically, the corpus of generation and its corpus of equal value and training case data are stored in database In, that is, it is saved in MySQL server.The corpus that training engine obtains is more can be to provide more semantic knowledges for analysis And semantic rules, so that analysis engine has higher recall rate and accuracy rate.
On the basis of Figure 10 corresponding embodiment, as shown in Figure 10, in another embodiment of the present invention, the present invention provides one In semantic training method of the kind based on business model, step 105 is specifically included:
601, start.
602, data are obtained.
It mainly include user language knowledge 74 and training case data 75.
603, SPD is generated.
Semantic pattern is improved and supplemented again, while Context event is associated with semantic pattern, generates SPD file;
604, TCD is generated.
Use-case and its expression of equal value will be trained to arrange, semantic pattern is associated with, generates TCD file;
605, basic semantic mapping processing is carried out.
The simple statement in the training case data is analyzed, basic semantic mode is mapped as, simple mapping is generated and closes It is file;
606, complicated Semantic mapping processing is carried out.
The complicated sentence in the training case data is analyzed, the set of basic semantic mode is mapped that, is generated complicated Mapping relations file;
607, data storage is carried out.
Using the SPD file, TCD file, simple mapping relations file and complex mapping relation file as training result File 76 uploads in file server.
608, terminate.
Semantic training system and its method provided by the invention based on business model, when user needs to access semantic analysis When technology, inevitably need to carry out semantic training, the present invention is by recording module to the business of user's application field Model enters the typing of guiding, and user is allowed to be unlikely to have no way of doing it.The business model of user's typing is obtained by analysis module Intension, i.e. user language knowledge, while generating basic corpus allows user under conditions of having basis, carries out the acquisition of corpus Work instructs user in a kind of method of systematization, complete to define training use-case.By training use-case generation module, to The corpus at family is expanded automatically, is then carried out merging treatment of equal value, is finally carried out automatic marking, greatly reduce the work of user It measures, user is facilitated to be managed and safeguard.The generation that rule file is completed by training execution module realizes that basic semantic reflects It penetrates and complicated Semantic mapping, analysis engine is allowed to better understand user's expression.
Semantic training system and its method provided by the invention based on business model, by recording module, user is only needed Typing and modification are carried out to business model, application field developer can be easy for grasping how operational semantics is trained, User can more easily manage and maintain semantic training simultaneously.Pass through object in analysis module, attribute and method Analyze to user shield domain knowledge to grammatical and semantic knowledge generating process so that user only need domain knowledge without The knowledge such as additional grammatical and semantic reduce the threshold that application field developer enters access semantic analysis ability.Pass through instruction Practice use-case generation module and automatically generate trained case data, significantly reduces the workload of user.Pass through training execution module It can produce the method based on complicated nested statement, generate basic semantic mapping stream, for understanding the complicated sentence of user, that is, wrap The sentence of the mapping containing multiple basic semantics, improves the semantic analytic ability of analysis engine
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of semantic training system based on business model characterized by comprising
Recording module, the typing for receiving business model operate, and obtain business model data, the business model data include Object, attribute or method;
Analysis module is connected with the recording module, for being analyzed and processed to the object, attribute or method, obtains and uses Family linguistry, semantic structure or Context event, and constructing semantic mode accordingly generate basic corpus;
Training use-case generation module, is connected with the analysis module, for generating training case data according to the basic corpus;
Training execution module, with the analysis module and training use-case generation module be connected, be used for according to the semantic pattern, Training case data generates SPD file, TCD file and mapping relations file.
2. the semantic training system based on business model according to claim 1, which is characterized in that the recording module includes Object typing unit, attribute typing unit and method typing unit, data storage cell;Wherein:
The object typing unit is used to receive the typing operation of object in the business model;The object include abstract word, Hyponym and example word;
The attribute typing unit is used to receive the typing operation of the attribute of the object;The attribute includes attribute word, attribute Value and attribute type;
The method typing unit is used to receive the typing operation of the method for the object;The method includes method name, input Parameter and output parameter;
The data storage cell is used for business model storage to the database.
3. the semantic training system based on business model according to claim 1, which is characterized in that the analysis module includes Object analysis unit, attributive analysis unit, method analytical unit, ontology expansion unit, schema construction unit, basic corpus generate Unit and data storage cell;Wherein:
The object analysis unit is used to carry out the object extraction of term, obtains user language knowledge, and will be described right As being associated with the user language knowledge;
The attributive analysis unit obtains the semantic structure of the attribute for analyzing the attribute;
The method analytical unit obtains corresponding Context event for analyzing the method;
The ontology expansion unit, for expanding the user language knowledge, and to the knot of the user language knowledge Structure is adjusted;
The schema construction unit is used for by the predicate semantic relation and phrase semantic relation in the business model, in conjunction with institute User language knowledge and semantic structure are stated, semantic pattern is constructed;
The basis corpus generation unit replaces each semantic role in the semantic pattern for analyzing the semantic pattern Specific word is changed to generate basic corpus;
The data storage cell be used for by the mapping relations of the user language knowledge and business model, user language knowledge, Semantic structure, semantic pattern and basic corpus storage are into database server.
4. the semantic training system based on business model according to claim 1, which is characterized in that the trained use-case generates Module includes corpus acquiring unit, expansion combining unit, automatic marking unit and data storage cell of equal value;Wherein:
The corpus acquiring unit is used to obtain the basic corpus that the analysis module generates, and is obtained and expanded by predetermined manner Corpus;
The of equal value combining unit that expands is used to carry out the corpus that common language knowledge and the corpus acquiring unit obtain etc. The expansion and merging of valence, the standard type and associated equivalent table for obtaining corpus reach;
The automatic marking unit is used to mark the standard type of equal value for expanding the corpus that combining unit generates automatically Note obtains training case data;
The data storage cell is for the trained case data to be saved in database.
5. the semantic training system based on business model according to claim 1, which is characterized in that the trained execution module It is single including SPD generation unit, TCD generation unit, basic semantic map unit, complicated Semantic mapping unit and data storage Member;Wherein:
The SPD generation unit is used to for the Context event to be associated with the semantic pattern, generates SPD file;
The TCD generation unit is used to for the trained case data to be associated with the semantic pattern, generates TCD file;
The basic semantic map unit is used to analyze the simple statement in the trained case data, is mapped as basic language Adopted mode generates simple mapping relations file;
The complexity Semantic mapping unit is used to analyze the complicated sentence in the trained case data, maps that basic language The set of adopted mode generates complex mapping relation file;
The data storage cell is literary by the SPD file, TCD file, simple mapping relations file and complex mapping relation Part uploads in file server.
6. a kind of semantic training method based on business model characterized by comprising
The typing operation for receiving business model, obtains business model data, the business model data include object, attribute or side Method;
The object, attribute or method are analyzed and processed, user language knowledge, semantic structure or Context event are obtained, and Constructing semantic mode accordingly generates basic corpus;
Training case data is generated according to the basic corpus;
SPD file, TCD file and mapping relations file are generated according to the semantic pattern, training case data.
7. the semantic training method based on business model according to claim 6, which is characterized in that the reception business model Typing operation include:
Receive the typing operation of object in the business model;The object includes abstract word, hyponym and example word;
Receive the typing operation of the attribute of the object;The attribute includes attribute word, attribute value and attribute type;
Receive the typing operation of the method for the object;The method includes method name, input and output parameter;
By business model storage to the database.
8. the semantic training method based on business model according to claim 6, which is characterized in that described point to described right As, attribute or method are analyzed and processed, user language knowledge, semantic structure or Context event, and constructing semantic accordingly are obtained Mode, generating basic corpus includes:
The extraction of term is carried out to the object, obtains user language knowledge, and by the object and the user language knowledge It is associated;
The user language knowledge is expanded, and the structure of the user language knowledge is adjusted;
The attribute is analyzed, the semantic structure of the attribute is obtained;
The method is analyzed, corresponding Context event is obtained;
By the predicate semantic relation and phrase semantic relation in the business model, in conjunction with the user language knowledge and semanteme Structure constructs semantic pattern;
The semantic pattern is analyzed, each semantic role in the semantic pattern is replaced with into specific word to generate basic language Material;
By the mapping relations of the user language knowledge and business model, user language knowledge, semantic structure, semantic pattern and Basic corpus storage is into database server.
9. the semantic training method based on business model according to claim 1, which is characterized in that described according to the basis Corpus generates training case data
The basic corpus that the analysis module generates is obtained, and is obtained by predetermined manner and expands corpus;
Common language knowledge into row equivalent expansion and is merged with the corpus that the corpus acquiring unit obtains, obtains corpus Standard type and associated equivalent table reach;
The standard type of the corpus carries out automatic marking, obtains training case data;
The trained case data is saved in database.
10. the semantic training method based on business model according to claim 1, which is characterized in that described according to institute's predicate Adopted mode, training case data generate SPD file, TCD file and mapping relations file and include:
The Context event is associated with the semantic pattern, generates SPD file;
The trained case data is associated with the semantic pattern, generates TCD file;
The simple statement in the trained case data is analyzed, basic semantic mode is mapped as, generates simple mapping relations File;
The complicated sentence in the trained case data is analyzed, the set of basic semantic mode is mapped that, complexity is generated and reflects Penetrate relational file;
The SPD file, TCD file, simple mapping relations file and complex mapping relation file are uploaded into file service In device.
CN201711011579.1A 2017-10-26 2017-10-26 Semantic training system and its method based on business model Pending CN109710914A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711011579.1A CN109710914A (en) 2017-10-26 2017-10-26 Semantic training system and its method based on business model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711011579.1A CN109710914A (en) 2017-10-26 2017-10-26 Semantic training system and its method based on business model

Publications (1)

Publication Number Publication Date
CN109710914A true CN109710914A (en) 2019-05-03

Family

ID=66253275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711011579.1A Pending CN109710914A (en) 2017-10-26 2017-10-26 Semantic training system and its method based on business model

Country Status (1)

Country Link
CN (1) CN109710914A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750540A (en) * 2019-10-18 2020-02-04 中国人民解放军军事科学院军事医学研究院 Method for constructing medical service knowledge base, method and system for obtaining medical service semantic model and medium
CN116302294A (en) * 2023-05-18 2023-06-23 安元科技股份有限公司 Method and system for automatically identifying component attribute through interface

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446942A (en) * 2008-12-10 2009-06-03 苏州大学 Semantic character labeling method of natural language sentence
US20140181154A1 (en) * 2012-12-26 2014-06-26 James Michael Amulu Generating information models in an in-memory database system
CN106777275A (en) * 2016-12-29 2017-05-31 北京理工大学 Entity attribute and property value extracting method based on many granularity semantic chunks
CN107038229A (en) * 2017-04-07 2017-08-11 云南大学 A kind of use-case extracting method based on natural semantic analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446942A (en) * 2008-12-10 2009-06-03 苏州大学 Semantic character labeling method of natural language sentence
US20140181154A1 (en) * 2012-12-26 2014-06-26 James Michael Amulu Generating information models in an in-memory database system
CN106777275A (en) * 2016-12-29 2017-05-31 北京理工大学 Entity attribute and property value extracting method based on many granularity semantic chunks
CN107038229A (en) * 2017-04-07 2017-08-11 云南大学 A kind of use-case extracting method based on natural semantic analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
饶竹一 等: "基于知识图谱的智能客服系统研究", 《电力信息通信》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750540A (en) * 2019-10-18 2020-02-04 中国人民解放军军事科学院军事医学研究院 Method for constructing medical service knowledge base, method and system for obtaining medical service semantic model and medium
CN116302294A (en) * 2023-05-18 2023-06-23 安元科技股份有限公司 Method and system for automatically identifying component attribute through interface
CN116302294B (en) * 2023-05-18 2023-09-01 安元科技股份有限公司 Method and system for automatically identifying component attribute through interface

Similar Documents

Publication Publication Date Title
Ferrández et al. The QALL-ME Framework: A specifiable-domain multilingual Question Answering architecture
RU2509350C2 (en) Method for semantic processing of natural language using graphic intermediary language
WO2009135051A2 (en) Systems and methods for natural language communication with a computer
Zou et al. Understanding: how to resolve ambiguity
Jaafar et al. A survey and comparative study of Arabic NLP architectures
CN109710914A (en) Semantic training system and its method based on business model
Bender Language CoLLAGE: Grammatical Description with the LinGO Grammar Matrix.
Franconi et al. Quelo natural language interface: Generating queries and answer descriptions
Duan Overview of the NLPCC 2019 shared task: open domain semantic parsing
Damljanovic Natural language interfaces to conceptual models
Zhang et al. A novel slot-gated model combined with a key verb context feature for task request understanding by service robots
Song et al. Large pretrained models on multimodal sentiment analysis
Hromei et al. Embedding contextual information in seq2seq models for grounded semantic role labeling
Frank et al. Building literary corpora for computational literary analysis-a prototype to bridge the gap between CL and DH
Xu et al. Nanjing Yunjin intelligent question-answering system based on knowledge graphs and retrieval augmented generation technology
Shahzadi et al. UMagic! THE UML modeler for text documents
Radovanovic Neural Machine Translation from Natural Language into SQL with state-of-the-art Deep Learning methods
Yin Fuzzy information recognition and translation processing in English interpretation based on a generalized maximum likelihood ratio algorithm
Isard The methodius corpus of rhetorical discourse structures and generated texts
Manzella et al. Semantic Search Engine for Data Management and Sustainable Development: Marine Planning Service Platform
Shao et al. Automatic Question Generation for Language Learning Task Based on the Grid-Based Language Structure Parsing Framework
Dannélls Discourse generation from formal specifications using the Grammatical Framework, GF
Cheng et al. The semantic tagging model of chinese question sentence chunk based on description logics
Sonnadara et al. A Natural Language Understanding Sequential Model for Generating Queries with Multiple SQL Commands
Wang et al. Extraction and Application of Verb Event Structure Based on Grammatical Knowledge-Base of Contemporary Chinese (GKB)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20231017

AD01 Patent right deemed abandoned