CN109657247A - The customized grammer implementation method and device of machine learning - Google Patents
The customized grammer implementation method and device of machine learning Download PDFInfo
- Publication number
- CN109657247A CN109657247A CN201811566818.4A CN201811566818A CN109657247A CN 109657247 A CN109657247 A CN 109657247A CN 201811566818 A CN201811566818 A CN 201811566818A CN 109657247 A CN109657247 A CN 109657247A
- Authority
- CN
- China
- Prior art keywords
- grammer
- executive plan
- customized
- machine learning
- analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The present invention provides the customized grammer implementation method and device of a kind of machine learning.The described method includes: carrying out morphological analysis and syntactic analysis to customized grammer, abstract syntax tree is converted to;Semantic analysis is carried out based on the abstract syntax tree, constructs the logic executive plan of grammer;Based on the logic executive plan, the distribution situation of reference data constructs distributed physics executive plan;Based on the distributed physics executive plan, correlation machine learning database is called by reflection mechanism, is calculated by distributed memory, carries out the training and test of model.The present invention can reduce the development cost of the threshold that uses of machine learning, reduction coding and user.
Description
Technical field
The present invention relates to the customized grammer implementation method of field of artificial intelligence more particularly to a kind of machine learning and
Device.
Background technique
Machine learning is a branch of artificial intelligence, had developed into a multi-field cross discipline at nearly more than 30 years, relates to
And the multiple subjects such as probability theory, statistics, Approximation Theory, convextiry analysis, computational complexity theory.Machine learning algorithm is a kind of from number
It is automatically analyzed in and obtains the algorithm that regular and assimilated equations predict unknown data.Machine learning is widely used to
Data mining, computer vision, natural language processing, living things feature recognition, search engine, medical diagnosis, detection credit card are taken advantage of
The fields such as swindleness.
Common machine learning algorithm needs to learn specific programming language, specific compiler, and carries out complicated volume
Code is realized, more demanding for the code capacity of researcher, needs that higher time cost study correlation computer is spent to know
Know.
Summary of the invention
The customized grammer implementation method and device of machine learning provided by the invention, can reduce the use of machine learning
Threshold reduces coding and the development cost of user.
In a first aspect, the present invention provides a kind of customized grammer implementation method of machine learning, comprising:
Morphological analysis and syntactic analysis are carried out to customized grammer, are converted to abstract syntax tree;
Semantic analysis is carried out based on the abstract syntax tree, constructs the logic executive plan of grammer;
Based on the logic executive plan, the distribution situation of reference data constructs distributed physics executive plan;
Based on the distributed physics executive plan, correlation machine learning database is called by reflection mechanism, passes through distribution
Formula memory calculates, and carries out the training and test of model.
Optionally, described to carry out semantic analysis based on the abstract syntax tree, the logic executive plan for constructing grammer includes:
Abstract syntax tree is analyzed, patrolling for grammer is constructed using Java Virtual Machine reflection function by customized reflection rule
Collect executive plan.
Optionally, the morphological analysis are as follows: character string is converted into flag sequence.
Optionally, the syntactic analysis are as follows: according to given formal grammar to the input text being made of word sequence into
Row is analyzed and determines syntactic structure.
Second aspect, the present invention provide a kind of customized grammer realization device of machine learning, comprising:
Converting unit is converted to abstract syntax tree for carrying out morphological analysis and syntactic analysis to customized grammer;
First construction unit, for carrying out semantic analysis based on the abstract syntax tree, the logic for constructing grammer executes meter
It draws;
Second construction unit, for being based on the logic executive plan, the distribution situation of reference data is constructed distributed
Physics executive plan;
Computing unit calls correlation machine by reflection mechanism for being based on the distributed physics executive plan
Library is practised, is calculated by distributed memory, carries out the training and test of model.
Optionally, first construction unit is advised for analyzing abstract syntax tree by customized reflection
Then, using Java Virtual Machine reflection function, the logic executive plan of grammer is constructed
Optionally, the morphological analysis are as follows: character string is converted into flag sequence.
Optionally, the syntactic analysis are as follows: according to given formal grammar to the input text being made of word sequence into
Row is analyzed and determines syntactic structure.
The customized grammer implementation method and device of machine learning provided in an embodiment of the present invention, by customized a kind of new
Grammer, cover machine learning algorithms most in use, user need to only input several sentences, can be achieved with most of machine learning algorithm
Building, trained and interpretation of result, so as to reduce the threshold that uses of machine learning, the study of reduction coding and researcher
And development cost.
Detailed description of the invention
Fig. 1 is the flow chart of the customized grammer implementation method of machine learning provided in an embodiment of the present invention;
Fig. 2 is the execution block diagram of the customized grammer implementation method of machine learning provided in an embodiment of the present invention;
Fig. 3 is the structural schematic diagram of the customized grammer realization device of machine learning provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only
It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill
Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The embodiment of the present invention provides a kind of customized grammer implementation method of machine learning, as shown in Figure 1, the method packet
It includes:
S11, morphological analysis and syntactic analysis are carried out to customized grammer, is converted to abstract syntax tree.
S12, semantic analysis is carried out based on the abstract syntax tree, constructs the logic executive plan of grammer.
S13, it is based on the logic executive plan, the distribution situation of reference data constructs distributed physics executive plan.
S14, it is based on the distributed physics executive plan, correlation machine learning database is called by reflection mechanism, is passed through
Distributed memory calculates, and carries out the training and test of model.
Wherein, the reflection mechanism can be Java reflection mechanism, but be not limited only to this.
The machine learning library can be spark correlation machine learning database, but be not limited only to this.
The customized grammer implementation method of machine learning provided in an embodiment of the present invention passes through the new language of customized one kind
Method covers machine learning algorithms most in use, and user need to only input several sentences, can be achieved with the structure of most of machine learning algorithm
Build, train and interpretation of result, use threshold so as to reduce machine learning, reduce coding and researcher study and
Development cost.
The customized grammer implementation method of machine learning of the embodiment of the present invention is described in detail below.
As shown in Fig. 2, this programme is converted to abstract syntax by carrying out morphological analysis, syntactic analysis to customized grammer
Tree.Semantic analysis is carried out based on abstract syntax tree, constructs the logic plan of grammer, the distribution situation of reference data, building distribution
The physics executive plan of formula, and spark correlation machine learning database is called by Java principle of reflection, pass through distributed memory meter
It calculates, carries out the training and test of model.
Wherein, the morphological analysis is that character string is converted to the process of label (token) sequence in computer science.
Certain given formal grammar is constituted to by word sequence (such as English word sequence) according to the syntactic analysis
Input text carry out analyze and determine its syntactic structure a kind of process.
The abstract syntax tree is a kind of common tree structure of syntactic analysis, is usually used in storing the result of syntactic analysis.
The process for using of customized grammer is relatively simple, and user needs the specified data set for needing to operate first, then refers to
Fixed corresponding machine learning algorithm, is trained data, initial data can also be divided into test set and training set, pass through instruction
Practice data set training pattern, and passes through the effect of test set test model.
This programme is based on Antlr4 progress morphology and syntactic analysis, customized syntactic structure are as follows:
It is described that semantic analysis is carried out based on the abstract syntax tree, construct the logic executive plan of grammer specifically: to pumping
As syntax tree is analyzed, by customized reflection rule, using JVM, (Java Virtual Machine, Java is virtual
Machine) reflection function, construct the logic executive plan of grammer.Reflection rule is as follows, by can with the configuration file of flowering structure
The node in customized syntax tree to be reflected into the power function in machine learning library (such as spark mlib):
-
func.name:PCA
func.path:"org.apache.spark.ml.feature.PCA"
func.args:
-
arg.spark.funcName:setInputCol
arg.ausname:inputCol
arg.nullable:false
arg.type:"java.lang.String"
-
arg.spark.funcName:setOutputCol
arg.ausname:outputCol
arg.nullable:false
arg.type:"java.lang.String"
-
arg.spark.funcName:setK
arg.nullable:false
arg.ausname:k
arg.type:int
This programme logic-based executive plan, the distribution situation of reference data construct distributed physics executive plan, protect
Card count deposit in be effectively carried out.
Advantage of this programme compared to Scikit-learn is it is obvious that eliminate a large amount of coding, the distribution based on bottom
Memory computing platform supports the training and test of mass data;Compared to spark mlib, customized grammer is more terse, is not required to
It wants user to intervene distributed computing, reduces User Exploitation cost.
The embodiment of the present invention also provides a kind of customized grammer realization device of machine learning, as shown in figure 3, described device
Include:
Converting unit 11 is converted to abstract syntax tree for carrying out morphological analysis and syntactic analysis to customized grammer;
First construction unit 12, for carrying out semantic analysis based on the abstract syntax tree, the logic for constructing grammer is executed
Plan;
Second construction unit 13, for being based on the logic executive plan, the distribution situation of reference data, building distribution
Physics executive plan;
Computing unit 14 calls correlation machine by reflection mechanism for being based on the distributed physics executive plan
Learning database is calculated by distributed memory, carries out the training and test of model.
Optionally, first construction unit 12 is advised for analyzing abstract syntax tree by customized reflection
Then, using Java Virtual Machine reflection function, the logic executive plan of grammer is constructed
Optionally, the morphological analysis are as follows: character string is converted into flag sequence.
Optionally, the syntactic analysis are as follows: according to given formal grammar to the input text being made of word sequence into
Row is analyzed and determines syntactic structure.
The customized grammer realization device of machine learning provided in an embodiment of the present invention passes through the new language of customized one kind
Method covers machine learning algorithms most in use, and user need to only input several sentences, can be achieved with the structure of most of machine learning algorithm
Build, train and interpretation of result, use threshold so as to reduce machine learning, reduce coding and researcher study and
Development cost.
The device of the present embodiment can be used for executing the technical solution of above method embodiment, realization principle and technology
Effect is similar, and details are not described herein again.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above method embodiment, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access
Memory, RAM) etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by those familiar with the art, all answers
It is included within the scope of the present invention.Therefore, protection scope of the present invention should be subject to the protection scope in claims.
Claims (8)
1. a kind of customized grammer implementation method of machine learning characterized by comprising
Morphological analysis and syntactic analysis are carried out to customized grammer, are converted to abstract syntax tree;
Semantic analysis is carried out based on the abstract syntax tree, constructs the logic executive plan of grammer;
Based on the logic executive plan, the distribution situation of reference data constructs distributed physics executive plan;
Based on the distributed physics executive plan, correlation machine learning database is called by reflection mechanism, by distribution
Calculating is deposited, the training and test of model are carried out.
2. the method according to claim 1, wherein it is described based on the abstract syntax tree carry out semantic analysis,
The logic executive plan for constructing grammer includes: to analyze abstract syntax tree, by customized reflection rule, uses Java
Virtual machine reflection function constructs the logic executive plan of grammer.
3. method according to claim 1 or 2, which is characterized in that the morphological analysis are as follows: character string is converted to mark
Remember sequence.
4. method according to claim 1 or 2, which is characterized in that the syntactic analysis are as follows: according to given formal grammar
The input text being made of word sequence analyze and determines syntactic structure.
5. a kind of customized grammer realization device of machine learning characterized by comprising
Converting unit is converted to abstract syntax tree for carrying out morphological analysis and syntactic analysis to customized grammer;
First construction unit constructs the logic executive plan of grammer for carrying out semantic analysis based on the abstract syntax tree;
Second construction unit, for being based on the logic executive plan, the distribution situation of reference data constructs distributed physics
Executive plan;
Computing unit calls correlation machine learning database by reflection mechanism for being based on the distributed physics executive plan,
It is calculated by distributed memory, carries out the training and test of model.
6. device according to claim 5, which is characterized in that first construction unit, for abstract syntax tree into
Row analysis, constructs the logic executive plan of grammer using Java Virtual Machine reflection function by customized reflection rule.
7. device according to claim 5 or 6, which is characterized in that the morphological analysis are as follows: character string is converted to mark
Remember sequence.
8. device according to claim 5 or 6, which is characterized in that the syntactic analysis are as follows: according to given formal grammar
The input text being made of word sequence analyze and determines syntactic structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811566818.4A CN109657247B (en) | 2018-12-19 | 2018-12-19 | Method and device for realizing self-defined grammar of machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811566818.4A CN109657247B (en) | 2018-12-19 | 2018-12-19 | Method and device for realizing self-defined grammar of machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109657247A true CN109657247A (en) | 2019-04-19 |
CN109657247B CN109657247B (en) | 2023-05-23 |
Family
ID=66115308
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811566818.4A Active CN109657247B (en) | 2018-12-19 | 2018-12-19 | Method and device for realizing self-defined grammar of machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109657247B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112001500A (en) * | 2020-08-13 | 2020-11-27 | 星环信息科技(上海)有限公司 | Model training method, device and storage medium based on longitudinal federated learning system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090144229A1 (en) * | 2007-11-30 | 2009-06-04 | Microsoft Corporation | Static query optimization for linq |
US20090328012A1 (en) * | 2008-06-27 | 2009-12-31 | Microsoft Corporation | Compiler in a managed application context |
US20100241828A1 (en) * | 2009-03-18 | 2010-09-23 | Microsoft Corporation | General Distributed Reduction For Data Parallel Computing |
US20120226639A1 (en) * | 2011-03-01 | 2012-09-06 | International Business Machines Corporation | Systems and Methods for Processing Machine Learning Algorithms in a MapReduce Environment |
US20150378696A1 (en) * | 2014-06-27 | 2015-12-31 | International Business Machines Corporation | Hybrid parallelization strategies for machine learning programs on top of mapreduce |
US20170177312A1 (en) * | 2015-12-18 | 2017-06-22 | International Business Machines Corporation | Dynamic recompilation techniques for machine learning programs |
CN106970819A (en) * | 2017-03-28 | 2017-07-21 | 清华大学 | A kind of c program code specification check device based on the regular description languages of PRDL |
-
2018
- 2018-12-19 CN CN201811566818.4A patent/CN109657247B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090144229A1 (en) * | 2007-11-30 | 2009-06-04 | Microsoft Corporation | Static query optimization for linq |
US20090328012A1 (en) * | 2008-06-27 | 2009-12-31 | Microsoft Corporation | Compiler in a managed application context |
US20100241828A1 (en) * | 2009-03-18 | 2010-09-23 | Microsoft Corporation | General Distributed Reduction For Data Parallel Computing |
US20120226639A1 (en) * | 2011-03-01 | 2012-09-06 | International Business Machines Corporation | Systems and Methods for Processing Machine Learning Algorithms in a MapReduce Environment |
US20150378696A1 (en) * | 2014-06-27 | 2015-12-31 | International Business Machines Corporation | Hybrid parallelization strategies for machine learning programs on top of mapreduce |
US20170177312A1 (en) * | 2015-12-18 | 2017-06-22 | International Business Machines Corporation | Dynamic recompilation techniques for machine learning programs |
CN106970819A (en) * | 2017-03-28 | 2017-07-21 | 清华大学 | A kind of c program code specification check device based on the regular description languages of PRDL |
Non-Patent Citations (4)
Title |
---|
K R NEERAJ ET AL: "A domain specific language for business transaction processing", 《 2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS》 * |
MIKE INNES ET AL: "On Machine Learning and Programming Languages", 《SYSML》 * |
ZOLTAN A. KOCSIS ET AL: "Automatic Improvement of Apache Spark Queries using Semantics-preserving Program Reduction", 《 PROCEEDINGS OF THE 2016 ON GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION》 * |
梁国蓉: "一个基于Dataflow的大数据Query Engine系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112001500A (en) * | 2020-08-13 | 2020-11-27 | 星环信息科技(上海)有限公司 | Model training method, device and storage medium based on longitudinal federated learning system |
Also Published As
Publication number | Publication date |
---|---|
CN109657247B (en) | 2023-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ren et al. | Lego: Latent execution-guided reasoning for multi-hop question answering on knowledge graphs | |
Ma et al. | Prompt for extraction? PAIE: Prompting argument interaction for event argument extraction | |
CN109033063B (en) | Machine inference method based on knowledge graph, electronic device and computer readable storage medium | |
CN109783618A (en) | Pharmaceutical entities Relation extraction method and system based on attention mechanism neural network | |
CN109920540A (en) | Construction method, device and the computer equipment of assisting in diagnosis and treatment decision system | |
CN116049831A (en) | Software vulnerability detection method based on static analysis and dynamic analysis | |
CN111797241B (en) | Event Argument Extraction Method and Device Based on Reinforcement Learning | |
EP3968245A1 (en) | Automatically generating a pipeline of a new machine learning project from pipelines of existing machine learning projects stored in a corpus | |
EP3968244A1 (en) | Automatically curating existing machine learning projects into a corpus adaptable for use in new machine learning projects | |
Levy et al. | Learning to align the source code to the compiled object code | |
CN108595165A (en) | A kind of code completion method, apparatus and storage medium based on code intermediate representation | |
CN115146279A (en) | Program vulnerability detection method, terminal device and storage medium | |
CN110428907A (en) | A kind of text mining method and system based on unstructured electronic health record | |
CN116580849A (en) | Medical data acquisition and analysis system and method thereof | |
Jha et al. | Does data augmentation improve generalization in NLP? | |
CN116956896A (en) | Text analysis method, system, electronic equipment and medium based on artificial intelligence | |
CN109657247A (en) | The customized grammer implementation method and device of machine learning | |
EP3965024A1 (en) | Automatically labeling functional blocks in pipelines of existing machine learning projects in a corpus adaptable for use in new machine learning projects | |
Patrick et al. | An active learning process for extraction and standardisation of medical measurements by a trainable FSA | |
CN115130545A (en) | Data processing method, electronic device, program product, and medium | |
CN113761875A (en) | Event extraction method and device, electronic equipment and storage medium | |
KR20210051252A (en) | Apparatus and method for providing translation for a word with multiple meanings | |
Araujo | A parallel evolutionary algorithm for stochastic natural language parsing | |
Cottone et al. | Gl-learning: an optimized framework for grammatical inference | |
Mimi et al. | Text Prediction Zero Probability Problem Handling with N-gram Model and Laplace Smoothing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |