The content of the invention
The invention provides a kind of natural language processing method and processing platform, to solve existing natural language processing side
In method, the difficulty for using and extending of natural language processing algorithm application is larger, causes natural language processing method to have larger
Use limitation, the problem of applicability is poor.
In a first aspect, the invention provides a kind of natural language processing method, the natural language processing method includes:Will be certainly
The related each basic algorithm individual packages of right Language Processing, the respective independent target element of generation;Set according to user, it is determined that needing
The target element and the operation order of the target element of needs wanted;By the target element of each needs according to the operation order
It is loaded onto in default core frame, generates complete natural language processing algorithm application;Receive pending data, by it is described from
Right Language Processing algorithm, which is applied, to be handled the pending data, and exports result.
Further, the natural language processing method also includes:When the target element for receiving user's deletion/increase needs
During instruction, according to the instruction, the target element of needs and corresponding operation order are reaffirmed;By the mesh for the needs reaffirmed
Mark component is reloaded into default core frame according to corresponding operation order, and generating new natural language processing algorithm should
With.
Further, the natural language processing method also includes:According to the natural language processing algorithm application repeatedly generated, really
Surely the target element that number is more than or equal to preset times is combined, fixed algorithm combination is generated, to be subsequently generated natural language
Fixed algorithm combination corresponding to being loaded directly into Processing Algorithm application.
Further, described applied by the natural language processing algorithm is handled the pending data, and defeated
Go out the process of result, including:The pending data is inputted into initial target component and handled, the natural language
Say that other each target elements are upper and lower according to corresponding to each target element according to corresponding operation order corresponding to Processing Algorithm application
Literary information, intermediate data and the processing of corresponding types are obtained, until last target element output result;Target element pair
The contextual information answered indicate the target element intermediate data to be obtained type and processing after data type;Obtain
The data that are generated by the first target element of operation of intermediate data;The initial target component is the natural language processing
The target element of operation order at first in all target elements corresponding to algorithm application.
Further, the natural language processing method also includes:When receiving user and creating new target element instruction, root
According to basic algorithm logic corresponding to definition of the default core frame to target element and target element, new target element is completed
Encapsulation.
Second aspect, present invention also offers a kind of natural language processing platform, the natural language processing platform includes:Calculate
Method development layer and business algorithm layer;The algorithm development layer is used for:The related each basic algorithm of natural language processing is independently sealed
Dress, the respective independent target element of generation;Set according to user, it is determined that the target element of the target element of needs and needs
Operation order;The business algorithm layer is used for:The target element of each needs is loaded onto default core according to the operation order
In heart framework, complete natural language processing algorithm application is generated;Pending data is received, is calculated by the natural language processing
Method is applied and the pending data is handled, and exports result.
Further, the algorithm development layer is additionally operable to:When the instruction for receiving the target element that user's deletion/increase needs
When, according to the instruction, reaffirm the target element of needs and corresponding operation order;The business algorithm layer is additionally operable to:Will
The target element for the needs reaffirmed is reloaded into default core frame according to corresponding operation order, generate it is new from
Right Language Processing algorithm application.
Further, the algorithm development layer is additionally operable to:According to the natural language processing algorithm application repeatedly generated, group is determined
The target element that number is more than or equal to preset times is closed, fixed algorithm combination is generated, to be subsequently generated natural language processing
Fixed algorithm combination corresponding to being loaded directly into algorithm application.
Further, the business algorithm layer is used to apply to the pending data by the natural language processing algorithm
Handled, and export result, specifically included, the business algorithm layer is used for:By the pending data input to rise
Handled in beginning target element, other each target elements corresponding to natural language processing algorithm application are according to mutually meeting the tendency of
Row order, according to contextual information corresponding to each target element, obtains intermediate data and the processing of corresponding types, until last
Target element exports result;Contextual information corresponding to target element indicates the target element intermediate data to be obtained
Data type after type and processing;The data that the intermediate data to be obtained is generated by the first target element of operation;It is described
Initial target component is the target of operation order at first in all target elements corresponding to natural language processing algorithm application
Component.
Further, the algorithm development layer is additionally operable to:When receiving user and creating new target element instruction, according to pre-
If basic algorithm logic corresponding to definition and target element of the core frame to target element, complete the envelope of new target element
Dress.
Technical scheme provided in an embodiment of the present invention can include the following benefits:The invention provides a kind of natural language
Say processing method and processing platform.It is in the natural language processing method, the related each basic algorithm of natural language processing is independent
Target element is encapsulated as, and is pre- by algorithm curriculum offering general in addition to target element in the application of natural language processing algorithm
If core frame, subsequently during the application of complete natural language processing algorithm is generated, can according to the setting of user, with
Meaning chooses the target element needed, and the operation order of target element as needed, it would be desirable to target element be loaded onto
In default core frame, you can generate complete natural language processing algorithm application, without expending a large amount of manpowers again, enter line code
Write, also, specific data-in port be provided with default core frame, and to the interface of each target element, realize
The agreement of the information such as method, implementation procedure, initialization and name, in the absence of algorithm using the limitation with exploitation extension
Property, applicability is more preferable.In addition, in the natural language processing method, the instruction that can also increase or delete according to user, again really
Recognize the target element of needs and corresponding operation order, and reload generation natural language processing algorithm application, thus may be used
Know, in natural language processing method provided in an embodiment of the present invention, can be set according to user, group arbitrarily is carried out to target element
Close, increase or deletion, further simplify the application and development of natural language processing algorithm and the difficulty used, improve natural language processing
The applicability of method.
Embodiment
Below, embodiments of the invention are discussed in detail with reference to accompanying drawing.
Referring to Fig. 1, Fig. 1 is illustrated that a kind of flow signal of natural language processing method provided in an embodiment of the present invention
Figure.Understand that the natural language processing method includes with reference to Fig. 1:
Step 101, each basic algorithm individual packages by natural language processing correlation, the respective independent target group of generation
Part.
Generally, a complete natural language processing algorithm is applied comprising multiple basic algorithms, and each basic algorithm
With independent data processing function, such as:Segmentation methods, subordinate sentence algorithm, summary extraction algorithm, rule matching algorithm and deactivation
The basic algorithms such as word filter algorithm., will in the embodiment of the present invention for the ease of the development and application of natural language processing algorithm application
Logical operation code corresponding to the related each basic algorithm of natural language processing is stored separately in a unit module, and root
The unit module is configured according to definition of the default core frame to target element, generates an independent target element.Its
In, the frame part that core frame is a complete natural language processing algorithm application is preset, this default core frame includes
In addition to target element, full content needed for complete natural language processing algorithm, and the default core frame is applied to
The development and application of all natural language processing algorithm applications.Definition of the default core frame to target element is included to each target group
The agreement of the information such as interface, implementation method, implementation procedure, initialization and the name of part.
For different natural language processing algorithm applications, comprising basic algorithm can be different, i.e., different natural language
Say Processing Algorithm apply comprising segmentation methods, subordinate sentence algorithm, summary extraction algorithm, rule matching algorithm and stop words filtering calculate
Method can be different.Therefore, in the related basic algorithm of natural language processing, data processing function identical basic algorithm includes more
It is individual, for example, segmentation methods, subordinate sentence algorithm, summary extraction algorithm, rule matching algorithm and stop words filter algorithm are including more
It is individual.Similarly, in the target element that generation is encapsulated by basic algorithm, generation is encapsulated by data processing function identical basic algorithm
Target element also include it is multiple, such as by segmentation methods, subordinate sentence algorithm, summary extraction algorithm, rule matching algorithm and stop words
Filter algorithm each encapsulates the target element of generation including multiple.
If, can also be according to default core frame to mesh in addition, when receiving user and creating new target element instruction
Basic algorithm logic corresponding to the definition of mark component and target element, completes the encapsulation of new target element.When it is implemented, can
Previously according to definition of the default core frame to target element, to set the vacant of the related each basic algorithm of natural language processing
Unit module, there is no the logical operation code for storing basic algorithm in vacant unit module.New mesh is created when receiving user
When marking component instruction, it is first determined basic algorithm corresponding to the desired target element created of user, then replicate the basic algorithm
Corresponding vacant unit module, according to basic algorithm logic corresponding to the basic algorithm, the logical operation code that user is inputted
It is written in corresponding vacant unit module, generates new target element.In the method, can quickly develop increases new mesh
Mark component so that the exploitation of natural language processing algorithm application is simpler efficiently.
After generating target element, target element can also be stored, to be subsequently generated complete natural language
Processing Algorithm directly invokes or replicated the target element of storage when applying.
Step 102, set according to user, it is determined that the target element and the operation order of the target element of needs that need.
In a particular application, natural language processing algorithm of the user according to needed for oneself apply comprising basic algorithm and
The operation order of each basic algorithm, in the target element of generation, choose the target element of needs, according to the selection of user and
Selection order, it may be determined that go out the operation order of the target element of needs and the target element of needs.
Such as:Natural language processing algorithm needed for user is applied to extract comprising segmentation methods, subordinate sentence algorithm, summary and calculated
Method, rule matching algorithm and stop words filter algorithm, and extremely advised according to segmentation methods to subordinate sentence algorithm to extraction algorithm of making a summary
Then matching algorithm to putting in order for stop words filter algorithm chooses target element corresponding to each basic algorithm successively, then, according to
The selection of user, it may be determined that the target element of needs includes target element corresponding to segmentation methods, mesh corresponding to subordinate sentence algorithm
Mark component, target element corresponding to extraction algorithm of making a summary, target element and stop words filter algorithm corresponding to rule matching algorithm
Corresponding target element, and the operation order of the target element needed is target element corresponding to segmentation methods to subordinate sentence algorithm
Corresponding target element to target element corresponding to target element to rule matching algorithm corresponding to extraction algorithm of making a summary extremely disables
The operation order of target element corresponding to word filter algorithm.
Further, in order to which the target element generated in step 101 can be recycled, natural language processing algorithm is improved
The development efficiency of application, can be by the mesh that is generated in step 101 when determining the target element needed according to the setting of user
Mark the purpose that the target element that duplication user needs in component realizes the target element for determining to need.So, follow-up redevelopment is new
Natural language processing algorithm application when, without regenerating target element, development efficiency is higher.
Step 103, the target element of each needs is loaded onto in default core frame according to the operation order, generated
Complete natural language processing algorithm application.
Before the target element of needs is loaded onto in default core frame, first transfer storage default core frame or
Replicate the default core frame of storage, target element load port be provided with default core frame, it would be desirable to target element
The operation order of target element as required, it is loaded onto from target element load port in default core frame, you can generation
Complete natural language processing algorithm application.
Step 104, pending data is received, apply to enter the pending data by the natural language processing algorithm
Row processing, and export result.
Data-in port is provided with default core frame, when it is implemented, by the pending data from default core
Input is handled into initial target component in the data-in port of heart framework, the natural language processing algorithm application pair
The other each target elements answered, according to contextual information corresponding to each target element, obtain corresponding according to corresponding operation order
The intermediate data of type and processing, until last target element output result;Contextual information corresponding to target element
Data type after indicating the type of the target element intermediate data to be obtained and handling;The intermediate data to be obtained is fortune
The data that the first target element of row is generated;The initial target component is corresponding to natural language processing algorithm application
The target element of operation order at first in all target elements.
Wherein, pending data is corpus data, including corpus (set of multiple single language materials) and/or single language
Material.It is different each to run the type of the intermediate data of first target element generation, and contains label in the intermediate data, the mark
Include the type information of the intermediate data in label, for example, containing in the intermediate data that target element corresponding to segmentation methods generates
There is participle label, the type information for the intermediate data that target element corresponding to segmentation methods generates included in the participle label,
Run to include in label corresponding to posterior target element and run type information corresponding to first target element.
In some other optional embodiment, the natural language processing method also includes:When receiving user's deletion/increasing
When adding the instruction of the target element of needs, according to the instruction, the target element of needs and corresponding operation order are reaffirmed, weight
The target element of the needs newly confirmed include in step 101 target element that generates and/or deleted according to user/increase needs
The instruction of target element, patrolled again according to basic algorithm corresponding to definition of the default core frame to target element and target element
Volume, the target element of generation, also, if the target generated in step 101 is included in the target element for the needs reaffirmed
Component, then the target element of needs is reaffirmed by way of being replicated in the target element generated from step 101;Again transfer
Or after replicating default core frame, the target element for the needs reaffirmed is reloaded to pre- according to corresponding operation order
If in core frame, generate new natural language processing algorithm application.It follows that using side provided in an embodiment of the present invention
Method, it can be set according to user, arbitrarily target element is combined, the target element of combination arbitrarily can be increased or be deleted
Remove so that the development and application of natural language processing algorithm application are simpler efficiently.
Further, the natural language processing method also includes:According to the natural language processing algorithm application repeatedly generated, really
Surely the target element that number is more than or equal to preset times is combined, fixed algorithm combination is generated, to be subsequently generated natural language
Fixed algorithm combination corresponding to being loaded directly into Processing Algorithm application.That is, in repeatedly generation natural language processing algorithm application
During, if certain several target element is loaded in default core frame according to identical operation order, and it is loaded
Number to default core frame is more than or equal to preset times, then by these target elements according to corresponding operation order string
It is connected together, generates fixed algorithm combination, can be by generation during the application of natural language processing algorithm is subsequently generated
Fixed algorithm combination is loaded directly into default core frame.So, natural language processing algorithm application can further be improved
The efficiency of development and application.
Such as:During repeatedly generation natural language processing algorithm application, target element corresponding to segmentation methods, point
Target element corresponding to sentence algorithm, target element corresponding to summary extraction algorithm, target element corresponding to rule matching algorithm and
Target element corresponding to stop words filter algorithm is according to target corresponding to the target element to subordinate sentence algorithm as corresponding to segmentation methods
Component is to target element corresponding to target element to rule matching algorithm corresponding to extraction algorithm of making a summary to stop words filter algorithm
The number that the order of corresponding target element is loaded onto default core frame is more than or equal to preset times, then by segmentation methods pair
Target element corresponding to the target element answered, subordinate sentence algorithm, target element, rule matching algorithm pair corresponding to summary extraction algorithm
Target element corresponding to the target element and stop words filter algorithm answered according to as corresponding to segmentation methods target element to subordinate sentence
Target element corresponding to target element to rule matching algorithm corresponding to target element corresponding to algorithm to summary extraction algorithm is extremely
The sequential concatenation of target element corresponding to stop words filter algorithm together, generates a fixed algorithm combination.
Herein it should be noted that the present invention is not limited the specific number of preset times.Can be according to actual need
Set.
It is in natural language processing method provided in an embodiment of the present invention, the related each basic algorithm of natural language processing is only
It is vertical to be encapsulated as target element, and be by algorithm curriculum offering general in addition to target element in the application of natural language processing algorithm
Default core frame, subsequently during complete natural language processing algorithm application is generated, can according to the setting of user,
Arbitrarily choose the target element needed, and the operation order of target element as needed, it would be desirable to target element loading
Into default core frame, you can generate complete natural language processing algorithm application, without expending a large amount of manpowers again, carry out generation
Code is write, also, specific data-in port is provided with default core frame, and interface to each target element, reality
The agreement of the information such as existing method, implementation procedure, initialization and name, in the absence of algorithm using the limitation with exploitation extension
Property, applicability is more preferable.In addition, in the natural language processing method, the instruction that can also increase or delete according to user, again really
Recognize the target element of needs and corresponding operation order, and reload generation natural language processing algorithm application, thus may be used
Know, in natural language processing method provided in an embodiment of the present invention, can be set according to user, group arbitrarily is carried out to target element
Close, increase or deletion, further simplify the application and development of natural language processing algorithm and the difficulty used, improve natural language processing
The applicability of method.
Corresponding with above-mentioned natural language processing method, the embodiment of the invention also discloses a kind of natural language processing to put down
Platform.
Referring to Fig. 2, Fig. 2 is illustrated that a kind of structured flowchart of natural language processing platform provided in an embodiment of the present invention.
Understand that the natural language processing platform 200 includes with reference to Fig. 2:Algorithm development layer 201, business algorithm layer 202, core frame layer
203 and algorithm computation layer 204.
Wherein, algorithm development layer 201 is used for:The related each basic algorithm individual packages of natural language processing, generation is each
From independent target element;Set according to user, it is determined that the target element and the operation order of the target element of needs that need.
Business algorithm layer 202 is used for:The target element of each needs is loaded onto default core according to the operation order
In framework, complete natural language processing algorithm application is generated;Pending data is received, passes through the natural language processing algorithm
Handled using to the pending data, and export result.
Core frame layer 203 is used to store default core frame.Default core frame is a complete natural language processing
The frame part of algorithm application, this default core frame are included in addition to target element, complete natural language processing algorithm institute
The full content needed, and the default core frame is applied to the development and application of all natural language processing algorithms application.
The bottom that algorithm computation layer 204 is used to store involved by the related basic algorithm of natural language processing calculates, such as
Matrix computations.These bottoms are calculated during basic algorithm is encapsulated as into target element, can directly be replicated use, nothing
Manpower need to be spent to write code again, and then the development efficiency of natural language processing algorithm application can be improved.
Further, business algorithm layer 202 is used to apply to the pending data by the natural language processing algorithm
Handled, and export result, specifically included, the business algorithm layer 202 is used for:By the pending data input to
Handled in initial target component, other each target elements are according to corresponding corresponding to the natural language processing algorithm application
Operation order, according to contextual information corresponding to each target element, intermediate data and the processing of corresponding types are obtained, until last
Target element output result;Contextual information corresponding to target element indicates the target element intermediate data to be obtained
Type and processing after data type;The data that the intermediate data to be obtained is generated by the first target element of operation;Institute
Initial target component is stated as the mesh of operation order at first in all target elements corresponding to natural language processing algorithm application
Mark component.
Further, algorithm development layer 201 is additionally operable to be stored target element, to generate complete natural language
When Processing Algorithm is applied, the target element of storage can be directly invoked or replicated from algorithm development layer 201.
In some other optional embodiment, algorithm development layer 201 is additionally operable to:Needed when receiving user's deletion/increase
During the instruction for the target element wanted, according to the instruction, the target element of needs and corresponding operation order are reaffirmed;Business is calculated
Method layer 202 is additionally operable to:The target element for the needs reaffirmed is reloaded to default core according to corresponding operation order
In framework, new natural language processing algorithm application is generated.
Further, algorithm development layer 201 is additionally operable to:According to the natural language processing algorithm application repeatedly generated, group is determined
The target element that number is more than or equal to preset times is closed, fixed algorithm combination is generated, to be subsequently generated natural language processing
Fixed algorithm combination corresponding to being loaded directly into algorithm application.
In some other optional embodiment, algorithm development layer 201 is additionally operable to:New target is created when receiving user
When component instructs, according to basic algorithm logic corresponding to definition of the default core frame to target element and target element, complete
The encapsulation of new target element.Definition of the default core frame to target element includes the interface to each target element, realization side
The agreement of the information such as method, implementation procedure, initialization and name.
Natural language processing platform provided in an embodiment of the present invention can be implemented each in above-mentioned natural language processing method
Step, and identical technique effect can be obtained.Using the natural language processing platform to pending corpus data at
During reason, the natural language processing algorithm application for handling pending corpus data can be used for simple and quick generation so that whole
Processing procedure more rapidly and efficiently, also, carries out natural language using the natural language processing platform to pending corpus data
During processing, the data-in port of all natural language processing algorithm applications is consistent, and without using limitation, applicability is more
It is good.
In the specific implementation, the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can store
There is program, the program may include part or all of in each embodiment of natural language processing method provided by the invention when performing
Step.Described storage medium can be magnetic disc, CD, read-only memory (English:Read-only memory, referred to as:
ROM) or random access memory is (English:Random access memory, referred to as:RAM) etc..
It is required that those skilled in the art can be understood that the technology in the embodiment of the present invention can add by software
The mode of general hardware platform realize.Based on such understanding, the technical scheme in the embodiment of the present invention substantially or
Say that the part to be contributed to prior art can be embodied in the form of software product, the computer software product can be deposited
Storage is in storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are causing a computer equipment (can be with
Be personal computer, server, either network equipment etc.) perform some part institutes of each embodiment of the present invention or embodiment
The method stated.
In this specification between each embodiment identical similar part mutually referring to.Especially for natural language
For processing platform embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, related part referring to
Explanation in embodiment of the method.
Invention described above embodiment is not intended to limit the scope of the present invention..