CN202563501U - Corpus annotating system based on BP neural network - Google Patents

Corpus annotating system based on BP neural network Download PDF

Info

Publication number
CN202563501U
CN202563501U CN2012200600774U CN201220060077U CN202563501U CN 202563501 U CN202563501 U CN 202563501U CN 2012200600774 U CN2012200600774 U CN 2012200600774U CN 201220060077 U CN201220060077 U CN 201220060077U CN 202563501 U CN202563501 U CN 202563501U
Authority
CN
China
Prior art keywords
corpus
neural network
system based
processing
annotating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2012200600774U
Other languages
Chinese (zh)
Inventor
孙赢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Vocational University
Original Assignee
Suzhou Vocational University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Vocational University filed Critical Suzhou Vocational University
Priority to CN2012200600774U priority Critical patent/CN202563501U/en
Application granted granted Critical
Publication of CN202563501U publication Critical patent/CN202563501U/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The utility model discloses a corpus annotating system based on a BP (Back Propagation) neural network, comprising: a corpus memory; a corpus to be annotated buffer memory; a corpus annotating result comparator; and a BP neural network processing unit comprising at least two classification processors, wherein, the BP neural network processing unit is simultaneously connected with the corpus memory, the corpus to be annotated buffer memory, and the corpus annotating result comparator. In the corpus annotating system of the utility model, the BP neural network processing unit comprises at least two classification processors therein, only when the annotating results of the at least two classification processors to the corpus to be annotated meet a certain coefficient according to setting, the corpus to be annotated can be annotated and stored in the corpus memory. Therefore, the corpus annotating system based on the BP neural network in the utility model raises accuracy of corpus annotation.

Description

A kind of corpus annotation system based on the BP neural network
Technical field
The utility model relates to a kind of corpus annotation system based on the BP neural network.
Background technology
Corpus labeling is the main contents of superficial layer analyzing, and it can be applied to fields such as information retrieval, mechanical translation, subject content analysis and text-processing, and the accuracy of corpus labeling is directly connected to the correctness of text analyzing and text-processing.
In the existing corpus labeling method; There is the applied for machines learning algorithm to carry out the English corpus labeling; The advantage of this algorithm is can from number of characteristics, find out and own relevant characteristic; But because used big measure feature, make search efficiency very low, the use of vocabulary characteristic has simultaneously caused data sparse.
The hidden markov model approach that also has employing to drive based on mistake is carried out the English corpus labeling; Obtained accuracy of identification preferably, still, this method has been used and has been comprised speech at interior big measure feature; Though the wrong policy selection that drives of utilization some relevant characteristics; But the occupancy of internal memory is still very big, has also occurred the sparse phenomenon of data simultaneously, needs to adopt the method for rollback that data are carried out smoothly.
Utilize the self-learning property of BP neural network, can improve the efficient of corpus labeling, but still need further to improve based on the degree of accuracy of the corpus labeling system of single BP neural network.
The utility model content
The utility model has been designed and developed a kind of corpus annotation system based on the BP neural network; In native system; The BP Processing with Neural Network includes at least two classification processors in the unit; Have only ought at least two the classification processor annotation results of treating the mark language material satisfy certain coefficient according to setting, just can treat the mark language material and mark, and deposit the corpus storer in.Native system has improved the degree of accuracy of corpus annotation.
The technical scheme that the utility model provides is:
A kind of corpus annotation system based on the BP neural network comprises:
The corpus storer;
Wait to mark the language material memory buffer;
Corpus labeling is comparer as a result;
BP Processing with Neural Network unit, it includes at least two classification processors, said BP Processing with Neural Network unit simultaneously with said corpus storer, wait to mark language material memory buffer and corpus labeling as a result comparer be connected.
Preferably, in the described corpus annotation system based on the BP neural network, the number of said classification processor is three.
Preferably, described corpus annotation system based on the BP neural network also comprises:
Input media, it is connected with said BP Processing with Neural Network unit, and said input media comprises keyboard and speech recognition device.
Preferably, described corpus annotation system based on the BP neural network also comprises:
Output unit, it is connected with said BP Processing with Neural Network unit, and said output unit comprises display.
The described corpus annotation system of the utility model based on the BP neural network; In native system; The BP Processing with Neural Network includes at least two classification processors in the unit; Have only ought at least two the classification processor annotation results of treating the mark language material satisfy certain coefficient according to setting, just can treat the mark language material and mark, and deposit the corpus storer in.Native system has improved the degree of accuracy of corpus annotation.
Description of drawings
Fig. 1 is the structural representation of the described corpus annotation system based on the BP neural network of the utility model.
Embodiment
Below in conjunction with accompanying drawing the utility model is done further detailed description, can implement according to this with reference to the instructions literal to make those skilled in the art.
As shown in Figure 1, the utility model provides a kind of corpus annotation system based on the BP neural network, comprising: the corpus storer; Wait to mark the language material memory buffer; Corpus labeling is comparer as a result; BP Processing with Neural Network unit, it includes at least two classification processors, said BP Processing with Neural Network unit simultaneously with said corpus storer, wait to mark language material memory buffer and corpus labeling as a result comparer be connected.
The utility model is described to be comprised based on each parts in the corpus annotation system of BP neural network: the corpus storer, comparer, BP Processing with Neural Network unit, input media, output unit are hardware as a result to wait to mark language material memory buffer, corpus annotation.
The BP Processing with Neural Network is provided with at least two classification processors in the unit.Carry out in the mark process treating the mark language material, classification processor is treated the mark language material and is marked, and annotation results deposited in waits to mark the language material memory buffer.When all classification processors all mark completion, all annotation results that BP Processing with Neural Network unit will be waited to mark in the language material memory buffer are extracted, and input to corpus labeling comparer as a result, and the result compares by corpus labeling.Set the corpus labeling coefficient of comparisons in the comparer as a result according to the number of classification processor, be used for judging whether success of mark.Under the situation of two classification processors, then coefficient of comparisons should be 1, and promptly the annotation results of two classification processors is identical; Under the situation of three classification processors, then coefficient of comparisons is 2/3, and promptly two annotation results is identical in three classification processors.
After marking successfully, will language material marked transfer to and mark language material, and be stored in the corpus storer.
In the described corpus annotation system based on the BP neural network, the number of said classification processor is three.
Described corpus annotation system based on the BP neural network also comprises: input media, and it is connected with said BP Processing with Neural Network unit, and said input media comprises keyboard and speech recognition device.
Input media is used for input language material to be marked.
Described corpus annotation system based on the BP neural network also comprises: output unit, and it is connected with said BP Processing with Neural Network unit, and said output unit comprises display.
Output unit is used for output and has marked language material.
Although the embodiment of the utility model is open as above; But it is not restricted to listed utilization in instructions and the embodiment; It can be applied to the field of various suitable the utility model fully, for being familiar with those skilled in the art, can easily realize other modification; Therefore under the universal that does not deviate from claim and equivalency range and limited, the legend that the utility model is not limited to specific details and illustrates here and describe.

Claims (4)

1. the corpus annotation system based on the BP neural network is characterized in that, comprising:
The corpus storer;
Wait to mark the language material memory buffer;
Corpus labeling is comparer as a result;
BP Processing with Neural Network unit, it includes at least two classification processors, said BP Processing with Neural Network unit simultaneously with said corpus storer, wait to mark language material memory buffer and corpus labeling as a result comparer be connected.
2. the corpus annotation system based on the BP neural network as claimed in claim 1 is characterized in that the number of said classification processor is three.
3. the corpus annotation system based on the BP neural network as claimed in claim 1 is characterized in that, also comprises:
Input media, it is connected with said BP Processing with Neural Network unit, and said input media comprises keyboard and speech recognition device.
4. the corpus annotation system based on the BP neural network as claimed in claim 1 is characterized in that, also comprises:
Output unit, it is connected with said BP Processing with Neural Network unit, and said output unit comprises display.
CN2012200600774U 2012-02-23 2012-02-23 Corpus annotating system based on BP neural network Expired - Fee Related CN202563501U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012200600774U CN202563501U (en) 2012-02-23 2012-02-23 Corpus annotating system based on BP neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012200600774U CN202563501U (en) 2012-02-23 2012-02-23 Corpus annotating system based on BP neural network

Publications (1)

Publication Number Publication Date
CN202563501U true CN202563501U (en) 2012-11-28

Family

ID=47213133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012200600774U Expired - Fee Related CN202563501U (en) 2012-02-23 2012-02-23 Corpus annotating system based on BP neural network

Country Status (1)

Country Link
CN (1) CN202563501U (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530282A (en) * 2013-10-23 2014-01-22 北京紫冬锐意语音科技有限公司 Corpus tagging method and equipment
CN105374350A (en) * 2015-09-29 2016-03-02 百度在线网络技术(北京)有限公司 Speech marking method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530282A (en) * 2013-10-23 2014-01-22 北京紫冬锐意语音科技有限公司 Corpus tagging method and equipment
CN103530282B (en) * 2013-10-23 2016-07-13 北京紫冬锐意语音科技有限公司 Corpus labeling method and equipment
CN105374350A (en) * 2015-09-29 2016-03-02 百度在线网络技术(北京)有限公司 Speech marking method and device

Similar Documents

Publication Publication Date Title
US10740561B1 (en) Identifying entities in electronic medical records
CN111079406B (en) Natural language processing model training method, task execution method, equipment and system
US9390086B2 (en) Classification system with methodology for efficient verification
CN109033374B (en) Knowledge graph retrieval method based on Bayesian classifier
WO2021121198A1 (en) Semantic similarity-based entity relation extraction method and apparatus, device and medium
CN109325226B (en) Deep learning network-based term extraction method and device and storage medium
WO2014117553A1 (en) Method and system of adding punctuation and establishing language model
CN111222330B (en) Chinese event detection method and system
CN102959538B (en) Index to document
WO2022222300A1 (en) Open relationship extraction method and apparatus, electronic device, and storage medium
CN107357765B (en) Word document flaking method and device
CN109783801B (en) Electronic device, multi-label classification method and storage medium
CN111143571B (en) Entity labeling model training method, entity labeling method and device
CN109271624B (en) Target word determination method, device and storage medium
CN104573030A (en) Textual emotion prediction method and device
CN113987125A (en) Text structured information extraction method based on neural network and related equipment thereof
CN111291168A (en) Book retrieval method and device and readable storage medium
CN113821593A (en) Corpus processing method, related device and equipment
CN112365993A (en) Classification method and system for few-sample public health question
CN115374786A (en) Entity and relationship combined extraction method and device, storage medium and terminal
CN202563501U (en) Corpus annotating system based on BP neural network
CN114090792A (en) Document relation extraction method based on comparison learning and related equipment thereof
WO2020082613A1 (en) Method and device for extraction of core viewpoint from securities research report using deep learning model
CN104484323A (en) Translation processing method based on document segment
CN111368532B (en) Topic word embedding disambiguation method and system based on LDA

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121128

Termination date: 20140223