CN110532391A - A kind of method and device of text part-of-speech tagging - Google Patents
A kind of method and device of text part-of-speech tagging Download PDFInfo
- Publication number
- CN110532391A CN110532391A CN201910817945.5A CN201910817945A CN110532391A CN 110532391 A CN110532391 A CN 110532391A CN 201910817945 A CN201910817945 A CN 201910817945A CN 110532391 A CN110532391 A CN 110532391A
- Authority
- CN
- China
- Prior art keywords
- speech
- word
- user
- chosen
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of method and devices of text part-of-speech tagging, this method includes the part of speech of determining user setting, obtain the first kind word that user chooses from sentence, sentence is divided into multiple paragraphs according to the first kind word chosen to store, and the part-of-speech tagging for the first kind word chosen for the part of speech of user setting and is shown.Part of speech according to user setting is labeled part of speech to the first kind word that user chooses from sentence, can the word to identical part of speech disposably quickly mark, the effective efficiency for improving part-of-speech tagging, and sentence is divided into multiple paragraphs according to first kind word to store, it can keep the order of each paragraph in sentence, and the part of speech for the first kind word chosen is shown, intuitively, and convenient for discovery marking error.
Description
Technical field
The present embodiments relate to machine learning techniques field more particularly to a kind of methods and dress of text part-of-speech tagging
It sets.
Background technique
Machine learning (Machine Learning, ML) is a multi-field cross discipline, be related to probability theory, statistics,
The multiple subjects such as Approximation Theory, convextiry analysis, algorithm complexity theory.Specialize in the study that the mankind were simulated or realized to computer how
Behavior reorganizes the existing structure of knowledge and is allowed to constantly improve the performance of itself to obtain new knowledge or skills.
And machine, in order to promote the accuracy of Language Processing, generally requires manually to help to important text when being trained
This progress part-of-speech tagging.And traditional tool implementation, all it is to directly give in short, mark personnel is allowed to manually type in correlation
Word and carry out mark note.Not only low efficiency in this way, and the word of mark note is unordered, if some word in a word connects
It is continuous to occur twice, and part of speech is different, then will be unable to distinguish.
Summary of the invention
The embodiment of the present invention provides a kind of method and device of text part-of-speech tagging, to improve the efficiency of part-of-speech tagging.
In a first aspect, the embodiment of the present invention provides a kind of method of text part-of-speech tagging, comprising:
Determine the part of speech of user setting;
Obtain the first kind word that user chooses from sentence;
The sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and by the chosen
The part-of-speech tagging of a kind of word is the part of speech of the user setting and is shown.
In above-mentioned technical proposal, the part of speech according to user setting is labeled the first kind word that user chooses from sentence
Part of speech, can the word to identical part of speech disposably quickly mark, the effective efficiency for improving part-of-speech tagging, and according to the first kind
Sentence is divided into multiple paragraphs and stored by word, can keep the order of each paragraph in sentence, and to the first kind chosen
The part of speech of word is shown, intuitively, and is convenient for discovery marking error.
Optionally, it for the part of speech of the user setting and is shown by the part-of-speech tagging of the first kind word chosen
Later, further includes:
The second class word that the part of speech and user for obtaining user's modification are chosen;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described
The part-of-speech tagging of second class word is the part of speech of user modification.
In above-mentioned technical proposal, by obtaining the part of speech of user's modification, part-of-speech tagging is carried out to the second class word, may be implemented
The quickly part of speech of transformation setting, reaches the purpose being labeled to the word of different parts of speech.
Optionally, the sentence is divided into multiple paragraphs and stored by the first kind word chosen described in the foundation, and will
The part-of-speech tagging of the first kind word chosen is the part of speech of the user setting and is shown, comprising:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute
In predicate sentence.
In above-mentioned technical proposal, using first kind word as cut-off rule, sentence is divided into multiple paragraphs and is ranked up storage, it can be with
So that each paragraph keeps order in sentence, the accuracy of part-of-speech tagging is improved.
Optionally, it for the part of speech of the user setting and is shown by the part-of-speech tagging of the first kind word chosen
Later, further includes:
Identical background colour is set by the word for being labeled as the part of speech of the user setting;
Wherein, the corresponding background colour of the word of different parts of speech is different.
In above-mentioned technical proposal, background colour can also be set after mark part of speech, to realize the word for distinguishing different parts of speech.
Optionally, the method also includes:
Obtain the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent of non-classified word
The part of speech of word whether be unfiled, if so, it is not divide that the part of speech, which is revised as non-classified word with adjacent part of speech,
The word of class merges storage.
In above-mentioned technical proposal, the word for deleting part-of-speech tagging is merged with adjacent part of speech for non-classified word, it can
To keep order.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops
Word;
Wherein, part of speech is that non-classified word does not show part of speech.
Second aspect, the embodiment of the present invention provide a kind of device of text part-of-speech tagging, comprising:
Determination unit, for determining the part of speech of user setting;
Acquiring unit, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs and stored by processing unit, the first kind word for choosing according to described in, and
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and is shown.
Optionally, the processing unit is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being controlled
Make the second class word that the acquiring unit obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described
The part-of-speech tagging of second class word is the part of speech of user modification.
Optionally, the processing unit is specifically used for:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute
In predicate sentence.
Optionally, the processing unit is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being incited somebody to action
The word for being labeled as the part of speech of the user setting is set as identical background colour;
Wherein, the corresponding background colour of the word of different parts of speech is different.
Optionally, the processing unit is also used to:
It controls the acquiring unit and obtains the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent of non-classified word
The part of speech of word whether be unfiled, if so, it is not divide that the part of speech, which is revised as non-classified word with adjacent part of speech,
The word of class merges storage.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops
Word;
Wherein, part of speech is that non-classified word does not show part of speech.
The third aspect, the embodiment of the present invention also provide a kind of calculating equipment, comprising:
Memory, for storing program instruction;
Processor executes above-mentioned text according to the program of acquisition for calling the program instruction stored in the memory
The method of part-of-speech tagging.
Fourth aspect, the embodiment of the present invention also provide a kind of computer-readable non-volatile memory medium, including computer
Readable instruction, when computer is read and executes the computer-readable instruction, so that computer executes above-mentioned text part of speech mark
The method of note.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is a kind of schematic diagram of system architecture provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram of the method for text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 5 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of the device of text part-of-speech tagging provided in an embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into
It is described in detail to one step, it is clear that described embodiments are only a part of the embodiments of the present invention, rather than whole implementation
Example.Based on the embodiments of the present invention, obtained by those of ordinary skill in the art without making creative efforts
All other embodiment, shall fall within the protection scope of the present invention.
Fig. 1 illustratively shows a kind of system architecture that the embodiment of the present invention is applicable in, which can be clothes
Business device 100, including processor 110, communication interface 120 and memory 130.
Wherein, communication interface 120 is received and dispatched the information of terminal device transmission, is realized for being communicated with terminal device
Communication.
Processor 110 is the control centre of server 100, connects entire server 100 with route using various interfaces
Various pieces by running or execute the software program/or module that are stored in memory 130, and are called and are stored in storage
Data in device 130, the various functions and processing data of execute server 100.Optionally, processor 110 may include one
Or multiple processing units.
Memory 130 can be used for storing software program and module, and processor 110 is stored in memory 130 by operation
Software program and module, thereby executing various function application and data processing.Memory 130 can mainly include storage journey
Sequence area and storage data area, wherein storing program area can application program needed for storage program area, at least one function etc.;
Storage data area can store the data etc. created according to business processing.In addition, memory 130 may include high random access
Memory, can also include nonvolatile memory, a for example, at least disk memory, flush memory device or other are volatile
Property solid-state memory.
It should be noted that above-mentioned structure shown in FIG. 1 is only a kind of example, it is not limited in the embodiment of the present invention.
Based on foregoing description, Fig. 2 illustratively shows a kind of side of text part-of-speech tagging provided in an embodiment of the present invention
The process of method, the process can be executed by the device of text part-of-speech tagging, which can be located at server 100 as shown in Figure 1
It is interior, it is also possible to the server 100.
As shown in Fig. 2, the process specifically includes:
Step 201, the part of speech of user setting is determined.
User needs first to be arranged the part of speech currently marked before being labeled to the word in sentence.Implement in the present invention
In example, which can include but is not limited to the words such as unfiled, verb, title, pronoun, adjective, number, quantifier or stop words
Property, in the specific application process, it can be increased and be deleted according to the actual situation.Wherein, it is not that part of speech, which is non-classified word,
Show part of speech, word is all non-classified in the sentence that the first beginning and end are labeled.As shown in figure 3, the part of speech that user can be set
Including verb, noun, adjective and stop words.After file loads out, the part of speech of active user's setting is verb.
Step 202, the first kind word that user chooses from sentence is obtained.
When user needs to be labeled a certain word, it is necessary to the word is first chosen, is chosen generally by mouse sliding,
The first kind word that acquisition user chooses from sentence can be realized by click () function during specific implementation.
Step 203, the sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and will be described
The part-of-speech tagging for the first kind word chosen is the part of speech of the user setting and is shown.
After obtaining the first kind word that user chooses, so that it may using the first kind word chosen as cut-off rule, by sentence
It is divided into multiple paragraphs and is ranked up storage, is then the part of speech of user setting by the part-of-speech tagging for the first kind word chosen, and
In the sentence that the part of speech of mark is shown.As shown in figure 4, the part of speech of user's current setting is verb, one can be saved as at this time
In a variable Type, such as current Type value is " verb ".Just load comes out to be recorded once whole sentence one, and type is " not divide
The storage mode of class ", initial non-classified sentence can be as shown in table 1.The first kind word that user chooses is " application ", is first found
The record of serial number 1 where " application ", at this time can be with " application " for cut-off rule, by the content of text of serial number 1 in table 1
Be divided into three sections (if left side or right side be it is empty if are not segmented): " once can ", " apply ", " more VPS specifically this how
Do ", it is stored in the form of a table, and to one serial number of each section of imparting, it specifically can be as shown in table 2.
Table 1
Serial number | Content of text | Part of speech |
1 | Can it once apply for more VPS how does specific this do | It is unfiled |
Table 2
Serial number | Content of text | Part of speech |
1 | Once can | It is unfiled |
2 | Application | Verb |
3 | More VPS how does specific this do | It is unfiled |
In order to preferably mark the word of different parts of speech, the part of speech and user that user's modification can also be obtained choose the
Then paragraph where second class word is divided into multiple paragraphs and stored by two class words according to the second class word, and by the second class word
Part-of-speech tagging be user modification part of speech.
For example, the value of modification Type variable is adjective as shown in figure 5, the part of speech of user's modification is adjective.User's choosing
In the second class word be " can ", from can be determined in table 2 " can " where paragraph serial number 1, then with " can "
For cut-off rule, will " once can " be divided into " primary ", " can ", and according to " primary " and " can " sequence in prototype statement,
It is ranked up storage, for " can " mark part of speech, it is specific as shown in table 3.
Table 3
Serial number | Content of text | Part of speech |
1 | Once | It is unfiled |
2 | It can | Adjective |
3 | Application | Verb |
4 | More VPS how does specific this do | It is unfiled |
Further, deletion mark can also be carried out to the word for having marked part of speech, specifically, what available user clicked
Marked the word of part of speech, then the part of speech for having marked the word of part of speech is revised as it is unfiled, and according to its adjacent word or sentence
Merge storage.
It should be noted that only part of speech is that non-classified word can be just selected in a sentence, part of speech has been marked
Word can not be selected, therefore, it is necessary to the word for having marked part of speech carry out delete mark when, it is only necessary to click marked word
The arbitrary region of the word of property, can be acquired.
For example, by cancellation " can " mark for, user click " can " where arbitrary region, get use
Family click the word for having marked part of speech: " can ", from found in table 3 " can " where position be serial number 2, its part of speech is modified
To be unfiled, its two adjacent record (1,3) is then found, if part of speech is the same, so that it may merge, from table
It it can be seen that the word of serial number 1 is unfiled in 3, therefore can merge, list as shown in Table 2 may finally be obtained.
Optionally, after being labeled to each word, the word setting of the part of speech of user setting will can also be labeled as
For identical background colour, wherein the corresponding background colour of the word of different parts of speech is different, specifically as shown in fig. 6, can from Fig. 6
To find out that the corresponding background colour of different parts of speech is different.
After the completion of all words that can be marked of a sentence all mark, so that it may click and submit key, carry out next language
The mark of sentence.
Above-described embodiment shows the part of speech by determining user setting, obtains the first kind word that user chooses from sentence,
Sentence is divided into multiple paragraphs according to the first kind word chosen to store, and is to use by the part-of-speech tagging for the first kind word chosen
The part of speech of family setting is simultaneously shown.Part of speech according to user setting is labeled the first kind word that user chooses from sentence
Part of speech can effectively improve the efficiency of part-of-speech tagging, and sentence is divided into multiple paragraphs according to first kind word and is stored,
It can keep the order of data.
Based on the same technical idea, Fig. 7 illustratively shows a kind of text part of speech mark provided in an embodiment of the present invention
The structure of the device of note, the device can execute the process of text part-of-speech tagging, which can be located at server shown in FIG. 1
In 100, it is also possible to the server 100.
As shown in fig. 7, the device specifically includes:
Determination unit 701, for determining the part of speech of user setting;
Acquiring unit 702, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs and stored by processing unit 703, the first kind word for choosing according to described in,
And the part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and is shown.
Optionally, the processing unit 703 is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being controlled
Make the second class word that the acquiring unit 701 obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described
The part-of-speech tagging of second class word is the part of speech of user modification.
Optionally, the processing unit 703 is specifically used for:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute
In predicate sentence.
Optionally, the processing unit 703 is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being incited somebody to action
The word for being labeled as the part of speech of the user setting is set as identical background colour;
Wherein, the corresponding background colour of the word of different parts of speech is different.
Optionally, the processing unit 703 is also used to:
It controls the acquiring unit 701 and obtains the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, and is deposited according to its adjacent word or sentence
Storage.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops
Word;
Wherein, part of speech is that non-classified word does not show part of speech.
Based on the same technical idea, the embodiment of the invention also provides a kind of calculating equipment, comprising:
Memory, for storing program instruction;
Processor executes above-mentioned text according to the program of acquisition for calling the program instruction stored in the memory
The method of part-of-speech tagging.
Based on the same technical idea, the embodiment of the invention also provides a kind of computer-readable non-volatile memories to be situated between
Matter, including computer-readable instruction, when computer is read and executes the computer-readable instruction, so that computer executes
The method for stating text part-of-speech tagging.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (10)
1. a kind of method of text part-of-speech tagging characterized by comprising
Determine the part of speech of user setting;
Obtain the first kind word that user chooses from sentence;
The sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and by the first kind chosen
The part-of-speech tagging of word is the part of speech of the user setting and is shown.
2. the method as described in claim 1, which is characterized in that be described by the part-of-speech tagging of the first kind word chosen
The part of speech of user setting and after being shown, further includes:
The second class word that the part of speech and user for obtaining user's modification are chosen;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and by described second
The part-of-speech tagging of class word is the part of speech of user modification.
3. the method as described in claim 1, which is characterized in that the first kind word chosen described in the foundation divides the sentence
It is stored for multiple paragraphs, and part of speech and progress by the part-of-speech tagging of the first kind word chosen for the user setting
Display, comprising:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute's predicate
In sentence.
4. the method as described in claim 1, which is characterized in that be described by the part-of-speech tagging of the first kind word chosen
The part of speech of user setting and after being shown, further includes:
Identical background colour is set by the word for being labeled as the part of speech of the user setting;
Wherein, the corresponding background colour of the word of different parts of speech is different.
5. method as claimed in claim 4, which is characterized in that the method also includes:
Obtain the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent word of non-classified word
Part of speech whether be unfiled, if so, it is non-classified that the part of speech, which is revised as non-classified word and adjacent part of speech,
Word merges storage.
6. such as method described in any one of claim 1 to 5, which is characterized in that the part of speech includes classification, verb, title, generation
Word, adjective, number, quantifier or stop words;
Wherein, part of speech is that non-classified word does not show part of speech.
7. a kind of device of text part-of-speech tagging characterized by comprising
Determination unit, for determining the part of speech of user setting;
Acquiring unit, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs for the first kind word chosen according to described in and stored by processing unit, and by institute
The part-of-speech tagging for stating the first kind word chosen is the part of speech of the user setting and is shown.
8. device as claimed in claim 7, which is characterized in that the processing unit is also used to:
The part-of-speech tagging of the word chosen for the part of speech of the user setting and after being shown, is being controlled into the acquisition
The second class word that unit obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and by described second
The part-of-speech tagging of class word is the part of speech of user modification.
9. a kind of calculating equipment characterized by comprising
Memory, for storing program instruction;
Processor requires 1 to 6 according to the program execution benefit of acquisition for calling the program instruction stored in the memory
Described in any item methods.
10. a kind of computer-readable non-volatile memory medium, which is characterized in that including computer-readable instruction, work as computer
When reading and executing the computer-readable instruction, so that computer executes such as method as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910817945.5A CN110532391B (en) | 2019-08-30 | 2019-08-30 | Text part-of-speech tagging method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910817945.5A CN110532391B (en) | 2019-08-30 | 2019-08-30 | Text part-of-speech tagging method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110532391A true CN110532391A (en) | 2019-12-03 |
CN110532391B CN110532391B (en) | 2022-07-05 |
Family
ID=68665827
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910817945.5A Active CN110532391B (en) | 2019-08-30 | 2019-08-30 | Text part-of-speech tagging method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110532391B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113283232A (en) * | 2021-05-31 | 2021-08-20 | 支付宝(杭州)信息技术有限公司 | Method and device for automatically analyzing private information in text |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101539907A (en) * | 2008-03-19 | 2009-09-23 | 日电(中国)有限公司 | Part-of-speech tagging model training device and part-of-speech tagging system and method thereof |
US20100138217A1 (en) * | 2008-11-28 | 2010-06-03 | Institute For Information Industry | Method for constructing chinese dictionary and apparatus and storage media using the same |
JP2010250814A (en) * | 2009-04-14 | 2010-11-04 | Nec (China) Co Ltd | Part-of-speech tagging system, training device and method of part-of-speech tagging model |
CN103473220A (en) * | 2013-09-13 | 2013-12-25 | 华中师范大学 | Subtitle-file-based documentary content automatic segmentation and subhead automatic generation method |
CN108170674A (en) * | 2017-12-27 | 2018-06-15 | 东软集团股份有限公司 | Part-of-speech tagging method and apparatus, program product and storage medium |
CN108197101A (en) * | 2017-12-19 | 2018-06-22 | 浪潮软件股份有限公司 | A kind of corpus labeling method and device |
CN108256029A (en) * | 2018-01-11 | 2018-07-06 | 北京神州泰岳软件股份有限公司 | Statistical classification model training apparatus and training method |
CN108874937A (en) * | 2018-05-31 | 2018-11-23 | 南通大学 | A kind of sensibility classification method combined based on part of speech with feature selecting |
CN109271626A (en) * | 2018-08-31 | 2019-01-25 | 北京工业大学 | Text semantic analysis method |
CN109558580A (en) * | 2017-09-26 | 2019-04-02 | 北京国双科技有限公司 | A kind of text analyzing method and device |
CN109922155A (en) * | 2019-03-18 | 2019-06-21 | 众安信息技术服务有限公司 | The method and device of intelligent agent is realized in block chain network |
CN110110327A (en) * | 2019-04-26 | 2019-08-09 | 网宿科技股份有限公司 | A kind of text marking method and apparatus based on confrontation study |
-
2019
- 2019-08-30 CN CN201910817945.5A patent/CN110532391B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101539907A (en) * | 2008-03-19 | 2009-09-23 | 日电(中国)有限公司 | Part-of-speech tagging model training device and part-of-speech tagging system and method thereof |
US20100138217A1 (en) * | 2008-11-28 | 2010-06-03 | Institute For Information Industry | Method for constructing chinese dictionary and apparatus and storage media using the same |
JP2010250814A (en) * | 2009-04-14 | 2010-11-04 | Nec (China) Co Ltd | Part-of-speech tagging system, training device and method of part-of-speech tagging model |
CN103473220A (en) * | 2013-09-13 | 2013-12-25 | 华中师范大学 | Subtitle-file-based documentary content automatic segmentation and subhead automatic generation method |
CN109558580A (en) * | 2017-09-26 | 2019-04-02 | 北京国双科技有限公司 | A kind of text analyzing method and device |
CN108197101A (en) * | 2017-12-19 | 2018-06-22 | 浪潮软件股份有限公司 | A kind of corpus labeling method and device |
CN108170674A (en) * | 2017-12-27 | 2018-06-15 | 东软集团股份有限公司 | Part-of-speech tagging method and apparatus, program product and storage medium |
CN108256029A (en) * | 2018-01-11 | 2018-07-06 | 北京神州泰岳软件股份有限公司 | Statistical classification model training apparatus and training method |
CN108874937A (en) * | 2018-05-31 | 2018-11-23 | 南通大学 | A kind of sensibility classification method combined based on part of speech with feature selecting |
CN109271626A (en) * | 2018-08-31 | 2019-01-25 | 北京工业大学 | Text semantic analysis method |
CN109922155A (en) * | 2019-03-18 | 2019-06-21 | 众安信息技术服务有限公司 | The method and device of intelligent agent is realized in block chain network |
CN110110327A (en) * | 2019-04-26 | 2019-08-09 | 网宿科技股份有限公司 | A kind of text marking method and apparatus based on confrontation study |
Non-Patent Citations (1)
Title |
---|
吴潇等: "基于购物领域词典扩建的评论情感研究", 《计算机技术与发展》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113283232A (en) * | 2021-05-31 | 2021-08-20 | 支付宝(杭州)信息技术有限公司 | Method and device for automatically analyzing private information in text |
Also Published As
Publication number | Publication date |
---|---|
CN110532391B (en) | 2022-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108170792B (en) | Question and answer guiding method and device based on artificial intelligence and computer equipment | |
US11194958B2 (en) | Fact replacement and style consistency tool | |
JP2022013586A (en) | Method of generating conference minutes, apparatus, electronic device, and computer-readable storage medium | |
US20120290509A1 (en) | Training Statistical Dialog Managers in Spoken Dialog Systems With Web Data | |
WO2004061593A2 (en) | Automated essay scoring | |
CN108846138B (en) | Question classification model construction method, device and medium fusing answer information | |
CN108509556A (en) | Data migration method and device, server, storage medium | |
US11507743B2 (en) | System and method for automatic key phrase extraction rule generation | |
CN110348020A (en) | A kind of English- word spelling error correction method, device, equipment and readable storage medium storing program for executing | |
CN108710695A (en) | Mind map generation method based on e-book and electronic equipment | |
EP2707807A2 (en) | Training statistical dialog managers in spoken dialog systems with web data | |
CN110222194A (en) | Data drawing list generation method and relevant apparatus based on natural language processing | |
CN115048435B (en) | Intelligent database storage method and system | |
CN108829651A (en) | A kind of method, apparatus of document treatment, terminal device and storage medium | |
US10204080B2 (en) | Rich formatting for a data label associated with a data point | |
KR102444362B1 (en) | Method, system and non-transitory computer-readable recording medium for supporting writing assessment | |
US8214736B2 (en) | Method and system of identifying textual passages that affect document length | |
CN110516164A (en) | A kind of information recommendation method, device, equipment and storage medium | |
CN106970758A (en) | Electronic document operation processing method, device and electronic equipment | |
CN110532391A (en) | A kind of method and device of text part-of-speech tagging | |
CN106997340A (en) | The generation of dictionary and the Document Classification Method and device using dictionary | |
CN102707938A (en) | Table-form software specification manufacturing and supporting method and device | |
CN108228779A (en) | A kind of result prediction method based on Learning Community's dialogue stream | |
US20130318104A1 (en) | Method and system for analyzing data in artifacts and creating a modifiable data network | |
CN116226681A (en) | Text similarity judging method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |