CN110532391A - A kind of method and device of text part-of-speech tagging - Google Patents

A kind of method and device of text part-of-speech tagging Download PDF

Info

Publication number
CN110532391A
CN110532391A CN201910817945.5A CN201910817945A CN110532391A CN 110532391 A CN110532391 A CN 110532391A CN 201910817945 A CN201910817945 A CN 201910817945A CN 110532391 A CN110532391 A CN 110532391A
Authority
CN
China
Prior art keywords
speech
word
user
chosen
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910817945.5A
Other languages
Chinese (zh)
Other versions
CN110532391B (en
Inventor
李金锋
杨绳春
洪文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN201910817945.5A priority Critical patent/CN110532391B/en
Publication of CN110532391A publication Critical patent/CN110532391A/en
Application granted granted Critical
Publication of CN110532391B publication Critical patent/CN110532391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of method and devices of text part-of-speech tagging, this method includes the part of speech of determining user setting, obtain the first kind word that user chooses from sentence, sentence is divided into multiple paragraphs according to the first kind word chosen to store, and the part-of-speech tagging for the first kind word chosen for the part of speech of user setting and is shown.Part of speech according to user setting is labeled part of speech to the first kind word that user chooses from sentence, can the word to identical part of speech disposably quickly mark, the effective efficiency for improving part-of-speech tagging, and sentence is divided into multiple paragraphs according to first kind word to store, it can keep the order of each paragraph in sentence, and the part of speech for the first kind word chosen is shown, intuitively, and convenient for discovery marking error.

Description

A kind of method and device of text part-of-speech tagging
Technical field
The present embodiments relate to machine learning techniques field more particularly to a kind of methods and dress of text part-of-speech tagging It sets.
Background technique
Machine learning (Machine Learning, ML) is a multi-field cross discipline, be related to probability theory, statistics, The multiple subjects such as Approximation Theory, convextiry analysis, algorithm complexity theory.Specialize in the study that the mankind were simulated or realized to computer how Behavior reorganizes the existing structure of knowledge and is allowed to constantly improve the performance of itself to obtain new knowledge or skills.
And machine, in order to promote the accuracy of Language Processing, generally requires manually to help to important text when being trained This progress part-of-speech tagging.And traditional tool implementation, all it is to directly give in short, mark personnel is allowed to manually type in correlation Word and carry out mark note.Not only low efficiency in this way, and the word of mark note is unordered, if some word in a word connects It is continuous to occur twice, and part of speech is different, then will be unable to distinguish.
Summary of the invention
The embodiment of the present invention provides a kind of method and device of text part-of-speech tagging, to improve the efficiency of part-of-speech tagging.
In a first aspect, the embodiment of the present invention provides a kind of method of text part-of-speech tagging, comprising:
Determine the part of speech of user setting;
Obtain the first kind word that user chooses from sentence;
The sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and by the chosen The part-of-speech tagging of a kind of word is the part of speech of the user setting and is shown.
In above-mentioned technical proposal, the part of speech according to user setting is labeled the first kind word that user chooses from sentence Part of speech, can the word to identical part of speech disposably quickly mark, the effective efficiency for improving part-of-speech tagging, and according to the first kind Sentence is divided into multiple paragraphs and stored by word, can keep the order of each paragraph in sentence, and to the first kind chosen The part of speech of word is shown, intuitively, and is convenient for discovery marking error.
Optionally, it for the part of speech of the user setting and is shown by the part-of-speech tagging of the first kind word chosen Later, further includes:
The second class word that the part of speech and user for obtaining user's modification are chosen;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described The part-of-speech tagging of second class word is the part of speech of user modification.
In above-mentioned technical proposal, by obtaining the part of speech of user's modification, part-of-speech tagging is carried out to the second class word, may be implemented The quickly part of speech of transformation setting, reaches the purpose being labeled to the word of different parts of speech.
Optionally, the sentence is divided into multiple paragraphs and stored by the first kind word chosen described in the foundation, and will The part-of-speech tagging of the first kind word chosen is the part of speech of the user setting and is shown, comprising:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute In predicate sentence.
In above-mentioned technical proposal, using first kind word as cut-off rule, sentence is divided into multiple paragraphs and is ranked up storage, it can be with So that each paragraph keeps order in sentence, the accuracy of part-of-speech tagging is improved.
Optionally, it for the part of speech of the user setting and is shown by the part-of-speech tagging of the first kind word chosen Later, further includes:
Identical background colour is set by the word for being labeled as the part of speech of the user setting;
Wherein, the corresponding background colour of the word of different parts of speech is different.
In above-mentioned technical proposal, background colour can also be set after mark part of speech, to realize the word for distinguishing different parts of speech.
Optionally, the method also includes:
Obtain the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent of non-classified word The part of speech of word whether be unfiled, if so, it is not divide that the part of speech, which is revised as non-classified word with adjacent part of speech, The word of class merges storage.
In above-mentioned technical proposal, the word for deleting part-of-speech tagging is merged with adjacent part of speech for non-classified word, it can To keep order.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops Word;
Wherein, part of speech is that non-classified word does not show part of speech.
Second aspect, the embodiment of the present invention provide a kind of device of text part-of-speech tagging, comprising:
Determination unit, for determining the part of speech of user setting;
Acquiring unit, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs and stored by processing unit, the first kind word for choosing according to described in, and The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and is shown.
Optionally, the processing unit is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being controlled Make the second class word that the acquiring unit obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described The part-of-speech tagging of second class word is the part of speech of user modification.
Optionally, the processing unit is specifically used for:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute In predicate sentence.
Optionally, the processing unit is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being incited somebody to action The word for being labeled as the part of speech of the user setting is set as identical background colour;
Wherein, the corresponding background colour of the word of different parts of speech is different.
Optionally, the processing unit is also used to:
It controls the acquiring unit and obtains the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent of non-classified word The part of speech of word whether be unfiled, if so, it is not divide that the part of speech, which is revised as non-classified word with adjacent part of speech, The word of class merges storage.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops Word;
Wherein, part of speech is that non-classified word does not show part of speech.
The third aspect, the embodiment of the present invention also provide a kind of calculating equipment, comprising:
Memory, for storing program instruction;
Processor executes above-mentioned text according to the program of acquisition for calling the program instruction stored in the memory The method of part-of-speech tagging.
Fourth aspect, the embodiment of the present invention also provide a kind of computer-readable non-volatile memory medium, including computer Readable instruction, when computer is read and executes the computer-readable instruction, so that computer executes above-mentioned text part of speech mark The method of note.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is a kind of schematic diagram of system architecture provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram of the method for text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 5 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic diagram of text part-of-speech tagging provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of the device of text part-of-speech tagging provided in an embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into It is described in detail to one step, it is clear that described embodiments are only a part of the embodiments of the present invention, rather than whole implementation Example.Based on the embodiments of the present invention, obtained by those of ordinary skill in the art without making creative efforts All other embodiment, shall fall within the protection scope of the present invention.
Fig. 1 illustratively shows a kind of system architecture that the embodiment of the present invention is applicable in, which can be clothes Business device 100, including processor 110, communication interface 120 and memory 130.
Wherein, communication interface 120 is received and dispatched the information of terminal device transmission, is realized for being communicated with terminal device Communication.
Processor 110 is the control centre of server 100, connects entire server 100 with route using various interfaces Various pieces by running or execute the software program/or module that are stored in memory 130, and are called and are stored in storage Data in device 130, the various functions and processing data of execute server 100.Optionally, processor 110 may include one Or multiple processing units.
Memory 130 can be used for storing software program and module, and processor 110 is stored in memory 130 by operation Software program and module, thereby executing various function application and data processing.Memory 130 can mainly include storage journey Sequence area and storage data area, wherein storing program area can application program needed for storage program area, at least one function etc.; Storage data area can store the data etc. created according to business processing.In addition, memory 130 may include high random access Memory, can also include nonvolatile memory, a for example, at least disk memory, flush memory device or other are volatile Property solid-state memory.
It should be noted that above-mentioned structure shown in FIG. 1 is only a kind of example, it is not limited in the embodiment of the present invention.
Based on foregoing description, Fig. 2 illustratively shows a kind of side of text part-of-speech tagging provided in an embodiment of the present invention The process of method, the process can be executed by the device of text part-of-speech tagging, which can be located at server 100 as shown in Figure 1 It is interior, it is also possible to the server 100.
As shown in Fig. 2, the process specifically includes:
Step 201, the part of speech of user setting is determined.
User needs first to be arranged the part of speech currently marked before being labeled to the word in sentence.Implement in the present invention In example, which can include but is not limited to the words such as unfiled, verb, title, pronoun, adjective, number, quantifier or stop words Property, in the specific application process, it can be increased and be deleted according to the actual situation.Wherein, it is not that part of speech, which is non-classified word, Show part of speech, word is all non-classified in the sentence that the first beginning and end are labeled.As shown in figure 3, the part of speech that user can be set Including verb, noun, adjective and stop words.After file loads out, the part of speech of active user's setting is verb.
Step 202, the first kind word that user chooses from sentence is obtained.
When user needs to be labeled a certain word, it is necessary to the word is first chosen, is chosen generally by mouse sliding, The first kind word that acquisition user chooses from sentence can be realized by click () function during specific implementation.
Step 203, the sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and will be described The part-of-speech tagging for the first kind word chosen is the part of speech of the user setting and is shown.
After obtaining the first kind word that user chooses, so that it may using the first kind word chosen as cut-off rule, by sentence It is divided into multiple paragraphs and is ranked up storage, is then the part of speech of user setting by the part-of-speech tagging for the first kind word chosen, and In the sentence that the part of speech of mark is shown.As shown in figure 4, the part of speech of user's current setting is verb, one can be saved as at this time In a variable Type, such as current Type value is " verb ".Just load comes out to be recorded once whole sentence one, and type is " not divide The storage mode of class ", initial non-classified sentence can be as shown in table 1.The first kind word that user chooses is " application ", is first found The record of serial number 1 where " application ", at this time can be with " application " for cut-off rule, by the content of text of serial number 1 in table 1 Be divided into three sections (if left side or right side be it is empty if are not segmented): " once can ", " apply ", " more VPS specifically this how Do ", it is stored in the form of a table, and to one serial number of each section of imparting, it specifically can be as shown in table 2.
Table 1
Serial number Content of text Part of speech
1 Can it once apply for more VPS how does specific this do It is unfiled
Table 2
Serial number Content of text Part of speech
1 Once can It is unfiled
2 Application Verb
3 More VPS how does specific this do It is unfiled
In order to preferably mark the word of different parts of speech, the part of speech and user that user's modification can also be obtained choose the Then paragraph where second class word is divided into multiple paragraphs and stored by two class words according to the second class word, and by the second class word Part-of-speech tagging be user modification part of speech.
For example, the value of modification Type variable is adjective as shown in figure 5, the part of speech of user's modification is adjective.User's choosing In the second class word be " can ", from can be determined in table 2 " can " where paragraph serial number 1, then with " can " For cut-off rule, will " once can " be divided into " primary ", " can ", and according to " primary " and " can " sequence in prototype statement, It is ranked up storage, for " can " mark part of speech, it is specific as shown in table 3.
Table 3
Serial number Content of text Part of speech
1 Once It is unfiled
2 It can Adjective
3 Application Verb
4 More VPS how does specific this do It is unfiled
Further, deletion mark can also be carried out to the word for having marked part of speech, specifically, what available user clicked Marked the word of part of speech, then the part of speech for having marked the word of part of speech is revised as it is unfiled, and according to its adjacent word or sentence Merge storage.
It should be noted that only part of speech is that non-classified word can be just selected in a sentence, part of speech has been marked Word can not be selected, therefore, it is necessary to the word for having marked part of speech carry out delete mark when, it is only necessary to click marked word The arbitrary region of the word of property, can be acquired.
For example, by cancellation " can " mark for, user click " can " where arbitrary region, get use Family click the word for having marked part of speech: " can ", from found in table 3 " can " where position be serial number 2, its part of speech is modified To be unfiled, its two adjacent record (1,3) is then found, if part of speech is the same, so that it may merge, from table It it can be seen that the word of serial number 1 is unfiled in 3, therefore can merge, list as shown in Table 2 may finally be obtained.
Optionally, after being labeled to each word, the word setting of the part of speech of user setting will can also be labeled as For identical background colour, wherein the corresponding background colour of the word of different parts of speech is different, specifically as shown in fig. 6, can from Fig. 6 To find out that the corresponding background colour of different parts of speech is different.
After the completion of all words that can be marked of a sentence all mark, so that it may click and submit key, carry out next language The mark of sentence.
Above-described embodiment shows the part of speech by determining user setting, obtains the first kind word that user chooses from sentence, Sentence is divided into multiple paragraphs according to the first kind word chosen to store, and is to use by the part-of-speech tagging for the first kind word chosen The part of speech of family setting is simultaneously shown.Part of speech according to user setting is labeled the first kind word that user chooses from sentence Part of speech can effectively improve the efficiency of part-of-speech tagging, and sentence is divided into multiple paragraphs according to first kind word and is stored, It can keep the order of data.
Based on the same technical idea, Fig. 7 illustratively shows a kind of text part of speech mark provided in an embodiment of the present invention The structure of the device of note, the device can execute the process of text part-of-speech tagging, which can be located at server shown in FIG. 1 In 100, it is also possible to the server 100.
As shown in fig. 7, the device specifically includes:
Determination unit 701, for determining the part of speech of user setting;
Acquiring unit 702, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs and stored by processing unit 703, the first kind word for choosing according to described in, And the part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and is shown.
Optionally, the processing unit 703 is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being controlled Make the second class word that the acquiring unit 701 obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and will be described The part-of-speech tagging of second class word is the part of speech of user modification.
Optionally, the processing unit 703 is specifically used for:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute In predicate sentence.
Optionally, the processing unit 703 is also used to:
The part-of-speech tagging of the first kind word chosen for the part of speech of the user setting and after being shown, is being incited somebody to action The word for being labeled as the part of speech of the user setting is set as identical background colour;
Wherein, the corresponding background colour of the word of different parts of speech is different.
Optionally, the processing unit 703 is also used to:
It controls the acquiring unit 701 and obtains the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, and is deposited according to its adjacent word or sentence Storage.
Optionally, the part of speech is including but not limited to unfiled, verb, title, pronoun, adjective, number, quantifier or stops Word;
Wherein, part of speech is that non-classified word does not show part of speech.
Based on the same technical idea, the embodiment of the invention also provides a kind of calculating equipment, comprising:
Memory, for storing program instruction;
Processor executes above-mentioned text according to the program of acquisition for calling the program instruction stored in the memory The method of part-of-speech tagging.
Based on the same technical idea, the embodiment of the invention also provides a kind of computer-readable non-volatile memories to be situated between Matter, including computer-readable instruction, when computer is read and executes the computer-readable instruction, so that computer executes The method for stating text part-of-speech tagging.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (10)

1. a kind of method of text part-of-speech tagging characterized by comprising
Determine the part of speech of user setting;
Obtain the first kind word that user chooses from sentence;
The sentence is divided into multiple paragraphs and stored by the first kind word chosen according to described in, and by the first kind chosen The part-of-speech tagging of word is the part of speech of the user setting and is shown.
2. the method as described in claim 1, which is characterized in that be described by the part-of-speech tagging of the first kind word chosen The part of speech of user setting and after being shown, further includes:
The second class word that the part of speech and user for obtaining user's modification are chosen;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and by described second The part-of-speech tagging of class word is the part of speech of user modification.
3. the method as described in claim 1, which is characterized in that the first kind word chosen described in the foundation divides the sentence It is stored for multiple paragraphs, and part of speech and progress by the part-of-speech tagging of the first kind word chosen for the user setting Display, comprising:
Using the first kind word chosen as cut-off rule, the sentence is divided into multiple paragraphs and is ranked up storage;
By the part-of-speech tagging for the first kind word chosen it is the part of speech of the user setting, and the part of speech of mark is shown in institute's predicate In sentence.
4. the method as described in claim 1, which is characterized in that be described by the part-of-speech tagging of the first kind word chosen The part of speech of user setting and after being shown, further includes:
Identical background colour is set by the word for being labeled as the part of speech of the user setting;
Wherein, the corresponding background colour of the word of different parts of speech is different.
5. method as claimed in claim 4, which is characterized in that the method also includes:
Obtain the word for having marked part of speech that user clicks;
The part of speech of the word for having marked part of speech is revised as unfiled, determines that part of speech is revised as the adjacent word of non-classified word Part of speech whether be unfiled, if so, it is non-classified that the part of speech, which is revised as non-classified word and adjacent part of speech, Word merges storage.
6. such as method described in any one of claim 1 to 5, which is characterized in that the part of speech includes classification, verb, title, generation Word, adjective, number, quantifier or stop words;
Wherein, part of speech is that non-classified word does not show part of speech.
7. a kind of device of text part-of-speech tagging characterized by comprising
Determination unit, for determining the part of speech of user setting;
Acquiring unit, the first kind word chosen from sentence for obtaining user;
The sentence is divided into multiple paragraphs for the first kind word chosen according to described in and stored by processing unit, and by institute The part-of-speech tagging for stating the first kind word chosen is the part of speech of the user setting and is shown.
8. device as claimed in claim 7, which is characterized in that the processing unit is also used to:
The part-of-speech tagging of the word chosen for the part of speech of the user setting and after being shown, is being controlled into the acquisition The second class word that unit obtains the part of speech of user's modification and user chooses;
The paragraph where the second class word is divided into multiple paragraphs according to the second class word to store, and by described second The part-of-speech tagging of class word is the part of speech of user modification.
9. a kind of calculating equipment characterized by comprising
Memory, for storing program instruction;
Processor requires 1 to 6 according to the program execution benefit of acquisition for calling the program instruction stored in the memory Described in any item methods.
10. a kind of computer-readable non-volatile memory medium, which is characterized in that including computer-readable instruction, work as computer When reading and executing the computer-readable instruction, so that computer executes such as method as claimed in any one of claims 1 to 6.
CN201910817945.5A 2019-08-30 2019-08-30 Text part-of-speech tagging method and device Active CN110532391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910817945.5A CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910817945.5A CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Publications (2)

Publication Number Publication Date
CN110532391A true CN110532391A (en) 2019-12-03
CN110532391B CN110532391B (en) 2022-07-05

Family

ID=68665827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910817945.5A Active CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Country Status (1)

Country Link
CN (1) CN110532391B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283232A (en) * 2021-05-31 2021-08-20 支付宝(杭州)信息技术有限公司 Method and device for automatically analyzing private information in text

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539907A (en) * 2008-03-19 2009-09-23 日电(中国)有限公司 Part-of-speech tagging model training device and part-of-speech tagging system and method thereof
US20100138217A1 (en) * 2008-11-28 2010-06-03 Institute For Information Industry Method for constructing chinese dictionary and apparatus and storage media using the same
JP2010250814A (en) * 2009-04-14 2010-11-04 Nec (China) Co Ltd Part-of-speech tagging system, training device and method of part-of-speech tagging model
CN103473220A (en) * 2013-09-13 2013-12-25 华中师范大学 Subtitle-file-based documentary content automatic segmentation and subhead automatic generation method
CN108170674A (en) * 2017-12-27 2018-06-15 东软集团股份有限公司 Part-of-speech tagging method and apparatus, program product and storage medium
CN108197101A (en) * 2017-12-19 2018-06-22 浪潮软件股份有限公司 A kind of corpus labeling method and device
CN108256029A (en) * 2018-01-11 2018-07-06 北京神州泰岳软件股份有限公司 Statistical classification model training apparatus and training method
CN108874937A (en) * 2018-05-31 2018-11-23 南通大学 A kind of sensibility classification method combined based on part of speech with feature selecting
CN109271626A (en) * 2018-08-31 2019-01-25 北京工业大学 Text semantic analysis method
CN109558580A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 A kind of text analyzing method and device
CN109922155A (en) * 2019-03-18 2019-06-21 众安信息技术服务有限公司 The method and device of intelligent agent is realized in block chain network
CN110110327A (en) * 2019-04-26 2019-08-09 网宿科技股份有限公司 A kind of text marking method and apparatus based on confrontation study

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539907A (en) * 2008-03-19 2009-09-23 日电(中国)有限公司 Part-of-speech tagging model training device and part-of-speech tagging system and method thereof
US20100138217A1 (en) * 2008-11-28 2010-06-03 Institute For Information Industry Method for constructing chinese dictionary and apparatus and storage media using the same
JP2010250814A (en) * 2009-04-14 2010-11-04 Nec (China) Co Ltd Part-of-speech tagging system, training device and method of part-of-speech tagging model
CN103473220A (en) * 2013-09-13 2013-12-25 华中师范大学 Subtitle-file-based documentary content automatic segmentation and subhead automatic generation method
CN109558580A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 A kind of text analyzing method and device
CN108197101A (en) * 2017-12-19 2018-06-22 浪潮软件股份有限公司 A kind of corpus labeling method and device
CN108170674A (en) * 2017-12-27 2018-06-15 东软集团股份有限公司 Part-of-speech tagging method and apparatus, program product and storage medium
CN108256029A (en) * 2018-01-11 2018-07-06 北京神州泰岳软件股份有限公司 Statistical classification model training apparatus and training method
CN108874937A (en) * 2018-05-31 2018-11-23 南通大学 A kind of sensibility classification method combined based on part of speech with feature selecting
CN109271626A (en) * 2018-08-31 2019-01-25 北京工业大学 Text semantic analysis method
CN109922155A (en) * 2019-03-18 2019-06-21 众安信息技术服务有限公司 The method and device of intelligent agent is realized in block chain network
CN110110327A (en) * 2019-04-26 2019-08-09 网宿科技股份有限公司 A kind of text marking method and apparatus based on confrontation study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴潇等: "基于购物领域词典扩建的评论情感研究", 《计算机技术与发展》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283232A (en) * 2021-05-31 2021-08-20 支付宝(杭州)信息技术有限公司 Method and device for automatically analyzing private information in text

Also Published As

Publication number Publication date
CN110532391B (en) 2022-07-05

Similar Documents

Publication Publication Date Title
CN108170792B (en) Question and answer guiding method and device based on artificial intelligence and computer equipment
US11194958B2 (en) Fact replacement and style consistency tool
JP2022013586A (en) Method of generating conference minutes, apparatus, electronic device, and computer-readable storage medium
US20120290509A1 (en) Training Statistical Dialog Managers in Spoken Dialog Systems With Web Data
WO2004061593A2 (en) Automated essay scoring
CN108846138B (en) Question classification model construction method, device and medium fusing answer information
CN108509556A (en) Data migration method and device, server, storage medium
US11507743B2 (en) System and method for automatic key phrase extraction rule generation
CN110348020A (en) A kind of English- word spelling error correction method, device, equipment and readable storage medium storing program for executing
CN108710695A (en) Mind map generation method based on e-book and electronic equipment
EP2707807A2 (en) Training statistical dialog managers in spoken dialog systems with web data
CN110222194A (en) Data drawing list generation method and relevant apparatus based on natural language processing
CN115048435B (en) Intelligent database storage method and system
CN108829651A (en) A kind of method, apparatus of document treatment, terminal device and storage medium
US10204080B2 (en) Rich formatting for a data label associated with a data point
KR102444362B1 (en) Method, system and non-transitory computer-readable recording medium for supporting writing assessment
US8214736B2 (en) Method and system of identifying textual passages that affect document length
CN110516164A (en) A kind of information recommendation method, device, equipment and storage medium
CN106970758A (en) Electronic document operation processing method, device and electronic equipment
CN110532391A (en) A kind of method and device of text part-of-speech tagging
CN106997340A (en) The generation of dictionary and the Document Classification Method and device using dictionary
CN102707938A (en) Table-form software specification manufacturing and supporting method and device
CN108228779A (en) A kind of result prediction method based on Learning Community's dialogue stream
US20130318104A1 (en) Method and system for analyzing data in artifacts and creating a modifiable data network
CN116226681A (en) Text similarity judging method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant