CN110532391B - Text part-of-speech tagging method and device - Google Patents

Text part-of-speech tagging method and device Download PDF

Info

Publication number
CN110532391B
CN110532391B CN201910817945.5A CN201910817945A CN110532391B CN 110532391 B CN110532391 B CN 110532391B CN 201910817945 A CN201910817945 A CN 201910817945A CN 110532391 B CN110532391 B CN 110532391B
Authority
CN
China
Prior art keywords
speech
user
word
words
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910817945.5A
Other languages
Chinese (zh)
Other versions
CN110532391A (en
Inventor
李金锋
杨绳春
洪文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN201910817945.5A priority Critical patent/CN110532391B/en
Publication of CN110532391A publication Critical patent/CN110532391A/en
Application granted granted Critical
Publication of CN110532391B publication Critical patent/CN110532391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a device for text part-of-speech tagging, wherein the method comprises the steps of determining the part-of-speech set by a user, acquiring a first class of words selected by the user from a sentence, dividing the sentence into a plurality of language segments according to the selected first class of words for storage, tagging the part-of-speech of the selected first class of words as the part-of-speech set by the user and displaying the part-of-speech. The part of speech is labeled on the first class of words selected from the sentences by the user according to the part of speech set by the user, the words with the same part of speech can be labeled quickly at one time, the part of speech labeling efficiency is effectively improved, the sentences are divided into a plurality of language sections according to the first class of words to be stored, the orderliness of each language section in the sentences can be kept, the part of speech of the selected first class of words is displayed, the visualization is realized, and the labeling error can be conveniently found.

Description

Text part-of-speech tagging method and device
Technical Field
The embodiment of the invention relates to the technical field of machine learning, in particular to a method and a device for text part-of-speech tagging.
Background
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The method specially studies how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer.
When the machine is trained, in order to improve the accuracy of language processing, it is often necessary to manually assist in part-of-speech tagging of important texts. In the traditional tool implementation mode, a sentence is directly given, so that a marking person manually enters related words and marks the words. Therefore, the efficiency is low, the marked words are unordered, and if a word in a sentence appears twice continuously and has different parts of speech, the word cannot be distinguished.
Disclosure of Invention
The embodiment of the invention provides a method and a device for text part-of-speech tagging, which are used for improving the efficiency of part-of-speech tagging.
In a first aspect, an embodiment of the present invention provides a method for text part-of-speech tagging, including:
determining the part of speech set by a user;
acquiring a first word selected from a sentence by a user;
and dividing the sentence into a plurality of language sections for storage according to the selected first class of words, marking the part of speech of the selected first class of words as the part of speech set by the user, and displaying the part of speech.
According to the technical scheme, the part of speech of the first class of words selected from the sentences by the user is labeled according to the part of speech set by the user, the words with the same part of speech can be labeled quickly at one time, the part of speech labeling efficiency is effectively improved, the sentences are divided into a plurality of language sections according to the first class of words to be stored, the orderliness of each language section in the sentences can be kept, the part of speech of the selected first class of words is displayed, the visualization is realized, and the labeling error can be conveniently found.
Optionally, after the part of speech of the selected first type of word is marked as the part of speech set by the user and displayed, the method further includes:
acquiring the part of speech modified by the user and a second word selected by the user;
and dividing the language segment where the second type word is located into a plurality of language segments according to the second type word for storage, and marking the part of speech of the second type word as the part of speech modified by the user.
According to the technical scheme, the part of speech modified by the user is obtained, and part of speech tagging is performed on the second type of words, so that the part of speech set can be rapidly changed, and the purpose of tagging words with different parts of speech is achieved.
Optionally, the dividing the sentence into a plurality of language segments according to the selected first type of word for storage, and labeling and displaying the part of speech of the selected first type of word as the part of speech set by the user, includes:
dividing the sentence into a plurality of language segments for sequencing and storing by taking the selected first class of words as a dividing line;
and marking the part of speech of the selected first class of words as the part of speech set by the user, and displaying the marked part of speech in the sentence.
In the technical scheme, the first class of words are used as the dividing lines, the sentences are divided into a plurality of language sections to be sorted and stored, so that the language sections in the sentences can keep orderliness, and the accuracy of part-of-speech tagging is improved.
Optionally, after the part of speech of the selected first type word is labeled as the part of speech set by the user and displayed, the method further includes:
setting words marked as the part of speech set by the user as the same background color;
wherein, the background colors corresponding to the words with different parts of speech are different.
In the technical scheme, the background color can be set after the part of speech is labeled so as to distinguish words with different parts of speech.
Optionally, the method further includes:
acquiring words with marked parts of speech clicked by a user;
and modifying the part of speech of the word with the part of speech marked into unclassified part of speech, determining whether the part of speech of the adjacent word with the part of speech modified into unclassified part of speech is unclassified, and if so, combining and storing the word with the part of speech modified into unclassified part of speech and the adjacent word with the part of speech unclassified part of speech.
In the technical scheme, the word with the part-of-speech tag deleted and the adjacent word with the part-of-speech tag as unclassified words are combined, so that the orderliness can be kept.
Optionally, the part of speech includes, but is not limited to, unclassified, verb, name, pronoun, adjective, numerator, quantifier, or stop word;
wherein, the part of speech is that the unclassified word does not display the part of speech.
In a second aspect, an embodiment of the present invention provides an apparatus for text part-of-speech tagging, including:
a determining unit configured to determine a part of speech set by a user;
the obtaining unit is used for obtaining a first word selected from the sentence by a user;
and the processing unit is used for dividing the sentence into a plurality of language sections for storage according to the selected first class of words, marking the part of speech of the selected first class of words as the part of speech set by the user and displaying the part of speech.
Optionally, the processing unit is further configured to:
after the part of speech of the selected first class of words is marked as the part of speech set by the user and displayed, controlling the acquisition unit to acquire the part of speech modified by the user and a second class of words selected by the user;
and dividing the language segment where the second type word is located into a plurality of language segments according to the second type word for storage, and marking the part of speech of the second type word as the part of speech modified by the user.
Optionally, the processing unit is specifically configured to:
dividing the sentence into a plurality of language segments for sequencing and storing by taking the selected first class of words as a dividing line;
and marking the part of speech of the selected first class of words as the part of speech set by the user, and displaying the marked part of speech in the sentence.
Optionally, the processing unit is further configured to:
after the part of speech of the selected first class of words is marked as the part of speech set by the user and displayed, setting the words marked as the part of speech set by the user as the same background color;
wherein, the background colors corresponding to the words with different parts of speech are different.
Optionally, the processing unit is further configured to:
controlling the acquisition unit to acquire words with marked parts of speech clicked by a user;
and modifying the part of speech of the word with the part of speech marked into unclassified part of speech, determining whether the part of speech of the adjacent word with the part of speech modified into unclassified part of speech is unclassified, and if so, combining and storing the word with the part of speech modified into unclassified part of speech and the adjacent word with the part of speech unclassified part of speech.
Optionally, the part of speech includes, but is not limited to, unclassified, verb, name, pronoun, adjective, numerator, quantifier, or stop word;
wherein the part of speech is that unclassified words do not show the part of speech.
In a third aspect, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instruction stored in the memory and executing the text part-of-speech tagging method according to the obtained program.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer-readable instructions are read and executed by a computer, the computer is caused to perform the above method for text part-of-speech tagging.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for part-of-speech tagging of a text according to an embodiment of the present invention;
fig. 3 is a schematic diagram of part-of-speech tagging of a text according to an embodiment of the present invention;
fig. 4 is a schematic diagram of part-of-speech tagging of a text according to an embodiment of the present invention;
fig. 5 is a schematic diagram of part-of-speech tagging of a text according to an embodiment of the present invention;
fig. 6 is a schematic diagram of part-of-speech tagging of a text according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a device for text part-of-speech tagging according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 illustrates an exemplary system architecture, which may be a server 100, including a processor 110, a communication interface 120, and a memory 130, to which embodiments of the present invention are applicable.
The communication interface 120 is used for communicating with a terminal device, and transceiving information transmitted by the terminal device to implement communication.
The processor 110 is a control center of the server 100, connects various parts of the entire server 100 using various interfaces and routes, performs various functions of the server 100 and processes data by operating or executing software programs and/or modules stored in the memory 130 and calling data stored in the memory 130. Alternatively, processor 110 may include one or more processing units.
The memory 130 may be used to store software programs and modules, and the processor 110 executes various functional applications and data processing by operating the software programs and modules stored in the memory 130. The memory 130 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to a business process, etc. Further, the memory 130 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
It should be noted that the structure shown in fig. 1 is only an example, and the embodiment of the present invention is not limited thereto.
Based on the above description, fig. 2 exemplarily shows a flow of a method for text part-of-speech tagging provided by the embodiment of the present invention, where the flow may be performed by a device for text part-of-speech tagging, and the device may be located in the server 100 shown in fig. 1, or may be the server 100.
As shown in fig. 2, the process specifically includes:
in step 201, the part of speech set by the user is determined.
Before a user labels a word in a sentence, the part of speech of the current label needs to be set. In the embodiment of the present invention, the parts of speech may include, but are not limited to, parts of speech such as unclassified, verb, name, pronoun, adjective, numerator, quantifier, or stop word, and may be added or subtracted according to actual situations in a specific application process. Wherein, the words with the part of speech as unclassified are not displayed, and the words in the initial unclassified sentence are unclassified. As shown in FIG. 3, the parts of speech that the user can set include verbs, nouns, adjectives and stop words. After the file is loaded, the part of speech set by the current user is a verb.
Step 202, acquiring a first word selected by a user from a sentence.
When a user needs to label a word, the word needs to be selected first, generally by mouse sliding, and the collection of the first word selected by the user from the sentence can be realized by a click () function in the specific implementation process.
Step 203, dividing the sentence into a plurality of language segments for storage according to the selected first class of words, and marking the part of speech of the selected first class of words as the part of speech set by the user and displaying the part of speech.
After the first kind of words selected by the user are obtained, the selected first kind of words are used as a dividing line, the sentence is divided into a plurality of language sections and is sorted and stored, then the part of speech of the selected first kind of words is marked as the part of speech set by the user, and the marked part of speech is displayed in the sentence. As shown in FIG. 4, the part-of-speech currently set by the user is a verb, and can be saved in a variable Type at this time, such as the current Type value is "verb". Just after loading a whole sentence, there is a record with the type "unclassified", and the storing manner of the initial unclassified sentence can be as shown in table 1. The first word selected by the user is "application", and the record with the serial number of 1 in which the "application" is located is found first, at this time, the "application" can be used as a dividing line to divide the text content with the serial number of 1 in table 1 into three sections (if the left side or the right side is empty, the text content is not divided): "one-time availability", "application", "multiple VPSs? What is specifically done? ", and each segment is assigned a sequence number, which can be specifically shown in table 2.
TABLE 1
Serial number Text content Part of speech
1 Can multiple VPSs be applied at one time? What is specifically done? Not classified
TABLE 2
Serial number Text content Part of speech
1 Whether at one time can Not classified
2 Application for Verb and its usage
3 Multiple VPSs? What is specifically done? Not classified
In order to label words with different parts of speech better, the parts of speech modified by the user and a second class of words selected by the user can be obtained, then, the word segment where the second class of words is located is divided into a plurality of word segments according to the second class of words for storage, and the parts of speech of the second class of words are labeled as the parts of speech modified by the user.
For example, as shown in FIG. 5, the part of speech modified by the user is an adjective, and the value of the modified Type variable is an adjective. The second word selected by the user is "can or not", the number of the word segment where "can or not" is located is determined to be 1 from table 2, then "can or not" is taken as a dividing line, and "can or not" is divided into "once" and "can or not", and sorted and stored according to the sequence of "once" and "can or not" in the original sentence, and the part of speech is labeled as "can or not", which is specifically shown in table 3.
TABLE 3
Serial number Text content Part of speech
1 At a time Not classified
2 Whether or not to Adjectives
3 Application for Verb and its usage
4 Multiple VPS? What is specifically done? Not classified
Further, words with parts of speech marked can be deleted and marked, specifically, words with parts of speech marked clicked by a user can be obtained, then the parts of speech of the words with parts of speech marked are modified into unclassified words, and merging and storing are carried out according to adjacent words or sentences.
It should be noted that, in a sentence, only words whose part of speech is unclassified may be selected, and words whose part of speech is labeled may not be selected, so that when deleting and labeling a word whose part of speech is labeled, only any region of the word whose part of speech is labeled needs to be clicked, and all the regions can be obtained.
For example, taking canceling the label of "can or not" as an example, the user clicks an arbitrary region where "can or not" is located, and obtains a word with a part of speech already marked, which is clicked by the user: if the word is the same, the word can be merged, and the word with the sequence number 1 can be seen from the table 3 to be unclassified, so that the word can be merged, and finally the list shown in the table 2 can be obtained.
Optionally, after each word is labeled, words labeled as parts of speech set by the user may also be set to have the same background color, where the background colors corresponding to words of different parts of speech are different, specifically as shown in fig. 6, and it can be seen from fig. 6 that the background colors corresponding to different parts of speech are different.
When all words which can be labeled in one sentence are labeled, the submit button can be clicked to label the next sentence.
The embodiment shows that the part of speech set by the user is determined, the first class of words selected by the user from the sentence is obtained, the sentence is divided into a plurality of language sections according to the selected first class of words for storage, and the part of speech of the selected first class of words is marked as the part of speech set by the user and displayed. The part-of-speech tagging is carried out on the first class of words selected from the sentences by the user according to the part-of-speech set by the user, the part-of-speech tagging efficiency can be effectively improved, the sentences are divided into a plurality of language sections according to the first class of words to be stored, and the orderliness of data can be kept.
Based on the same technical concept, fig. 7 exemplarily shows a structure of an apparatus for text part-of-speech tagging provided by an embodiment of the present invention, which may execute a process of text part-of-speech tagging, and the apparatus may be located in the server 100 shown in fig. 1, or the server 100.
As shown in fig. 7, the apparatus specifically includes:
a determining unit 701 configured to determine a part of speech set by a user;
an obtaining unit 702, configured to obtain a first type of word selected from a sentence by a user;
the processing unit 703 is configured to divide the sentence into a plurality of language segments according to the selected first-class word, store the language segments, label the part of speech of the selected first-class word as the part of speech set by the user, and display the part of speech.
Optionally, the processing unit 703 is further configured to:
after the part of speech of the selected first class word is marked as the part of speech set by the user and displayed, the obtaining unit 701 is controlled to obtain the part of speech modified by the user and a second class word selected by the user;
and dividing the language segment where the second type word is located into a plurality of language segments according to the second type word for storage, and marking the part of speech of the second type word as the part of speech modified by the user.
Optionally, the processing unit 703 is specifically configured to:
dividing the sentence into a plurality of language segments for sequencing and storing by taking the selected first class of words as a dividing line;
and marking the part of speech of the selected first class of words as the part of speech set by the user, and displaying the marked part of speech in the sentence.
Optionally, the processing unit 703 is further configured to:
after the part of speech of the selected first class of words is marked as the part of speech set by the user and displayed, setting the words marked as the part of speech set by the user as the same background color;
wherein, the background colors corresponding to the words with different parts of speech are different.
Optionally, the processing unit 703 is further configured to:
controlling the obtaining unit 701 to obtain a word which is clicked by a user and has a part of speech tagged;
and modifying the part of speech of the word with the part of speech marked into unclassified words, and storing the words or the sentences according to the adjacent words or the sentences.
Optionally, the part of speech includes, but is not limited to, unclassified, verb, name, pronoun, adjective, numerator, quantifier, or stop word;
wherein, the part of speech is that the unclassified word does not display the part of speech.
Based on the same technical concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instruction stored in the memory and executing the text part-of-speech tagging method according to the obtained program.
Based on the same technical concept, the embodiment of the invention also provides a computer-readable non-volatile storage medium, which comprises computer-readable instructions, and when the computer-readable instructions are read and executed by a computer, the computer is enabled to execute the method for text part-of-speech tagging.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for text part-of-speech tagging, comprising:
determining the part of speech set by a user;
acquiring a first word selected from a sentence by a user;
dividing the sentence into a plurality of language segments for storage according to the selected first class of words, marking the part of speech of the selected first class of words as the part of speech set by the user and displaying the part of speech;
after the part of speech of the selected first kind of word is marked as the part of speech set by the user and displayed, the method further comprises the following steps:
acquiring the part of speech modified by the user and a second word selected by the user; the part of speech modified by the user is the part of speech set by the user by transforming the part of speech set by the user;
and dividing the language segment where the second type word is located into a plurality of language segments according to the second type word for storage, and marking the part of speech of the second type word as the part of speech modified by the user.
2. The method of claim 1, wherein the dividing the sentence into a plurality of language segments for storage according to the selected first type of word, and labeling and displaying the part of speech of the selected first type of word as the part of speech set by the user comprises:
dividing the sentence into a plurality of language segments by taking the selected first class of words as a segmentation line, and sequencing and storing the language segments;
and marking the part of speech of the selected first class of words as the part of speech set by the user, and displaying the marked part of speech in the sentence.
3. The method of claim 1, wherein after the part of speech of the selected first kind of word is marked as the part of speech set by the user and displayed, the method further comprises:
setting words marked as the part of speech set by the user as the same background color;
wherein, the background colors corresponding to the words with different parts of speech are different.
4. The method of claim 3, wherein the method further comprises:
acquiring words with marked parts of speech clicked by a user;
and modifying the part of speech of the word with the part of speech marked into unclassified part of speech, determining whether the part of speech of the adjacent word with the part of speech modified into unclassified part of speech is unclassified, and if so, combining and storing the word with the part of speech modified into unclassified part of speech and the adjacent word with the part of speech unclassified part of speech.
5. The method of any one of claims 1 to 4, wherein the part of speech comprises a taxonomy, a verb, a name, a pronoun, an adjective, a numerator, a quantifier, or a stop word;
wherein, the part of speech is that the unclassified word does not display the part of speech.
6. An apparatus for part-of-speech tagging of text, comprising:
a determining unit configured to determine a part of speech set by a user;
the obtaining unit is used for obtaining a first word selected from the sentence by a user;
the processing unit is used for dividing the sentence into a plurality of language segments for storage according to the selected first class of words, marking the part of speech of the selected first class of words as the part of speech set by the user and displaying the part of speech;
the processing unit is further to:
after the part of speech of the selected first class of words is marked as the part of speech set by the user and displayed, controlling the acquisition unit to acquire the part of speech modified by the user and a second class of words selected by the user; the part of speech modified by the user is the part of speech set by the user by transforming the part of speech set by the user;
and dividing the language segment where the second type word is located into a plurality of language segments according to the second type word for storage, and marking the part of speech of the second type word as the part of speech modified by the user.
7. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to execute the method of any one of claims 1 to 5 in accordance with the obtained program.
8. A computer-readable non-transitory storage medium including computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1 to 5.
CN201910817945.5A 2019-08-30 2019-08-30 Text part-of-speech tagging method and device Active CN110532391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910817945.5A CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910817945.5A CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Publications (2)

Publication Number Publication Date
CN110532391A CN110532391A (en) 2019-12-03
CN110532391B true CN110532391B (en) 2022-07-05

Family

ID=68665827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910817945.5A Active CN110532391B (en) 2019-08-30 2019-08-30 Text part-of-speech tagging method and device

Country Status (1)

Country Link
CN (1) CN110532391B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283232A (en) * 2021-05-31 2021-08-20 支付宝(杭州)信息技术有限公司 Method and device for automatically analyzing private information in text

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539907B (en) * 2008-03-19 2013-01-23 日电(中国)有限公司 Part-of-speech tagging model training device and part-of-speech tagging system and method thereof
TWI403911B (en) * 2008-11-28 2013-08-01 Inst Information Industry Chinese dictionary constructing apparatus and methods, and storage media
CN101866337B (en) * 2009-04-14 2014-07-02 日电(中国)有限公司 Part-or-speech tagging system, and device and method thereof for training part-or-speech tagging model
CN103473220B (en) * 2013-09-13 2016-05-18 华中师范大学 The automatic merogenesis of documentary film content based on subtitle file and the automatic generation method of subhead thereof
CN109558580B (en) * 2017-09-26 2023-01-17 北京国双科技有限公司 Text analysis method and device
CN108197101B (en) * 2017-12-19 2021-09-14 浪潮软件股份有限公司 Corpus labeling method and apparatus
CN108170674A (en) * 2017-12-27 2018-06-15 东软集团股份有限公司 Part-of-speech tagging method and apparatus, program product and storage medium
CN108256029B (en) * 2018-01-11 2021-05-28 鼎富智能科技有限公司 Statistical classification model training device and training method
CN108874937B (en) * 2018-05-31 2022-05-20 南通大学 Emotion classification method based on part of speech combination and feature selection
CN109271626B (en) * 2018-08-31 2023-09-26 北京工业大学 Text semantic analysis method
CN109922155B (en) * 2019-03-18 2022-03-04 众安信息技术服务有限公司 Method and device for realizing intelligent agent in block chain network
CN110110327B (en) * 2019-04-26 2021-06-22 网宿科技股份有限公司 Text labeling method and equipment based on counterstudy

Also Published As

Publication number Publication date
CN110532391A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
US10929449B2 (en) Generating a structured document guiding view
CN107544726B (en) Speech recognition result error correction method and device based on artificial intelligence and storage medium
CN111046645A (en) Method and device for generating article, computer equipment and storage medium
CN108897869B (en) Corpus labeling method, apparatus, device and storage medium
CN113704429A (en) Semi-supervised learning-based intention identification method, device, equipment and medium
US20170060841A1 (en) Text Extraction
CN109815481B (en) Method, device, equipment and computer storage medium for extracting event from text
CN110442871A (en) Text message processing method, device and equipment
CN112667802A (en) Service information input method, device, server and storage medium
Tallerico Applications of qualitative analysis software: A view from the field
CN111967858A (en) Talent intelligent recommendation method and system, computer equipment and storage medium
CN110532391B (en) Text part-of-speech tagging method and device
CN115455151A (en) AI emotion visual identification method and system and cloud platform
CN113050933B (en) Brain graph data processing method, device, equipment and storage medium
CN111062204B (en) Text punctuation use error identification method and device based on machine learning
CN112765506B (en) Page text content display method, device, equipment and storage medium
CN114328895A (en) News abstract generation method and device and computer equipment
Suriyachay et al. Thai named entity tagged corpus annotation scheme and self verification
US11120204B2 (en) Comment-based article augmentation
CN112364640A (en) Entity noun linking method, device, computer equipment and storage medium
CN113515588A (en) Form data detection method, computer device and storage medium
CN111860862A (en) Performing hierarchical simplification of learning models
CN112836498A (en) Data processing method, data identification device and computing equipment
CN112464627B (en) Manual text labeling tool and method for coreference relationship
CN110866394A (en) Company name identification method and device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant