CN110110328A - Text handling method and device - Google Patents
Text handling method and device Download PDFInfo
- Publication number
- CN110110328A CN110110328A CN201910346113.XA CN201910346113A CN110110328A CN 110110328 A CN110110328 A CN 110110328A CN 201910346113 A CN201910346113 A CN 201910346113A CN 110110328 A CN110110328 A CN 110110328A
- Authority
- CN
- China
- Prior art keywords
- word
- text
- word frequency
- short
- destination document
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of text handling method and devices.This method includes obtaining short text corpus, disposes every short text according to preset format and using all short texts as a destination document;Count the word frequency summation of all words in the word frequency and the destination document that each word occurs in the destination document;According to the word frequency and the word frequency summation, the word weight of institute's predicate is calculated.Present application addresses the bad technical problems of short text treatment effect.The emphasis vocabulary in short text can be preferably identified by the application.In addition, the application is suitable for nature text-processing scene.
Description
Technical field
This application involves text-processing fields, in particular to a kind of text handling method and device.
Background technique
The characteristics of short text in natural language processing is that sentence is shorter, vocabulary is fewer.
Inventors have found that bad for short text treatment effect.Further, the heavy duty word in short text can not be identified
It converges.
For the bad problem of short text treatment effect in the related technology, currently no effective solution has been proposed.
Summary of the invention
The main purpose of the application is to provide a kind of text handling method and device, to solve short text treatment effect not
Good problem.
To achieve the goals above, according to the one aspect of the application, a kind of text handling method is provided.
Text handling method according to the application includes: to obtain short text corpus, disposes every short essay according to preset format
Originally and using all short texts as a destination document;Count the word frequency and institute that each word occurs in the destination document
State the word frequency summation of all words in destination document;According to the word frequency and the word frequency summation, the word power of institute's predicate is calculated
Weight.
Further, the method is used to handle the weight of frequency of occurrences height but meaningless word in short text.
Further, short text corpus is obtained, disposes every short text according to preset format and by all short texts
Include: to obtain short text corpus as a destination document, disposes every short essay according to the format that every row disposes a short text
Originally and using all short texts as a destination document.
Further, all words in the word frequency and the destination document that each word occurs in the destination document are counted
Word frequency summation includes: the word frequency WF that each word occurs in the statistics destination document;Count all words in the destination document
Word frequency summation DF;According to the word frequency and the word frequency summation, the word weight that institute's predicate is calculated includes: to calculate word weight WW
=ln (DF/WF).
Further, for handling that the frequency of occurrences in short text is high but meaningless word includes following one or more: language
Gas word, auxiliary word, pronoun
To achieve the goals above, according to the another aspect of the application, a kind of text processing apparatus is provided.
According to the text processing apparatus of the application, comprising: module is obtained, for obtaining short text corpus, according to default lattice
Formula disposes every short text and using all short text as a destination document;Statistical module, for counting the target
The word frequency summation of all words in each word occurs in document word frequency and the destination document;Computing module, for according to institute
Predicate frequency and the word frequency summation, are calculated the word weight of institute's predicate.
Further, for handling the weight of frequency of occurrences height but meaningless word in short text.
Further, the acquisition module disposes the format of a short text according to every row for obtaining short text corpus
Dispose every short text and using all short texts as a destination document.
Further, the statistical module is used for, and counts the word frequency WF that each word occurs in the destination document;Statistics institute
State the word frequency summation DF of all words in destination document;According to the word frequency and the word frequency summation, the word of institute's predicate is calculated
Weight includes: to calculate word weight WW=ln (DF/WF).
Further, for handling that the frequency of occurrences in short text is high but meaningless word includes following one or more: language
Gas word, auxiliary word, pronoun.
Text handling method and device in the embodiment of the present application, using short text corpus is obtained, according to preset format portion
Affix one's name to every short text and using all short texts as the mode of a destination document, it is every in the destination document by counting
The word frequency summation of all words, has reached according to the word frequency and the word frequency in the word frequency of a word appearance and the destination document
Summation, is calculated the purpose of the word weight of institute's predicate, to realize the emphasis vocabulary that can preferably identify in short text
Technical effect, and then solve the bad technical problem of short text treatment effect.
Detailed description of the invention
The attached drawing constituted part of this application is used to provide further understanding of the present application, so that the application's is other
Feature, objects and advantages become more apparent upon.The illustrative examples attached drawing and its explanation of the application is for explaining the application, not
Constitute the improper restriction to the application.In the accompanying drawings:
Fig. 1 is the text handling method flow diagram according to one embodiment of the application;
Fig. 2 is the text handling method flow diagram according to another embodiment of the application;
Fig. 3 is the text processing apparatus structural schematic diagram according to the embodiment of the present application.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application
Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only
The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people
Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection
It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, "
Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way
Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein.In addition, term " includes " and " tool
Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units
Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear
Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
In this application, term " on ", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outside",
" in ", "vertical", "horizontal", " transverse direction ", the orientation or positional relationship of the instructions such as " longitudinal direction " be orientation based on the figure or
Positional relationship.These terms are not intended to limit indicated dress primarily to better describe the application and embodiment
Set, element or component must have particular orientation, or constructed and operated with particular orientation.
Also, above-mentioned part term is other than it can be used to indicate that orientation or positional relationship, it is also possible to for indicating it
His meaning, such as term " on " also are likely used for indicating certain relations of dependence or connection relationship in some cases.For ability
For the those of ordinary skill of domain, the concrete meaning of these terms in this application can be understood as the case may be.
In addition, term " installation ", " setting ", " being equipped with ", " connection ", " connected ", " socket " shall be understood in a broad sense.For example,
It may be a fixed connection, be detachably connected or monolithic construction;It can be mechanical connection, or electrical connection;It can be direct phase
It even, or indirectly connected through an intermediary, or is two connections internal between device, element or component.
For those of ordinary skills, the concrete meaning of above-mentioned term in this application can be understood as the case may be.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, this method includes the following steps, namely S102 to step S106:
Step S102 obtains short text corpus, disposes every short text according to preset format and by all short texts
As a destination document;
Short text corpus is acquired as text input.Short text corpus can be collected in advance.
Refer to according to preset format and disposes the short text corpus to each short text according to the format of setting.
Meanwhile using all short texts as a destination document.
It should be noted that using all short texts as can't be by each short text when a destination document
It is individually handled, but it is that a text is handled that all short texts, which are treated as,.
Step S104 counts all words in the word frequency and the destination document that each word occurs in the destination document
Word frequency summation;
The word frequency that all words in the word frequency and the destination document that each word occurs are counted in the destination document is total
With.
It should be noted that the word frequency not occurred to word each in the destination document in embodiments herein
Statistical method is specifically limited, as long as being capable of word frequency statistics demand.
It is also to be noted that not to the word frequency summation of all words in the destination document in embodiments herein
Statistical method specifically limited, as long as being capable of word frequency statistics summation demand.
The word weight of institute's predicate is calculated according to the word frequency and the word frequency summation in step S106.
According to the word frequency and the word frequency summation, to calculate the word weight of institute's predicate.According to obtained institute's predicate
Word weight of the weight as keyword in short text.
It can be seen from the above description that the application realizes following technical effect:
In the embodiment of the present application, using short text corpus is obtained, every short text is disposed according to preset format and by institute
Have mode of the short text as a destination document, by count each word occurs in the destination document word frequency and
The word frequency summation of all words in the destination document, has reached according to the word frequency and the word frequency summation, is calculated described
The purpose of the word weight of word to realize the technical effect that can preferably identify the emphasis vocabulary in short text, and then solves
It has determined the bad technical problem of short text treatment effect.
According to the embodiment of the present application, as preferred in the present embodiment, for handling, the frequency of occurrences in short text is high but nothing
The weight of meaning word.In embodiments herein, the concept of number of files is not used, by using word frequency summation and word frequency
The method that ratio takes natural logrithm again, can effectively solve the problems, such as some high frequencies but meaningless word weight ratio is higher.
According to the embodiment of the present application, as preferred in the present embodiment, short text corpus is obtained, is disposed according to preset format
Every short text and using all short texts as a destination document include: obtain short text corpus, according to every row dispose
The format of one short text disposes every short text and using all short text as a destination document.Specifically, it will obtain
The short text corpus merger taken is a document, and has a short text in every row.It is segmented again later.
According to the embodiment of the present application, as preferred in the present embodiment, count what each word in the destination document occurred
The word frequency summation of all words includes: in word frequency and the destination document
Step S202 counts the word frequency WF that each word occurs in the destination document;
Step S204 counts the word frequency summation DF of all words in the destination document;
Step S206, according to the word frequency and the word frequency summation, the word weight that institute's predicate is calculated includes: calculating word
Weight WW=ln (DF/WF).
Specifically, pass through the word frequency summation DF, word weight WW=of all words in the word frequency WF and document of each word of statistics
ln(DF/WF).The method for taking natural logrithm again using the ratio of word frequency summation and word frequency calculates word weight at this time.
According to the embodiment of the present application, as preferred in the present embodiment, for handling, the frequency of occurrences in short text is high but nothing
Meaning word includes following one or more: modal particle, auxiliary word, pronoun.
Specifically, due to the characteristics of short text be sentence is shorter, vocabulary is fewer, a word can in current statement
Can only occur once, however be difficult to find which word or which word are emphasis in traditional word statistics based on long text
Word.In this application based on the thought of TFIDF, algorithm and thinking are transformed, make the word weight processing suitable for short text
Method.By regarding the short text corpus of all collections a piece of document as, having cast aside existing in embodiments herein
The concept of number of documents in TFIDF, eliminates the process for calculating TF, and calculation amount is smaller.IDF means inverse text frequency in TFIDF
Rate index refers to that total number of files and some word appear in a calculated result in how many documents.Do not have in this method
The concept of number of files, the method for taking natural logrithm again using the ratio of word frequency summation and word frequency, for example, when only occurring one in document
A word " ", word frequency WF is equivalent to the word frequency summation DF of all words, then WW=ln (DF/WF)=ln1=0.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions
It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not
The sequence being same as herein executes shown or described step.
According to the embodiment of the present application, additionally provide it is a kind of for implementing the text processing apparatus of the above method, such as Fig. 3 institute
Show, which includes: to obtain module 10, for obtaining short text corpus, disposes every short text according to preset format and by institute
There is the short text as a destination document;Statistical module 20, for counting the word that each word occurs in the destination document
The word frequency summation of all words in frequency and the destination document;Computing module 30, for total according to the word frequency and the word frequency
With the word weight of institute's predicate is calculated.
Short text corpus is acquired in the acquisition module 10 of the embodiment of the present application as text input.It can collect in advance
Short text corpus.
Refer to according to preset format and disposes the short text corpus to each short text according to the format of setting.
Meanwhile using all short texts as a destination document.
It should be noted that using all short texts as can't be by each short text when a destination document
It is individually handled, but it is that a text is handled that all short texts, which are treated as,.
The word frequency and institute that each word occurs are counted in the statistical module 20 of the embodiment of the present application in the destination document
State the word frequency summation of all words in destination document.
It should be noted that the word frequency not occurred to word each in the destination document in embodiments herein
Statistical method is specifically limited, as long as being capable of word frequency statistics demand.
It is also to be noted that not to the word frequency summation of all words in the destination document in embodiments herein
Statistical method specifically limited, as long as being capable of word frequency statistics summation demand.
According to the word frequency and the word frequency summation in the computing module 30 of the embodiment of the present application, to calculate institute's predicate
Word weight.Word weight according to obtained institute's predicate weight as keyword in short text.
According to the embodiment of the present application, as preferred in the present embodiment, the text processing apparatus is for handling short text
The weight of middle frequency of occurrences height but meaningless word.In embodiments herein, the concept of number of files is not used, by using
The method that the ratio of word frequency summation and word frequency takes natural logrithm again can effectively solve some high frequencies but meaningless word weight
Relatively high problem.
According to the embodiment of the present application, as preferred in the present embodiment, the acquisition module 10, for obtaining short text language
Material disposes every short text and using all short texts as a target text according to the format that every row disposes a short text
Shelves.Specifically, the short text corpus merger that will acquire is a document, and has a short text in every row.It carries out again later
Participle.
According to the embodiment of the present application, as preferred in the present embodiment, the statistical module is used for,
Count the word frequency WF that each word occurs in the destination document;
Count the word frequency summation DF of all words in the destination document;
According to the word frequency and the word frequency summation, the word weight that institute's predicate is calculated includes:
It calculates word weight WW=ln (DF/WF).
Specifically, pass through the word frequency summation DF, word weight WW=of all words in the word frequency WF and document of each word of statistics
ln(DF/WF).The method for taking natural logrithm again using the ratio of word frequency summation and word frequency calculates word weight at this time.
According to the embodiment of the present application, as preferred in the present embodiment, the text processing apparatus is for handling short text
The middle frequency of occurrences is high but meaningless word includes following one or more: modal particle, auxiliary word, pronoun.Specifically, due to short text
The characteristics of be sentence is shorter, vocabulary is fewer, a word may only occur in current statement it is primary, however traditional
Word statistics based on long text is difficult to find which word or which word are heavy duty words.In this application based on the think of of TFIDF
Think, algorithm and thinking are transformed, makes the word weight processing method suitable for short text.Pass through in embodiments herein
By the short text corpus of all collections, regard a piece of document as, has cast aside the concept of number of documents in existing TFIDF, eliminated
The process of TF is calculated, calculation amount is smaller.IDF means inverse document frequency in TFIDF, refers to total number of files and some
Word appears in a calculated result in how many documents.There is no the concept of number of files in this method, using word frequency summation and word
The method that the ratio of frequency takes natural logrithm again, for example, only occur in the document word " ", word frequency WF is equivalent to all words
Word frequency summation DF, then WW=ln (DF/WF)=ln1=0.
Obviously, those skilled in the art should be understood that each module of above-mentioned the application or each step can be with general
Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed
Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored
Be performed by computing device in the storage device, perhaps they are fabricated to each integrated circuit modules or by they
In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the application be not limited to it is any specific
Hardware and software combines.
The foregoing is merely preferred embodiment of the present application, are not intended to limit this application, for the skill of this field
For art personnel, various changes and changes are possible in this application.Within the spirit and principles of this application, made any to repair
Change, equivalent replacement, improvement etc., should be included within the scope of protection of this application.
Claims (10)
1. a kind of text handling method characterized by comprising
Short text corpus is obtained, disposes every short text according to preset format and using all short texts as a target text
Shelves;
Count the word frequency summation of all words in the word frequency and the destination document that each word occurs in the destination document;
According to the word frequency and the word frequency summation, the word weight of institute's predicate is calculated.
2. text handling method according to claim 1, which is characterized in that for handle in short text the frequency of occurrences it is high but
The weight of meaningless word.
3. text handling method according to claim 1, which is characterized in that short text corpus is obtained, according to preset format
It disposes every short text and includes: using all short texts as a destination document
Short text corpus is obtained, disposes every short text according to the format that every row disposes a short text and by all short essays
This is as a destination document.
4. text handling method according to claim 1, which is characterized in that count each word in the destination document and occur
Word frequency and the destination document in the word frequency summations of all words include:
Count the word frequency WF that each word occurs in the destination document;
Count the word frequency summation DF of all words in the destination document;
According to the word frequency and the word frequency summation, the word weight that institute's predicate is calculated includes:
It calculates word weight WW=ln (DF/WF).
5. text handling method according to claim 1, which is characterized in that for handle in short text the frequency of occurrences it is high but
Meaningless word includes following one or more: modal particle, auxiliary word, pronoun.
6. a kind of text processing apparatus characterized by comprising
Module is obtained, for obtaining short text corpus, disposes every short text according to preset format and by all short texts
As a destination document;
Statistical module, for counting all words in the word frequency and the destination document that each word occurs in the destination document
Word frequency summation;
Computing module, for the word weight of institute's predicate to be calculated according to the word frequency and the word frequency summation.
7. text processing apparatus according to claim 6, which is characterized in that for handle in short text the frequency of occurrences it is high but
The weight of meaningless word.
8. text processing apparatus according to claim 6, which is characterized in that the acquisition module, for obtaining short text
Corpus disposes every short text and using all short texts as a target according to the format that every row disposes a short text
Document.
9. text processing apparatus according to claim 6, which is characterized in that the statistical module is used for,
Count the word frequency WF that each word occurs in the destination document;
Count the word frequency summation DF of all words in the destination document;
According to the word frequency and the word frequency summation, the word weight that institute's predicate is calculated includes:
It calculates word weight WW=ln (DF/WF).
10. text processing apparatus according to claim 6, which is characterized in that high for handling the frequency of occurrences in short text
But meaningless word includes following one or more: modal particle, auxiliary word, pronoun.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910346113.XA CN110110328B (en) | 2019-04-26 | 2019-04-26 | Text processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910346113.XA CN110110328B (en) | 2019-04-26 | 2019-04-26 | Text processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110110328A true CN110110328A (en) | 2019-08-09 |
CN110110328B CN110110328B (en) | 2023-09-01 |
Family
ID=67487015
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910346113.XA Active CN110110328B (en) | 2019-04-26 | 2019-04-26 | Text processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110110328B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251841A (en) * | 2007-05-17 | 2008-08-27 | 华东师范大学 | Method for establishing and searching feature matrix of Web document based on semantics |
CN104750844A (en) * | 2015-04-09 | 2015-07-01 | 中南大学 | Method and device for generating text characteristic vectors based on TF-IGM, method and device for classifying texts |
CN106503153A (en) * | 2016-10-21 | 2017-03-15 | 江苏理工学院 | A kind of computer version taxonomic hierarchies, system and its file classification method |
CN106570112A (en) * | 2016-11-01 | 2017-04-19 | 四川用联信息技术有限公司 | Improved ant colony algorithm-based text clustering realization method |
CN106919554A (en) * | 2016-10-27 | 2017-07-04 | 阿里巴巴集团控股有限公司 | The recognition methods of invalid word and device in document |
CN108491429A (en) * | 2018-02-09 | 2018-09-04 | 湖北工业大学 | A kind of feature selection approach based on document frequency and word frequency statistics between class in class |
CN108536868A (en) * | 2018-04-24 | 2018-09-14 | 北京慧闻科技发展有限公司 | The data processing method of short text data and application on social networks |
CN109492110A (en) * | 2018-11-28 | 2019-03-19 | 南京中孚信息技术有限公司 | Document Classification Method and device |
-
2019
- 2019-04-26 CN CN201910346113.XA patent/CN110110328B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251841A (en) * | 2007-05-17 | 2008-08-27 | 华东师范大学 | Method for establishing and searching feature matrix of Web document based on semantics |
CN104750844A (en) * | 2015-04-09 | 2015-07-01 | 中南大学 | Method and device for generating text characteristic vectors based on TF-IGM, method and device for classifying texts |
CN106503153A (en) * | 2016-10-21 | 2017-03-15 | 江苏理工学院 | A kind of computer version taxonomic hierarchies, system and its file classification method |
CN106919554A (en) * | 2016-10-27 | 2017-07-04 | 阿里巴巴集团控股有限公司 | The recognition methods of invalid word and device in document |
CN106570112A (en) * | 2016-11-01 | 2017-04-19 | 四川用联信息技术有限公司 | Improved ant colony algorithm-based text clustering realization method |
CN108491429A (en) * | 2018-02-09 | 2018-09-04 | 湖北工业大学 | A kind of feature selection approach based on document frequency and word frequency statistics between class in class |
CN108536868A (en) * | 2018-04-24 | 2018-09-14 | 北京慧闻科技发展有限公司 | The data processing method of short text data and application on social networks |
CN109492110A (en) * | 2018-11-28 | 2019-03-19 | 南京中孚信息技术有限公司 | Document Classification Method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110110328B (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101479040B1 (en) | Method, apparatus, and computer storage medium for automatically adding tags to document | |
US20090125371A1 (en) | Domain-Specific Sentiment Classification | |
CN104102681B (en) | Microblog key event acquiring method and device | |
US20140013221A1 (en) | Method and device for filtering harmful information | |
CN103823796A (en) | System and method for translation | |
CN104504046A (en) | Patent retrieval system and patent retrieval method | |
CN103678714B (en) | Construction method and device for entity knowledge base | |
Mao et al. | Parameterization of the level-resolved radiative recombination rate coefficients for the SPEX code | |
US9870433B2 (en) | Data processing method and system of establishing input recommendation | |
CN105512104A (en) | Dictionary dimension reducing method and device and information classifying method and device | |
CN106126495A (en) | A kind of based on large-scale corpus prompter method and apparatus | |
CN110110328A (en) | Text handling method and device | |
Khalil et al. | Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis | |
Karan et al. | Evaluation of Classification Algorithms and Features for Collocation Extraction in Croatian. | |
CN102915312A (en) | Method and system for issuing information on websites | |
CN107391730A (en) | A kind of SQL statement processing method and processing device | |
CN102291440A (en) | Method and device for optimizing rule in cloud environment | |
Lemnitzer et al. | Combining a rule-based approach and machine learning in a good-example extraction task for the purpose of lexicographic work on contemporary standard German | |
JP5798086B2 (en) | Device, method and program for extracting pairs of place names and words from a document | |
Volk | How bad is the problem of PP-attachment? A comparison of English, German and Swedish | |
Nabil et al. | New approaches for extracting arabic keyphrases | |
CN105512339A (en) | File searcher and searching method | |
CN112560448A (en) | New word extraction method and device | |
CN103902673B (en) | Anti-spam filtering rule upgrade method and device | |
Kim et al. | Analysis and mitigation on switching transients of medium-voltage low-harmonic filter banks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |