CN111125160A - Data preprocessing method, system and terminal based on trademark approximate analysis - Google Patents

Data preprocessing method, system and terminal based on trademark approximate analysis Download PDF

Info

Publication number
CN111125160A
CN111125160A CN201911370644.9A CN201911370644A CN111125160A CN 111125160 A CN111125160 A CN 111125160A CN 201911370644 A CN201911370644 A CN 201911370644A CN 111125160 A CN111125160 A CN 111125160A
Authority
CN
China
Prior art keywords
judging whether
analysis
word
desensitization
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911370644.9A
Other languages
Chinese (zh)
Inventor
朱峰
彭丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Knowledge Gain And Loss Network Technology Co ltd
Original Assignee
Guangdong Knowledge Gain And Loss Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Knowledge Gain And Loss Network Technology Co ltd filed Critical Guangdong Knowledge Gain And Loss Network Technology Co ltd
Priority to CN201911370644.9A priority Critical patent/CN111125160A/en
Publication of CN111125160A publication Critical patent/CN111125160A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to the technical field of data analysis, in particular to a data preprocessing method, a system and a terminal based on trademark approximate analysis, wherein the data preprocessing method based on the trademark approximate analysis comprises the following steps: acquiring an input keyword; performing character type recognition on the keywords; judging whether the keywords are of multi-type character combination, judging whether the desensitization recognition result completely hits the sensitive words, judging whether at least one sensitive word in the set is contained in the A-type sensitive word, judging whether at least one sensitive word in the set is contained in the B-type sensitive word, and further judging whether the significance is achieved. The method of the invention can simplify the analysis steps of the user on the combined trademark and improve the analysis efficiency.

Description

Data preprocessing method, system and terminal based on trademark approximate analysis
Technical Field
The invention relates to the technical field of data analysis, in particular to a data preprocessing method, a data preprocessing system and a data preprocessing terminal based on trademark approximate analysis.
Background
In recent years, with rapid development of the world economy and society, the value of trademarks has increased dramatically, and the number of trademarks registered has continued to increase. Under the condition that a trademark owner usually queries and searches registered trademarks published by a trademark office in a fixed period by himself or a proxy agency on the registration or the right of maintenance of the trademarks so as to find approximate trademarks in time, the search level of manual search is narrow, so that the search result is not comprehensive, and a method for analyzing the approximation of the trademarks is continuously created by a person skilled in the art.
In this case, how to make the structure of the search more accurate becomes a problem to be solved. To solve the above problems. The invention provides a data preprocessing method, a data preprocessing system and a data preprocessing terminal based on trademark approximate analysis.
Disclosure of Invention
The invention solves the technical problem of providing a data preprocessing method, a data preprocessing system and a data preprocessing terminal based on trademark approximate analysis. The data preprocessing method, the data preprocessing system and the data preprocessing terminal based on the trademark approximate analysis can simplify the analysis steps of a user on the combined trademark and improve the analysis efficiency.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
a data preprocessing method based on trademark approximate analysis comprises the following steps:
acquiring an input keyword;
performing character type recognition on the keywords;
judging whether the keywords are of multi-type character combination, if so, judging whether the keywords contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
judging whether the desensitization recognition result completely hits the sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on the analysis object to generate a set;
judging whether at least one sensitive word in the set is contained in the A-type sensitive word, if so, popping up an influence significance prompt;
if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
Preferably, said determination of whether a number is included is, in particular,
judging whether the numbers are contained;
if yes, performing Arabic numerals extraction, English numerals extraction, Chinese type numerals extraction,
judging whether characters remain after extraction;
if not, carrying out unified initialization of the digital format;
desensitization recognition is respectively carried out according to Chinese, English and numbers. By analyzing the keywords, the keywords are respectively extracted with Arabic numerals, English and Chinese characters and are respectively identified, so that the method can greatly improve the accuracy of trademark analysis.
Preferably, the judgment is made whether there are residual characters after extraction;
if yes, extracting Chinese characters and English characters.
Carrying out unified initialization of digital formats;
desensitization recognition according to Chinese, English and number
Preferably, the determination of whether a number is included, and if not,
extracting Chinese characters and English characters;
carrying out unified initialization of digital formats;
desensitization recognition is carried out according to Chinese, English and number respectively.
Further preferably, the judging whether the result of desensitization recognition completely hits the sensitive word specifically includes:
establishing a sensitive word corpus;
acquiring extracted Chinese, English and number;
matching the acquired Chinese, English and number with the sensitive word text set to acquire matched words;
and performing similarity analysis on the matched words and the extracted Chinese, English and number.
Further preferably, the similarity is determined according to the following formula:
similarity Y: (
Figure 261898DEST_PATH_IMAGE001
,
Figure 647880DEST_PATH_IMAGE002
Figure 630880DEST_PATH_IMAGE003
)=α∗
Figure 494931DEST_PATH_IMAGE004
Wherein, α>0, α are adjustable parameters,
Figure 778144DEST_PATH_IMAGE001
for obtaining Chinese, English and digital words,
Figure 69448DEST_PATH_IMAGE005
similar words that match for the sensitive word corpus,
Figure 805323DEST_PATH_IMAGE006
is composed of
Figure 473065DEST_PATH_IMAGE007
Respectively, the level of the layer. In the process of judging the similarity, a similarity Y algorithm is adopted, the hierarchical relationship of each word is defined on the position by the Y algorithm, different positions of the same word are judged, and the similarity is judged if the same word is extremely similar on the position. The method for judging the similarity can match similar trademarks in a database, and improves the matching accuracy.
Preferably, the splitting word processing is performed on the analysis object, and the generating set specifically includes: and separating characters of the analysis object, combining the separated characters in an ascending order one by one to generate a set of combined characters.
Preferably, the type a sensitive word is a word that affects registration.
Preferably, the B-type sensitive word is a word which cannot be registered.
A data pre-processing system based on trademark approximation analysis, comprising:
a keyword acquisition module: the keyword acquisition module is used for acquiring input keywords;
a character type identification module: the character type identification module is used for identifying the character type of the keyword;
a character type judging module: the character type judging module is used for judging whether the key words are multi-type character combinations, if so, judging whether the key words contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
desensitizing the recognition module: the desensitization recognition module is used for judging whether a desensitization recognition result completely hits sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on an analysis object to generate a set;
desensitization judging module: the desensitization judging module is used for judging whether at least one sensitive word in the set is contained in the A-type sensitive word, and if so, popping up an influence significance prompt; if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
A computer readable storage medium having stored thereon computer program instructions adapted to be loaded by a processor and to execute a method of data pre-processing based on trademark approximation analysis.
A mobile terminal comprises a processor and a memory, wherein the processor is used for executing a program stored in the memory so as to realize a data preprocessing method based on trademark approximate analysis.
Compared with the prior art, the invention has the beneficial effects that: the trademark to be analyzed is subjected to data preprocessing, so that the analysis steps of the user on the combined trademark can be simplified, the analysis efficiency is improved, after desensitization identification is carried out on an analysis object, the user can be fed back more accurately, the problem that the trademark is rejected after being submitted for registration due to the existence of sensitive words is reduced, and the accuracy of subsequent approximate analysis is improved. Specifically, the method is not limited to analyzing the English trademark, the Chinese trademark and the Arabic numeral trademark, and can be used for analyzing three combined trademarks one by one after extraction, so that the intelligence and the judgment accuracy are greatly improved.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a schematic flow chart of a data preprocessing method based on trademark approximation analysis according to the present invention;
FIG. 2 is a block diagram of a data preprocessing system based on trademark approximation analysis according to the present invention.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic drawings and illustrate only the basic flow diagram of the invention, and therefore they show only the flow associated with the invention.
Example 1
As shown in fig. 1, the present invention is a data preprocessing method based on trademark approximation analysis, and the method specifically comprises:
acquiring an input keyword;
performing character type recognition on the keywords;
judging whether the keywords are of multi-type character combination, if so, judging whether the keywords contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
judging whether the desensitization recognition result completely hits the sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on the analysis object to generate a set;
judging whether at least one sensitive word in the set is contained in the A-type sensitive word, if so, popping up an influence significance prompt;
if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
The splitting processing is performed on the analysis object, and the generating set specifically comprises: and separating characters of the analysis object, combining the separated characters in an ascending order one by one to generate a set of combined characters.
The A-type sensitive words are words affecting registration. The B-type sensitive words are words which cannot be registered.
For example, the following steps are carried out: words such as class a sensitive words such as "kitten" and "puppy" have a strong potential to be registered in different classes, because such words have a large relevance and may be difficult to register. For the type B sensitive words, for example, the ticket is a word which cannot be registered, the ticket is a ticket which is identified, and the identified ticket is marked with cross-class protection benefits, so that the system can remind that the words are not significant when the applicant registers again.
Example 2
The method for judging whether the keyword is a multi-type character combination is as follows:
step 1, the judgment of whether the number is included is specifically as follows:
judging whether the numbers are contained;
if yes, performing Arabic numerals extraction, English numerals extraction, Chinese type numerals extraction,
judging whether characters remain after extraction;
if not, carrying out unified initialization of the digital format;
desensitization recognition is respectively carried out according to Chinese, English and numbers.
Step 2, judging whether residual characters exist after extraction;
wherein, the Chinese character extraction and the English character extraction are carried out.
Carrying out unified initialization of digital formats;
desensitization recognition is performed according to Chinese, English and number respectively.
The specific steps for judging whether the desensitization recognition result completely hits the sensitive word are as follows:
establishing a sensitive word corpus;
acquiring extracted Chinese, English and number;
matching the acquired Chinese, English and number with the sensitive word text set to acquire matched words;
and performing similarity analysis on the matched words and the extracted Chinese, English and number.
Further preferably, the similarity is determined according to the following formula:
similarity Y: (
Figure 610785DEST_PATH_IMAGE001
,
Figure 869728DEST_PATH_IMAGE002
Figure 92899DEST_PATH_IMAGE003
)=α∗
Figure 298753DEST_PATH_IMAGE004
Wherein, α>0, α are adjustable parameters,
Figure 25400DEST_PATH_IMAGE001
for obtaining Chinese, English and digital words,
Figure 924086DEST_PATH_IMAGE005
similar words that match for the sensitive word corpus,
Figure 431291DEST_PATH_IMAGE006
is composed of
Figure 175256DEST_PATH_IMAGE007
Respectively, the level of the layer.
For example, in trademark registration, there are some provisions that some words may not be registered as trademarks, such as: the names, flags and logos of international organizations between governments are the same or similar, but the names, flags and logos of administrative divisions above county level or foreign names known by the public are harmful to socialist moral fashion or have other adverse effects except that the names, flags and logos of international organizations between governments are the same or similar, but are not easily misled by the organizations.
In the trademark application, the applicant uses "love house and house" as the trademarkThe application, its word application is that "love and wu" evolves, is caused the misleading easily by masses, lets masses regard love and wu of idiom as this kind of writing method of love and room, and at this moment, the user writes into the word, and the system acquires this word, carries out the matching process of similarity in the database, and first when judging the position of the same word, through judging the position of same word, the word is write into
Figure 21989DEST_PATH_IMAGE008
Defining a position for two words, selecting the maximum distance in the positions
Figure 825997DEST_PATH_IMAGE009
When the distance is larger, the similarity is lower, and when the distance is smaller, the similarity is higher, and at this time, the love house and the wu have great similarity with the love house and the house, and cannot be used. Therein defined
Figure 23760DEST_PATH_IMAGE006
Is composed of
Figure 571416DEST_PATH_IMAGE007
The respective levels are the positions of the words in the word.
For example: the obtained keywords are: if a keyword is obtained: the color123 firstly identifies the character type of the word, and in the identification process, the keywords have Chinese, English and numbers, so that the keywords are split, the Chinese, English and numbers are respectively split, and desensitization identification is respectively carried out according to the Chinese, English and numbers;
judging whether the desensitization recognition result completely hits the sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on the analysis object to generate a set;
judging whether at least one sensitive word in the set is contained in the A-type sensitive word, if so, popping up an influence significance prompt;
if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
Example 3
As shown in FIG. 2, the present invention provides a data preprocessing system based on trademark approximation analysis:
keyword acquisition module 1: the keyword acquisition module is used for acquiring input keywords;
character type recognition module 2: the character type identification module is used for identifying the character type of the keyword;
character type judging module 3: the character type judging module is used for judging whether the key words are multi-type character combinations, if so, judging whether the key words contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
desensitization identification module 4: the desensitization recognition module is used for judging whether a desensitization recognition result completely hits sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on an analysis object to generate a set;
desensitization judgment module 5: the desensitization judging module is used for judging whether at least one sensitive word in the set is contained in the A-type sensitive word, and if so, popping up an influence significance prompt; if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
In the character type determining module 3, the specific process of determining whether the keyword is a multi-type character is as follows:
judging whether the numbers are contained; if yes, performing Arabic number extraction, English number extraction and Chinese type number extraction, and judging whether characters remain after extraction; if not, carrying out unified initialization of the digital format; desensitization recognition is respectively carried out according to Chinese, English and numbers.
Wherein, the judgment is carried out to judge whether residual characters exist after extraction; the Chinese character extraction and the English character extraction are carried out. Carrying out unified initialization of digital formats; desensitization recognition is performed according to Chinese, English and number respectively.
The specific steps for judging whether the desensitization recognition result completely hits the sensitive word are as follows:
establishing a sensitive word corpus;
acquiring extracted Chinese, English and number;
matching the acquired Chinese, English and number with the sensitive word text set to acquire matched words;
and performing similarity analysis on the matched words and the extracted Chinese, English and number.
Further preferably, the similarity is determined according to the following formula:
similarity Y: (
Figure 803814DEST_PATH_IMAGE001
,
Figure 44303DEST_PATH_IMAGE002
Figure 463783DEST_PATH_IMAGE003
)=α∗
Figure 815130DEST_PATH_IMAGE004
Wherein, α>0, α are adjustable parameters,
Figure 370876DEST_PATH_IMAGE001
for obtaining Chinese, English and digital words,
Figure 579003DEST_PATH_IMAGE005
similar words that match for the sensitive word corpus,
Figure 485780DEST_PATH_IMAGE006
is composed of
Figure 109659DEST_PATH_IMAGE007
Respectively, the level of the layer.
A computer readable storage medium having stored thereon computer program instructions adapted to be loaded by a processor and to execute a method of data pre-processing based on trademark approximation analysis.
A mobile terminal comprises a processor and a memory, wherein the processor is used for executing a program stored in the memory so as to realize a data preprocessing method based on trademark approximate analysis.
The above detailed description is specific to possible embodiments of the present invention, and the above embodiments are not intended to limit the scope of the present invention, and all equivalent implementations or modifications that do not depart from the scope of the present invention should be included in the present claims.

Claims (10)

1. A data preprocessing method based on trademark approximate analysis is characterized by comprising the following steps:
acquiring an input keyword;
performing character type recognition on the keywords;
judging whether the keywords are of multi-type character combination, if so, judging whether the keywords contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
judging whether the desensitization recognition result completely hits the sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on the analysis object to generate a set;
judging whether at least one sensitive word in the set is contained in the A-type sensitive word, if so, popping up an influence significance prompt;
if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
2. The method of claim 1, wherein the determination of whether the data includes a number is specifically,
judging whether the numbers are contained;
if yes, performing Arabic numerals extraction, English numerals extraction, Chinese type numerals extraction,
judging whether characters remain after extraction;
if not, carrying out unified initialization of the digital format;
desensitization recognition is respectively carried out according to Chinese, English and numbers.
3. The data preprocessing method based on trademark approximate analysis as claimed in claim 2, wherein said judging whether there are remaining characters after extraction;
if yes, extracting Chinese characters and English characters.
4. Carrying out unified initialization of digital formats;
desensitization recognition according to Chinese, English and number
The method of claim 2, wherein the determination of whether the data includes a number is performed by comparing the data with a reference value,
extracting Chinese characters and English characters;
carrying out unified initialization of digital formats;
desensitization recognition is carried out according to Chinese, English and number respectively.
5. The data preprocessing method based on trademark approximate analysis according to claim 1, wherein the word splitting processing is performed on the analysis object, and the generating set specifically comprises: and separating characters of the analysis object, combining the separated characters in an ascending order one by one to generate a set of combined characters.
6. The trademark approximation analysis-based data preprocessing method as claimed in claim 1, wherein the sensitive words in class A are words affecting registration.
7. The trademark approximate analysis-based data preprocessing method as claimed in claim 1, wherein the B-type sensitive words are unregisterable words.
8. A data preprocessing system based on trademark approximation analysis, comprising:
a keyword acquisition module: the keyword acquisition module is used for acquiring input keywords;
a character type identification module: the character type identification module is used for identifying the character type of the keyword;
a character type judging module: the character type judging module is used for judging whether the key words are multi-type character combinations, if so, judging whether the key words contain numbers, and if not, performing desensitization identification according to Chinese, English and numbers respectively;
desensitizing the recognition module: the desensitization recognition module is used for judging whether a desensitization recognition result completely hits sensitive words, if so, popping up a word without significance prompt, and if not, performing word splitting processing on an analysis object to generate a set;
desensitization judging module: the desensitization judging module is used for judging whether at least one sensitive word in the set is contained in the A-type sensitive word, and if so, popping up an influence significance prompt; if not, judging whether at least one sensitive word in the set is contained in the B-type sensitive words, if so, popping up the sensitive word without significance prompt, and if not, entering a registration analysis logic.
9. A computer-readable storage medium, characterized in that it stores computer program instructions adapted to be loaded by a processor and to execute the method of any of claims 1 to 7.
10. A mobile terminal comprising a processor and a memory, the processor being configured to execute a program stored in the memory to implement the method of any one of claims 1 to 7.
CN201911370644.9A 2019-12-26 2019-12-26 Data preprocessing method, system and terminal based on trademark approximate analysis Pending CN111125160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911370644.9A CN111125160A (en) 2019-12-26 2019-12-26 Data preprocessing method, system and terminal based on trademark approximate analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911370644.9A CN111125160A (en) 2019-12-26 2019-12-26 Data preprocessing method, system and terminal based on trademark approximate analysis

Publications (1)

Publication Number Publication Date
CN111125160A true CN111125160A (en) 2020-05-08

Family

ID=70503482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911370644.9A Pending CN111125160A (en) 2019-12-26 2019-12-26 Data preprocessing method, system and terminal based on trademark approximate analysis

Country Status (1)

Country Link
CN (1) CN111125160A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035726A (en) * 2020-11-02 2020-12-04 北京梦知网科技有限公司 Trademark registration method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301244A (en) * 2016-12-30 2017-10-27 徐庆 Method, device, system and the trade mark memory of a kind of trade mark point card processing
CN108985584A (en) * 2018-06-27 2018-12-11 广州朝舜网络科技有限公司 A kind of trade mark intelligent analysis method, device, terminal and storage medium
CN109388965A (en) * 2018-09-10 2019-02-26 全球能源互联网研究院有限公司 A kind of desensitization method and system of blended data
CN110059159A (en) * 2019-04-15 2019-07-26 重庆天蓬网络有限公司 A kind of similar mark real-time monitoring system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301244A (en) * 2016-12-30 2017-10-27 徐庆 Method, device, system and the trade mark memory of a kind of trade mark point card processing
CN108985584A (en) * 2018-06-27 2018-12-11 广州朝舜网络科技有限公司 A kind of trade mark intelligent analysis method, device, terminal and storage medium
CN109388965A (en) * 2018-09-10 2019-02-26 全球能源互联网研究院有限公司 A kind of desensitization method and system of blended data
CN110059159A (en) * 2019-04-15 2019-07-26 重庆天蓬网络有限公司 A kind of similar mark real-time monitoring system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035726A (en) * 2020-11-02 2020-12-04 北京梦知网科技有限公司 Trademark registration method and device
CN112035726B (en) * 2020-11-02 2021-03-12 北京梦知网科技有限公司 Trademark registration method and device

Similar Documents

Publication Publication Date Title
CN107045496B (en) Error correction method and error correction device for text after voice recognition
JP6402653B2 (en) Object recognition device, object recognition method, and program
EP3819785A1 (en) Feature word determining method, apparatus, and server
US20230015054A1 (en) Text classification method, electronic device and computer-readable storage medium
CN109783631B (en) Community question-answer data verification method and device, computer equipment and storage medium
CN111797239B (en) Application program classification method and device and terminal equipment
CN110222045A (en) A kind of data sheet acquisition methods, device and computer equipment, storage medium
CN108090216B (en) Label prediction method, device and storage medium
CN113312461A (en) Intelligent question-answering method, device, equipment and medium based on natural language processing
WO2010088052A1 (en) Methods and systems for matching records and normalizing names
CN111859968A (en) Text structuring method, text structuring device and terminal equipment
US12051256B2 (en) Entry detection and recognition for custom forms
CN112380848B (en) Text generation method, device, equipment and storage medium
CN110737770B (en) Text data sensitivity identification method and device, electronic equipment and storage medium
CN108170708B (en) Vehicle entity identification method, electronic equipment, storage medium and system
CN109660621A (en) Content pushing method and service equipment
CN111125160A (en) Data preprocessing method, system and terminal based on trademark approximate analysis
CN113836297B (en) Training method and device for text emotion analysis model
CN114842982A (en) Knowledge expression method, device and system for medical information system
CN110533035B (en) Student homework page number identification method based on text matching
CN114090748A (en) Question and answer result display method, device, equipment and storage medium
US11449794B1 (en) Automatic charset and language detection with machine learning
CN113627186A (en) Entity relation detection method based on artificial intelligence and related equipment
JP2022091608A (en) Information processing device and information processing program
CN113111147A (en) Text type identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination