CN112257425A - Power data analysis method and system based on data classification model - Google Patents

Power data analysis method and system based on data classification model Download PDF

Info

Publication number
CN112257425A
CN112257425A CN202011051534.9A CN202011051534A CN112257425A CN 112257425 A CN112257425 A CN 112257425A CN 202011051534 A CN202011051534 A CN 202011051534A CN 112257425 A CN112257425 A CN 112257425A
Authority
CN
China
Prior art keywords
power
data
document
target sentence
data analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011051534.9A
Other languages
Chinese (zh)
Inventor
董阳
张倩宜
郑阳
张驰
赵迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Tianjin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Tianjin Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202011051534.9A priority Critical patent/CN112257425A/en
Publication of CN112257425A publication Critical patent/CN112257425A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a power data analysis method based on a data classification model, which comprises the following steps: s1, establishing a root database; s2, preprocessing the power document to obtain a target sentence of the power document, wherein the target sentence is sentence information needing word segmentation; s3, identifying a target sentence, calling a root database to match the target sentence, judging whether the target sentence contains keywords in an ambiguous word bank, generating a feature identification result, and obtaining a multi-level label; s4, carrying out word segmentation processing according to the feature recognition result to form characters, and converting the characters into a feature vector matrix; and S5, inputting the feature vector matrix into a text classifier, and outputting a grading result of the electric power document. The electric power document is subjected to word segmentation processing according to relevant laws and regulations of the electric power system, corresponding target sentences are extracted, the root database is matched with the target sentences to generate recognition results, grading results of the electric power document are output, and the efficiency and the speed of grading the electric power data are greatly improved.

Description

Power data analysis method and system based on data classification model
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a power data analysis method and system based on a data classification model.
Background
The information security level protection system is implemented in China at present, and the proposed protection idea of 'zoning and key point division' is an effective means for solving the current information security problem. For the promotion of digital transformation of power companies and centralized and unified management of power data, the problem of classification of power data needs to be solved urgently, and especially for the security classification of power companies, it is clear which data can be shared open unconditionally, and which data is applicable to conditional sharing open or unopened sharing open according to the core business secret or relevant laws and regulations, so that data authorization and sharing open can be developed by combining different application scenarios.
At present, data management is disordered in the data sharing and exchanging process of an electric power company, the same or similar protection measures are taken for different data, the protection granularity is coarse, great hidden danger is brought to the data sharing and exchanging safety, and if sensitive data are not protected, the benefit of the enterprise and even the national safety can be seriously influenced. Therefore, fine-grained protection of data is an important content of information security.
Natural language processing, as an important branch of artificial intelligence, is increasingly used in many scenarios such as machine translation, intelligent question answering, and the like, playing an increasingly important role. Text word segmentation is the most basic process in natural language processing, and text can be better analyzed and recognized only after being accurately segmented.
At present, manual grading is mainly carried out by means of knowledge background of professionals and relevant reference regulations, the manual grading mode depends on the capability of workers, and the method is huge in workload, low in efficiency and high in error rate. The common mechanical word segmentation method is based on character string matching, is simple and efficient, has a simple language processing effect, but is not good in processing complex ambiguous sentences and cannot process ambiguities and new words. The word segmentation method based on machine learning improves the precision of text word segmentation by constructing a statistical model, can learn new words, but has higher complexity, needs to train a huge corpus, has high training cost, cannot well recognize words in a dictionary, and needs to improve the classification accuracy.
Therefore, in order to solve the above technical problem, it is necessary to extract feature words not only in the classified data description but also in the related legal provision, and to appropriately increase the weight of these feature words, and it is necessary to develop a data analysis method capable of classifying power data based on a data classification model.
Disclosure of Invention
The invention aims to provide a power data analysis method based on a data grading model, which can perform word segmentation processing on text data, extract characteristic words accurately and analyze power data accurately.
Another object of the present invention is to provide a power data analysis system based on a data classification model.
The technical scheme of the invention is as follows:
a power data analysis method based on a data classification model comprises the following steps:
s1, establishing a root database;
s2, importing an electric power document, preprocessing the electric power document, and acquiring a target sentence of the electric power document, wherein the target sentence is sentence information needing word segmentation;
s3, identifying the target sentence, calling the root database to match the target sentence, judging whether the target sentence contains keywords in an ambiguous word bank, if so, generating a feature identification result according to a word segmentation rule, and obtaining a multi-level label;
s4, carrying out word segmentation processing according to the feature recognition result to form characters, and converting the characters into a feature vector matrix through a TF-IDF algorithm;
and S5, inputting the characteristic vector matrix into a text classifier, generating a data grading model, and outputting a grading result of the electric power document.
In the above technical solution, the creating a root database in S1 includes:
s10, acquiring a large amount of text data as a corpus in a manual mode according to relevant laws and regulations of the power system to form an initial training sample;
s11, importing the training samples into a training model to gradually form a root classification model;
s12, after classification is formed, further training a root classification model through classification actual combat simulation, increasing decision data and improving the capability of the root classification model for dealing with abnormity;
s13, inputting the result data into the root classification model again for training the root classification model after artificial decision making and re-learning as a training sample;
and S14, collecting the result data and establishing a root database.
In the above technical solution, in S10, a large amount of text data is always obtained from relevant laws and regulations of the power system as a corpus, and a preset N value is used to remove homogeneous data in the corpus.
In the above technical solution, in S2, the preprocessing of the power document includes removing sensitive words, messy codes, and punctuation marks, so as to remove redundant parts in the power document and further filter the power document.
In the above technical solution, the matching of the target sentence in S3 includes fuzzy matching and regular matching.
In the above technical solution, the ambiguous word bank in S3 includes a preset keyword set with ambiguous properties.
In the above technical solution, the word segmentation method in S4 fully segments sentences in the power document, reads characters in each line in the power document by establishing a TF-IDF structure, calculates the frequency of occurrence of each character, and establishes a feature vector matrix.
In the above technical solution, in S5, the feature vector matrix is converted into one input vector of the text classifier, the multi-level label is converted into another input vector of the text classifier, a data-level model is generated by invoking a text classifier training algorithm, and a level-level result of the power document is input.
A power data analysis system based on a data staging model, comprising:
the preprocessing module is used for receiving the power document and acquiring a target statement of the power document;
the word segmentation module is used for generating a feature recognition result by matching the root database with the target sentence and obtaining a plurality of hierarchical labels;
the character dividing module is used for carrying out character dividing processing according to the characteristic identification result to form characters and generating a characteristic vector matrix;
and the output module is used for generating a grading result of the electric power document after the characteristic vector matrix is input through the text classifier.
Further, the word segmentation module further comprises:
the judging module is used for judging whether the target sentence has the keywords with ambiguous properties according to the keywords in the ambiguous word bank;
and the identification module is used for carrying out feature recognition on the target sentence after judging the keyword with ambiguous property on the target sentence so as to generate a feature recognition result.
The invention has the advantages and positive effects that:
1. the method comprises the steps of performing word segmentation processing on an electric power document through relevant laws and regulations of an electric power system, extracting corresponding target sentences, matching the root database with the target sentences to generate recognition results, outputting classification results of the electric power document, and greatly improving efficiency, speed and accuracy of electric power data classification.
2. The data value is used as a core, a data analysis system is constructed from a view point of combining safety management and data management, the power data is comprehensively analyzed, and the power data is objectively and accurately analyzed.
3. The intensity of data security management in the system is enhanced, the management strategy and granularity are refined, the requirement of data security management in the big data era is better met, and the security guarantee is provided for the security of dynamic service data of a big data platform.
Detailed Description
The present invention will be described in further detail with reference to specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the scope of the invention in any way.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
Example 1
The invention discloses a power data analysis method based on a data classification model, which comprises the following steps:
s1, establishing a root database;
s2, importing an electric power document, preprocessing the electric power document, and acquiring a target sentence of the electric power document, wherein the target sentence is sentence information needing word segmentation;
s3, identifying the target sentence, calling the root database to match the target sentence, judging whether the target sentence contains keywords in an ambiguous word bank, if so, generating a feature identification result according to a word segmentation rule, and obtaining a multi-level label;
s4, carrying out word segmentation processing according to the feature recognition result to form characters, and converting the characters into a feature vector matrix through a TF-IDF algorithm;
and S5, inputting the characteristic vector matrix into a text classifier, generating a data grading model, and outputting a grading result of the electric power document.
Further, the creating a root database in S1 includes:
s10, acquiring a large amount of text data as a corpus in a manual mode according to relevant laws and regulations of the power system to form an initial training sample;
s11, importing the training samples into a training model to gradually form a root classification model;
s12, after classification is formed, further training a root classification model through classification actual combat simulation, increasing decision data and improving the capability of the root classification model for dealing with abnormity;
s13, inputting the result data into the root classification model again for training the root classification model after artificial decision making and re-learning as a training sample;
and S14, collecting the result data and establishing a root database.
Further, in S10, a large amount of text data is always obtained from the relevant laws and regulations of the power system as the corpus, and the preset N value is used to remove the homogenization data in the corpus.
Further, in S2, the preprocessing of the power document includes removing sensitive words, messy codes, and punctuation marks, so as to remove redundant parts in the power document, thereby implementing further filtering of the power document.
Further, the matching the target sentence in S3 includes fuzzy matching and regular matching.
Further, the ambiguous word library in S3 includes a preset keyword set with ambiguous properties.
Further, the word segmentation method in S4 fully segments the sentences in the power document, reads the characters in each line of the power document by establishing a TF-IDF structure, calculates the frequency of occurrence of each character, and establishes a feature vector matrix.
Further, in S5, the feature vector matrix is converted into one input vector of the text classifier, the multi-level label is converted into another input vector of the text classifier, a data level model is generated by calling a text classifier training algorithm, and a level result of the power document is input.
Example 2
On the basis of embodiment 1, the power data analysis system based on the data classification model of the present invention includes:
the preprocessing module is used for receiving the power document and acquiring a target statement of the power document;
the word segmentation module is used for generating a feature recognition result by matching the root database with the target sentence and obtaining a plurality of hierarchical labels;
the character dividing module is used for carrying out character dividing processing according to the characteristic identification result to form characters and generating a characteristic vector matrix;
and the output module is used for generating a grading result of the electric power document after the characteristic vector matrix is input through the text classifier.
Further, the word segmentation module further comprises:
the judging module is used for judging whether the target sentence has the keywords with ambiguous properties according to the keywords in the ambiguous word bank;
and the identification module is used for carrying out feature recognition on the target sentence after judging the keyword with ambiguous property on the target sentence so as to generate a feature recognition result.
Example 3
On the basis of embodiment 1, the computer device of the present invention includes an air blowing device, a nonvolatile storage medium, a memory, and a network interface connected through a system. Wherein the non-volatile storage medium of the computer device stores an operating system, a database, and computer readable instructions. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform a method of power data analysis based on the data staging model of embodiment 1.
The network interface of the computer device is used for communication connection with the terminal.
The invention has been described in an illustrative manner, and it is to be understood that any simple variations, modifications or other equivalent changes which can be made by one skilled in the art without departing from the spirit of the invention fall within the scope of the invention.

Claims (10)

1. A power data analysis method based on a data classification model is characterized by comprising the following steps:
s1, establishing a root database;
s2, importing an electric power document, preprocessing the electric power document, and acquiring a target sentence of the electric power document, wherein the target sentence is sentence information needing word segmentation;
s3, identifying the target sentence, calling the root database to match the target sentence, judging whether the target sentence contains keywords in an ambiguous word bank, if so, generating a feature identification result according to a word segmentation rule, and obtaining a multi-level label;
s4, carrying out word segmentation processing according to the feature recognition result to form characters, and converting the characters into a feature vector matrix through a TF-IDF algorithm;
and S5, inputting the characteristic vector matrix into a text classifier, generating a data grading model, and outputting a grading result of the electric power document.
2. The power data analysis method according to claim 1, wherein the creating a root database in S1 includes:
s10, acquiring a large amount of text data as a corpus in a manual mode according to relevant laws and regulations of the power system to form an initial training sample;
s11, importing the training samples into a training model to gradually form a root classification model;
s12, after classification is formed, further training a root classification model through classification actual combat simulation, increasing decision data and improving the capability of the root classification model for dealing with abnormity;
s13, inputting the result data into the root classification model again for training the root classification model after artificial decision making and re-learning as a training sample;
and S14, collecting the result data and establishing a root database.
3. The power data analysis method according to claim 2, characterized in that: in S10, a large amount of text data is always obtained from the relevant laws and regulations of the power system as a corpus, and a preset N value is used to remove the homogeneous data in the corpus.
4. The power data analysis method according to claim 3, characterized in that: in S2, the preprocessing the power document includes removing sensitive words, messy codes, and punctuation marks, so as to remove redundant parts in the power document, thereby further filtering the power document.
5. The power data analysis method according to claim 4, characterized in that: the matching of the target sentence in S3 includes fuzzy matching and regular matching.
6. The power data analysis method according to claim 5, characterized in that: the ambiguous word bank in S3 includes a preset set of keywords with ambiguous properties.
7. The power data analysis method according to claim 6, characterized in that: the word segmentation method in the step S4 is used to fully segment sentences in the power document, read characters in each line of the power document by establishing a TF-IDF structure, calculate the frequency of occurrence of each character, and establish a feature vector matrix.
8. The power data analysis method according to claim 7, characterized in that: in S5, the feature vector matrix is converted into one input vector of the text classifier, the multi-level label is converted into another input vector of the text classifier, a data-level model is generated by invoking a text classifier training algorithm, and a level result of the power document is input.
9. A power data analysis system based on a data staging model, comprising:
the preprocessing module is used for receiving the power document and acquiring a target statement of the power document;
the word segmentation module is used for generating a feature recognition result by matching the root database with the target sentence and obtaining a plurality of hierarchical labels;
the character dividing module is used for carrying out character dividing processing according to the characteristic identification result to form characters and generating a characteristic vector matrix;
and the output module is used for generating a grading result of the electric power document after the characteristic vector matrix is input through the text classifier.
10. The power data analysis system according to claim 9, wherein: the word segmentation module further comprises:
the judging module is used for judging whether the target sentence has the keywords with ambiguous properties according to the keywords in the ambiguous word bank;
and the identification module is used for carrying out feature recognition on the target sentence after judging the keyword with ambiguous property on the target sentence so as to generate a feature recognition result.
CN202011051534.9A 2020-09-29 2020-09-29 Power data analysis method and system based on data classification model Pending CN112257425A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011051534.9A CN112257425A (en) 2020-09-29 2020-09-29 Power data analysis method and system based on data classification model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011051534.9A CN112257425A (en) 2020-09-29 2020-09-29 Power data analysis method and system based on data classification model

Publications (1)

Publication Number Publication Date
CN112257425A true CN112257425A (en) 2021-01-22

Family

ID=74233311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011051534.9A Pending CN112257425A (en) 2020-09-29 2020-09-29 Power data analysis method and system based on data classification model

Country Status (1)

Country Link
CN (1) CN112257425A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657505A (en) * 2021-08-18 2021-11-16 国网四川省电力公司自贡供电公司 Data processing system and method of power monitoring platform
CN114218318A (en) * 2022-02-21 2022-03-22 国网山东省电力公司乳山市供电公司 Data processing system and method for electric power big data
CN114936543A (en) * 2022-06-07 2022-08-23 中国银行股份有限公司 Batch data item label dropping method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304373A (en) * 2017-10-13 2018-07-20 腾讯科技(深圳)有限公司 Construction method, device, storage medium and the electronic device of semantic dictionary
CN108304375A (en) * 2017-11-13 2018-07-20 广州腾讯科技有限公司 A kind of information identifying method and its equipment, storage medium, terminal
CN110413998A (en) * 2019-07-16 2019-11-05 深圳供电局有限公司 A kind of adaptive Chinese word cutting method and its system, medium towards power industry
WO2020077895A1 (en) * 2018-10-16 2020-04-23 深圳壹账通智能科技有限公司 Signing intention determining method and apparatus, computer device, and storage medium
CN111309904A (en) * 2020-01-20 2020-06-19 上海市大数据中心 Public data classification method based on generalized characteristic word stock

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304373A (en) * 2017-10-13 2018-07-20 腾讯科技(深圳)有限公司 Construction method, device, storage medium and the electronic device of semantic dictionary
CN108304375A (en) * 2017-11-13 2018-07-20 广州腾讯科技有限公司 A kind of information identifying method and its equipment, storage medium, terminal
WO2020077895A1 (en) * 2018-10-16 2020-04-23 深圳壹账通智能科技有限公司 Signing intention determining method and apparatus, computer device, and storage medium
CN110413998A (en) * 2019-07-16 2019-11-05 深圳供电局有限公司 A kind of adaptive Chinese word cutting method and its system, medium towards power industry
CN111309904A (en) * 2020-01-20 2020-06-19 上海市大数据中心 Public data classification method based on generalized characteristic word stock

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657505A (en) * 2021-08-18 2021-11-16 国网四川省电力公司自贡供电公司 Data processing system and method of power monitoring platform
CN113657505B (en) * 2021-08-18 2024-05-10 国网四川省电力公司自贡供电公司 Data processing system and method of power monitoring platform
CN114218318A (en) * 2022-02-21 2022-03-22 国网山东省电力公司乳山市供电公司 Data processing system and method for electric power big data
CN114218318B (en) * 2022-02-21 2022-05-17 国网山东省电力公司乳山市供电公司 Data processing system and method for electric power big data
CN114936543A (en) * 2022-06-07 2022-08-23 中国银行股份有限公司 Batch data item label dropping method and device

Similar Documents

Publication Publication Date Title
CN112257425A (en) Power data analysis method and system based on data classification model
CN106776538A (en) The information extracting method of enterprise's noncanonical format document
CN111274814B (en) Novel semi-supervised text entity information extraction method
CN111124487B (en) Code clone detection method and device and electronic equipment
KR20220091676A (en) Apparatus and Method for Building Unstructured Cyber Threat Information Big-data, Method for Analyzing Unstructured Cyber Threat Information
CN112989831B (en) Entity extraction method applied to network security field
CN112560486A (en) Power entity identification method based on multilayer neural network, storage medium and equipment
CN110750978A (en) Emotional tendency analysis method and device, electronic equipment and storage medium
CN113486664A (en) Text data visualization analysis method, device, equipment and storage medium
CN109446299A (en) The method and system of searching email content based on event recognition
CN115687621A (en) Short text label labeling method and device
CN111178080A (en) Named entity identification method and system based on structured information
CN110110087A (en) A kind of Feature Engineering method for Law Text classification based on two classifiers
CN114239579A (en) Electric power searchable document extraction method and device based on regular expression and CRF model
CN113868422A (en) Multi-label inspection work order problem traceability identification method and device
CN113705192A (en) Text processing method, device and storage medium
CN109753798A (en) A kind of Webshell detection model based on random forest and FastText
Hadi Classification of Arabic social media data
CN116226371A (en) Digital economic patent classification method
CN115329380A (en) Database table classification and classification method, device, equipment and storage medium
CN115482075A (en) Financial data anomaly analysis method and device, electronic equipment and storage medium
CN115618085A (en) Interface data exposure detection method based on dynamic label
CN112488593B (en) Auxiliary bid evaluation system and method for bidding
CN114969334A (en) Abnormal log detection method and device, electronic equipment and readable storage medium
CN114610882A (en) Abnormal equipment code detection method and system based on electric power short text classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210122

RJ01 Rejection of invention patent application after publication