CN113946654A - Intelligent monitoring word setting system and method - Google Patents

Intelligent monitoring word setting system and method Download PDF

Info

Publication number
CN113946654A
CN113946654A CN202111212204.8A CN202111212204A CN113946654A CN 113946654 A CN113946654 A CN 113946654A CN 202111212204 A CN202111212204 A CN 202111212204A CN 113946654 A CN113946654 A CN 113946654A
Authority
CN
China
Prior art keywords
monitoring
word
negative
words
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111212204.8A
Other languages
Chinese (zh)
Inventor
朱旭琪
张勤勉
周权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Qingbo Big Data Technology Co ltd
Original Assignee
Anhui Qingbo Big Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Qingbo Big Data Technology Co ltd filed Critical Anhui Qingbo Big Data Technology Co ltd
Priority to CN202111212204.8A priority Critical patent/CN113946654A/en
Publication of CN113946654A publication Critical patent/CN113946654A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a system and a method for intelligently setting monitoring words, which comprises an article acquisition module, an article processing module, an intelligent matching module and a monitoring word generation module, wherein the article acquisition module is used for acquiring article publishing sites of news, forums, microblogs and electronic newspapers and sending the articles to a database, the article processing module is used for analyzing topics of the articles, extracting words from the topics as a matched content set and randomly taking 300 of the content set to provide screening operation for a user, the articles directly matched by the user are judged and marked unreasonably and do not meet the content of service, negative cases are generated and sent to the intelligent matching module. According to the method and the device, a user does not need to give an explicit and accurate logic combination scheme, and only needs to perform intuitive feedback on whether the specific content is matched or not according to an intelligent matching result, so that an algorithm can obtain feedback to perform intelligent optimization on an implicit and accurate monitoring phrase, and continuous optimization can be performed later.

Description

Intelligent monitoring word setting system and method
Technical Field
The invention belongs to the field of acquisition and analysis, and particularly relates to an intelligent monitoring word setting system.
Background
The technical threshold set by the scheme in public opinion application is high, the logic is complex, and the requirements on personnel skills and experiences are high. The method and the device analyze the self service based on the content, and a user can use the method and the device conveniently without understanding complex logic. Therefore, we improve this and propose an intelligent setting monitoring word system.
Disclosure of Invention
The invention aims to overcome the problems in the prior art, and provides an intelligent monitoring word setting system, which does not need a user to give an explicit and accurate logic combination scheme, but only needs to perform intuitive feedback on whether specific contents are matched or not according to an intelligent matching result, so that an algorithm can obtain feedback to perform intelligent optimization on an implicit and accurate monitoring word group, and can perform continuous optimization later.
In order to achieve the technical purpose and achieve the technical effect, the invention is realized by the following technical scheme:
an intelligent monitoring word setting system comprises an article acquisition module, an article processing module, an intelligent matching module and a monitoring word generating module;
the article collecting module is used for collecting article publishing sites of news, forums, microblogs and electronic newspapers and sending the articles to a database;
the article processing module is used for analyzing topics of articles, extracting words in the articles as a matched content set, randomly selecting 300 articles, providing screening operation for a user, judging and marking unreasonable and unsatisfied service contents of the articles directly matched by the user, generating negative cases and sending the negative cases to the intelligent matching module;
the intelligent matching module comprises a positive content submodule and a negative content submodule, wherein the positive content submodule is used for carrying out optimization calculation on the initialization detection group according to the negative case marks to form an iterated service monitoring phrase, the iterative service monitoring phrase is processed through an algorithm loop until the content ratio of the positive case is larger than a preset value, the negative content submodule is used for carrying out optimization calculation on the initialization detection group according to the negative case marks to form an iterated service monitoring phrase, the iterative algorithm loop is processed until the content ratio of the negative case is smaller than the preset value, and the satisfying condition phrase is sent to the monitoring phrase generating module;
the monitoring word generation module sequences the monitoring word groups and displays the characteristic words through the level set by the user.
Furthermore, the database at least comprises a preset number of positive cases and negative cases, and monitoring words of the user are compared with the positive cases and the negative cases to obtain monitoring words which can be screened.
The method for intelligently setting the monitoring words comprises the following steps:
A. analyzing data in the database, and generating a positive case and a negative case according to the user monitoring words; the module carries out real-face labeling collection on the positive cases and carries out negative labeling collection on the negative cases;
B. establishing a common word set of a positive labeling set and a negative labeling set, selecting a word from the common word set, deleting the common word set from the positive labeling set, and acquiring data under the same conditions;
C. calculating from data acquisition, deleting a common word set from the front labeling set if the common word set is larger than a preset value, acquiring data under the same condition, and judging whether preset data are compounded or not; if the number of the common words in the common set is less than the preset number, deleting the common word set in the common set, and judging whether preset data are compounded or not;
D. if the preset data is not compounded, repeating the step C until the preset numerical value is met;
F. deleting a class of words of the positive labeled word set, calculating a negative labeled numerical value, and if the negative labeled numerical value is larger than a preset numerical value, obtaining a detection word group set and sending the detection word group set to a monitoring word generation module;
G. the monitoring word generation module sequences the monitoring word groups and displays the characteristic words through the level set by the user.
The intelligent matching module calculation method comprises the following steps:
s1, analyzing the data in the database, and generating positive cases and negative cases according to the user monitoring words;
s2, carrying out a true face labeling set P on the positive cases and carrying out a negative face labeling set N on the negative cases by the module;
s3, performing word segmentation on the forward label set P to establish a truncated word segmentation set, and combining the truncated word segmentation set with the original service feature word set to establish a set A;
s4, performing word segmentation on the set N to establish a negative word set B;
s5, establishing a common word set C of A and B;
s6, selecting a word W in the word C, deleting the word W from the set A, and acquiring a set T by using the same condition data;
s7, calculating the content of N in T, if the content is larger than a preset value, deleting W from the set A, setting the deleted W as a new set A, deleting W from the set C, setting the deleted W as a new set C, and judging whether the set C is an empty set; if the value is less than the preset value, deleting W from the C set, setting the W as a new C set, and judging whether the C set is an empty set;
s8, if C is not an empty set, repeating the steps S6 and S7 again until a preset value is met;
and S9, deleting any unmarked element a in the set A, recording the unmarked element a as an A2 set, acquiring data T by taking A2 as an acquisition word and condition, calculating the content of original P elements in the T, judging whether the original P elements are larger than a preset value, if the original P elements meet the preset value, continuously judging whether the unmarked element exists in the set A, if n is larger than 0, finishing the judgment, and if n is smaller than 0, returning to the beginning of S9.
The invention has the beneficial effects that: according to the system and the method for intelligently setting the monitoring words, a user does not need to give an explicit and accurate logic combination scheme, and only needs to carry out intuitive feedback on whether specific contents are matched or not according to an intelligent matching result, so that an algorithm can obtain feedback to intelligently optimize an implicit and accurate monitoring word group, and continuous optimization can be carried out later.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of the algorithm structure of the intelligent matching module of the present invention;
FIG. 3 is a schematic diagram of the algorithm structure of the intelligent matching module of the present invention;
fig. 4 is a schematic diagram of the algorithm structure of the intelligent matching module of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "opening," "upper," "lower," "thickness," "top," "middle," "length," "inner," "peripheral," and the like are used in an orientation or positional relationship that is merely for convenience in describing and simplifying the description, and do not indicate or imply that the referenced component or element must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be considered as limiting the present invention.
An intelligent monitoring word setting system as shown in fig. 1, which comprises an article collecting module, an article processing module, an intelligent matching module and a monitoring word generating module, wherein the article collecting module is used for collecting article publishing sites of news, forums, microblogs and electronic newspapers and sending the articles to a database, the article processing module is used for analyzing topics of the articles and extracting words of the articles as a matched content set and randomly selecting 300 of the content sets to provide screening operation for users, the articles directly matched by the users are subjected to judgment marking unreasonable and unsatisfied with the content of services to generate negative cases and sent to the intelligent matching module, the intelligent matching module comprises a positive content submodule and a negative content submodule, the positive content submodule is used for carrying out optimization calculation on an initialization detection group according to the negative case marking and wanting to be an iterative service monitoring phrase, and performing algorithm circulation until the content ratio of the positive examples is larger than a preset value, performing optimization calculation on the initialization detection group according to the negative case marks to form an iterative service monitoring phrase, performing algorithm circulation until the content ratio of the negative examples is smaller than the preset value, sending the phrase meeting the conditions to a monitoring word generation module, sequencing the monitoring phrases by the monitoring word generation module, displaying characteristic words through the user setting level, comparing the monitoring words of the user with the positive cases and the negative cases to obtain the monitoring words which can be screened, wherein the database at least comprises the preset number of the positive cases and the negative cases.
The method for intelligently setting the monitoring words comprises the following steps:
A. analyzing data in the database, and generating a positive case and a negative case according to the user monitoring words; the module carries out real-face labeling collection on the positive cases and carries out negative labeling collection on the negative cases;
B. establishing a common word set of a positive labeling set and a negative labeling set, selecting a word from the common word set, deleting the common word set from the positive labeling set, and acquiring data under the same conditions;
C. calculating from data acquisition, deleting a common word set from the front labeling set if the common word set is larger than a preset value, acquiring data under the same condition, and judging whether preset data are compounded or not; if the number of the common words in the common set is less than the preset number, deleting the common word set in the common set, and judging whether preset data are compounded or not;
D. if the preset data is not compounded, repeating the step C until the preset numerical value is met;
F. deleting a class of words of the positive labeled word set, calculating a negative labeled numerical value, and if the negative labeled numerical value is larger than a preset numerical value, obtaining a detection word group set and sending the detection word group set to a monitoring word generation module;
G. the monitoring word generation module sequences the monitoring word groups and displays the characteristic words through the level set by the user.
As shown in fig. 2, 3 and 4, the intelligent matching module calculation method includes the following steps:
s1, analyzing the data in the database, and generating positive cases and negative cases according to the user monitoring words;
s2, carrying out a true face labeling set P on the positive cases and carrying out a negative face labeling set N on the negative cases by the module;
s3, segmenting the forward label set P to establish a truncated word set, merging the truncated word set with the original service feature word set to establish a set A
S4, performing word segmentation on the set N to establish a negative word set B
S5, establishing a common word set C of A and B
S6, selecting a word W in the C, deleting the W from the A set to acquire a set T by the same condition data,
s7, calculating the content of N in T, if the content is larger than a preset value, deleting W from the set A, setting the deleted W as a new set A, deleting W from the set C, setting the deleted W as a new set C, and judging whether the set C is an empty set; if the value is less than the preset value, deleting W from the C set, setting the W as a new C set, and judging whether the C set is an empty set;
s8, if C is not empty, repeating S6 and S7 again until the preset value is satisfied
And S9, deleting any unmarked element a in the set A, recording the unmarked element a as an A2 set, acquiring data T by taking A2 as an acquisition word and condition, calculating the content of original P elements in the T, judging whether the original P elements are larger than a preset value, if the original P elements meet the preset value, continuously judging whether the unmarked element exists in the set A, if n is larger than 0, finishing the judgment, and if n is smaller than 0, returning to the beginning of S9.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed.

Claims (3)

1. An intelligent monitoring word setting system is characterized by comprising an article acquisition module, an article processing module, an intelligent matching module and a monitoring word generation module;
the article collecting module is used for collecting article publishing sites of news, forums, microblogs and electronic newspapers and sending the articles to a database;
the article processing module is used for analyzing topics of articles, extracting words in the articles as a matched content set, randomly selecting 300 articles, providing screening operation for a user, judging and marking unreasonable and unsatisfied service contents of the articles directly matched by the user, generating negative cases and sending the negative cases to the intelligent matching module;
the intelligent matching module comprises a positive content submodule and a negative content submodule, wherein the positive content submodule is used for carrying out optimization calculation on the initialization detection group according to the negative case marks to form an iterated service monitoring phrase, the iterative service monitoring phrase is processed through an algorithm loop until the content ratio of the positive case is larger than a preset value, the negative content submodule is used for carrying out optimization calculation on the initialization detection group according to the negative case marks to form an iterated service monitoring phrase, the iterative algorithm loop is processed until the content ratio of the negative case is smaller than the preset value, and the satisfying condition phrase is sent to the monitoring phrase generating module;
the monitoring word generation module sequences the monitoring word groups and displays the characteristic words through the level set by the user.
2. The system for intelligently setting monitoring words as claimed in claim 1, wherein the database at least comprises a preset number of positive cases and negative cases, and the monitoring words of the user are compared with the positive cases and the negative cases to obtain the monitoring words which can be screened.
3. The method for intelligently setting monitoring words according to claim 1, characterized in that the method for intelligently setting monitoring words comprises the following steps:
A. analyzing data in the database, and generating a positive case and a negative case according to the user monitoring words; the module carries out real-face labeling collection on the positive cases and carries out negative labeling collection on the negative cases;
B. establishing a common word set of a positive labeling set and a negative labeling set, selecting a word from the common word set, deleting the common word set from the positive labeling set, and acquiring data under the same conditions;
C. calculating from data acquisition, deleting a common word set from the front labeling set if the common word set is larger than a preset value, acquiring data under the same condition, and judging whether preset data are compounded or not; if the number of the common words in the common set is less than the preset number, deleting the common word set in the common set, and judging whether preset data are compounded or not;
D. if the preset data is not compounded, repeating the step C until the preset numerical value is met;
F. deleting a class of words of the positive labeled word set, calculating a negative labeled numerical value, and if the negative labeled numerical value is larger than a preset numerical value, obtaining a detection word group set and sending the detection word group set to a monitoring word generation module;
G. the monitoring word generation module sequences the monitoring word groups and displays the characteristic words through the level set by the user.
CN202111212204.8A 2021-10-18 2021-10-18 Intelligent monitoring word setting system and method Pending CN113946654A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111212204.8A CN113946654A (en) 2021-10-18 2021-10-18 Intelligent monitoring word setting system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111212204.8A CN113946654A (en) 2021-10-18 2021-10-18 Intelligent monitoring word setting system and method

Publications (1)

Publication Number Publication Date
CN113946654A true CN113946654A (en) 2022-01-18

Family

ID=79331451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111212204.8A Pending CN113946654A (en) 2021-10-18 2021-10-18 Intelligent monitoring word setting system and method

Country Status (1)

Country Link
CN (1) CN113946654A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019454A (en) * 2022-07-14 2022-09-06 麒芯微(上海)微电子有限公司 Digital RMB double-off-line receiving and paying system and method
CN116701561A (en) * 2023-06-09 2023-09-05 读书郎教育科技有限公司 Learning resource collection method matched with dictionary pen and system thereof

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019454A (en) * 2022-07-14 2022-09-06 麒芯微(上海)微电子有限公司 Digital RMB double-off-line receiving and paying system and method
CN116701561A (en) * 2023-06-09 2023-09-05 读书郎教育科技有限公司 Learning resource collection method matched with dictionary pen and system thereof
CN116701561B (en) * 2023-06-09 2024-04-26 读书郎教育科技有限公司 Learning resource collection method matched with dictionary pen and system thereof

Similar Documents

Publication Publication Date Title
CN109165294B (en) Short text classification method based on Bayesian classification
CN113946654A (en) Intelligent monitoring word setting system and method
CN106250513B (en) Event modeling-based event personalized classification method and system
KR101999152B1 (en) English text formatting method based on convolution network
CN112329836A (en) Text classification method, device, server and storage medium based on deep learning
Antonacopoulos et al. ICDAR2005 page segmentation competition
CN106339495A (en) Topic detection method and system based on hierarchical incremental clustering
CN106708940A (en) Method and device used for processing pictures
CN110807086A (en) Text data labeling method and device, storage medium and electronic equipment
Woitaszek et al. Identifying junk electronic mail in Microsoft outlook with a support vector machine
CN111339286A (en) Method for researching research condition of exploration institution based on topic visualization
CN111814486A (en) Enterprise client tag generation method, system and device based on semantic analysis
CN112579730A (en) High-expansibility multi-label text classification method and device
CN112487306A (en) Automatic event marking and classifying method based on knowledge graph
CN112884009A (en) Classification model training method and system
CN107368610A (en) Big text CRF and rule classification method and system based on full text
CN108073567A (en) A kind of Feature Words extraction process method, system and server
CN107992508A (en) A kind of Chinese email signature extracting method and system based on machine learning
CN108549708B (en) Image-text matching method and system
CN112784040B (en) Vertical industry text classification method based on corpus
CN112214575A (en) User activity field classification method for different social media platforms
CN108572961A (en) A kind of the vectorization method and device of text
CN108563765B (en) Intelligent image-text matching method and system
CN111046163A (en) Unread message processing method and device, storage medium and equipment
CN110674269A (en) Cable information management and control method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination