CN109492216A - Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium - Google Patents

Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium Download PDF

Info

Publication number
CN109492216A
CN109492216A CN201811095297.9A CN201811095297A CN109492216A CN 109492216 A CN109492216 A CN 109492216A CN 201811095297 A CN201811095297 A CN 201811095297A CN 109492216 A CN109492216 A CN 109492216A
Authority
CN
China
Prior art keywords
water note
model
title
words
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201811095297.9A
Other languages
Chinese (zh)
Inventor
杨将
祁家庆
喻红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811095297.9A priority Critical patent/CN109492216A/en
Publication of CN109492216A publication Critical patent/CN109492216A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of big data technology, discloses a kind of water note and identify automatically and the measures and procedures for the examination and approval, comprising: receive the model to be released to predetermined website currently submitted;The title of the received model is matched with a lists of keywords, filters out the keyword in the title for appearing in the model;The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;If calculate ratio be more than or equal to designated value, judge the model for water note, and refusal the model described in the website orientation;If the ratio calculated is less than designated value, the model is judged not and is water note, and the model described in the website orientation.The present invention also proposes that a kind of water note identifies automatically and examine device and a kind of computer readable storage medium.The present invention realizes the automatic identification and examination & approval of water note.

Description

Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium
Technical field
The present invention relates to big data technical field more particularly to a kind of water note identifies automatically and the measures and procedures for the examination and approval, device and meter Calculation machine readable storage medium storing program for executing.
Background technique
Appearance of the development of forum as network is emerged rapidly in large numbersBamboo shoots after a spring rain, and rapidly develop.Forum almost covers people The various aspects lived, almost everyone can find oneself it is interested or it should be understood that topicality forum.And All kinds of websites, comprehensive portal website or functional special subject network station are also all favored in opening up oneself forum, to promote online friend Between exchange, increase interactive and abundant website content.
User can share personal view, publication data, discussion interaction, announcement information etc. in forum.Usually in forum The information of upper publication and reply referred to as " posts " " money order receipt to be signed and returned to the sender " " follow-up " etc..
Due to " post " in forum " money order receipt to be signed and returned to the sender " " follow-up " be all user free behavior, user may send out sometimes " water note " a bit." the water note " be note, one kind is inessential for theme in forum or bbs, meaningless some models General designation.
The forum user that perhaps works for professional technical forum issue some moods etc with technology or work When incoherent water note, the system perception and searching accuracy of forum are influenced whether.
It is critically important for carrying out Intelligent detecting and examination & approval to water note.
Summary of the invention
The present invention provides a kind of water note and identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium, main It is designed to provide a kind of independent of artificial screening, Intelligent detecting and examines out the water note submitted in forum.
To achieve the above object, a kind of water note provided by the invention identifies and the measures and procedures for the examination and approval automatically, comprising:
Receive the model to be released to predetermined website currently submitted;
The title of the received model is matched with a lists of keywords, filters out and appears in the model Keyword in title;
The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;
If calculate ratio be more than or equal to designated value, judge the model for water note, and refusal in the website Issue the model;
If the ratio calculated is less than designated value, the model is judged not and is water note, and the note described in the website orientation Son.
Optionally, this method further include:
Obtain all models in website;
The title for being denoted as the model of water note is selected, the title of the water note is subjected to word segmentation processing, obtains Feature Words;
According to the frequency that the Feature Words occur, high frequency words are filtered out, and the high frequency words are recorded in a high frequency words In list;
From the high frequency words of the high frequency word list, keyword is selected according to preset rules, and the keyword of selection is remembered It records in the lists of keywords.
Optionally, the title of the water note is subjected to word segmentation processing, obtaining Feature Words is to use priority of long word principle, according to The dictionary prestored segments received keyword.
Optionally, this method further include:
Processing is filtered to the Feature Words, the filtration treatment is using following one or two kinds of modes:
Mode one is filtered according to part of speech, retains noun, verb and adjective;
Mode two is filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to spy The frequency or number that sign word occurs in title.
Optionally, the preset rules are by the high frequency words and a specific dictionary progress in the high frequency word list Match, find out the word to match, as the keyword.
In addition, to achieve the above object, identifying and examining device automatically the present invention also provides a kind of water note, which includes Memory and processor are stored with the water note that can be run on the processor in the memory and identify and examine journey automatically Sequence realizes following steps when the water note identifies automatically and examination and approval procedures are executed by the processor:
Receive the model to be released to predetermined website currently submitted;
The title of the received model is matched with a lists of keywords, filters out and appears in the model Keyword in title;
The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;
If calculate ratio be more than or equal to designated value, judge the model for water note, and refusal in the website Issue the model;
If the ratio calculated is less than designated value, the model is judged not and is water note, and the note described in the website orientation Son.
Optionally, the water note identifies automatically and when examination and approval procedures are executed by the processor also realizes following steps:
Obtain all models in website;
The title for being denoted as the model of water note is selected, the title of the water note is subjected to word segmentation processing, obtains Feature Words;
According to the frequency that the Feature Words occur, high frequency words are filtered out, and the high frequency words are recorded in a high frequency words In list;
From the high frequency words of the high frequency word list, keyword is selected according to preset rules, and the keyword of selection is remembered It records in the lists of keywords.
Optionally, the title of the water note is subjected to word segmentation processing, obtaining Feature Words is to use priority of long word principle, according to The dictionary prestored segments received keyword.
Optionally, the water note identifies automatically and when examination and approval procedures are executed by the processor also realizes following steps:
Following steps are also realized when the water note identifies automatically and examination and approval procedures are executed by the processor:
Processing is filtered to the Feature Words, the filtration treatment uses following any one or two kinds of modes:
Mode one is filtered according to part of speech, retains noun, verb and adjective;
Mode two is filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to spy The frequency or number that sign word occurs in title.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium Water note is stored on storage medium to identify automatically and examination and approval procedures, the water note identify automatically and examination and approval procedures can by one or Multiple processors execute, to realize that water note as described above identifies automatically and the step of the measures and procedures for the examination and approval.
Water note proposed by the present invention identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium, is receiving The needs that user is submitted automatically identify the model in the model issued in forum, are pre-established simultaneously by one The lists of keywords constantly improve is compared with the title of the model, to judge that keyword accounts for institute in the lists of keywords The title ratio of model is stated, and judges whether the model is water note accordingly, is judged as that the model of water note will not be sent out in forum Cloth, scheme of the present invention can intelligently identify and examine out the water note submitted in forum independent of artificial screening.
Detailed description of the invention
Fig. 1 identifies automatically for the water note that one embodiment of the invention provides and the flow diagram of the measures and procedures for the examination and approval;
Fig. 2 is the schematic diagram of internal structure that the water note that one embodiment of the invention provides identifies automatically and examine device;
Fig. 3 identifies automatically and examines water note in device for the water note that one embodiment of the invention provides and identifies and examine journey automatically The module diagram of sequence.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of water note and identifies automatically and the measures and procedures for the examination and approval.Shown in referring to Fig.1, provided for one embodiment of the invention Water note identify automatically and the flow diagram of the measures and procedures for the examination and approval.This method can be executed by a device, which can be by soft Part and/or hardware realization.
In the present embodiment, water note identifies automatically and the measures and procedures for the examination and approval include:
All models in S1, acquisition website.
Website of the present invention includes various forums.The forum (Forums) is group's group in online commerce services It knits.Forum may operate a library, a chatroom etc., and people is allowed to carry out real-time information interchange.
Forum is generally created by the head of a station (founder).And it sets up all levels of management personnel and forum is managed, including forum Administrator (Administrator), super edition owner (Super Moderator, some titles " total edition owner "), edition owner (Moderator is commonly called as " spot pig ", " mottled bamboo ").Super edition owner is less than the second permission (but head of a station's sheet of the head of a station (founder) Body is also super edition owner, super keepe, administrator) in general super edition owner can manage all forum's versions Block (common edition owner can only manage specific column).
In the present embodiment, the forum is thematic class forum, including, such as military class forum, computer amateur forum move Unrestrained forum or technology class forum etc..
In other embodiments, the forum is also possible to the work forum in an enterprise, user's enterprises employee it Between issue work related information.
Present pre-ferred embodiments can obtain all models from the data server of the forum.The data Be stored in server it is that all models being published in the forum and user submit but due to not over audit and The model in the forum is not issued.
S2, the title for being denoted as the model of water note is selected.
Water note is inessential for forum's theme, the meaningless some models issued in forum.For example, in skill The some models in terms of mood issued in Shu Lei forum, can be referred to as water note.
The water note can be what mode of all levels of management personnel of forum Jing Guo manual examination and verification indicated, be also possible to pass through One water note identifies automatically and examination and approval procedures automatically analyze to obtain and indicates.
S3, the title of the water note is subjected to word segmentation processing, obtains Feature Words.
Present pre-ferred embodiments use priority of long word principle, are segmented according to the dictionary prestored to received keyword (for example, priority of long word principle refers to: the phrase T1 for needing to segment for one, first since first character A, from what is prestored Dictionary finds out the longest word X1 originated by A, and X1 is then rejected from T1 and is left T2, then is former using identical cutting to T2 Reason, result after cutting be " X1/X2/,,, ").
The dictionary prestored may include conventional dictionary and specific dictionary, such as financial dictionary and product dictionary. For example, according to conventional dictionary, available " safety ", " honor ", " macro ", " life ", " having " what ", the Feature Words such as " feature ", But according to the specific products dictionary of safety company, available " the macro life of honor " this participle.Therefore, the present invention is preferably implemented After example segments the keyword " what feature is the macro life of the honor of safety have ", available product feature word is " flat Peace ", " ", " the macro life of honor ", " having ", " what ", " feature ".
Further, in present pre-ferred embodiments, processing further can also be filtered to obtained Feature Words, had Body, filtration treatment use following any one or two kinds of modes: mode one: being filtered according to part of speech, retain noun, verb with And adjective;Mode two: being filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to spy The frequency or number that sign word occurs in title.
S4, the frequency occurred according to the Feature Words, filter out high frequency words, and the high frequency words are recorded in a high frequency In word list.
Present pre-ferred embodiments calculate frequency that each Feature Words occur perhaps number by the frequency or number of appearance Greater than preset value Feature Words as high frequency words, and be recorded in the high frequency word list.
For example, being denoted as two models of water note, the title of one of model is by participle operation, obtained Feature Words For " today ", " mood ", " bad ", the title of another model by participle operation, obtained Feature Words be " how ", " heart Feelings " " improve.Then Feature Words " mood " occur twice, if the preset value is 2, " mood " is high frequency words.
S5, from the high frequency words of the high frequency word list, select keyword according to preset rules, and by the keyword of selection It is recorded in a lists of keywords.
In present pre-ferred embodiments, the method that the preset rules can be artificial screening is also possible to the height High frequency words in frequency word list are matched with a specific dictionary, the word to match are found out, as keyword.
The model to be released to predetermined website that S6, reception are currently submitted.
User entered by way of member perhaps tourist a forum and the forum submit text or picture or Person includes the behavior of music video, is known as posting.
When user newly submits a model, embodiment of this case, which obtains, deserves the preceding model submitted.
S7, the title of the received model is matched with the lists of keywords, filters out and appears in the note Keyword in the title of son.
Present pre-ferred embodiments will be recorded in the mark of the keyword in the lists of keywords Yu the received model Topic is matched, and finds out the keyword to match, and the number of words for calculating the keyword to match accounts for the word of the title of the model Several ratios.
For example, entitled " my mood of today is too bad " of the model currently submitted, by in lists of keywords Keyword carries out matching the available keyword to match being " mood ", " too bad ".
The number of words of the keyword to match is 4, and it is 9 that the title for the model currently submitted, which obtains number of words, then the key to match The ratio that word accounts for the title of the model currently submitted is 4/9=44%.
The number of words for the keyword that S8, calculating sifting come out accounts for the ratio of the number of words of the title, and judges that the ratio is It is no to reach designated value?
In present pre-ferred embodiments, the designated value be can be, such as 30%.
When the ratio that the number of words of the keyword screened accounts for the number of words of the title of the model has reached the finger Definite value then executes following S9.
When the ratio that the number of words of the keyword screened accounts for the number of words of the title of the model does not reach described Designated value then executes following S10.
S9 judges the model for water note, and rower of going forward side by side shows, further refusal issues the model on the web, And inform the submitter and/or administrator of model.
In present pre-ferred embodiments, the letter the such as the reason of title of the model, submission time, submitter, refusal publication Breath can all be sent to the submitter and/or webmaster of model.
S10 judges the model not and is water note, and the model described in the website orientation.
Invention also provides a kind of water note and identifies and examine device automatically.Referring to shown in Fig. 2, provided for one embodiment of the invention Water note identify and examine the schematic diagram of internal structure of device automatically.
In the present embodiment, the water note identify automatically and examine device 1 can be PC (Personal Computer, it is a People's computer), it is also possible to the terminal devices such as smart phone, tablet computer, portable computer.The water note identifies automatically and examines dress It sets 1 and includes at least memory 11, processor 12, communication bus 13 and network interface 14.
Wherein, memory 11 include at least a type of readable storage medium storing program for executing, the readable storage medium storing program for executing include flash memory, Hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), magnetic storage, disk, CD etc..Memory 11 It can be the internal storage unit that water note identifies automatically and examine device 1 in some embodiments, such as the water note identifies automatically And the hard disk of examination & approval device 1.Memory 11 is also possible to water note in further embodiments and identifies and examine the outer of device 1 automatically Portion stores equipment, such as water note identifies automatically and examine the plug-in type hard disk being equipped on device 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, Memory 11 can also both include that water note identifies automatically and examine the internal storage unit of device 1 or including External memory equipment. Memory 11 can be not only used for storage and be installed on application software and Various types of data that water note identifies automatically and examine device 1, example Such as water note identifies automatically and the code of examination and approval procedures 01, can be also used for temporarily storing and has exported or will export Data.
Processor 12 can be in some embodiments a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips, the program for being stored in run memory 11 Code or processing data, such as execute water note and identify automatically and examination and approval procedures 01 etc..
Communication bus 13 is for realizing the connection communication between these components.
Network interface 14 optionally may include standard wireline interface and wireless interface (such as WI-FI interface), be commonly used in Communication connection is established between the device 1 and other electronic equipments.
Optionally, which can also include user interface, and user interface may include display (Display), input Unit such as keyboard (Keyboard), optional user interface can also include standard wireline interface and wireless interface.It is optional Ground, in some embodiments, display can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Wherein, display can also be appropriate Referred to as display screen or display unit identify automatically and examine the information handled in device 1 and for showing for being shown in water note Show visual user interface.
Fig. 2 illustrate only identify automatically with component 11-14 and water note and the water note of examination and approval procedures 01 identify automatically and Device 1 is examined, water note is identified automatically and is examined it will be appreciated by persons skilled in the art that structure shown in fig. 1 is not constituted The restriction for criticizing device 1 may include perhaps combining certain components or different portions than illustrating less perhaps more components Part arrangement.
In 1 embodiment of device shown in Fig. 2, it is stored with water note in memory 11 and identifies automatically and examination and approval procedures 01;Place Reason device 12 executes that the water note that stores in memory 11 identifies automatically and whens examination and approval procedures 01 realizes following steps:
Step 1: obtaining all models in website.
Website of the present invention includes various forums.The forum (Forums) is group's group in online commerce services It knits.Forum may operate a library, a chatroom etc., and people is allowed to carry out real-time information interchange.
Forum is generally created by the head of a station (founder).And it sets up all levels of management personnel and forum is managed, including forum Administrator (Administrator), super edition owner (Super Moderator, some titles " total edition owner "), edition owner (Moderator is commonly called as " spot pig ", " mottled bamboo ").Super edition owner is less than the second permission (but head of a station's sheet of the head of a station (founder) Body is also super edition owner, super keepe, administrator) in general super edition owner can manage all forum's versions Block (common edition owner can only manage specific column).
In the present embodiment, the forum is thematic class forum, including, such as military class forum, computer amateur forum move Unrestrained forum or technology class forum etc..
In other embodiments, the forum is also possible to the work forum in an enterprise, user's enterprises employee it Between issue work related information.
Present pre-ferred embodiments can obtain all models from the data server of the forum.The data Be stored in server it is that all models being published in the forum and user submit but due to not over audit and The model in the forum is not issued.
Step 2: selecting the title for being denoted as the model of water note.
Water note is inessential for forum's theme, the meaningless some models issued in forum.For example, in skill The some models in terms of mood issued in Shu Lei forum, can be referred to as water note.
The water note can be what mode of all levels of management personnel of forum Jing Guo manual examination and verification indicated, be also possible to pass through One water note identifies automatically and examination and approval procedures automatically analyze to obtain and indicates.
Step 3: the title of the water note is carried out word segmentation processing, Feature Words are obtained.
Present pre-ferred embodiments use priority of long word principle, are segmented according to the dictionary prestored to received keyword (for example, priority of long word principle refers to: the phrase T1 for needing to segment for one, first since first character A, from what is prestored Dictionary finds out the longest word X1 originated by A, and X1 is then rejected from T1 and is left T2, then is former using identical cutting to T2 Reason, result after cutting be " X1/X2/,,, ").
The dictionary prestored may include conventional dictionary and specific dictionary, such as financial dictionary and product dictionary. For example, according to conventional dictionary, available " safety ", " honor ", " macro ", " life ", " having " what ", the Feature Words such as " feature ", But according to the specific products dictionary of safety company, available " the macro life of honor " this participle.Therefore, the present invention is preferably implemented After example segments the keyword " what feature is the macro life of the honor of safety have ", available product feature word is " flat Peace ", " ", " the macro life of honor ", " having ", " what ", " feature ".
Further, in present pre-ferred embodiments, processing further can also be filtered to obtained Feature Words, had Body, filtration treatment use following any one or two kinds of modes: mode one: being filtered according to part of speech, retain noun, verb with And adjective;Mode two: being filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to spy The frequency or number that sign word occurs in title.
Step 4: filtering out high frequency words, and the high frequency words are recorded in one according to the frequency that the Feature Words occur In high frequency word list.
Present pre-ferred embodiments calculate frequency that each Feature Words occur perhaps number by the frequency or number of appearance Greater than preset value Feature Words as high frequency words, and be recorded in the high frequency word list.
For example, being denoted as two models of water note, the title of one of model is by participle operation, obtained Feature Words For " today ", " mood ", " bad ", the title of another model by participle operation, obtained Feature Words be " how ", " heart Feelings " " improve.Then Feature Words " mood " occur twice, if the preset value is 2, " mood " is high frequency words.
Step 5: from the high frequency words of the high frequency word list, keyword is selected according to preset rules, and by the pass of selection Keyword is recorded in a lists of keywords.
In present pre-ferred embodiments, the method that the preset rules can be artificial screening is also possible to the height High frequency words in frequency word list are matched with a specific dictionary, the word to match are found out, as keyword.
Step 6: receiving the model to be released to predetermined website currently submitted.
User entered by way of member perhaps tourist a forum and the forum submit text or picture or Person includes the behavior of music video, is known as posting.
When user newly submits a model, embodiment of this case, which obtains, deserves the preceding model submitted.
Step 7: the title of the received model is matched with the lists of keywords, filters out and appear in institute State the keyword in the title of model.
Present pre-ferred embodiments will be recorded in the mark of the keyword in the lists of keywords Yu the received model Topic is matched, and finds out the keyword to match, and the number of words for calculating the keyword to match accounts for the word of the title of the model Several ratios.
For example, entitled " my mood of today is too bad " of the model currently submitted, by in lists of keywords Keyword carries out matching the available keyword to match being " mood ", " too bad ".
The number of words of the keyword to match is 4, and it is 9 that the title for the model currently submitted, which obtains number of words, then the key to match The ratio that word accounts for the title of the model currently submitted is 4/9=44%.
Step 8: the number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title, and judge the ratio Does example reach designated value?
In present pre-ferred embodiments, the designated value be can be, such as 30%.
When the ratio that the number of words of the keyword screened accounts for the number of words of the title of the model has reached the finger Definite value then executes following steps nine.
When the ratio that the number of words of the keyword screened accounts for the number of words of the title of the model does not reach described Designated value then executes following steps ten.
Step 9: judging the model for water note, rower of going forward side by side shows, further refusal issues the note on the web Son, and inform the submitter and/or administrator of model.
In present pre-ferred embodiments, the letter the such as the reason of title of the model, submission time, submitter, refusal publication Breath can all be sent to the submitter and/or webmaster of model.
Step 10: judging the model not and being water note, and the model described in the website orientation.
Optionally, in other embodiments, water note identifies automatically and examination and approval procedures can also be divided into one or more A module, one or more module are stored in memory 11, and (the present embodiment is processing by one or more processors Device 12) it is performed to complete the present invention, the so-called module of the present invention is the series of computation machine journey for referring to complete specific function Sequence instruction segment, for describing, water note identifies automatically and examination and approval procedures in water note identify automatically and examine the implementation procedure in device.
For example, referring to shown in Fig. 3, reflect automatically for the water note that water note of the present invention identifies and examines automatically in one embodiment of device Not and the program module schematic diagrames of examination and approval procedures, in the embodiment, water note identifies automatically and examination and approval procedures can be divided into water Note standard formulation module 10, water note identification module 20 and water note approval module 30, illustratively:
The water note standard formulation module 10 is used for: obtaining all models in website;Select the model for being denoted as water note Title, the title of the water note is subjected to word segmentation processing, obtains Feature Words;According to the frequency that the Feature Words occur, screening High frequency words out, and the high frequency words are recorded in a high frequency word list;From the high frequency words of the high frequency word list, according to Preset rules select keyword, and the keyword of selection is recorded in a lists of keywords.
The water note identification module 20 is used for: receiving the model to be released to predetermined website currently submitted;It will The title of the received model is matched with a lists of keywords, filters out the pass in the title for appearing in the model Keyword;The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;If the ratio calculated is greater than or waits In designated value, then judge the model for water note, and refuse the model described in the website orientation;Refer to if the ratio calculated is less than Definite value then judges the model not and is water note, and the model described in the website orientation.
The water note approval module 30 is used for: the model for being denoted as water note refusal being issued on website, and informs model Submitter and/or administrator, and by judgement be not that the model of water note is published in website.
The program modules such as above-mentioned water note standard formulation module 10, water note identification module 20 and water note approval module 30 are performed When the functions or operations step realized be substantially the same with above-described embodiment, details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with water note and identify automatically and examination and approval procedures, the water note identifies automatically and examination and approval procedures can be by one or more processors It executes, to realize following operation:
All models in website are obtained, the title for being denoted as the model of water note is selected, the title of the water note is carried out Word segmentation processing obtains Feature Words, according to the frequency that the Feature Words occur, filters out high frequency words, and the high frequency words are recorded In a high frequency word list, and from the high frequency words of the high frequency word list, keyword is selected according to preset rules, and record In a lists of keywords;
The model currently submitted is received, the title of the model currently submitted is matched with the keyword, is sieved The keyword in the title for appearing in the model currently submitted is selected, judges that the keyword screened works as premise described in accounting for The ratio of the title of the model of friendship, when the ratio that the keyword screened accounts for the title of the model currently submitted reaches Arrived designated value, then the model currently submitted be denoted as water note, or when the keyword that screen account for it is described current The ratio of the title of the model of submission does not reach designated value, judges that the model currently submitted is not water note;And
The model for being denoted as water note refusal is issued on website, and informs the submitter and/or administrator of model, and The model that judgement is not water note is published in website.
Computer readable storage medium specific embodiment of the present invention and above-mentioned water note identify automatically and examine device and side Each embodiment of method is essentially identical, does not make tired state herein.
It should be noted that the serial number of the above embodiments of the invention is only for description, do not represent the advantages or disadvantages of the embodiments.And The terms "include", "comprise" herein or any other variant thereof is intended to cover non-exclusive inclusion, so that packet Process, device, article or the method for including a series of elements not only include those elements, but also including being not explicitly listed Other element, or further include for this process, device, article or the intrinsic element of method.Do not limiting more In the case where, the element that is limited by sentence "including a ...", it is not excluded that including process, device, the article of the element Or there is also other identical elements in method.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of water note identifies automatically and the measures and procedures for the examination and approval, which is characterized in that the described method includes:
Receive the model to be released to predetermined website currently submitted;
The title of the received model is matched with a lists of keywords, filters out the title for appearing in the model In keyword;
The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;
If calculate ratio be more than or equal to designated value, judge the model for water note, and refusal in the website orientation The model;
If the ratio calculated is less than designated value, the model is judged not and is water note, and the model described in the website orientation.
2. water note as described in claim 1 identifies automatically and the measures and procedures for the examination and approval, which is characterized in that this method further include:
Obtain all models in website;
The title for being denoted as the model of water note is selected, the title of the water note is subjected to word segmentation processing, obtains Feature Words;
According to the frequency that the Feature Words occur, high frequency words are filtered out, and the high frequency words are recorded in a high frequency word list In;
From the high frequency words of the high frequency word list, keyword is selected according to preset rules, and the keyword of selection is recorded in In the lists of keywords.
3. water note as claimed in claim 2 identifies automatically and the measures and procedures for the examination and approval, which is characterized in that carry out the title of the water note Word segmentation processing, obtaining Feature Words is segmented according to the dictionary prestored to received keyword using priority of long word principle.
4. water note as claimed in claim 2 identifies automatically and the measures and procedures for the examination and approval, which is characterized in that this method further include:
Processing is filtered to the Feature Words, the filtration treatment is using following one or two kinds of modes:
Mode one is filtered according to part of speech, retains noun, verb and adjective;
Mode two is filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to Feature Words The frequency or number occurred in title.
5. water note as claimed in claim 2 identifies automatically and the measures and procedures for the examination and approval, which is characterized in that the preset rules are will be described High frequency words in high frequency word list are matched with a specific dictionary, the word to match are found out, as the keyword.
6. a kind of water note identifies automatically and examines device, which is characterized in that described device includes memory and processor, described to deposit It is stored with the water note that can be run on the processor on reservoir to identify automatically and examination and approval procedures, the water note identifies automatically and examines It criticizes when program is executed by the processor and realizes following steps:
Receive the model to be released to predetermined website currently submitted;
The title of the received model is matched with a lists of keywords, filters out the title for appearing in the model In keyword;
The number of words for the keyword that calculating sifting comes out accounts for the ratio of the number of words of the title;
If calculate ratio be more than or equal to designated value, judge the model for water note, and refusal in the website orientation The model;
If the ratio calculated is less than designated value, the model is judged not and is water note, and the model described in the website orientation.
7. water note as claimed in claim 6 identifies automatically and examines device, which is characterized in that the water note identifies automatically and examines It criticizes when program is executed by the processor and also realizes following steps:
Obtain all models in website;
The title for being denoted as the model of water note is selected, the title of the water note is subjected to word segmentation processing, obtains Feature Words;
According to the frequency that the Feature Words occur, high frequency words are filtered out, and the high frequency words are recorded in a high frequency word list In;
From the high frequency words of the high frequency word list, keyword is selected according to preset rules, and the keyword of selection is recorded in In the lists of keywords.
8. water note as claimed in claim 7 identifies automatically and examines device, which is characterized in that carry out the title of the water note Word segmentation processing, obtaining Feature Words is segmented according to the dictionary prestored to received keyword using priority of long word principle.
9. water note as claimed in claim 7 identifies automatically and examines device, which is characterized in that the water note identifies automatically and examines It criticizes when program is executed by the processor and also realizes following steps:
Processing is filtered to the Feature Words, the filtration treatment uses following any one or two kinds of modes:
Mode one is filtered according to part of speech, retains noun, verb and adjective;
Mode two is filtered according to the frequency, retains the Feature Words that the frequency is greater than frequency threshold value, wherein the frequency refers to Feature Words The frequency or number occurred in title.
10. a kind of computer readable storage medium, which is characterized in that be stored with water note on the computer readable storage medium certainly Dynamic identification and examination and approval procedures, the water note identifies automatically and examination and approval procedures can be executed by one or more processor, to realize Water note as described in any one of claims 1 to 5 identifies automatically and the step of the measures and procedures for the examination and approval.
CN201811095297.9A 2018-09-19 2018-09-19 Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium Withdrawn CN109492216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811095297.9A CN109492216A (en) 2018-09-19 2018-09-19 Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811095297.9A CN109492216A (en) 2018-09-19 2018-09-19 Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN109492216A true CN109492216A (en) 2019-03-19

Family

ID=65690535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811095297.9A Withdrawn CN109492216A (en) 2018-09-19 2018-09-19 Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109492216A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163558A (en) * 2019-04-16 2019-08-23 平安科技(深圳)有限公司 The examination and approval document measures and procedures for the examination and approval, device, computer equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102340424A (en) * 2010-07-21 2012-02-01 中国移动通信集团山东有限公司 Bad message detection method and bad message detection device
CN102999576A (en) * 2012-11-13 2013-03-27 北京百度网讯科技有限公司 Method and equipment for confirming page description information corresponding to target pages
CN103176983A (en) * 2011-12-20 2013-06-26 中国科学院计算机网络信息中心 Event warning method based on Internet information
CN104111941A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and equipment for information display
CN104598532A (en) * 2014-12-29 2015-05-06 中国联合网络通信有限公司广东省分公司 Information processing method and device
CN104657349A (en) * 2015-02-11 2015-05-27 厦门美柚信息科技有限公司 Forum post feature identifying method and device
US20160321355A1 (en) * 2014-04-01 2016-11-03 Tencent Technology (Shenzhen) Company Limited Media content recommendation method and apparatus
CN106354867A (en) * 2016-09-12 2017-01-25 传线网络科技(上海)有限公司 Multimedia resource recommendation method and device
CN107292365A (en) * 2017-06-27 2017-10-24 百度在线网络技术(北京)有限公司 Binding method, device, equipment and the computer-readable recording medium of Commercial goods labelses

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102340424A (en) * 2010-07-21 2012-02-01 中国移动通信集团山东有限公司 Bad message detection method and bad message detection device
CN103176983A (en) * 2011-12-20 2013-06-26 中国科学院计算机网络信息中心 Event warning method based on Internet information
CN102999576A (en) * 2012-11-13 2013-03-27 北京百度网讯科技有限公司 Method and equipment for confirming page description information corresponding to target pages
CN104111941A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and equipment for information display
US20160321355A1 (en) * 2014-04-01 2016-11-03 Tencent Technology (Shenzhen) Company Limited Media content recommendation method and apparatus
CN104598532A (en) * 2014-12-29 2015-05-06 中国联合网络通信有限公司广东省分公司 Information processing method and device
CN104657349A (en) * 2015-02-11 2015-05-27 厦门美柚信息科技有限公司 Forum post feature identifying method and device
CN106354867A (en) * 2016-09-12 2017-01-25 传线网络科技(上海)有限公司 Multimedia resource recommendation method and device
CN107292365A (en) * 2017-06-27 2017-10-24 百度在线网络技术(北京)有限公司 Binding method, device, equipment and the computer-readable recording medium of Commercial goods labelses

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163558A (en) * 2019-04-16 2019-08-23 平安科技(深圳)有限公司 The examination and approval document measures and procedures for the examination and approval, device, computer equipment and storage medium
CN110163558B (en) * 2019-04-16 2024-05-07 平安科技(深圳)有限公司 Approval method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US9171072B2 (en) System and method for real-time dynamic measurement of best-estimate quality levels while reviewing classified or enriched data
CN111241389B (en) Sensitive word filtering method and device based on matrix, electronic equipment and storage medium
CN108171073B (en) Private data identification method based on code layer semantic parsing drive
CN109325165A (en) Internet public opinion analysis method, apparatus and storage medium
CN106874253A (en) Recognize the method and device of sensitive information
CN110651288A (en) Event extraction system and method
CN109492222A (en) Intension recognizing method, device and computer equipment based on conceptional tree
CN110135942A (en) Products Show method, apparatus and computer readable storage medium
CN110263248A (en) A kind of information-pushing method, device, storage medium and server
CN109670852A (en) User classification method, device, terminal and storage medium
CN107729456A (en) Sensitive information search method, server and storage medium
CN108053545A (en) Certificate verification method and apparatus, server, storage medium
CN109635073A (en) Forum's community application management method, device, equipment and computer readable storage medium
CN109873813A (en) Text input abnormality monitoring method, device, computer equipment and storage medium
CN111930623A (en) Test case construction method and device and electronic equipment
CN110909120A (en) Resume searching/delivering method, device and system and electronic equipment
CN111553137A (en) Report generation method and device, storage medium and computer equipment
CN106295972A (en) Yun Zhi makes article made to order intelligent service system and implementation method
CN109657043B (en) Method, device and equipment for automatically generating article and storage medium
CN112528638A (en) Abnormal object identification method and device, electronic equipment and storage medium
CN109492216A (en) Water note identifies automatically and the measures and procedures for the examination and approval, device and computer readable storage medium
CN110008352B (en) Entity discovery method and device
CN112016317A (en) Sensitive word recognition method and device based on artificial intelligence and computer equipment
CN110941638B (en) Application classification rule base construction method, application classification method and device
US10248638B2 (en) Creating forms for hierarchical organizations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20190319

WW01 Invention patent application withdrawn after publication