CN113011165B - Method, device, equipment and medium for identifying blocked keywords - Google Patents
Method, device, equipment and medium for identifying blocked keywords Download PDFInfo
- Publication number
- CN113011165B CN113011165B CN202110296033.5A CN202110296033A CN113011165B CN 113011165 B CN113011165 B CN 113011165B CN 202110296033 A CN202110296033 A CN 202110296033A CN 113011165 B CN113011165 B CN 113011165B
- Authority
- CN
- China
- Prior art keywords
- sharing information
- target
- keywords
- determining
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 238000010992 reflux Methods 0.000 claims abstract description 69
- 238000012360 testing method Methods 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 14
- 238000004590 computer program Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 230000006872 improvement Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000002699 waste material Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Data Mining & Analysis (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification discloses a method, a device, equipment and a medium for identifying blocked keywords. The scheme comprises the following steps: acquiring the number of times of the target keyword used; obtaining the reflux times of the target keywords; determining a reflow rate of the target keyword based on the reflow times and the used times; judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result; and when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a medium for identifying blocked keywords.
Background
In the prior art, when a user sends information to other users through a client, the information shared by the users cannot be shared to other users or other users cannot use the information shared by the users, namely, the information shared by the users is blocked, which brings certain inconvenience to the users.
Therefore, how to quickly and accurately discover the content possibly blocked in the shared information is a technical problem to be solved.
Disclosure of Invention
The embodiment of the specification provides a method, a device, equipment and a medium for identifying blocked keywords, which are used for discovering the blocked keywords possibly existing in shared information and improving the usability of the shared information.
In order to solve the above technical problems, the embodiments of the present specification are implemented as follows:
the method for identifying the blocked keywords provided by the embodiment of the specification comprises the following steps:
Acquiring the number of times of the target keyword used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords;
Obtaining the reflux times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
determining a reflow rate of the target keyword based on the reflow times and the used times;
Judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result;
And when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword. The device for identifying blocked keywords provided by the embodiment of the specification comprises:
The first data acquisition module is used for acquiring the used times of the target keywords; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords;
The second data acquisition module is used for acquiring the reflow times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
a reflow rate calculation module for determining a reflow rate of the target keyword based on the reflow times and the used times;
the judging module is used for judging whether the reflux rate is smaller than or equal to a preset reflux rate or not to obtain a judging result;
And the result determining module is used for determining the target keyword as the blocked keyword when the judging result shows that the reflux rate is smaller than or equal to a preset reflux rate.
The device for identifying blocked keywords provided by the embodiment of the specification comprises:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to:
Acquiring the number of times of the target keyword used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords;
Obtaining the reflux times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
determining a reflow rate of the target keyword based on the reflow times and the used times;
Judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result;
and when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword. Embodiments of the present description provide a computer readable medium having stored thereon computer readable instructions executable by a processor to implement a method of identifying blocked keywords.
One embodiment of the present specification achieves the following advantageous effects: by counting the reflux rate of the target keywords, the blocked keywords are determined, and the blocked keywords in the shared information can be effectively identified, so that the determined blocked keywords are avoided when the shared information is generated, the shared information can be successfully shared, the availability of the shared information is improved, and the resource waste is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an overall scheme architecture of a method for identifying blocked keywords in an actual application scenario according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for identifying blocked keywords according to an embodiment of the present disclosure;
FIG. 3 is a lane diagram of a method for identifying blocked keywords according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of an apparatus for identifying blocked keywords according to an embodiment of the present disclosure;
Fig. 5 is a schematic structural diagram of an apparatus for identifying blocked keywords according to an embodiment of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of one or more embodiments of the present specification more clear, the technical solutions of one or more embodiments of the present specification will be clearly and completely described below in connection with specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without undue burden, are intended to be within the scope of one or more embodiments herein.
The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.
In order to solve the drawbacks of the prior art, the present solution provides the following embodiments:
Fig. 1 is a schematic diagram of an overall scheme architecture of a method for identifying blocked keywords in an actual application scenario according to an embodiment of the present disclosure. As shown in fig. 1, the scheme mainly includes a server 1, a first terminal 2 and a second terminal 3, where the server 1 may count the sharing information sent to the first terminal 2 and the keywords contained in the sharing information, and determine the number of times of using the keywords; the first terminal 1 may send the sharing information to the second terminal 2, the second terminal 2 may initiate an access request based on the received sharing information, the server 1 may count the number of times of access requests initiated by the second terminal 2, determine the number of times of reflow of the keyword, and further determine the reflow rate of the keyword in the sharing information according to the number of times of reflow and the number of times of use, where when the reflow rate is low, it indicates that the sharing information including the keyword cannot be normally used by the second terminal 2, for example, cannot be displayed in the second terminal or cannot be copied in the second terminal, and the keyword is a blocked keyword. And after the blocked keyword is determined, the blocked keyword can be not used any more when the shared information is regenerated later, so that the shared information can be normally shared to the second terminal, the second terminal can normally use the shared information, the availability of the shared information can be improved, and the resource waste caused by the generation of the unavailable shared information is reduced.
Next, a method for identifying blocked keywords provided for the embodiments of the specification will be specifically described with reference to the accompanying drawings:
Fig. 2 is a flowchart of a method for identifying blocked keywords according to an embodiment of the present disclosure. From the program perspective, the execution subject of the flow may be a program or an application client that is installed on an application server.
As shown in fig. 2, the process may include the steps of:
step 202: acquiring the number of times of the target keyword used; the number of times of use is obtained based on statistics of the number of sharing information including the target keyword sent to the first terminal set.
In this embodiment, the first terminal set may include at least one terminal, and the terminal may include a mobile terminal such as a mobile phone, a computer, a smart watch, and the like. The server may count the sharing information and the number of the sharing information generated for the first terminal, where the sharing information may include at least one keyword, and the keyword may be an individual word, or may be a word or a phrase.
In practical applications, the keyword to be identified may be determined as a target keyword, where the target keyword may be all or part of the keywords included in the sharing information. The number of times the target keyword is used may be understood as the number of times the generated sharing information contains the target keyword, that is, the number of times the target keyword is used in the generated sharing information.
Step 204: obtaining the reflux times of the target keywords; the reflow times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords.
The second set of terminals may include at least one second terminal, which may be a terminal that initiates an access request based on the shared information. In practical application, the first terminal may send the generated sharing information to the second terminal, and the second terminal initiates the access request based on the received sharing information, as another implementation manner, the server that generates the sharing information may also directly send the generated sharing information to the second terminal, and the second terminal may also initiate the access request based on the received sharing information, where a specific transmission manner of the sharing information is not limited.
In the embodiment of the present disclosure, the server may count the number of times of access requests initiated by the second terminal based on the sharing information including the target keyword, and may further determine the number of times of reflow of the target keyword. When the second terminal can initiate an access request based on the sharing information, keywords in the sharing information are available and are not blocked. The number of times of reflow of the target keyword is the number of times of inclusion of the target keyword in the sharing information capable of initiating the access request.
Step 206: and determining the reflow rate of the target keyword based on the reflow times and the used times.
Step 208: and judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result.
Step 210: and when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
In this embodiment of the present disclosure, when the reflux rate of the target keyword is less than or equal to the preset reflux rate, it indicates that the number of access requests initiated by the second terminal based on the sharing information including the target keyword is less, at least some of the second terminals in the second terminal set that receive the sharing information cannot be based on the access requests initiated by the sharing information including the target keyword, where the target keyword affects the use of the sharing information, and the user of the second terminal cannot obtain the content to be shared based on the sharing information.
It should be understood that the method according to one or more embodiments of the present disclosure may include the steps in which some of the steps are interchanged as needed, or some of the steps may be omitted or deleted.
In the embodiment of the specification, the blocked keywords can be determined by counting the reflux rate of the target keywords, so that the blocked keywords in the shared information can be effectively identified, and further, the determined blocked keywords can be avoided when the shared information is generated, the shared information can be successfully shared, the availability of the shared information is improved, and the resource waste is reduced.
The examples of the present specification also provide some specific embodiments of the method based on the method of fig. 2, which is described below.
Optionally, before the step 202 obtains the number of times of use of the target keyword, the method may further include:
Acquiring a sharing request sent by a first terminal in the first terminal set; the sharing request is a request for sharing the page to be shared;
generating the sharing information based on the link information of the page to be shared;
the obtaining the number of times of the target keyword used may specifically include:
And determining the number of times the target keyword is used based on the sharing information.
In practical applications, the sharing information may be a request for sharing a page to be shared generated based on the sharing request, and the sharing information may be generated based on the link information of the page to be shared.
The server may also send the generated sharing information to the first terminal, so that the first terminal sends the sharing information to the second terminal. For example, when a user browses a certain page information in a certain application program or a web page, the user wants to share the page information to friends of the user, the user can generate a sharing request according to a sharing operation of the user by clicking a "sharing" button in the page, the terminal sends the sharing request to a server, the server can generate sharing information according to the sharing request, the user can share the sharing information to friends of the user in a form of a short message in instant messaging, and the sharing information can be published to a preset position so that friends of the user can see the sharing information.
In this embodiment of the present disclosure, the server may record the generated sharing information, determine keywords included in the sharing information according to the generated sharing information, and further count the number of times each keyword is used. The target keywords refer to keywords which need to be identified and blocked or not, and can be at least part of keywords in the shared information, so that the identification range is improved, and the target keywords can also be all keywords contained in the shared information.
In consideration of practical applications, after receiving the sharing information, the user of the second terminal generally accesses the page to be shared within a preset time period, and in order to improve the recognition efficiency of the keywords, before obtaining the reflow times of the target keywords in step 204 in the embodiment of the present disclosure, the method may further include:
Acquiring the access request initiated by a second terminal in the second terminal set based on the sharing information within a preset time period; the preset time period is a time period with a preset duration taking the moment of generating the sharing information as a starting time;
The obtaining the reflow times of the target keyword may specifically include:
Determining the sharing information corresponding to the access request based on the access request;
and determining the reflow times of the target keywords based on the sharing information corresponding to the access request.
In practical application, the specific preset duration of the preset time period may be set according to the actual requirement, which is not limited herein specifically. For example, the preset time period may be a time period within 10 minutes after the generation of the sharing information, and after the generation of the sharing information, if the server receives the access request generated based on the sharing information within 10 minutes, the server may determine that the sharing information is reflowable, and accumulate the number of times of reflowing the keywords included in the sharing information.
In practical application, the number of times of use and the number of times of reflow of the target keyword can be counted one by one according to the generated sharing information and the received access request, and can also be counted based on the sharing information generated in the designated time period and the received access request.
When the number of times of use and the number of times of reflow of the target keyword are counted for the first time, determining the number of times of use of the target keyword based on the sharing information in the above steps may specifically include:
determining the times of the target keywords contained in the sharing information based on the sharing information;
Determining the times of the target keywords contained in the sharing information as the times of the target keywords to be used;
The determining, based on the sharing information corresponding to the access request, the number of times of reflow of the target keyword may specifically include:
Determining the times of including the target keywords in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
And determining the times of the target keywords contained in the sharing information corresponding to the access request as the times of the reflow of the keywords.
Assuming that the server receives 3 sharing requests of the first terminal, respectively generating 1 piece of sharing information for each sharing request, and generating 3 pieces of sharing information in total, wherein the sharing information 1 comprises a target keyword a, a target keyword b and a target keyword c; the sharing information 2 comprises a target keyword a, a target keyword b and a target keyword d; the sharing information 3 includes a target keyword a, a target keyword c, and a target keyword e. According to the generated sharing information, it can be determined that the number of times of use of the target keyword a is 3, the number of times of use of the target keyword b is 2, the number of times of use of the target keyword c is 2, the number of times of use of the target keyword d is 1, and the number of times of use of the target keyword e is 1.
Assuming that the server receives each access request based on the sharing information 1 and the sharing information 2 within 10 minutes after generating the sharing information, but does not receive the access request based on the sharing information 3, it may be determined that the target keywords included in the sharing information 1 and the sharing information 2 are reflowable, at this time, it may be determined that the number of times of reflowing the target keyword a is 2, the number of times of reflowing the target keyword b is 2, the number of times of reflowing the target keyword c is 1, the number of times of reflowing the target keyword d is 1, and the number of times of reflowing the target keyword e is 0 according to the sharing information corresponding to the received access request.
According to the embodiment of the specification, based on the stream computing platform, the number of times of the target keywords being used and the number of times of the target keywords being reflowed can be counted piece by piece according to the sharing information and the access request, the generated sharing information and the received access request can be obtained in real time, and further the blocked keywords can be found timely.
In this embodiment of the present disclosure, the determining, based on the historical statistical information, the return rate of the target keyword may also specifically include:
Determining the current use times of the target keywords contained in the sharing information based on the sharing information;
Adding the current use times with the historical use times of the target keywords to obtain the used times of the target keywords; the historical use times of the target keywords are the total times of the target keywords contained in the generated historical sharing information before the sharing information is generated;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
Determining the current reflow times of the target keywords contained in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
Adding the current reflow times with the historical reflow times of the target keywords to obtain the reflow times of the target keywords; the history reflux times of the target keywords are the total times of the target keywords contained in the history sharing information corresponding to the acquired history access requests before the access requests are acquired.
Continuing with the above example, it is assumed that, before the shared information is generated, the historical usage number of the target keyword a in the historical shared information that has been generated is 20, the historical usage number of the target keyword b is 15, the historical usage number of the target keyword c is 10, the historical usage number of the target keyword d is 12, and the historical usage number of the target keyword e is 8. After generating the sharing information 1 including the target keyword a, the target keyword b and the target keyword c, the number of times of use of the target keyword a is updated to 21, the number of times of use of the target keyword b is updated to 16, the number of times of use of the target keyword c is updated to 10, and the same principle is adopted, the number of times of use of the target keyword d and the target keyword e is updated, and the updated number of times is the number of times of use of the target keyword. Similarly, after receiving the access request based on the shared information 1 and the shared information 2, the reflow times of the target keyword are updated based on the previously counted reflow times, and assuming that the previously counted historical reflow times of the target keyword a are 15, after receiving the access request based on the shared information 1, the reflow times of the target keyword a are updated to be 16.
Step 206 in the example of the present specification determines the reflow rate of the target keyword based on the reflow times and the used times, which may specifically include:
calculating a ratio of the number of reflows to the number of uses based on the number of reflows and the number of uses;
and determining the reflux rate of the target keyword based on the ratio.
In practical applications, the reflow rate of the target keyword may be a ratio of the number of times of reflowing the target keyword to the number of times of using the target keyword, or may be expressed in a percentage form, and specific expression forms are not limited in the embodiments of the present disclosure.
In this embodiment of the present disclosure, one piece of shared information may be shared among multiple users, and there may be multiple access requests for one piece of shared information, so that the number of times of reflow of the target keyword may be greater than the number of times of use of the target keyword, and the number of times of reflow of the target keyword may be greater than or equal to 1. For example, the user of the first terminal shares a piece of shared information to the users of the plurality of second terminals, and each user of the second terminals may send an access request based on the shared information.
In this embodiment of the present disclosure, the sharing information may include information for describing content of a page to be shared, and a user receiving the sharing information may know main service content of the page to be shared through the sharing information. The sharing information may include text information, which may be used to describe information of contents of the page to be shared, and code information, which may be used to be associated with links of the page to be shared, based on which the page to be shared may be determined.
The text information can comprise text information, wherein the text can comprise Chinese characters, foreign language, mars text and the like, and also can comprise simplified characters, traditional Chinese characters and the like. The code information may include at least one of a number, a letter, and a character.
The access request in the embodiment of the present disclosure may include code information in the sharing information corresponding to the access request, and the server may present the page to be shared to the second terminal according to the code information, where a user of the second terminal obtains information in the sharing page.
After the sharing information is generated, the corresponding relation between the code information in the sharing information and the link information of the page to be shared can be established, and when the page to be shared is presented for the user, the server does not need to identify all information in the sharing information corresponding to the access request, and the information to be shared can be determined only by utilizing the code information in the sharing information, so that the processing efficiency of the access request can be improved.
The method for identifying blocked keywords provided in the embodiment of the present specification may further include:
Establishing a first corresponding relation between the text information and the code information for each sharing information;
the determining, based on the access request, the sharing information corresponding to the access request may specifically include:
Determining the code information contained in the access request based on the access request;
determining the text information corresponding to the code information according to the first corresponding relation;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
determining the times of the target keywords contained in the text information based on the text information corresponding to the code information;
And determining the reflow times of the target keywords according to the times of the target keywords contained in the text information.
In the embodiment of the specification, after the sharing information is generated, a corresponding relation between text information and code information in the sharing information can be established, when an access request containing the code information is received, the text information corresponding to the code information can be determined based on the code information, then keywords in the text information are determined, the times of containing target keywords in the text information are determined, and further the times of reflowing the target keywords can be determined.
As another implementation manner, in the embodiment of the present specification, a correspondence between the target keyword and the code information may be pre-established, and the number of times of reflow of the target keyword is determined according to the correspondence, and specifically, the sharing information in the embodiment of the present specification may include text information and code information; the text information may include the target keyword; the access request may include code information in the shared information corresponding to the access request;
The method may further comprise:
establishing a second corresponding relation between the target keyword and the code information for each sharing information;
the determining the sharing information corresponding to the access request based on the access request specifically includes:
Determining the code information contained in the access request based on the access request;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
determining the target keyword corresponding to the code information according to the second corresponding relation;
Determining the times of the code information corresponding to the target keywords according to the acquired times of the access requests;
And determining the reflow times of the target keywords based on the times of the code information corresponding to the target keywords.
In the embodiment of the specification, after the shared information is generated, the corresponding relation between the keyword contained in the shared information and the code information in the shared information can be established, when the access request is received, the target keyword corresponding to the code information can be determined according to the code information contained in the access request, and then the reflux times of the target keyword can be counted.
In practical application, when one piece of sharing information contains two or more identical target keywords, marks can be added to the target keywords according to the word order of the target keywords in the sharing information, the corresponding relation between the marked target keywords and the code information is established, and then the reflow times of the target keywords can be determined according to the corresponding relation.
In practical applications, after the user a shares the shared information to the user B, the shared information may not be displayed in the terminal interface of the user B, or the user B may not be able to copy the shared information in the terminal interface, so that the user B may not obtain the shared content corresponding to the shared information. The blocked keywords in the embodiments of the present specification may include keywords for which the copy operation is prohibited or the display operation is prohibited.
In order to facilitate statistics and generation of shared information, the method for identifying blocked keywords provided in the embodiments of the present disclosure may further include:
determining a keyword lexicon; the keyword word stock comprises the target keywords;
the generating the sharing information may specifically include:
acquiring at least one target keyword in the keyword lexicon;
And generating the sharing information based on the target keywords.
In practical application, after determining the target keyword as the blocked keyword in step 210 based on the recognition result of the keyword, the method may further include:
and deleting or marking the keywords which are determined to be blocked in the keyword lexicon.
The embodiment of the specification can delete or mark the blocked keywords in the keyword lexicon so as to avoid using the blocked keywords when the shared information is regenerated later.
In this embodiment of the present disclosure, when generating shared information, generating the shared information by using keywords that are not marked may include:
Acquiring at least one keyword in the keyword lexicon;
Judging whether the keywords are marked keywords or not;
if the keywords are unlabeled keywords, determining the keywords as the target keywords;
And generating the sharing information based on the target keywords.
In the embodiment of the specification, the keyword determined to be blocked can be replaced by the replacement word, and the shared information is generated by using the replacement word. The method for identifying blocked keywords provided in the embodiment of the present specification may include:
determining a replacement word library;
after the target keyword is determined to be the blocked keyword, the method further comprises:
Determining a replacement word corresponding to the target keyword in the replacement word lexicon;
And generating new sharing information based on the replacement words.
The replacement word library contains replacement words corresponding to the target keywords, wherein the replacement words can be synonyms and homonyms of the target keywords, also can be Mars containing components or radicals of the target keywords, and also can be splice words consisting of letters, pinyin, english, korean, japanese and the like.
In practical application, when the replacement word in the replacement word stock is determined to be a blocked keyword, the replacement word in the replacement word stock can be deleted; when the reflux rate of the replacement word is higher, the keyword in the keyword lexicon can be replaced by the replacement word, so that reflowable sharing information can be generated by using the keyword in the keyword lexicon later, and the usability of the sharing information is improved.
In the embodiment of the specification, whether the target keyword is the blocked keyword can be determined based on the generated sharing information, the generated sharing information and the access request based on the sharing information can be acquired based on the use of the server by the user, the keyword can be identified by utilizing the data generated by the user in the process of using the server, excessive hardware equipment is not required to be added, and the cost can be reduced.
To further determine that the target keyword is the blocked keyword, after determining that the target keyword is the blocked keyword in step 210 in the embodiment of the present specification, the method may further include:
the sharing information containing the target keywords is sent to a true machine testing terminal;
and if the sharing information is not displayed in the real machine test terminal, determining the target keyword as the keyword which is determined to be blocked.
The real machine test terminal may be a terminal carrying a target application program, for example, a terminal running a certain instant messaging application program. In practical application, the real machine test terminal can be operated manually, or the real machine test terminal can be operated by simulating the operation of a human by using a software program. When the shared information cannot be displayed in the real machine test terminal, the shared information can be determined to contain the blocked keywords.
The real machine test terminal comprises a first test terminal and a second test terminal, a user of a target application program in the first test terminal issues sharing information to a sharing interface of the target application program, such as a friend circle and a microblog, but the sharing information cannot be displayed in the sharing interface of the target application program of the second test terminal, so that the sharing information can be determined to be blocked, and the target keyword can be determined to be the keyword determined to be blocked.
As another implementation manner, after determining the target keyword as the blocked keyword in step 210 in the embodiment of the present specification, the method may further include:
the sharing information containing the target keywords is sent to a true machine testing terminal;
acquiring display page information generated by the real machine test terminal based on the selection operation of the sharing information;
if the information capable of executing the preset operation on the sharing information is not contained in the display page information, determining the target keywords as the keywords which are determined to be blocked; the preset operation includes a copy operation.
If the first test terminal sends the sharing information to the second test terminal, the second test terminal can receive and display the sharing information, but cannot copy the sharing information, for example, long press the sharing information, the displayed editing options do not include copy options, so that the user cannot copy the sharing information, and further cannot acquire the page to be shared based on the copied sharing information, the sharing information can be determined to be blocked, and the target keywords can be determined to be keywords determined to be blocked.
In order to more comprehensively identify the blocked keywords, in the embodiment of the specification, the features of the blocked keywords are analyzed by using a machine learning model based on the blocked keywords, the probability that the keywords in the keyword lexicon are blocked keywords is predicted, and the keywords with the probability greater than or equal to the set probability are determined as blocked keywords. And the method can further predict the possibility that the keywords with the return rate larger than the preset return rate are blocked keywords based on the blocked keywords by analyzing the characteristics of the blocked keywords by using a machine learning model, and determine the keywords with the possibility larger than or equal to the preset possibility as blocked keywords.
In order to more clearly describe a method for identifying blocked keywords provided in the embodiments of the present disclosure, fig. 3 is a lane diagram of a method for identifying blocked keywords provided in the embodiments of the present disclosure, as shown in fig. 3, the method may include a data statistics stage and a judgment stage, and specifically may include:
step 302: the first terminal receives the sharing operation of the user, and sends a sharing request to the server to request to acquire a request for sharing the page to be shared.
For example, when a user of the first terminal sees a commodity, a news, or a preferential activity, the user can share the business in the page to friends of the user, the user can send a sharing request for sharing the business in the page to the server by clicking a "share" button in the page, the server can generate a sharing request for sharing the business in the page based on the request, and the sharing information can be in the form of a password, text, and the like.
Step 304: and the server receives the sharing request of the first terminal and generates sharing information based on the link information of the page to be shared.
In practical application, the first terminal may send the sharing information to the second terminal of the friend in the chat mode of instant messaging, the second terminal may also share the sharing information to the sharing space that the friend can access, and the second terminal user may obtain the sharing information in the corresponding sharing space.
Step 306: and determining the number of the target keywords contained in the sharing information according to the generated sharing information, and obtaining the number of times of using the target keywords.
In practical application, the number of times of the target keyword used can be counted according to the plurality of pieces of sharing information generated in the specified time period, for example, the number of times of the target keyword used can be counted according to the sharing information generated in the last 7 days, and in the counting process, the number of times of the target keyword used can be obtained by counting the sharing information piece by using the stream computing platform along with the process of generating the sharing information.
Step 308: and the second terminal receives the sharing information and initiates an access request based on the sharing information.
The first terminal sends the sharing information to the second terminal through the instant messaging tool, the second terminal user can execute the copying operation on the sharing information, the copied sharing information is input in the corresponding application program, and an access request of the page to be shared is initiated based on the copied sharing information.
Step 310: and the server acquires the reflow times of the target keywords according to the received access request.
The server can determine the previously generated sharing information corresponding to the access request according to the access request, further determine the number of target keywords contained in the sharing information, and determine the reflow times of the target keywords. When the server receives the access request, it can be determined that the sharing information corresponding to the access request is reflowable, and the number of times of reflowing the target keyword contained in the sharing information is correspondingly increased.
In practical application, the reflow time can be set, and if the access request corresponding to the sharing information is received within a preset time period after the sharing information is generated, the sharing information can be determined to be reflowable; if the access request corresponding to the sharing information is not received within a preset time period after the sharing information is generated, determining that the sharing information is blocked and is non-reflowable, wherein the number of times of reflowing the target keyword contained in the sharing information is not increased. In practical application, the number of times of reflow of the target keyword can also be counted in real time based on the stream computing platform.
Step 312: and determining the reflow rate of the target keyword based on the reflow times and the used times. Specifically, the reflow rate of the target keyword may be determined according to the ratio of the number of reflows to the number of times used.
Step 314: and judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result.
Step 316: and when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
Based on the same thought, the embodiment of the specification also provides a device corresponding to the method. Fig. 4 is a schematic structural diagram of an apparatus for identifying blocked keywords according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus may include:
A first data obtaining module 402, configured to obtain the number of times the target keyword is used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords;
a second data obtaining module 404, configured to obtain the number of times of reflow of the target keyword; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
A reflow rate calculation module 406, configured to determine a reflow rate of the target keyword based on the reflow times and the used times;
a judging module 408, configured to judge whether the reflux rate is less than or equal to a preset reflux rate, so as to obtain a judging result;
And a result determining module 410, configured to determine the target keyword as a blocked keyword when the determination result indicates that the return rate is less than or equal to a preset return rate.
The present description example also provides some specific embodiments of the device based on the device of fig. 4, which is described below.
Optionally, the device for identifying blocked keywords provided in the embodiment of the present specification may further include:
the information generation module is used for acquiring a sharing request sent by a first terminal in the first terminal set; the sharing request is a request for sharing the page to be shared;
generating the sharing information based on the link information of the page to be shared;
the first data acquisition module is specifically configured to:
And determining the number of times the target keyword is used based on the sharing information.
Optionally, the apparatus may further include:
The request acquisition module is used for acquiring the access request initiated by the second terminal in the second terminal set based on the sharing information in a preset time period; the preset time period is a time period with a preset duration taking the moment of generating the sharing information as a starting time;
The second data acquisition module may be specifically configured to:
Determining the sharing information corresponding to the access request based on the access request;
and determining the reflow times of the target keywords based on the sharing information corresponding to the access request.
Optionally, the first data acquisition module may be specifically further configured to:
Determining the current use times of the target keywords contained in the sharing information based on the sharing information;
Adding the current use times with the historical use times of the target keywords to obtain the used times of the target keywords; the historical use times of the target keywords are the total times of the target keywords contained in the generated historical sharing information before the sharing information is generated;
the second data acquisition module may be further configured to:
Determining the current reflow times of the target keywords contained in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
Adding the current reflow times with the historical reflow times of the target keywords to obtain the reflow times of the target keywords; the history reflux times of the target keywords are the total times of the target keywords contained in the history sharing information corresponding to the acquired history access requests before the access requests are acquired.
Based on the same thought, the embodiment of the specification also provides equipment corresponding to the method. Fig. 5 is a schematic structural diagram of an apparatus for identifying blocked keywords according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus 500 may include:
At least one processor 510; and
A memory 530 communicatively coupled to the at least one processor; wherein,
The memory 530 stores instructions 520 executable by the at least one processor 510, the instructions being executable by the at least one processor 510 to enable the at least one processor 510 to:
Acquiring the number of times of the target keyword used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords;
Obtaining the reflux times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
determining a reflow rate of the target keyword based on the reflow times and the used times;
Judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result;
And when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
Based on the same thought, the embodiment of the specification also provides a computer readable medium corresponding to the method. A computer readable medium having stored thereon computer readable instructions executable by a processor to perform the method of identifying blocked keywords described above:
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the apparatus shown in fig. 5, the description is relatively simple as it is substantially similar to the method embodiment, with reference to the partial description of the method embodiment being relevant.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable GATE ARRAY, FPGA)) is an integrated circuit whose logic functions are determined by user programming of the device. The designer programs itself to "integrate" a digital system onto a single PLD without requiring the chip manufacturer to design and fabricate application specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented with "logic compiler (logic compiler)" software, which is similar to the software compiler used in program development and writing, and the original code before being compiled is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but HDL is not just one, but a plurality of kinds, such as ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language), and VHDL (Very-High-SPEED INTEGRATED Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include, but are not limited to, the following microcontrollers: ARC625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.
Claims (21)
1. A method of identifying blocked keywords, comprising:
Acquiring the number of times of the target keyword used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords; the target keywords are keywords in text information which is included in the sharing information and used for describing pages to be shared;
Obtaining the reflux times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
determining a reflow rate of the target keyword based on the reflow times and the used times;
Judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result;
And when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
2. The method of claim 1, further comprising, prior to the obtaining the number of times the target keyword is used:
Acquiring a sharing request sent by a first terminal in the first terminal set; the sharing request is a request for sharing the page to be shared;
generating the sharing information based on the link information of the page to be shared;
the obtaining the number of times of the target keyword is used specifically includes:
And determining the number of times the target keyword is used based on the sharing information.
3. The method of claim 2, further comprising, prior to the obtaining the number of reflows of the target keyword:
Acquiring the access request initiated by a second terminal in the second terminal set based on the sharing information within a preset time period; the preset time period is a time period with a preset duration taking the moment of generating the sharing information as a starting time;
the obtaining the reflow times of the target keywords specifically includes:
Determining the sharing information corresponding to the access request based on the access request;
and determining the reflow times of the target keywords based on the sharing information corresponding to the access request.
4. The method of claim 3, wherein the determining, based on the sharing information, the number of times the target keyword is used specifically includes:
determining the times of the target keywords contained in the sharing information based on the sharing information;
Determining the times of the target keywords contained in the sharing information as the times of the target keywords to be used;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
Determining the times of including the target keywords in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
And determining the times of the target keywords contained in the sharing information corresponding to the access request as the times of the reflow of the keywords.
5. The method of claim 3, wherein the determining, based on the sharing information, the number of times the target keyword is used specifically includes:
Determining the current use times of the target keywords contained in the sharing information based on the sharing information;
Adding the current use times with the historical use times of the target keywords to obtain the used times of the target keywords; the historical use times of the target keywords are the total times of the target keywords contained in the generated historical sharing information before the sharing information is generated;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
Determining the current reflow times of the target keywords contained in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
Adding the current reflow times with the historical reflow times of the target keywords to obtain the reflow times of the target keywords; the history reflux times of the target keywords are the total times of the target keywords contained in the history sharing information corresponding to the acquired history access requests before the access requests are acquired.
6. The method according to claim 1, wherein the determining the reflow rate of the target keyword based on the reflow times and the used times specifically includes:
calculating a ratio of the number of reflows to the number of uses based on the number of reflows and the number of uses;
and determining the reflux rate of the target keyword based on the ratio.
7. The method of claim 3, the sharing information comprising text information and code information; the access request comprises code information in the sharing information corresponding to the access request;
the method further comprises the steps of:
Establishing a first corresponding relation between the text information and the code information for each sharing information;
the determining the sharing information corresponding to the access request based on the access request specifically includes:
Determining the code information contained in the access request based on the access request;
determining the text information corresponding to the code information according to the first corresponding relation;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
determining the times of the target keywords contained in the text information based on the text information corresponding to the code information;
And determining the reflow times of the target keywords according to the times of the target keywords contained in the text information.
8. The method of claim 3, the sharing information comprising text information and code information; the text information comprises the target keywords; the access request comprises code information in the sharing information corresponding to the access request;
the method further comprises the steps of:
establishing a second corresponding relation between the target keyword and the code information for each sharing information;
the determining the sharing information corresponding to the access request based on the access request specifically includes:
Determining the code information contained in the access request based on the access request;
The determining the number of times of reflow of the target keyword based on the sharing information corresponding to the access request specifically includes:
determining the target keyword corresponding to the code information according to the second corresponding relation;
Determining the times of the code information corresponding to the target keywords according to the acquired times of the access requests;
And determining the reflow times of the target keywords based on the times of the code information corresponding to the target keywords.
9. The method of claim 1, the blocked keywords comprising keywords that are prohibited from performing copy operations or from performing display operations.
10. The method of claim 2, the method further comprising:
determining a keyword lexicon; the keyword word stock comprises the target keywords;
the generating the sharing information specifically includes:
acquiring at least one target keyword in the keyword lexicon;
And generating the sharing information based on the target keywords.
11. The method of claim 10, further comprising, after the determining the target keyword as a blocked keyword:
and deleting or marking the keywords which are determined to be blocked in the keyword lexicon.
12. The method of claim 11, wherein the generating the sharing information specifically includes:
Acquiring at least one keyword in the keyword lexicon;
Judging whether the keywords are marked keywords or not;
if the keywords are unlabeled keywords, determining the keywords as the target keywords;
And generating the sharing information based on the target keywords.
13. The method of claim 1, the method further comprising:
determining a replacement word library;
after the target keyword is determined to be the blocked keyword, the method further comprises:
Determining a replacement word corresponding to the target keyword in the replacement word lexicon;
And generating new sharing information based on the replacement words.
14. The method of claim 1, further comprising, after the determining the target keyword as a blocked keyword:
the sharing information containing the target keywords is sent to a true machine testing terminal;
and if the sharing information is not displayed in the real machine test terminal, determining the target keyword as the keyword which is determined to be blocked.
15. The method of claim 1, wherein the determining the target keyword as a blocked keyword further comprises:
the sharing information containing the target keywords is sent to a true machine testing terminal;
acquiring display page information generated by the real machine test terminal based on the selection operation of the sharing information;
if the information capable of executing the preset operation on the sharing information is not contained in the display page information, determining the target keywords as the keywords which are determined to be blocked; the preset operation includes a copy operation.
16. An apparatus for identifying blocked keywords, comprising:
The first data acquisition module is used for acquiring the used times of the target keywords; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords; the target keywords are keywords in text information which is included in the sharing information and used for describing pages to be shared;
the second data acquisition module is used for acquiring the reflow times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
a reflow rate calculation module for determining a reflow rate of the target keyword based on the reflow times and the used times;
the judging module is used for judging whether the reflux rate is smaller than or equal to a preset reflux rate or not to obtain a judging result;
And the result determining module is used for determining the target keyword as the blocked keyword when the judging result shows that the reflux rate is smaller than or equal to a preset reflux rate.
17. The apparatus of claim 16, the apparatus further comprising:
the information generation module is used for acquiring a sharing request sent by a first terminal in the first terminal set; the sharing request is a request for sharing the page to be shared;
generating the sharing information based on the link information of the page to be shared;
the first data acquisition module is specifically configured to:
And determining the number of times the target keyword is used based on the sharing information.
18. The apparatus of claim 16, the apparatus further comprising:
The request acquisition module is used for acquiring the access request initiated by the second terminal in the second terminal set based on the sharing information in a preset time period; the preset time period is a time period with a preset duration taking the moment of generating the sharing information as a starting time;
The second data acquisition module is specifically configured to:
Determining the sharing information corresponding to the access request based on the access request;
and determining the reflow times of the target keywords based on the sharing information corresponding to the access request.
19. The apparatus of claim 18, the first data acquisition module further configured to:
Determining the current use times of the target keywords contained in the sharing information based on the sharing information;
Adding the current use times with the historical use times of the target keywords to obtain the used times of the target keywords; the historical use times of the target keywords are the total times of the target keywords contained in the generated historical sharing information before the sharing information is generated;
The second data acquisition module is specifically further configured to:
Determining the current reflow times of the target keywords contained in the sharing information corresponding to the access request based on the sharing information corresponding to the access request;
Adding the current reflow times with the historical reflow times of the target keywords to obtain the reflow times of the target keywords; the history reflux times of the target keywords are the total times of the target keywords contained in the history sharing information corresponding to the acquired history access requests before the access requests are acquired.
20. An apparatus for identifying blocked keywords, comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to:
Acquiring the number of times of the target keyword used; the number of times of the used is obtained by counting the number of sharing information which is sent to the first terminal set and contains the target keywords; the target keywords are keywords in text information which is included in the sharing information and used for describing pages to be shared;
Obtaining the reflux times of the target keywords; the reflux times are obtained by counting the times of access requests initiated by the acquired second terminal set based on the sharing information containing the target keywords;
determining a reflow rate of the target keyword based on the reflow times and the used times;
Judging whether the reflux rate is smaller than or equal to a preset reflux rate or not, and obtaining a judging result;
And when the judgment result shows that the reflux rate is smaller than or equal to a preset reflux rate, determining the target keyword as a blocked keyword.
21. A computer readable medium having stored thereon computer readable instructions executable by a processor to implement the method of identifying blocked keywords of any one of claims 1 to 15.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110296033.5A CN113011165B (en) | 2021-03-19 | 2021-03-19 | Method, device, equipment and medium for identifying blocked keywords |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110296033.5A CN113011165B (en) | 2021-03-19 | 2021-03-19 | Method, device, equipment and medium for identifying blocked keywords |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113011165A CN113011165A (en) | 2021-06-22 |
CN113011165B true CN113011165B (en) | 2024-06-07 |
Family
ID=76403229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110296033.5A Active CN113011165B (en) | 2021-03-19 | 2021-03-19 | Method, device, equipment and medium for identifying blocked keywords |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113011165B (en) |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593444A (en) * | 2013-11-15 | 2014-02-19 | 北京国双科技有限公司 | Network keyword recognition processing method and device |
KR101453790B1 (en) * | 2013-08-16 | 2014-10-23 | 김수현 | Optimizing system for frequency of advertisement exposure using advertisement efficiency media |
CN104346337A (en) * | 2013-07-24 | 2015-02-11 | 腾讯科技(深圳)有限公司 | Method and device for intercepting junk information |
CN104462242A (en) * | 2014-11-18 | 2015-03-25 | 北京国双科技有限公司 | Webpage reflow quantity counting method and device |
CN105574203A (en) * | 2016-01-07 | 2016-05-11 | 沈文策 | Information storage method and device |
CN106528716A (en) * | 2016-10-26 | 2017-03-22 | 腾讯音乐娱乐(深圳)有限公司 | Multimedia search content recommendation method and apparatus |
CN106611007A (en) * | 2015-10-26 | 2017-05-03 | 北京国双科技有限公司 | Detection method and device of reprinted reflow data |
CN108471376A (en) * | 2017-02-23 | 2018-08-31 | 腾讯科技(深圳)有限公司 | Data processing method, apparatus and system |
CN109118243A (en) * | 2017-06-26 | 2019-01-01 | 阿里巴巴集团控股有限公司 | A kind of product is shared, useful evaluation identifies, method for pushing and server |
CN109218411A (en) * | 2018-08-22 | 2019-01-15 | 中国平安人寿保险股份有限公司 | Data processing method and device, computer readable storage medium, electronic equipment |
CN110011896A (en) * | 2018-11-06 | 2019-07-12 | 阿里巴巴集团控股有限公司 | A kind of data processing method and device, a kind of calculating equipment and storage medium |
CN110113315A (en) * | 2019-04-12 | 2019-08-09 | 平安科技(深圳)有限公司 | A kind of processing method and equipment of business datum |
CN110347900A (en) * | 2019-07-10 | 2019-10-18 | 腾讯科技(深圳)有限公司 | A kind of importance calculation method of keyword, device, server and medium |
CN110808899A (en) * | 2019-10-12 | 2020-02-18 | 北京达佳互联信息技术有限公司 | Content sharing method, device, client, server and system |
CN111756644A (en) * | 2020-06-30 | 2020-10-09 | 深圳壹账通智能科技有限公司 | Hot spot current limiting method, system, equipment and storage medium |
CN111767259A (en) * | 2020-06-29 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Content sharing method and device, readable medium and electronic equipment |
CN112417248A (en) * | 2020-11-24 | 2021-02-26 | 百度在线网络技术(北京)有限公司 | Recommendation method, device, model, equipment and storage medium for addressing keywords |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7281042B2 (en) * | 2003-08-15 | 2007-10-09 | Oversee.Net | Internet domain keyword optimization |
US8458179B2 (en) * | 2007-11-29 | 2013-06-04 | Palo Alto Research Center Incorporated | Augmenting privacy policies with inference detection |
-
2021
- 2021-03-19 CN CN202110296033.5A patent/CN113011165B/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104346337A (en) * | 2013-07-24 | 2015-02-11 | 腾讯科技(深圳)有限公司 | Method and device for intercepting junk information |
KR101453790B1 (en) * | 2013-08-16 | 2014-10-23 | 김수현 | Optimizing system for frequency of advertisement exposure using advertisement efficiency media |
CN103593444A (en) * | 2013-11-15 | 2014-02-19 | 北京国双科技有限公司 | Network keyword recognition processing method and device |
CN104462242A (en) * | 2014-11-18 | 2015-03-25 | 北京国双科技有限公司 | Webpage reflow quantity counting method and device |
CN106611007A (en) * | 2015-10-26 | 2017-05-03 | 北京国双科技有限公司 | Detection method and device of reprinted reflow data |
CN105574203A (en) * | 2016-01-07 | 2016-05-11 | 沈文策 | Information storage method and device |
CN106528716A (en) * | 2016-10-26 | 2017-03-22 | 腾讯音乐娱乐(深圳)有限公司 | Multimedia search content recommendation method and apparatus |
CN108471376A (en) * | 2017-02-23 | 2018-08-31 | 腾讯科技(深圳)有限公司 | Data processing method, apparatus and system |
CN109118243A (en) * | 2017-06-26 | 2019-01-01 | 阿里巴巴集团控股有限公司 | A kind of product is shared, useful evaluation identifies, method for pushing and server |
CN109218411A (en) * | 2018-08-22 | 2019-01-15 | 中国平安人寿保险股份有限公司 | Data processing method and device, computer readable storage medium, electronic equipment |
CN110011896A (en) * | 2018-11-06 | 2019-07-12 | 阿里巴巴集团控股有限公司 | A kind of data processing method and device, a kind of calculating equipment and storage medium |
CN110113315A (en) * | 2019-04-12 | 2019-08-09 | 平安科技(深圳)有限公司 | A kind of processing method and equipment of business datum |
CN110347900A (en) * | 2019-07-10 | 2019-10-18 | 腾讯科技(深圳)有限公司 | A kind of importance calculation method of keyword, device, server and medium |
CN110808899A (en) * | 2019-10-12 | 2020-02-18 | 北京达佳互联信息技术有限公司 | Content sharing method, device, client, server and system |
CN111767259A (en) * | 2020-06-29 | 2020-10-13 | 北京字节跳动网络技术有限公司 | Content sharing method and device, readable medium and electronic equipment |
CN111756644A (en) * | 2020-06-30 | 2020-10-09 | 深圳壹账通智能科技有限公司 | Hot spot current limiting method, system, equipment and storage medium |
CN112417248A (en) * | 2020-11-24 | 2021-02-26 | 百度在线网络技术(北京)有限公司 | Recommendation method, device, model, equipment and storage medium for addressing keywords |
Non-Patent Citations (1)
Title |
---|
基于LDA主题模型的网络舆情研究;李振鹏;黄帅;;系统科学与数学(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113011165A (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107450979B (en) | Block chain consensus method and device | |
CN110162796B (en) | News thematic creation method and device | |
CN110457578B (en) | Customer service demand identification method and device | |
CN111552945B (en) | Resource processing method, device and equipment | |
CN115203394A (en) | Model training method, service execution method and device | |
CN113079201B (en) | Information processing system, method, device and equipment | |
CN106970758B (en) | Electronic document operation processing method and device and electronic equipment | |
US11158319B2 (en) | Information processing system, method, device and equipment | |
CN117369783B (en) | Training method and device for security code generation model | |
CN113434063B (en) | Information display method, device and equipment | |
CN112559575A (en) | Search processing method, customer service information processing method and device | |
CN110825943B (en) | Method, system and equipment for generating user access path tree data | |
CN113011165B (en) | Method, device, equipment and medium for identifying blocked keywords | |
CN116822606A (en) | Training method, device, equipment and storage medium of anomaly detection model | |
CN114201086B (en) | Information display method and device | |
CN107885443B (en) | Information processing method and device | |
CN111324778B (en) | Data and service processing method and device and electronic equipment | |
CN106548331B (en) | Method and device for determining release sequence | |
CN110245136B (en) | Data retrieval method, device, equipment and storage equipment | |
CN112800188B (en) | Dialogue processing method and device | |
CN113344590A (en) | Method and device for model training and complaint rate estimation | |
CN108428189B (en) | Social resource processing method and device and readable medium | |
CN111967767A (en) | Business risk identification method, device, equipment and medium | |
CN117573359B (en) | Heterogeneous cluster-based computing framework management system and method | |
CN114116816B (en) | Recommendation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230116 Address after: 200120 Floor 15, No. 447, Nanquan North Road, China (Shanghai) Pilot Free Trade Zone, Pudong New Area, Shanghai Applicant after: Alipay.com Co.,Ltd. Address before: 310000 801-11 section B, 8th floor, 556 Xixi Road, Xihu District, Hangzhou City, Zhejiang Province Applicant before: Alipay (Hangzhou) Information Technology Co.,Ltd. |
|
GR01 | Patent grant |