Keyword identification method and system based on user browsing and searching behaviors
Technical Field
The invention relates to the technical field of data identification, in particular to a keyword identification method and system based on user browsing and searching behaviors.
Background
With the development of economy, the living standard of people is increasing day by day, so that in order to meet the acquisition of knowledge and mental requirements, information which needs to be known is browsed or searched on a browser in a web surfing mode.
However, because bad users scatter bad information via the network for bad purposes and then the users contacting the bad information spread in a spreading manner, the security and health of the network are seriously affected, the security and health of the network are basically attacked by means of an examination and report mode at present, the effect is poor, and at present, no related technology which can actively search and investigate through keyword identification is available for dealing with the maintenance of the security and health of the network based on a browser.
Disclosure of Invention
Solves the technical problem
Aiming at the defects in the prior art, the invention provides a keyword identification method and a keyword identification system based on user browsing and searching behaviors, which solve the technical problems in the background technology.
Technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
in a first aspect, a keyword recognition system based on user browsing and searching behavior comprises:
the control terminal is a main control end of the system and is used for sending out an execution command;
the authentication module is used for authenticating the identity information login system of the user side;
the analysis module is used for analyzing the operation behavior characteristics executed after the user terminal logs in;
the reading module is used for reading the browsing data content corresponding to the user operation behavior characteristics;
the calculation module is used for identifying the content of the words in the browsing data content read by the reading module, and referring to the identified content of the words to distinguish and count the synonymy of the words;
the selection module is used for selecting the character words after the calculation module operates to carry out close meaning distinguishing and statistics on the character words as candidate items of the keywords;
and the output module is used for outputting the keywords corresponding to the current service user of the system and sending the keywords to the cloud storage unit.
Further, the authentication module is provided with a sub-module at a lower level, including:
the cloud storage unit is used for storing data information;
the cloud storage unit runs the synchronous authentication module, the cloud storage unit runs the synchronous authentication module to store data information and name a data information storage folder by using a user end ID, the authentication user ID is synchronously extracted and authenticated when the authentication module starts to authenticate user identity information, the corresponding data information storage folder is opened for the retrieved user ID, and the data information storage folder is created for the user ID which is not retrieved.
Furthermore, the user operation behavior characteristics analyzed in the analysis module comprise browsing behavior characteristics and browsing behavior characteristics after searching.
Furthermore, the computing module is provided with sub-modules, including:
the identification unit synchronously operates in real time in the operation process of the calculation module and is used for identifying the word meaning item;
wherein the literal term proximity term includes the same term as the literal term.
Furthermore, a sub-module is arranged between the analysis module and the calculation module, and comprises:
the association unit is used for analyzing the characteristic state of the current operation behavior of the user side and obtaining basic keyword identification by referring to a set period;
the set period is set according to the system end, the analysis module is controlled to run in real time in the set period, the operation behavior characteristics of the user end are analyzed, when two operation behavior characteristics of the user end in the set period are analyzed and obtained, the corresponding search word content in the searched browsing behavior characteristics is used as a basic keyword, the basic keyword is synchronously sent to the identification unit, the word near-meaning item is obtained by the identification unit, the word near-meaning item is matched with the word near-meaning item in the browsing data content corresponding to the browsing behavior characteristics, and the word near-meaning item with the matching target is judged as a keyword candidate item and sent to the selection module.
Furthermore, the keyword candidate items selected by the selection module are prepositioned items which are processed by the calculation module and have the highest occurrence frequency and are customized according to the user, the keyword candidate items obtained by running in the selection module and the keyword candidate items obtained by running in the association unit are compared in the selection module, and the character words with the same two groups of keyword candidate items are judged as keywords and are sent to the output module;
the output content of the output module comprises keywords and basic keywords.
Furthermore, a sub-module is disposed between the output module and the cloud storage unit, and includes:
the evidence-making unit is used for configuring corresponding network links to feed back to the user side by referring to the keywords;
a marking unit for marking the keyword;
the system comprises a evidence unit, a synchronous control system, an output module, a cloud storage unit, a key word isomorphic marking unit and a cloud storage unit, wherein the evidence unit feeds back that a network link is clicked and opened by a user side, then the synchronous control system operates to obtain key words, compares the obtained key words with the key words output by the output module, marks the key word isomorphic marking unit of which the output key words have the key words of comparison matching items in the output module, and then sends the marked key words to the cloud storage unit.
Furthermore, after the system repeatedly operates the service user side and logs in the system again through the isomorphic authentication module, the calculation of the path weight for obtaining the system operation key words is carried out by referring to the corresponding user key words obtained by the last service operation of the system, and the calculation formula is as follows:
z=σ(∑ i w i x i +b z )
in the formula: x is the number of i Running the number of keywords for the system;
w i a rate at which the keyword candidate is determined to be a keyword;
b z a source offset for the keyword candidate item;
sigma is an activation function;
z is a weight value.
Furthermore, the control terminal is connected with an authentication module through a medium electric property, the authentication module is connected with a cloud storage unit through a medium electric property, the authentication module is connected with the analysis module reading module and the calculation module through a medium electric property, the analysis module is connected with the calculation module through a medium electric property, the calculation module is connected with an identification unit through a medium electric property, the calculation module is connected with a selection module and an output module through a medium electric property, the selection module is connected with the association unit through a medium electric property, and the output module is electrically connected with the cloud storage unit through a evidence demonstration unit and a marking unit.
In a second aspect, a keyword recognition method based on user browsing and searching behaviors includes the following steps:
step1: setting a user identity authentication channel, and after the user identity authentication is passed, carrying out browsing and searching behavior operation by a user through a browser, and capturing user operation behavior characteristic data in real time;
step2: designing high-frequency word judgment logic, and acquiring word words appearing at high frequency in browsing and searching behaviors by referring to user operation behavior characteristic data;
step3: analyzing synonyms and similar words of high-frequency words and phrases, and searching the same items of the synonyms and similar words in the user operation behavior characteristic data;
step4: receiving the same items and words input by a user when the user executes a search behavior on a browser, and packaging the words and the words under the same folder for storage;
step5: constructing a data storage cloud space, editing the folder obtained in the Step4 according to the user ID, and sending and storing the folder to the data storage cloud space;
step6: and designing a data storage cloud space data alternation cycle, feeding back to a management terminal after receiving each alternation cycle, and deleting the folder by the management terminal or checking and deleting the folder by a management terminal user.
Advantageous effects
Compared with the known public technology, the technical scheme provided by the invention has the following beneficial effects:
1. the invention provides a keyword recognition system based on user browsing and searching behaviors, which can store keywords recognized by the system for corresponding users, respectively recognize and capture the keywords according to positive operation behaviors of the users on different sides of a browser in the process of keyword recognition, and further provide related network connection through recognizing and capturing the keywords, thereby not only providing related required link indexes for the users, but also further evaluating the keywords with higher weights in the recognized keywords through the network connection.
2. In the use process of the system, the keyword candidate items are obtained and are further compared, so that the accuracy of the system for obtaining the keywords during operation is effectively improved, the keywords obtained by the system are ensured to be more consistent with the operation behavior characteristics of the user on the browser, and the browser management end can conveniently manage the browser user by means of identifying the obtained keywords.
3. The invention provides a keyword identification method based on user browsing and searching behaviors, which can synchronously manage keywords to a certain degree while assisting the system to operate in the step execution process, can possibly enable the system to operate more stably through the set storage logic and the period setting of keyword data alternation, can prevent the storage space data storage from overflowing or the slow operation of data storage calling caused by large data storage quantity from occurring through the setting of the alternation period for the storage space for storing the keywords, and avoids the storage space from being excessively overdue and useless data.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram of a keyword recognition system based on user browsing and searching activities;
FIG. 2 is a schematic flow chart of a keyword recognition method based on user browsing and searching behaviors;
the reference numerals in the drawings denote: 1. a control terminal; 2. an authentication module; 21. a cloud storage unit; 3. an analysis module; 31. an association unit; 4. a reading module; 5. a calculation module; 51. an identification unit; 6. a selection module; 7. an output module; 71. a witness unit; 72. and marking the unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present invention will be further described with reference to the following examples.
Example 1
A keyword recognition system based on user browsing and searching behaviors in this embodiment, as shown in fig. 1, includes:
the control terminal 1 is a main control end of the system and is used for sending out an execution command;
the authentication module 2 is used for authenticating the identity information login system of the user side;
the analysis module 3 is used for analyzing the operation behavior characteristics executed after the user terminal logs in;
the reading module 4 is used for reading browsing data content corresponding to the user operation behavior characteristics;
the calculation module 5 is used for identifying the content of the words in the browsing data content read by the reading module 4, and referring to the identified content of the words to distinguish and count the synonyms of the words;
the selection module 6 is used for selecting the character words after the calculation module 5 operates to carry out the near meaning distinguishing and the statistics on the character words as candidate items of the keywords;
and the output module 7 is configured to output the keyword corresponding to the current service user of the system to the cloud storage unit 21.
In this embodiment, the control terminal 1 controls the system to operate, a user authenticates a user-side identity information login system through the authentication module 2, the analysis module 3 operates in a rear-mounted manner and analyzes operation behavior characteristics executed after the user-side login when the user browses or searches by using a browser, the reading module 4 is further triggered to read operation behavior characteristics of the user corresponding to browsing data content, the calculation module 5 identifies word content in the browsing data content read by the reading module 4, the word content is subjected to close-sense discrimination and statistics by referring to the identified word content, the selection module 6 selects the word after the calculation module 5 operates to perform close-sense discrimination and statistics on the word as a keyword candidate item, and finally, the keyword corresponding to the current service user of the system is output through the output module 7 and sent to the cloud storage unit 21, and the keyword is stored in the cloud storage unit 21.
As shown in fig. 1, the authentication module 2 is provided with sub-modules at a lower level, including:
the cloud storage unit 21 is used for storing data information;
the cloud storage unit 21 runs the synchronous authentication module 2, the cloud storage unit 21 runs the synchronous authentication module for storing data information and naming a data information storage folder by using a user side ID, when the authentication module 2 starts authentication of user identity information, the authentication user ID is synchronously extracted and retrieved in the cloud storage unit 21, the corresponding data information storage folder is opened for the retrieved user ID, and the data information storage folder is created for the user ID which is not retrieved.
Example 2
In a specific implementation level, on the basis of embodiment 1, this embodiment further specifically describes, with reference to fig. 1, a keyword recognition system based on user browsing and searching behaviors in embodiment 1:
as shown in fig. 1, the user operation behavior characteristics analyzed in the analysis module 3 include browsing behavior characteristics and browsing behavior characteristics after searching.
As shown in fig. 1, the computing module 5 is provided with sub-modules, including:
the identification unit 51 synchronously operates in real time in the operation process of the calculation module 5 and is used for identifying the word meaning items;
wherein the literal term proximity term includes terms in which the literal terms are the same.
By the arrangement, the system can operate to identify the similar words of the bronze drum words and acquire more keyword candidate items for system operation processing.
As shown in fig. 1, a sub-module is disposed between the analysis module 3 and the calculation module 5, and includes:
the association unit 31 is configured to analyze a current operation behavior feature state of the user side, and obtain basic keyword recognition with reference to a set period;
the set period is set according to the system end, the analysis module 3 is controlled to run in real time in the set period, the operation behavior characteristics of the user end are analyzed, when the analysis obtains that the user end has two operation behavior characteristics in the set period, the corresponding search word content in the searched browsing behavior characteristics is used as a basic keyword, the basic keyword is synchronously sent to the identification unit 51, the word near-meaning item is obtained by the identification unit 51, the matching of the word near-meaning item is carried out in the browsing data content corresponding to the browsing behavior characteristics, the word near-meaning item with the matching target is judged to be a keyword candidate item, and the keyword candidate item is sent to the selection module 6.
By means of the method and the system, when a user uses a browser to browse and search and browse, the system can further obtain the keywords in the search and browse according to the search and browse operation behaviors of the user for keyword identification and extraction in browsing through browsing, and therefore the keyword identification and extraction are more efficient and accurate.
As shown in fig. 1, the keyword candidate items selected by the selection module 6 are the leading items with the highest occurrence frequency according to the user-defined number after being processed by the calculation module 5, the keyword candidate items obtained by running in the selection module 6 and the keyword candidate items obtained by running in the association unit 31 are compared in the selection module 6, and the text words with the same two groups of keyword candidate items are determined as keywords and sent to the output module 7;
the output module 7 outputs content including keywords and basic keywords.
As shown in fig. 1, a sub-module is disposed between the output module 7 and the cloud storage unit 21, and includes:
the evidence making unit 71 is used for configuring corresponding network links to feed back to the user side by referring to the keywords;
a labeling unit 72 for labeling the keyword;
the corroboration unit 71 feeds back that the network link is clicked and opened by the user side, the synchronous control system operates to acquire the keyword, compares the acquired keyword with the keyword output by the output module 7, marks the keyword isomorphic marking unit 72 in the output module 7, which outputs the keyword and has the keyword of the comparison matching item, and then sends the keyword to the cloud storage unit 21.
Through the arrangement of the sub-modules, the weight of the keyword can be further judged after the keyword is identified and acquired, and by adopting the mode of providing the network link to induce the user, the keyword weight is acquired while convenience is provided for the user, so that the system side can better manage the browser and the related network through the acquired keyword and the weight thereof.
As shown in fig. 1, after the system repeatedly operates the service user and logs in the system again through the isomorphic authentication module 2, the calculation of the path weight for obtaining the system operation keywords is performed with reference to the corresponding user keywords obtained in the last service operation of the system, and the calculation formula is as follows:
z=σ(∑ i w i x i +b z )
in the formula: x is the number of i Running keyword numbers for a system;
w i Is the rate at which the keyword candidate is determined to be a keyword;
b z a source offset for the keyword candidate item;
σ is an activation function;
z is a weight value.
As shown in fig. 1, the control terminal 1 is electrically connected to an authentication module 2 through a medium, the authentication module 2 is electrically connected to a cloud storage unit 21 through a medium, the authentication module 2 is electrically connected to the analysis module 3 through a medium, the analysis module 3 is electrically connected to the calculation module 5 through a medium, the association unit 31 is electrically connected to the analysis module 3 and the calculation module 5 through a medium, the calculation module 5 is electrically connected to an identification unit 51 through a medium, the calculation module 5 is electrically connected to a selection module 6 and an output module 7 through a medium, the selection module 6 is electrically connected to the association unit 31 through a medium, and the output module 7 is electrically connected to the cloud storage unit 21 through a certification unit 71 and a marking unit 72.
Example 3
In a specific implementation level, on the basis of embodiment 1, this embodiment further specifically describes, with reference to fig. 2, a keyword recognition system based on user browsing and searching behaviors in embodiment 1:
a keyword identification method based on user browsing and searching behaviors comprises the following steps:
step1: setting a user identity authentication channel, and after the user identity authentication is passed, performing browsing and searching behavior operations by a user through a browser, and capturing user operation behavior characteristic data in real time;
step2: designing high-frequency word judgment logic, and acquiring word words appearing at high frequency in browsing and searching behaviors by referring to user operation behavior characteristic data;
step3: analyzing synonyms and similar words of high-frequency words and phrases, and searching the same items of the synonyms and similar words in the user operation behavior characteristic data;
step4: receiving the same items and words input by a user when the user executes a search behavior on a browser, and packaging the words and the words under the same folder for storage;
step5: constructing a data storage cloud space, editing the folder obtained in the Step4 according to the user ID, and sending and storing the folder to the data storage cloud space;
step6: and designing an alternation cycle of data storage cloud space data, feeding back to a management terminal after each alternation cycle is received, and deleting the folder by the management terminal or checking and deleting the folder by a management terminal user.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.