Keyword recognition method and system based on user browsing and searching behaviors
Technical Field
The invention relates to the technical field of data identification, in particular to a keyword identification method and system based on user browsing and searching behaviors.
Background
With the development of economy, the living standard of people is increasingly improved, so that in order to meet the requirements of knowledge acquisition and spirit, information required to be known is browsed or searched on a browser in a surfing mode.
However, because some users with bad purposes scatter bad information by means of the network, and users with bad information are spread in a spreading manner, the safety and health of the network are seriously affected, the safety and health of the network are basically hit by means of a detection and reporting mode, the effect is poor, and no related technology capable of actively searching through keyword identification and investigation is available at present for maintaining the safety and health of the network based on a browser.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects existing in the prior art, the invention provides a keyword recognition method and a keyword recognition system based on user browsing and searching behaviors, and solves the technical problems in the background art.
Technical proposal
In order to achieve the above purpose, the invention is realized by the following technical scheme:
in a first aspect, a keyword recognition system based on user browsing and searching actions includes:
the control terminal is a main control terminal of the system and is used for sending out an execution command;
the authentication module is used for authenticating the identity information of the user side to log in the system;
the analysis module is used for analyzing the operation behavior characteristics executed after the user terminal logs in;
the reading module is used for reading the browsing data content corresponding to the operation behavior characteristics of the user;
the computing module is used for identifying word and word contents in the browse data contents read by the reading module, and distinguishing and counting word and word in a meaning manner by referring to the identified word contents;
the selection module is used for selecting the word and word which are subjected to the near-meaning distinction and statistics by the calculation module to serve as keyword candidate items;
and the output module is used for outputting keywords corresponding to the current service user of the system to be sent to the cloud storage unit.
Still further, the authentication module is provided with a sub-module at a lower stage, including:
the cloud storage unit is used for storing data information;
the cloud storage unit operates a synchronous authentication module, the cloud storage unit operates to store data information, names the data information storage folders by using user end IDs, synchronously extracts authentication user IDs to search in the cloud storage unit when the authentication module starts authentication on user identity information, opens corresponding data information storage folders for the searched user IDs, and creates the data information storage folders for the user IDs which are not searched.
Further, the user operation behavior features analyzed in the analysis module include browsing behavior features and browsing behavior features after searching.
Still further, the computing module is provided with a sub-module, which includes:
the identification unit is used for synchronously operating in real time in the operation process of the calculation module and identifying word and word meaning items;
wherein the text-word near term includes text-word identical terms.
Still further, a sub-module is disposed between the analysis module and the calculation module, and includes:
the association unit is used for analyzing the characteristic state of the current operation behavior of the user side and obtaining basic keyword identification by referring to the set period;
the system comprises a system end, a setting period, an analysis module, a selection module and a recognition unit, wherein the setting period is set independently according to the system end, the analysis module is controlled to run in real time in the setting period, the operation behavior characteristics of the user end are analyzed, when the analysis obtains that two operation behavior characteristics appear in the content of the setting period of the user end, the content of a corresponding search word in the searched browsing behavior characteristics is used as a basic keyword, the basic keyword is synchronously transmitted to the recognition unit, the word near-meaning item is obtained through the recognition unit, the word near-meaning item is matched in the browsing data content corresponding to the browsing behavior characteristics, and the word near-meaning item word with a matching target is judged to be a keyword candidate item and is transmitted to the selection module.
Furthermore, the keyword candidates selected by the selection module are the prepositions with highest occurrence frequency according to the user-defined quantity after being processed by the calculation module, the keyword candidates obtained by operation in the selection module and the keyword candidates obtained by operation of the association unit are compared in the selection module, and the word words with the same two groups of keyword candidates are judged to be keywords and are sent to the output module;
the output module outputs content including keywords and basic keywords.
Still further, be provided with the submodule between output module and the high in the clouds storage unit, include:
the evidence unit is used for configuring corresponding network links to feed back to the user side by referring to the keywords;
a marking unit for marking keywords;
and the synchronous control system operates to acquire keywords after the verification unit feeds back the network link is clicked and opened by the user side, compares the acquired keywords with the keywords output by the output module, marks the keyword isomorphic marking unit of the keywords with comparison matching item keywords in the output module, and then sends the marked keyword isomorphic marking unit to the cloud storage unit.
Further, after the system repeatedly operates the service user terminal to log in the system again through the isomorphic authentication module, the system operation keyword acquisition path weight is calculated by referring to the corresponding user keyword obtained by the last service operation of the system, and the calculation formula is as follows:
wherein: x is x i The number of keywords is operated for the system;
w i is determined as a ratio of keywords for the keyword candidates;
b z source offset for keyword candidates;
sigma is an activation function;
z is a weight value.
Still further, the control terminal is connected with an authentication module through a medium electric connection, the authentication module is connected with a cloud storage unit through a medium electric connection, the authentication module is connected with an analysis module reading module and a calculation module through a medium electric connection, the analysis module is connected with an association unit through a medium electric connection, the calculation module is connected with an identification unit through a medium electric connection, the calculation module is connected with a selection module and an output module through a medium electric connection, the selection module is connected with the association unit through a medium electric connection, and the output module is connected with the cloud storage unit through a certification unit and a marking unit electric connection.
In a second aspect, a keyword recognition method based on browsing and searching actions of a user includes the following steps:
step1: setting a user identity authentication channel, and after the user identity authentication is passed, performing browsing and searching actions by a user through a browser to capture user operation behavior characteristic data in real time;
step2: designing high-frequency word judgment logic, and acquiring word appearing in high frequency in browsing and searching behaviors by referring to user operation behavior feature data;
step3: analyzing high-frequency word synonyms and paraphrasing words, and searching synonym words and paraphrasing words in user operation behavior characteristic data;
step4: receiving the same item, and packing words input by a user when the user executes search behaviors on a browser under the same folder for storage;
step5: constructing a data storage cloud space, editing the folder obtained in the Step4 according to the user ID, and then sending and storing the folder to the data storage cloud space;
step6: and designing a cloud space data exchange period of the data storage, receiving feedback to the management end after each exchange period, and deleting the folder by the management end or checking and deleting the folder by a management end user.
Advantageous effects
Compared with the known public technology, the technical scheme provided by the invention has the following beneficial effects:
the invention provides a keyword recognition system based on user browsing and searching behaviors, by which the keywords recognized by the system can be stored for corresponding users, in the keyword recognition process, the keywords can be respectively recognized and captured according to the positive operation behaviors of the users on different sides of a browser, and related network connection can be further provided through the recognized and captured keywords, so that not only is the related required link index provided for the users, but also the keywords with larger weight in the recognized keywords can be further evaluated through the network connection.
In the invention, the system effectively improves the accuracy of acquiring the keywords by the operation of the system by acquiring the keyword candidates and further comparing the keyword candidates in the use process, ensures that the keywords acquired by the system more accord with the operation behavior characteristics of the user on the browser, and is convenient for the browser management end to manage the browser user by identifying the acquired keywords.
The invention provides a keyword recognition method based on user browsing and searching behaviors, wherein in the method, the steps are executed while assisting the system to run, so that keywords can be managed to a certain extent synchronously, the system can run more stably by setting the set storage logic and the period of keyword data alternation, and in the storage space for storing the keywords, the condition that the data storage overflow of the storage space or the data storage call operation is slow due to the large data storage quantity can be prevented by setting the period of alternation, so that excessive expiration and useless data are prevented from being stored in the storage space.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is evident that the drawings in the following description are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a schematic diagram of a keyword recognition system based on user browsing and searching actions;
FIG. 2 is a flow chart of a keyword recognition method based on user browsing and searching actions;
reference numerals in the drawings represent respectively: 1. a control terminal; 2. an authentication module; 21. a cloud storage unit; 3. an analysis module; 31. an association unit; 4. a reading module; 5. a computing module; 51. an identification unit; 6. a selection module; 7. an output module; 71. a evidence unit; 72. and a marking unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention is further described below with reference to examples.
Example 1
The keyword recognition system based on user browsing and searching behaviors of the present embodiment, as shown in fig. 1, includes:
the control terminal 1 is a main control end of the system and is used for sending out an execution command;
the authentication module 2 is used for authenticating the identity information of the user side to log in the system;
the analysis module 3 is used for analyzing the operation behavior characteristics executed after the user terminal logs in;
the reading module 4 is used for reading the browsing data content corresponding to the operation behavior characteristics of the user;
the computing module 5 is used for identifying the word content in the browsing data content read by the reading module 4, and distinguishing and counting word meaning by referring to the identified word content;
the selection module 6 is used for selecting the word and word after the calculation module 5 performs the near-meaning distinction and statistics on the word and word as a keyword candidate;
the output module 7 is configured to output a keyword corresponding to a current service user of the system to the cloud storage unit 21.
In this embodiment, the control terminal 1 controls the system to operate, the user authenticates the user side identity information login system through the authentication module 2, the analysis module 3 operates at a rear position to analyze the operation behavior characteristics executed after the user side login when the user browses or searches by using the browser, the advanced trigger reading module 4 reads the content of browsing data corresponding to the operation behavior characteristics of the user, the computing module 5 recognizes the content of text words in the content of browsing data read by the reading module 4, the computing module 5 recognizes and counts the text words in a near-meaning manner, the selection module 6 selects the text words which are subjected to the near-meaning recognition and counting by the computing module 5 to serve as keyword candidates, and finally the output module 7 outputs the keywords corresponding to the current service user of the system to the cloud storage unit 21, and the keywords are stored in the cloud storage unit 21.
As shown in fig. 1, the authentication module 2 is provided with a sub-module at a lower stage, including:
a cloud storage unit 21 for storing data information;
the cloud storage unit 21 operates the synchronous authentication module 2, the cloud storage unit 21 operates to store data information, name data information storage folders by user end IDs, synchronously extract authentication user IDs to search in the cloud storage unit 21 when the authentication module 2 starts authentication of user identity information, open corresponding data information storage folders for the searched user IDs, and create data information storage folders for the user IDs which are not searched.
Example 2
In a specific implementation level, on the basis of embodiment 1, this embodiment further specifically describes a keyword recognition system based on user browsing and searching behaviors in embodiment 1 with reference to fig. 1:
as shown in fig. 1, the user operation behavior features analyzed in the analysis module 3 include browsing behavior features and browsing behavior features after searching.
As shown in fig. 1, the computing module 5 is provided with a sub-module, including:
the identifying unit 51 is used for synchronously operating in real time in the operation process of the computing module 5 and identifying word and word meaning items;
wherein the text-word near term includes text-word identical terms.
Through the arrangement, the system operation can be enabled to identify the text words and the paraphrasing words of the bronze drum to obtain more keyword candidates for system operation processing.
As shown in fig. 1, a sub-module is disposed between the analysis module 3 and the calculation module 5, and includes:
the association unit 31 is configured to analyze a current operation behavior feature state of the user terminal, and obtain basic keyword recognition by referring to a set period;
the setting period is set independently according to the system end, in the setting period, the analysis module 3 is controlled to run in real time, the operation behavior characteristics of the user end are analyzed, when the analysis obtains that the two operation behavior characteristics appear in the content of the setting period, the content of the corresponding search text word in the searched browsing behavior characteristics is used as a basic keyword, the basic keyword is synchronously sent to the identification unit 51, the word near-meaning item is obtained through the identification unit 51 to match the word near-meaning item in the browsing data content corresponding to the browsing behavior characteristics, and the word near-meaning item with the matching target is judged to be a keyword candidate item and sent to the selection module 6.
Through the setting, when a user uses the browser to browse and search and browse, the system can further acquire keywords in the search and browse according to the search and browse operation behaviors of the user for keyword recognition and extraction during browsing through browsing, so that the recognition and extraction of the keywords are more efficient and accurate.
As shown in fig. 1, the keyword candidates selected by the selection module 6 are the preamble items with the highest occurrence frequency according to the user-defined number after being processed by the calculation module 5, the keyword candidates obtained by the operation in the selection module 6 and the keyword candidates obtained by the operation of the association unit 31 are compared in the selection module 6, and the word words with the same two groups of keyword candidates are determined as keywords and sent to the output module 7;
wherein, the output module 7 outputs content including keywords and basic keywords.
As shown in fig. 1, a sub-module is disposed between the output module 7 and the cloud storage unit 21, and includes:
a certification unit 71, configured to configure a corresponding network link with reference to the keyword and feed back the corresponding network link to the user terminal;
a marking unit 72 for marking keywords;
the verification unit 71 feeds back the network link, and after the network link is clicked and opened by the user end, the synchronization control system operates to obtain the keyword, compares the obtained keyword with the keyword output by the output module 7, marks the keyword isomorphic marking unit 72 of the keyword with the comparison matching item keyword in the output module 7, and then sends the keyword isomorphic marking unit to the cloud storage unit 21.
Through the arrangement of the sub-modules, the weight of the keyword can be further judged after the keyword is identified and acquired, and the keyword weight is acquired while the convenience of use is provided for the user by adopting the method for providing the network link to induce the user, so that the system side can better manage the browser and related networks through the acquired keyword and the weight thereof.
As shown in fig. 1, after the system repeatedly operates the service user terminal and logs in the system again through the isomorphic authentication module 2, the calculation of the system operation keyword acquisition path weight is performed by referring to the corresponding user keyword obtained by the last service operation of the system, and the calculation formula is as follows:
wherein: x is x i The number of keywords is operated for the system;
w i is determined as a ratio of keywords for the keyword candidates;
b z source offset for keyword candidates;
sigma is an activation function;
z is a weight value.
As shown in fig. 1, the control terminal 1 is electrically connected with an authentication module 2 through a medium, the authentication module 2 is electrically connected with a cloud storage unit 21 through the medium, the authentication module 2 is electrically connected with an analysis module 3, a reading module 4 and a calculation module 5 through the medium, the analysis module 3 is electrically connected with an association unit 31 through the medium, the calculation module 5 is electrically connected with an identification unit 51 through the medium, the calculation module 5 is electrically connected with a selection module 6 and an output module 7 through the medium, the selection module 6 is electrically connected with the association unit 31 through the medium, and the output module 7 is electrically connected with the cloud storage unit 21 through an evidence unit 71 and a marking unit 72.
Example 3
In a specific implementation level, on the basis of embodiment 1, this embodiment further specifically describes a keyword recognition system based on user browsing and searching behaviors in embodiment 1 with reference to fig. 2:
a keyword recognition method based on user browsing and searching behaviors comprises the following steps:
step1: setting a user identity authentication channel, and after the user identity authentication is passed, performing browsing and searching actions by a user through a browser to capture user operation behavior characteristic data in real time;
step2: designing high-frequency word judgment logic, and acquiring word appearing in high frequency in browsing and searching behaviors by referring to user operation behavior feature data;
step3: analyzing high-frequency word synonyms and paraphrasing words, and searching synonym words and paraphrasing words in user operation behavior characteristic data;
step4: receiving the same item, and packing words input by a user when the user executes search behaviors on a browser under the same folder for storage;
step5: constructing a data storage cloud space, editing the folder obtained in the Step4 according to the user ID, and then sending and storing the folder to the data storage cloud space;
step6: and designing a cloud space data exchange period of the data storage, receiving feedback to the management end after each exchange period, and deleting the folder by the management end or checking and deleting the folder by a management end user.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.