Disclosure of Invention
The invention aims to provide a webpage data pushing method and system, which are used for solving the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method of pushing web page data, the method comprising:
acquiring registration information of a user, and determining a capability histogram of the user according to the registration information; the abscissa of the capability histogram corresponds to a preset capability type, and the ordinate of the capability histogram is a capability score;
inquiring a retrieval flow in a storage bank corresponding to the user based on the authority granted by the user, and constructing a user portrait based on the capability histogram statistical retrieval flow; the user portraits are data clusters based on capability histograms;
receiving a search request input by a user, monitoring a search flow in real time, and updating a storage library corresponding to the user according to a monitoring result;
and matching the search flow monitored in real time with user images of other users, and determining and displaying push data in the matching process.
As a further scheme of the invention: the step of obtaining the registration information of the user and determining the capability histogram of the user according to the registration information comprises the following steps:
receiving a registration request of a user, sending an information acquisition template to the user, and receiving filling information input by the user;
inputting the filling information into a preset information conversion model, and determining a capability score; the information conversion model is a character string comparison model, the information conversion model comprises a preset reference database, the reference database comprises an information item and a scoring item, the information item and the filling information are two comparison parties, similarity is generated in the comparison process, a score weight is determined by the similarity, and a capability score is calculated based on the score weight; the calculation process is as follows:wherein Z is the ability score, n is the total number of the filling information, alpha (x) is a preset weight function,/>For the similarity of the ith filler information and its corresponding information item,/for example>And->Respectively the ith filling information and the corresponding information items thereof; f (F) i Scoring corresponding to the ith filling information;
and inquiring the capacity arrangement sequence corresponding to the information acquisition template, counting the capacity scores according to the capacity arrangement sequence, and creating a capacity histogram.
As a further scheme of the invention: the step of inquiring the retrieval flow in a storage bank corresponding to the user based on the authority granted by the user and counting the retrieval flow based on the capability histogram comprises the following steps:
receiving authority granted by a user, and establishing a query and retrieval flow in a storage library corresponding to the user; the retrieval flow is a tag set containing a time stamp;
querying the required capacity of each label, and marking the capacity type corresponding to the capacity type in the capacity histogram;
inputting each label into a preset transcoding model to obtain a label value;
performing dimension expansion on the rectangular units corresponding to the marked capacity types based on the tag values, and inserting a cut-off symbol determined by a timestamp when each dimension expansion is finished; the dimension expansion process is to convert rectangular units into rectangular columns;
and counting the expanded capacity histogram containing the cut-off symbol to obtain the user portrait.
As a further scheme of the invention: the step of receiving the search request input by the user, monitoring the search flow in real time, and updating the storage library corresponding to the user according to the monitoring result comprises the following steps:
receiving a search request input by a user, recording a search time, and monitoring a search flow in real time; the monitoring process comprises an encryption request port, and when an encryption request input by a user is received, the monitored retrieval process is encrypted;
inputting the retrieval flow into a part-of-speech recognition model, and extracting retrieval keywords according to part-of-speech recognition results;
and packing the extracted search keywords, inserting the search time, and inputting the packing result into a storage library corresponding to the user based on the search time.
As a further scheme of the invention: the step of matching the search flow monitored in real time with the user images of other users, and determining and displaying push data in the matching process comprises the following steps:
reading a capability histogram of a user, matching the capability histogram of the user with capability histograms of other users, and determining a matched user of the user;
counting a search flow monitored in real time based on the capability histogram of the matched user to obtain a sub-portrait unit;
traversing the user portraits of the matched users according to the sub-portrait unit, and determining the matching degree, the matching time and the time span;
and selecting a target user based on the matching degree and the time span, reading a subsequent tag set from the target user based on the matching time, and determining push data according to the read subsequent tag set and displaying the push data.
As a further scheme of the invention: the step of reading the capability histogram of the user, matching the capability histogram of the user with the capability histograms of other users, and determining the matched user of the user comprises the following steps:
reading a capability histogram of a user and capability histograms of other users;
normalizing the two capability histograms, and converting the capability scores into numerical value duty ratios;
calculating the Pasteur coefficients between the normalized capability histograms as similarity, and selecting a matched user of the user according to the similarity;
the calculation process of the Pasteur coefficient comprises the following steps:
wherein ρ is the Pasteur coefficient of the two capability histograms, m is the total number of capability types, p (i) and p ′ (i) The corresponding numerical value duty ratio of the ith capability type in the two capability histograms is respectively; the numerical duty cycle is the ratio of the capability score to the sum of all scores in the current capability histogram.
As a further scheme of the invention: the step of determining the matching degree, the matching time and the time span according to the user portrait of the matched user traversed by the sub portrait unit comprises the following steps:
acquiring a label in a sub-picture unit, traversing a user portrait of a matched user based on the label, and determining a label existence value according to a traversing result; the tag presence value includes 1 and 0,1 representing presence, 0 representing absence;
counting the label existence values of all labels in the sub-picture unit, and calculating the matching degree;
when the matching degree reaches a preset matching threshold value, a matching position corresponding to a label with a label existence value of 1 is obtained, one of adjacent cut-off symbols is read based on the matching position, and a timestamp corresponding to the cut-off symbol is obtained and used as a matching time;
and calculating the maximum difference value of all the matching moments as a moment span.
The technical scheme of the invention also provides a webpage data pushing system, which comprises:
the capacity acquisition module is used for acquiring registration information of the user and determining a capacity histogram of the user according to the registration information; the abscissa of the capability histogram corresponds to a preset capability type, and the ordinate of the capability histogram is a capability score;
the user portrait construction module is used for inquiring the retrieval flow in a storage bank corresponding to the user based on the authority granted by the user and constructing the user portrait based on the capability histogram statistical retrieval flow; the user portraits are data clusters based on capability histograms;
the storage library updating module is used for receiving a search request input by a user, monitoring a search flow in real time and updating a storage library corresponding to the user according to a monitoring result;
and the matching display module is used for matching the search flow monitored in real time with the user images of other users, and determining and displaying the pushed data in the matching process.
As a further scheme of the invention: the capability acquisition module includes:
the information receiving unit is used for receiving a registration request of a user, sending an information acquisition template to the user and receiving filling information input by the user;
the score determining unit is used for inputting the filling information into a preset information conversion model and determining a capability score; wherein the information conversion model is a character string comparison model, and the information conversion modelThe model comprises a preset reference database, wherein the reference database comprises an information item and a scoring item, the information item and the filling information are two comparison parties, similarity is generated in the comparison process, a score weight is determined by the similarity, and a capability score is calculated based on the score weight; the calculation process is as follows:wherein Z is the ability score, n is the total number of the filling information, alpha (x) is a preset weight function,/>For the similarity of the ith filler information and its corresponding information item,/for example>And->Respectively the ith filling information and the corresponding information items thereof; f (F) i Scoring corresponding to the ith filling information;
the creating execution unit is used for inquiring the capacity arrangement sequence corresponding to the information acquisition template, counting the capacity scores according to the capacity arrangement sequence and creating a capacity histogram.
As a further scheme of the invention: the user portrait construction module comprises:
the flow inquiry unit is used for receiving the authority granted by the user and establishing an inquiry and retrieval flow in a storage library corresponding to the user; the retrieval flow is a tag set containing a time stamp;
the type marking unit is used for inquiring the required capacity of each label and marking the capacity type corresponding to the capacity type in the capacity histogram;
the transcoding unit is used for inputting each label into a preset transcoding model to obtain a label value;
the dimension expanding unit is used for expanding the dimension of the rectangular unit corresponding to the marked capacity type based on the label value, and inserting a cut-off symbol determined by the timestamp when each dimension expansion is finished; the dimension expansion process is to convert rectangular units into rectangular columns;
and the statistics unit is used for counting the expanded capacity histogram containing the cut-off sign to obtain the user portrait.
Compared with the prior art, the invention has the beneficial effects that: the invention judges the capability of the user according to the user information, records the experience of the user according to the retrieval flow of the user, and simultaneously acquires the matched user according to the capability and the experience when receiving a new retrieval process.
Detailed Description
In order to make the technical problems, technical schemes and beneficial effects to be solved more clear, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a flow chart of a web page data pushing method, and in an embodiment of the present invention, a web page data pushing method is provided, where the method includes:
step S100: acquiring registration information of a user, and determining a capability histogram of the user according to the registration information; the abscissa of the capability histogram corresponds to a preset capability type, and the ordinate of the capability histogram is a capability score;
in the technical scheme of the invention, before network interaction, a user needs to register an account, wherein the account relates to personal information of the user, including age, profession, work experience and the like, and each capability of the user, such as digital receiving capability, information receiving speed and the like, can be primarily judged by the personal information, the indexes are preset by a management side, and after the capability under each index is obtained, a graph for representing the state of the user, called a capability histogram, can be generated; typically, the capability type of the capability histogram is a single digit, such as: logical reasoning ability, computing ability, spatial thinking ability, artistic appreciation, etc., these evaluation criteria are not unique and are therefore not limiting.
Step S200: inquiring a retrieval flow in a storage bank corresponding to the user based on the authority granted by the user, and constructing a user portrait based on the capability histogram statistical retrieval flow; the user portraits are data clusters based on capability histograms;
the retrieval process belongs to private data, in daily life, many people can select traceless browsing, but in the application, the data processing process based on browsing records (retrieval process) cannot be performed in the subsequent step if the browsing records cannot be obtained, so the application has a front permission obtaining step, and the subsequent step can be performed only after the permission granted by the user is obtained.
Inquiring a retrieval flow of a user based on the authority granted by the user, converting the retrieval flow into a set of a plurality of keywords, converting the set into a numerical value, and filling the numerical value into a corresponding capability type to obtain a user portrait; this process corresponds to the expansion of the capability histogram, that is, the filling of data in the vertical direction of the plane in which the capability histogram is located, so that the original two-dimensional data is expanded into three-dimensional data, and the rectangles in the capability histogram are converted into rectangular columns, so that comprehensive data reflecting the capability of the user and the retrieval process, called user portraits, are obtained.
In addition, in consideration of the privacy of the retrieval process, a storage scheme with a higher encryption level needs to be adopted in the storage process of the retrieval process.
Step S300: receiving a search request input by a user, monitoring a search flow in real time, and updating a storage library corresponding to the user according to a monitoring result;
step S300 is an application process, all users are faced by step S100 and step S200, the users faced by step S300 are independent individuals, and when each search request input by the user is received, the search flow is updated, and correspondingly, the user portraits are updated.
Step S400: matching the search flow monitored in real time with user images of other users, and determining push data and displaying the push data in the matching process;
for each user, in one-time passing behavior, according to the short retrieval flow, traversal matching can be performed in user images of all other users, so that users with similar capabilities and similar experiences are determined, namely, the matched users, the subsequent retrieval flow (which can be obtained by directly acquiring the user images) of the matched users is read, and the user images are used as generation references of push data, so that the user images can be more matched with the current user.
Fig. 2 is a first sub-flowchart of a web page data pushing method, where the step of obtaining registration information of a user and determining a capability histogram of the user according to the registration information includes:
step S101: receiving a registration request of a user, sending an information acquisition template to the user, and receiving filling information input by the user;
step S102: inputting the filling information into a preset information conversion model, and determining a capability score; the information conversion model is a character string comparison model, and comprises a preset reference database, wherein the reference database comprises information items and score items, and the information items and the filling informationThe information is two comparison parties, similarity is generated in the comparison process, score weights are determined by the similarity, and capability scores are calculated based on the score weights; the calculation process is as follows:wherein Z is the ability score, n is the total number of the filling information, alpha (x) is a preset weight function,/>For the similarity of the ith filler information and its corresponding information item,/for example>And->Respectively the ith filling information and the corresponding information items thereof; f (F) i Scoring corresponding to the ith filling information;
step S103: and inquiring the capacity arrangement sequence corresponding to the information acquisition template, counting the capacity scores according to the capacity arrangement sequence, and creating a capacity histogram.
The generating logic of the capability histogram is very simple, and an information acquisition template is firstly created by a management party and is used for acquiring various information of a user so as to judge the capability of the user, and meanwhile, once the information acquisition template is determined, the arrangement sequence of the capability types in the capability histogram is also determined.
Specifically, the information acquired by the information acquisition template is text information input by a user, and numerical information is needed for generating the capability histogram, so that filling information needs to be converted into a numerical value, which is called capability score, and a conversion rule in the conversion process is autonomously determined by a management party, wherein the conversion rule is generally in a form of a table, and the type of filling information is input, so that the score can be directly read in the table.
FIG. 3 is a second sub-flowchart of a web page data pushing method, wherein the steps of querying a search flow in a repository corresponding to a user based on rights granted by the user, counting the search flow based on the capability histogram, and constructing a user portrait include:
step S201: receiving authority granted by a user, and establishing a query and retrieval flow in a storage library corresponding to the user; the retrieval flow is a tag set containing a time stamp;
step S202: querying the required capacity of each label, and marking the capacity type corresponding to the capacity type in the capacity histogram;
step S203: inputting each label into a preset transcoding model to obtain a label value;
step S204: performing dimension expansion on the rectangular units corresponding to the marked capacity types based on the tag values, and inserting a cut-off symbol determined by a timestamp when each dimension expansion is finished; the dimension expansion process is to convert rectangular units into rectangular columns;
step S205: counting the expanded capacity histogram containing the cut-off symbol to obtain a user portrait;
the above-mentioned contents specifically define the construction process of the user portrait, firstly, each user has a dedicated storage library for storing the retrieval process, when receiving the rights granted by the user, the retrieval process can be queried according to the rights granted by the user.
The types of the labels are various, the required capacities are different, for example, when the labels are mathematical, the required capacities comprise computing capacity, logic capacity and the like, the required capacities of different labels are predetermined by a management party, the required capacities of the labels are queried, and corresponding rectangles are queried and marked in a capacity histogram; on the basis, the label is input into a preset transcoding model, data in the form of numerical values, which are called label numerical values, are obtained, and for marked rectangles, the label numerical values are taken as the heights in the vertical direction of the capability histogram, so that the rectangles can be converted into rectangular columns.
It should be noted that, each search process corresponds to a conversion process, each conversion process obtains three-dimensional data after expansion, and a cut-off symbol is inserted into the three-dimensional data, so that a rectangular column section corresponding to each search process can be determined.
Fig. 4 is a third sub-flowchart of a web page data pushing method, wherein the steps of receiving a search request input by a user, monitoring the search process in real time, and updating a repository corresponding to the user according to the monitoring result include:
step S301: receiving a search request input by a user, recording a search time, and monitoring a search flow in real time; the monitoring process comprises an encryption request port, and when an encryption request input by a user is received, the monitored retrieval process is encrypted;
step S302: inputting the retrieval flow into a part-of-speech recognition model, and extracting retrieval keywords according to part-of-speech recognition results;
step S303: and packing the extracted search keywords, inserting the search time, and inputting the packing result into a storage library corresponding to the user based on the search time.
Step S301 to step S303 specifically define a data storage process, which belongs to a microscopic angle, step S300 is a generation process of a search flow, and is a basis for constructing a storage library of each user, and a specific working principle is as follows:
and receiving a search request input by a user, recording the sending time of the request as the search time, monitoring the search process in real time, extracting search keywords in the search process by using the existing part-of-speech recognition model as labels, packaging all labels corresponding to the same search process, and obtaining a label set as data stored in a storage library.
In the storage process, the storage sequence needs to be determined by the retrieval time, so that the order of the storage library is ensured.
FIG. 5 is a fourth sub-flowchart of a method for pushing web page data, wherein the steps of matching a search process monitored in real time with user images of other users, determining and displaying pushed data in the matching process include:
step S401: reading a capability histogram of a user, matching the capability histogram of the user with capability histograms of other users, and determining a matched user of the user;
each user has a capability histogram, and the capability histograms of different users are compared to determine whether the capability between the users is similar.
Step S402: counting a search flow monitored in real time based on the capability histogram of the matched user to obtain a sub-portrait unit;
taking a certain user as an example, when the search flow of the user is monitored, the user portraits corresponding to the search flow are constructed based on the same logic (step S202 to step S205), and only one search flow is analyzed, so that the data amount of the obtained user portraits is only one segment, and therefore the user portraits are called as sub-portrayal units.
Step S403: traversing the user portraits of the matched users according to the sub-portrait unit, and determining the matching degree, the matching time and the time span;
from the above, it can be seen that the sub-portrayal units are equivalent to subsets (all three-dimensional data) of the user portrayal, so that a traversing result can be obtained based on the sub-portrayal units, wherein the traversing result comprises a matching degree, a matching time and a time span, and the matching degree is used for representing the inclusion degree of the sub-portrayal units by the user portrayal; the matching time is used for representing a timestamp corresponding to the contained position; since the sub-portrait elements are tag sets, the containing positions are not unique, and there is a time difference between a plurality of time stamps, and the maximum value of the time difference is called a time span.
Step S404: selecting a target user based on the matching degree and the time span, reading a subsequent tag set from the target user based on the matching time, and determining push data according to the read subsequent tag set and displaying the push data;
and selecting the user most similar to the current user according to the matching degree and the time span, then reading a follow-up tag set (follow-up retrieval flow) of the user most similar according to the matching time, and determining push data of the user according to the follow-up tag set, wherein the matching degree with the current user is extremely high.
In summary, the matching process of the present application is as follows:
firstly, according to the comparison of the capability histogram of the user and the capability histograms of other users, the users with similar capabilities are determined, which is the first-layer matching, and the determined users with similar capabilities are not unique and are collectively called as matching users.
And then, reading the capability histogram of the matched user, taking the capability histogram of the matched user (not the current user) as a reference, and counting the retrieval flow of the current user to obtain sub-image units, wherein the comparison process of the follow-up sub-image units and the user portraits is more accurate because the capability histograms are the same.
Finally, the appearance positions of the labels in the sub-image units in the user portrait (same as that of a matched user) are sequentially obtained, the stop signs near the appearance positions are read, the time when the matched user performs similar retrieval can be queried, the obtained time is also not unique due to the fact that the number of the labels is not unique, and generally, the smaller the maximum span of the time corresponding to all the labels is, the more similar the retrieval process of the matched user is to the retrieval process of the current user.
The practical meaning of the above process is that for a certain user, through the above matching process, users with similar capabilities (capability histogram matching) and users with similar experience (checked flow matching) can be queried, and the push data determined by the users with similar capabilities and the users with similar experience has extremely high fit with the current user.
As a preferred embodiment of the present invention, the step of reading the capability histogram of the user, matching the capability histogram of the user with the capability histograms of other users, and determining the matched user of the user includes:
reading a capability histogram of a user and capability histograms of other users;
normalizing the two capability histograms, and converting the capability scores into numerical value duty ratios;
calculating the Pasteur coefficients between the normalized capability histograms as similarity, and selecting a matched user of the user according to the similarity;
the calculation process of the Pasteur coefficient comprises the following steps:
wherein ρ is the Pasteur coefficient of the two capability histograms, m is the total number of capability types, p (i) and p ′ (i) The corresponding numerical value duty ratio of the ith capability type in the two capability histograms is respectively; the numerical duty cycle is the ratio of the capability score to the sum of all scores in the current capability histogram.
The above-mentioned content specifically defines the matching process of the capability histograms, and aims to determine two users with similar capabilities, namely, to compare the similarity of the two capability histograms from the data perspective, the application adopts the pasteurization coefficient as an evaluation label, and the greater the pasteurization coefficient, the higher the similarity.
As a preferred embodiment of the technical scheme of the invention, the step of determining the matching degree, the matching time and the time span according to the user portrait of the matching user traversed by the sub-portrait unit comprises the following steps:
acquiring a label in a sub-picture unit, traversing a user portrait of a matched user based on the label, and determining a label existence value according to a traversing result; the tag presence value includes 1 and 0,1 representing presence, 0 representing absence;
counting the label existence values of all labels in the sub-picture unit, and calculating the matching degree;
when the matching degree reaches a preset matching threshold value, a matching position corresponding to a label with a label existence value of 1 is obtained, one of adjacent cut-off symbols is read based on the matching position, and a timestamp corresponding to the cut-off symbol is obtained and used as a matching time;
and calculating the maximum difference value of all the matching moments as a moment span.
The above content provides a specific scheme for comparing the sub-portrait units with the user portraits, which is characterized in that the number of the tags is not unique, if a certain tag in the sub-portrait units appears in the user portraits, the tag existence value is 1, and the tag existence values of all the tags are counted to calculate the matching degree.
In the calculation process of the label existence value, if the label existence value is 1, the matching positions are synchronously read, namely, the appearance positions of the label in the user portrait are read, and the stop signs adjacent to the appearance positions can be obtained as the matching time (the stop sign is determined by the time stamp, and the reverse pushing time stamp by the stop sign is a reverse process and is not complex); and calculating the maximum difference value of the matching time to obtain the time span.
Fig. 6 is a block diagram of a structure of a web page data pushing system, and as a preferred embodiment of the technical solution of the present invention, the present invention further provides a web page data pushing system, where the system 10 includes:
a capability obtaining module 11, configured to obtain registration information of a user, and determine a capability histogram of the user according to the registration information; the abscissa of the capability histogram corresponds to a preset capability type, and the ordinate of the capability histogram is a capability score;
a user portrait construction module 12, configured to query a search flow in a repository corresponding to a user based on rights granted by the user, and construct a user portrait based on the capability histogram statistical search flow; the user portraits are data clusters based on capability histograms;
the repository updating module 13 is configured to receive a search request input by a user, monitor a search process in real time, and update a repository corresponding to the user according to a monitoring result;
and the matching display module 14 is used for matching the search flow monitored in real time with the user images of other users, and determining and displaying push data in the matching process.
Further, the capability acquiring module 11 includes:
the information receiving unit is used for receiving a registration request of a user, sending an information acquisition template to the user and receiving filling information input by the user;
the score determining unit is used for inputting the filling information into a preset information conversion model and determining a capability score; the information conversion model is a character string comparison model, and contains a preset reference database, wherein the reference database comprises informationThe method comprises the steps of generating similarity in the comparison process for the information item and the filling information, determining score weights according to the similarity, and calculating capability scores based on the score weights; the calculation process is as follows:wherein Z is the ability score, n is the total number of the filling information, alpha (x) is a preset weight function,/>For the similarity of the ith filler information and its corresponding information item,/for example>And->Respectively the ith filling information and the corresponding information items thereof; f (F) i Scoring corresponding to the ith filling information;
the creating execution unit is used for inquiring the capacity arrangement sequence corresponding to the information acquisition template, counting the capacity scores according to the capacity arrangement sequence and creating a capacity histogram.
Specifically, the user portrayal construction module 12 includes:
the flow inquiry unit is used for receiving the authority granted by the user and establishing an inquiry and retrieval flow in a storage library corresponding to the user; the retrieval flow is a tag set containing a time stamp;
the type marking unit is used for inquiring the required capacity of each label and marking the capacity type corresponding to the capacity type in the capacity histogram;
the transcoding unit is used for inputting each label into a preset transcoding model to obtain a label value;
the dimension expanding unit is used for expanding the dimension of the rectangular unit corresponding to the marked capacity type based on the label value, and inserting a cut-off symbol determined by the timestamp when each dimension expansion is finished; the dimension expansion process is to convert rectangular units into rectangular columns;
and the statistics unit is used for counting the expanded capacity histogram containing the cut-off sign to obtain the user portrait.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.