Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The inventor has found through research and study that the prior art only counts the click records or search records of the related big data information on the search engine when determining the popularity of the big data information. However, in an actual network environment, the heat degree of a certain big data information may be deliberately mastered, and if the heat degree of the big data information is determined by the prior art, the reliability of the heat degree cannot be ensured.
In order to solve the above problem of the prior art, embodiments of the present invention provide a big data information heat analysis method and a cloud platform device, which can analyze and track communication behavior data of a terminal device corresponding to a click record or a search record of big data information within a set time period, so as to determine whether a heat stir-up behavior exists in the terminal device, and thus can ensure the reliability of the heat of the big data information within the set time period.
In order to describe the above big data information heat analysis method in detail, please refer to fig. 1, which provides a schematic diagram of a communication architecture of the big data information heat analysis system 100 according to an embodiment of the present invention. The big data information heat analysis system 100 may include a heat analysis server 200, a search engine server 300, and a plurality of terminal devices 400 communicating with the search engine server 300, wherein the data analysis server 200 is communicatively connected with the search engine server 300.
In a specific embodiment, each of the heat analysis server 200 and the search engine server 300 may be a desktop computer, a tablet computer, a notebook computer, or other electronic devices capable of implementing data processing and data communication, and the terminal device 400 may be an electronic device such as a mobile phone or a computer, which is not limited herein.
On the basis, please refer to fig. 2 in combination, which is a flowchart illustrating a big data information heat analysis method according to an embodiment of the present invention, where the big data information heat analysis method may be applied to the heat analysis server 200 in fig. 1, and further, the big data information heat analysis method may specifically include the contents described in the following steps S21 to S24.
Step S21, detecting a real-time heat value of each piece of big data information in the search engine server in real time, and when detecting that the real-time heat value of the target big data information in the search engine server exceeds a set threshold, acquiring an operation record of the target big data information in the search engine server.
In this embodiment, the operation record includes each user behavior record of the target big data information counted by the search engine server within a set time period, and the user behavior record is a click record or a search record for the target big data information.
Step S22, parsing each user behavior record in the operation records to determine device signature information corresponding to each user behavior record, generating a request instruction for obtaining a device communication identifier corresponding to the device signature information according to the device signature information, and sending the request instruction to the search engine server.
In this embodiment, the device signature information is an operation trace reserved when the terminal device corresponding to the user behavior record performs one click operation or one search operation in the search engine server.
Step S23, acquiring the device communication identifier corresponding to the request instruction, which is extracted by the search engine server from the database corresponding to the search engine server based on the request instruction.
In this embodiment, each device communication identifier corresponds to one of the terminal devices that the search engine server communicates with, the device communication identifier may be a communication IP address, and the communication IP address of each terminal device 400 is fixed.
Step S24, extracting communication behavior data corresponding to each target terminal device from the target terminal device corresponding to each obtained device communication identifier; judging whether a hotness frying behavior exists in each target terminal device according to the communication behavior data; when judging that the hotness stir-frying behavior does not exist in each target terminal device, determining that the real-time hotness value of the target big data information is a true value; and when the fact that the hot frying action exists in each target terminal device is judged, determining the real-time hot value of the target big data information as a false value.
In the present embodiment, the communication behavior data includes communication list information that the target terminal device communicates with other devices, which do not communicate with the search engine server 300, within a set period of time.
When the contents described in the above steps S21-S24 are executed, the following advantageous technical effects can be achieved: the method comprises the steps of firstly obtaining operation records of target big data information with real-time heat value exceeding a set threshold, secondly analyzing the operation records to determine a plurality of device signature information and generate request instructions, further sending the request instructions to a search engine server, secondly obtaining device communication identifiers fed back by the search engine server based on the request instructions, and finally extracting communication behavior data from target terminal devices corresponding to the device communication identifiers and judging whether the real-time heat value of the target big data information is a true value or a false value according to the communication behavior data. Therefore, the communication behavior data of the big data information in the set time period can be analyzed and tracked, so that whether the hot frying behaviors exist in a plurality of target terminal devices corresponding to the target big data information or not is judged, and the reliability of the real-time hot value of the target big data information is further ensured.
In an implementation manner, the determining whether the hotfrying behavior exists in each target terminal device according to the communication behavior data described in step S24 may specifically include what is described in the following steps (11) to (14).
(11) Extracting communication list information of the target terminal device in each communication behavior data, which communicates with other devices within a set time period, and determining target communication IP addresses of other devices from the communication list information; wherein each target terminal device communicates with one other device within the set time period.
(12) And calculating the accumulated value of the same target communication IP address in all the determined target communication IP addresses, and judging whether the accumulated value exceeds a preset threshold value.
(13) And if the accumulated value exceeds the preset threshold value, judging that the heat frying behavior exists in the target terminal equipment communicated with other equipment of the target communication IP address corresponding to the accumulated value.
(14) And if the accumulated value does not exceed the preset threshold value, judging that the heat stir-up behavior does not exist in each target terminal device.
In specific implementation, through the content described in the above steps (11) to (14), whether the target terminal device has a hot cooking behavior can be accurately determined according to the consistency of the target communication IP addresses of other devices communicating with each target terminal device within a set time period, so that it can be determined whether the real-time hot value of the target big data information is a real hot value or a cooked hot value.
It can be understood that the steps (11) to (14) above are to determine the hotness frying behavior by determining the consistency of the target communication IP addresses of other devices corresponding to different target terminal devices. However, in some scenarios, it is also possible to do hotness by logging in different account information on the same target terminal device. Therefore, in order to ensure that the hotness cooking behavior is accurately and comprehensively judged, the judgment of whether the hotness cooking behavior exists in each target terminal device according to the communication behavior data described in step S24 may specifically include the contents described in the following steps (21) to (23).
(21) And extracting the login information of the target terminal device in each communication behavior data within a set time period.
(22) Analyzing the login information to obtain a corresponding login path aiming at each login information corresponding to each target terminal device, and determining a login server corresponding to each login information through the login path; the login server may be a server corresponding to different search engines.
(23) Judging whether a plurality of login servers corresponding to each target terminal device are the same or not; if the login servers corresponding to each target terminal device are the same, judging that the target terminal device has a hot frying behavior; otherwise, judging that the target terminal equipment does not have the hot frying action.
Based on the content described in the above steps (21) - (23), different login information of the same target terminal device can be analyzed, so as to determine whether the target terminal device has a hot frying behavior based on the consistency of the login servers corresponding to the different login information. Therefore, the hot frying behavior can be accurately and comprehensively judged.
When the inventor applies the above method, it is found that when the heat analysis server 200 acquires the communication behavior data of the target terminal device, the heat analysis server 200 may not acquire the communication behavior data from the target terminal device due to the activation of the privacy protection mechanism of the target terminal device, which may result in that the subsequent heat cooking determination cannot be performed smoothly. In order to improve the above problem, in step S24, the communication behavior data corresponding to each target terminal device is extracted from the target terminal device corresponding to each acquired device communication identifier, which may specifically include the contents described in the following steps S241 to S244.
Step S241, determining a protocol signature of the target terminal device corresponding to each device communication identifier and each data privacy level from each device communication identifier; and on the premise that each target terminal device contains the system privacy data group based on the protocol signature, determining the privacy weight ratio between each data privacy grade of each target terminal device under the corresponding service privacy data group and each data privacy grade of each target terminal device under the corresponding system privacy data group according to the data privacy grade of each target terminal device under the corresponding system privacy data group and the firewall sequence of the data privacy grade.
Step S242, the data privacy level of each target terminal device in the service privacy data group corresponding thereto and the data privacy level of the system privacy data group corresponding thereto, whose privacy weight ratio is within the set value interval, is transferred to the system privacy data group.
Step S243, on the premise that a plurality of data privacy grades are contained in the service privacy data group corresponding to each target terminal device, determining the privacy weight ratio of each target terminal device between the data privacy grades of each target terminal device under the corresponding service privacy data group according to the data privacy grade of each target terminal device under the corresponding system privacy data group and the firewall sequence of the data privacy grade; marking each data privacy grade under the service privacy data group corresponding to each target terminal device based on the privacy weight ratio among the data privacy grades; setting privacy safety factors for each target data privacy grade obtained by marking through the data privacy grade of each target terminal device under the corresponding system privacy data group and the firewall sequence of the data privacy grade, and sequentially transferring each target data privacy grade to the system privacy data group according to the sequence of the privacy safety factors corresponding to each target terminal device from large to small; wherein the number of transferred target data privacy classes does not exceed a predetermined value.
Step S244, generating a communication connection request according to the number of data privacy levels of each target terminal device under the corresponding system privacy data group, and sending the communication connection request to the corresponding target terminal device; and acquiring an interface verification code fed back by each target terminal device based on the communication connection request, establishing communication connection with each target terminal device according to the interface verification code, and extracting communication behavior data corresponding to each target terminal device.
It is understood that, when the contents described in the above steps S241 to S244 are executed, the data privacy level of the target terminal device can be analyzed and adjusted based on the device communication identifier, so as to generate the communication connection request according to the number of data privacy levels under the system privacy data group corresponding to the target terminal device. Therefore, the communication connection is established with the target terminal equipment through the communication connection request, and the interception of the data acquisition behavior of the heat analysis server by a privacy protection mechanism of the target terminal equipment can be avoided. In this way, the heat analysis server 200 can be ensured to smoothly acquire the communication behavior data from the target terminal device, and the true and false value determination of the real-time heat value of the target big data information can be realized.
In a specific implementation, in order to avoid the search engine server 300 from deleting the request instruction sent by the heat analysis server 200 by mistake, in step S22, a request instruction for acquiring the device communication identifier corresponding to the device signature information is generated according to the device signature information, and the request instruction may specifically include the contents described in the following step S2221 to step S2223.
Step S2221, a first check code is generated according to the device signature information, and a preset check algorithm is adopted to check and calculate the first check code and the device mac address of the heat degree analysis server to obtain a first check result.
Step S2222, sending the first verification result and the device signature information to the search engine server; after receiving the first verification result and the device signature information, the search engine server performs verification calculation on a second verification code corresponding to the device signature information and a pre-stored target mac address by using the preset verification algorithm to obtain a second verification result, feeds back authorization information to the heat analysis server when the first verification result and the second verification result are judged to be consistent, and lists a digital signature corresponding to the heat analysis server in a preset white list.
Step S2223, when the authorization information is received, the request instruction is generated according to the device signature information, and the digital signature of the heat analysis server is implanted into the request instruction.
Based on the steps S2221 to S2223, the following technical effects can be achieved: by enabling the search engine server 300 to perform authorization authentication on the heat degree analysis server 200 in advance, the digital signature of the heat degree analysis server 200 is listed in a preset white list, and the heat degree analysis server 200 implants the digital signature after generating the request instruction, so that the search engine server 300 can be prevented from deleting the request instruction sent by the heat degree analysis server 200 by mistake.
In practical applications, the distribution of the heat value is different in different periods, and in order to reduce the processing load of the heat analysis server 200 and improve the accuracy of the heat value analysis, the following steps (31) and (32) may be further included on the basis of step S21.
(31) Acquiring a modification instruction for modifying the set threshold; and the modification instruction is configured according to the current time interval information.
(32) And modifying the set threshold according to the modification instruction.
It can be understood that, through the contents described in the above step (31) and step (32), the set thresholds for different time periods can be flexibly modified, which can reduce the processing load of the heat value analysis server and improve the accuracy of the heat value analysis.
In an alternative embodiment, in order to accurately determine the device signature information to avoid omission of the device signature information, in step S22, each user behavior record in the operation records is parsed to determine the device signature information corresponding to each user behavior record, which may specifically include the contents described in the following steps S2211 to S2215.
Step S2211, determining a registration information set corresponding to each user behavior record and a key random number set corresponding to each user behavior record; the registration information set is used for representing equipment registration information corresponding to each user behavior record, the key random number set is used for representing an encryption key sequence corresponding to each user behavior record, and the registration information set and the key random number set respectively comprise a plurality of information packets with different information association degrees.
Step S2212, obtaining the state parameter of one information packet of each user behavior record in the registration information set, and determining the information packet with the maximum information association degree in the key random number set as a reference information packet.
Step S2213, determining a transformation parameter of the state parameter in the reference information packet based on the list record information of the operation record; and establishing a mapping path between the registration information set and the key random number set of each user behavior record according to the matching degree of the parameter characteristics between the state parameters and the transformation parameters.
Step S2214, looking up a signature field sequence in the reference information packet with the transformation parameter as a reference, and mapping the signature field sequence to the information packet where the state parameter is located based on the mapping sequence between path nodes corresponding to the signature field sequence in the mapping path, so as to obtain the signature authority information corresponding to the signature field sequence in the information packet where the state parameter is located.
And step S2215, determining the device signature information matched with the authority level corresponding to the signature authority information from the embedded point information of each user behavior record according to the signature authority information.
When the contents described in the above steps S2211 to S2215 are executed, the device signature information can be accurately determined according to the signature authority level, so that omission of the device signature information is avoided.
Based on the same inventive concept, please refer to fig. 3 in combination, a big data information heat analysis cloud platform apparatus 210 is provided, the apparatus includes:
the record obtaining module 211 is configured to detect a real-time heat value of each piece of big data information in the search engine server in real time, and obtain an operation record of the target big data information in the search engine server when detecting that the real-time heat value of the target big data information in the search engine server exceeds a set threshold;
a request sending module 212, configured to analyze each user behavior record in the operation records to determine device signature information corresponding to each user behavior record, generate a request instruction for obtaining a device communication identifier corresponding to the device signature information according to the device signature information, and send the request instruction to the search engine server;
an identifier obtaining module 213, configured to obtain a device communication identifier, which is extracted by the search engine server based on the request instruction and corresponds to the request instruction, in a database corresponding to the search engine server;
the heat analysis module 214 is configured to extract communication behavior data corresponding to each target terminal device from the target terminal device corresponding to each acquired device communication identifier; judging whether a hotness frying behavior exists in each target terminal device according to the communication behavior data; when judging that the hotness stir-frying behavior does not exist in each target terminal device, determining that the real-time hotness value of the target big data information is a true value; and when the fact that the hot frying action exists in each target terminal device is judged, determining the real-time hot value of the target big data information as a false value.
Optionally, the heat analysis module 214 is specifically configured to:
extracting communication list information of the target terminal device in each communication behavior data, which communicates with other devices within a set time period, and determining target communication IP addresses of other devices from the communication list information; each target terminal device communicates with one other device in the set time period; calculating the accumulated value of the same target communication IP address in all the determined target communication IP addresses, and judging whether the accumulated value exceeds a preset threshold value or not; if the accumulated value exceeds the preset threshold value, judging that the heat frying behavior exists in the target terminal equipment communicated with other equipment of the target communication IP address corresponding to the accumulated value; if the accumulated value does not exceed the preset threshold value, judging that the heat stir-frying action does not exist in each target terminal device;
or for:
extracting the login information of the target terminal equipment in each communication behavior data within a set time period; analyzing the login information to obtain a corresponding login path aiming at each login information corresponding to each target terminal device, and determining a login server corresponding to each login information through the login path; the login server can be a server corresponding to different search engines; judging whether a plurality of login servers corresponding to each target terminal device are the same or not; if the login servers corresponding to each target terminal device are the same, judging that the target terminal device has a hot frying behavior; otherwise, judging that the target terminal equipment does not have the hot frying action.
Optionally, the heat analysis module 214 is specifically configured to:
determining a protocol signature of a target terminal device corresponding to each device communication identifier and each data privacy level from each device communication identifier; on the premise that each target terminal device is determined to contain a system privacy data group based on the protocol signature, determining a privacy weight ratio between each data privacy grade of each target terminal device under the corresponding service privacy data group and each data privacy grade of each target terminal device under the corresponding system privacy data group according to the data privacy grade of each target terminal device under the corresponding system privacy data group and the firewall sequence of the data privacy grade;
transferring the data privacy grade of each target terminal device, wherein the privacy weight ratio between the data privacy grade of each target terminal device under the corresponding service privacy data group and the data privacy grade of each target terminal device under the corresponding system privacy data group is within a set numerical value interval, to the system privacy data group;
on the premise that a plurality of data privacy grades are contained in a service privacy data group corresponding to each target terminal device, determining a privacy weight ratio between the data privacy grades of each target terminal device in the service privacy data group corresponding to each target terminal device according to the data privacy grade of each target terminal device in the system privacy data group corresponding to each target terminal device and a firewall sequence of the data privacy grade; marking each data privacy grade under the service privacy data group corresponding to each target terminal device based on the privacy weight ratio among the data privacy grades; setting privacy safety factors for each target data privacy grade obtained by marking through the data privacy grade of each target terminal device under the corresponding system privacy data group and the firewall sequence of the data privacy grade, and sequentially transferring each target data privacy grade to the system privacy data group according to the sequence of the privacy safety factors corresponding to each target terminal device from large to small; wherein the number of transferred target data privacy classes does not exceed a predetermined value;
generating a communication connection request according to the number of data privacy grades of each target terminal device under the corresponding system privacy data group, and sending the communication connection request to the corresponding target terminal device; and acquiring an interface verification code fed back by each target terminal device based on the communication connection request, establishing communication connection with each target terminal device according to the interface verification code, and extracting communication behavior data corresponding to each target terminal device.
Optionally, the request sending module 212 is specifically configured to:
generating a first check code according to the equipment signature information, and checking and calculating the first check code and the equipment mac address of the heat analysis server by adopting a preset check algorithm to obtain a first check result;
sending the first verification result and the equipment signature information to the search engine server; after receiving the first verification result and the device signature information, the search engine server performs verification calculation on a second verification code corresponding to the device signature information and a prestored target mac address by using the preset verification algorithm to obtain a second verification result, feeds back authorization information to the heat analysis server when judging that the first verification result is consistent with the second verification result, and lists a digital signature corresponding to the heat analysis server in a preset white list;
and when the authorization information is received, generating the request instruction according to the device signature information and implanting the digital signature of the heat analysis server into the request instruction.
For the description of the above functional modules, refer to the description of the method shown in fig. 2, and no further description is made here.
On the basis of the above, please refer to fig. 4 in combination, a schematic diagram of a hardware structure of the heat analysis server 200 is provided, the heat analysis server 200 includes an execution processor 221 and a non-volatile memory 222 which are communicated with each other, the execution processor 221 is used for reading the computer program from the non-volatile memory 222 and implementing the method steps described in step S21-step S24 shown in fig. 2 by executing the computer program.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.