CN113901349B - Strong relation analysis method, system and storage medium - Google Patents

Strong relation analysis method, system and storage medium Download PDF

Info

Publication number
CN113901349B
CN113901349B CN202111472424.4A CN202111472424A CN113901349B CN 113901349 B CN113901349 B CN 113901349B CN 202111472424 A CN202111472424 A CN 202111472424A CN 113901349 B CN113901349 B CN 113901349B
Authority
CN
China
Prior art keywords
analysis
behavior
characteristic
correlation
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111472424.4A
Other languages
Chinese (zh)
Other versions
CN113901349A (en
Inventor
张广志
于笑博
成立立
杨占军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beiling Rongxin Datalnfo Science and Technology Ltd
Original Assignee
Beiling Rongxin Datalnfo Science and Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beiling Rongxin Datalnfo Science and Technology Ltd filed Critical Beiling Rongxin Datalnfo Science and Technology Ltd
Priority to CN202111472424.4A priority Critical patent/CN113901349B/en
Publication of CN113901349A publication Critical patent/CN113901349A/en
Application granted granted Critical
Publication of CN113901349B publication Critical patent/CN113901349B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a strong relation analysis method, a strong relation analysis system and a storage medium, wherein relevant characteristic information is extracted aiming at a specific relation analysis field, the specific relation type of an object can be analyzed, and the analysis of the relevance of object behaviors is more targeted; the characteristic information is distributed according to time, space and behavior characteristics, and the characteristic fields are subjected to digitization and converted into a group of coordinate data, so that the behavior information of the object can be hidden, the privacy of the object is prevented from being revealed and tampered, and the multi-source heterogeneous platform data meeting privacy desensitization are fused; the relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, the influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher.

Description

Strong relation analysis method, system and storage medium
Technical Field
The present application belongs to the technical field of data analysis, and more particularly, to a strong relationship analysis method, system and storage medium.
Background
With the development of internet information technology, people leave a large amount of network operation information through actions such as online browsing and execution, and the information is analyzed to reflect the relevance among events, so that the behavior habits and preferences of people are judged, and the method becomes a main means for actual requirements of internet cluster analysis, investigation tracing, active marketing and the like at present.
The strong relationship generally means that the actors have a high interaction relationship with each other, and performing a strong relationship analysis between events helps to quickly grasp the association logic between events. However, due to the influence of large population mobility, variable personnel behaviors and complex magnetic attraction effect influence factors of modern cities, the technical problems of large accuracy deviation and large data processing difficulty in capturing the social network with strong relationship by adopting the traditional analysis method exist.
Disclosure of Invention
In view of the above, the present application provides a strong relationship analysis method, system and storage medium, which can adapt to modern diversified population behavior patterns.
The specific technical scheme of the application is as follows:
the first aspect of the present application provides a strong relationship analysis method, including the following steps:
acquiring behavior data of an analysis object, wherein the behavior data comprises government affair data, consumption data and browsing data, and extracting characteristic information circulating in the analysis field;
the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information;
drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects;
and judging the relationship strength of different analysis objects in the analysis field according to the correlation size.
Preferably, the feature information of the extraction flow in the analysis field is specifically:
defining an analysis field of the behavior data according to field attributes and semantic association in the behavior data, wherein the analysis field comprises office matters, life matters and entertainment matters;
the extraction is defined as temporal, spatial and behavioral features within the domain of analysis to which it belongs.
Preferably, the step of performing data processing on the feature field under each partition to generate the coordinate information specifically includes:
arranging the time fields in a sequential and interval manner, determining an interval distance according to the length of stay, and generating a time coordinate;
and taking the resident place as a center, distributing and arranging the space fields, determining the spacing distance according to the geographic position and generating the space coordinates.
Preferably, the step of performing data processing on the feature field under each partition to generate the coordinate information specifically includes:
extracting keywords of the behavior field, and introducing different assignment operators according to the keywords;
and assigning the space coordinate based on an assignment operator to generate a behavior coordinate.
Preferably, the step of drawing the behavior characteristic curve about the characteristic field specifically includes:
and (4) serially connecting the coordinate information in the unit of event and the sequence of occurrence time, wherein the behavior characteristic curve is a periodic fluctuation curve related to time, space and behavior.
Preferably, the calculating the correlation of the behavior characteristic curves of different analysis objects specifically includes:
partitioning the curve by taking an event as a unit, selecting a corresponding calculation model in the partition according to the behavior attribute, and calculating the correlation coefficient of two curve segments in the same event;
and carrying out weight analysis on the correlation coefficient of each curve segment to obtain a correlation result.
Preferably, after calculating the correlation of the behavior characteristic curves of different analysis objects, before judging the relationship strength of different analysis objects in the analysis field according to the magnitude of the correlation, the method further includes:
extracting a frequent item set in the characteristic information, and comparing the coverage of the frequent item set of different analysis objects;
and correcting the correlation result of the behavior characteristic curve according to the size of the coverage surface.
Preferably, the method further comprises the following steps:
judging whether the correlation of the behavior characteristic curve is in a preset range or not;
if the correlation of the behavior characteristic curve is within a preset range, pushing the characteristic resource of the corresponding analysis object to a target analysis object terminal as recommended content;
and if the correlation of the behavior characteristic curve is not in the preset range, judging whether the degree of the correlation deviating from the preset range exceeds a preset value, and if so, pushing the characteristic resource corresponding to the analysis object to a target analysis object terminal as a guide content.
A second aspect of the present application provides a strong relationship analysis system, including a memory and a processor, where the memory includes a strong relationship analysis program, and when the program is executed by the processor, the method includes the following steps:
acquiring behavior data of an analysis object, wherein the behavior data comprises government affair data, consumption data and browsing data, and extracting characteristic information circulating in the analysis field;
the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information;
drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects;
and judging the relationship strength of different analysis objects in the analysis field according to the correlation size.
Preferably, the feature information of the extraction flow in the analysis field is specifically:
defining an analysis field of the behavior data according to field attributes and semantic association in the behavior data, wherein the analysis field comprises office matters, life matters and entertainment matters;
the extraction is defined as temporal, spatial and behavioral features within the domain of analysis to which it belongs.
Preferably, the step of performing data processing on the feature field under each partition to generate the coordinate information specifically includes:
arranging the time fields in a sequential and interval manner, determining an interval distance according to the length of stay, and generating a time coordinate;
and taking the resident place as a center, distributing and arranging the space fields, determining the spacing distance according to the geographic position and generating the space coordinates.
Preferably, the step of performing data processing on the feature field under each partition to generate the coordinate information specifically includes:
extracting keywords of the behavior field, and introducing different assignment operators according to the keywords;
and assigning the space coordinate based on an assignment operator to generate a behavior coordinate.
Preferably, the step of drawing the behavior characteristic curve about the characteristic field specifically includes:
and (4) serially connecting the coordinate information in the unit of event and the sequence of occurrence time, wherein the behavior characteristic curve is a periodic fluctuation curve related to time, space and behavior.
Preferably, the calculating the correlation of the behavior characteristic curves of different analysis objects specifically includes:
partitioning the curve by taking an event as a unit, selecting a corresponding calculation model in the partition according to the behavior attribute, and calculating the correlation coefficient of two curve segments in the same event;
and carrying out weight analysis on the correlation coefficient of each curve segment to obtain a correlation result.
Preferably, after calculating the correlation of the behavior characteristic curves of different analysis objects, before judging the relationship strength of different analysis objects in the analysis field according to the magnitude of the correlation, the method further includes:
extracting a frequent item set in the characteristic information, and comparing the coverage of the frequent item set of different analysis objects;
and correcting the correlation result of the behavior characteristic curve according to the size of the coverage surface.
Preferably, the method further comprises the following steps:
judging whether the correlation of the behavior characteristic curve is in a preset range or not;
if the correlation of the behavior characteristic curve is within a preset range, pushing the characteristic resource of the corresponding analysis object to a target analysis object terminal as recommended content;
and if the correlation of the behavior characteristic curve is not in the preset range, judging whether the degree of the correlation deviating from the preset range exceeds a preset value, and if so, pushing the characteristic resource corresponding to the analysis object to a target analysis object terminal as a guide content.
A third aspect of the present application provides a computer-readable storage medium, which includes a strong relationship analysis program, and when the program is executed by a processor, the steps of the strong relationship analysis method are implemented.
In summary, the present application provides a strong relationship analysis method, system and storage medium, including: acquiring behavior data of an analysis object, and extracting characteristic information flowing in an analysis field; the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information; drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects; and judging the relationship strength of different analysis objects in the analysis field according to the correlation size. The related characteristic information is extracted aiming at the specific relation analysis field, the specific relation type of the object can be analyzed, and the relevance analysis of the object behavior is more targeted; the characteristic information is distributed according to time, space and behavior characteristics, and the characteristic fields are subjected to digitization and converted into a group of coordinate data, so that the behavior information of the object can be hidden, the privacy of the object is prevented from being revealed and tampered, and the multi-source heterogeneous platform data meeting privacy desensitization are fused; the relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, the influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a flow chart of a strong relationship analysis method of the present application;
fig. 2 is a block diagram of a strong relationship analysis system according to the present application.
Detailed Description
In order to make the objects, features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application are clearly and completely described, and it is obvious that the embodiments described below are only a part of the embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of a strong relationship analysis method according to the present application.
A first aspect of an embodiment of the present application provides a strong relationship analysis method, including the following steps:
s102: acquiring behavior data of an analysis object, wherein the behavior data comprises government affair data, consumption data and browsing data, and extracting characteristic information circulating in the analysis field;
s104: the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information;
s106: drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects;
s108: and judging the relationship strength of different analysis objects in the analysis field according to the correlation size.
It should be noted that the behavior data of the analysis object in S102 may be derived from government-side data, user communication data, consumption data, a web browsing log, and the like. The related characteristic information is extracted according to the specific relation analysis field, the specific relation type of the object can be analyzed, the relation analysis of the object behavior is more targeted, and the relation type of the object is divided into colleagues, friends or family.
In S104, the feature information may be distributed according to time, space, and behavior features, and the feature fields are digitized, for example, the same feature field is set as a number, different number numbers are arranged according to a certain sequence, and converted into a set of coordinate data, so that the behavior information of the object may be hidden, thereby avoiding disclosure and tampering of the object privacy, and satisfying fusion of the multi-source heterogeneous platform data with desensitized privacy.
In S106, the coordinate information of the feature field is serially connected to draw a behavior feature curve in units of events, where the behavior feature curve may be a periodic variation curve relating to time, space, and behavior characteristics, and reflects a population behavior trajectory and preference within a certain time range. The method for calculating the correlation may be to analyze the overlapping degree of the curves through an empirical function, or may be to calculate the correlation coefficient or to obtain the fitting analysis.
In S108, a relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher. Meanwhile, besides the judgment of the strong relationship, the relationship degree, the relationship type, the influence and the associated relationship can be analyzed, and a powerful basis can be provided for marketing schemes and policy making.
The strong relation analysis method extracts related characteristic information aiming at a specific relation analysis field, can analyze the specific relation type of an object, and has more pertinence to the analysis of the relevance of object behaviors; the characteristic information is distributed according to time, space and behavior characteristics, and the characteristic fields are subjected to digitization and converted into a group of coordinate data, so that the behavior information of the object can be hidden, the privacy of the object is prevented from being revealed and tampered, and the multi-source heterogeneous platform data meeting privacy desensitization are fused; the relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, the influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher.
According to the embodiment of the application, the extracting of the feature information of the flow in the analysis field specifically comprises the following steps:
defining an analysis field of the behavior data according to field attributes and semantic association in the behavior data, wherein the analysis field comprises office matters, life matters and entertainment matters;
the extraction is defined as temporal, spatial and behavioral features within the domain of analysis to which it belongs.
It should be noted that, the attribute classification of the behavior data may be performed according to literal attributes or default attributes of the feature fields in the behavior data, such as living expenses or entertainment shopping consumption in the consumption data, or according to semantic association of the feature fields, the attribute classification of the behavior data may be inferred, such as location properties corresponding to geographic locations. And then extracting the time characteristic, the space characteristic and the behavior characteristic in the corresponding analysis field, and removing the interference data outside the analysis field.
According to the embodiment of the application, the characteristic fields under each partition are subjected to data processing, and the generation of the coordinate information specifically comprises the following steps:
arranging the time fields in a sequential and interval manner, determining an interval distance according to the length of stay, and generating a time coordinate;
and taking the resident place as a center, distributing and arranging the space fields, determining the spacing distance according to the geographic position and generating the space coordinates.
It should be noted that the time coordinates may be arranged in the horizontal direction, the space coordinates may be arranged in the vertical direction, and the time and space coordinates are arranged to form a two-dimensional scatter. Wherein the interval of the time coordinate is determined by the residence time in the space; the spacing of the spatial coordinates is determined by the geographical location distance from the resident locations, which may be the residence, workplace, residence, etc. of the objects, which may be arranged in such a way that each object's resident location has the same horizontal reference, even though it is different.
According to the embodiment of the application, the characteristic fields under each partition are subjected to data processing, and the generation of the coordinate information specifically comprises the following steps:
extracting keywords of the behavior field, and introducing different assignment operators according to the keywords;
and assigning the space coordinate based on an assignment operator to generate a behavior coordinate.
It should be noted that the keywords are identification data for identifying the category of the behavior field, and the extraction method of the keywords may be measured and calculated by a big data technology, or obtained by semantic association analysis. The behavior fields are assigned according to the categories, for example, the behavior categories which can occur to the objects in the same place are divided into consumption, sales, accompany or supervision, and the like, and after the behavior fields are assigned, the analysis process of behavior characteristics can be simplified, so that subsequent data processing is facilitated.
According to the embodiment of the application, the drawing of the behavior characteristic curve related to the characteristic field specifically includes:
and (4) serially connecting the coordinate information in the unit of event and the sequence of occurrence time, wherein the behavior characteristic curve is a periodic fluctuation curve related to time, space and behavior.
It should be noted that the behavior characteristic curve may be a two-dimensional periodic fluctuation curve with time as an abscissa and a spatial position as an ordinate, and the behavior attribute and preference of the object are recorded at each peak to reflect the behavior trajectory of the object over a certain period of time.
According to the embodiment of the application, the calculation of the correlation of the behavior characteristic curves of different analysis objects specifically comprises the following steps:
partitioning the curve by taking an event as a unit, selecting a corresponding calculation model in the partition according to the behavior attribute, and calculating the correlation coefficient of two curve segments in the same event;
and carrying out weight analysis on the correlation coefficient of each curve segment to obtain a correlation result.
It should be noted that, the curve partition may refer to the start and end time nodes of the event, or the space transformation nodes. Because behavior fluctuations generated under different behavior attributes have different characteristics, calculation models need to be established for different behavior purposes, and parameters of the calculation models can be determined according to the mean value of analysis of a large amount of object data. And then, the correlation coefficients of the curve segments are added to obtain a correlation result of the two curves. For example, in the analysis of the relationship between colleagues, the weight ratio of the jobs or the amateur jobs can be respectively enlarged to calculate the correlation, so as to determine the potential business cooperation mode of the object and obtain the potential cooperation relationship in the behavior of the object.
According to the embodiment of the present application, after calculating the correlation of the behavior characteristic curves of different analysis objects, before judging the relationship strength of the different analysis objects in the analysis field according to the magnitude of the correlation, the method further includes:
extracting a frequent item set in the characteristic information, and comparing the coverage of the frequent item set of different analysis objects;
and correcting the correlation result of the behavior characteristic curve according to the size of the coverage surface.
It should be noted that the frequent item set refers to the minimum field set with the frequency of occurrence in the feature field higher than the threshold, and the coverage of the frequent item set can be determined by the similarity degree. If the frequent item sets of different analysis objects are denser, the relevance between the frequent item sets is higher, and the density degree of the frequent item sets can be used as a fixed coefficient to adjust the relevance of the behavior characteristic curve. The behavior characteristic curve reflects the frequency of the relationship, the frequent item set reflects the strength of the relationship, and the synchronous analysis of the two can reflect the degree of the relationship more accurately.
According to the embodiment of the application, the method further comprises the following steps:
judging whether the correlation of the behavior characteristic curve is in a preset range or not;
if the correlation of the behavior characteristic curve is within a preset range, pushing the characteristic resource of the corresponding analysis object to a target analysis object terminal as recommended content;
and if the correlation of the behavior characteristic curve is not in the preset range, judging whether the degree of the correlation deviating from the preset range exceeds a preset value, and if so, pushing the characteristic resource corresponding to the analysis object to a target analysis object terminal as a guide content.
It should be noted that the relationship that the correlation is in the preset range belongs to a strong relationship, has a magnetic attraction effect, and is suitable for occasions such as similar product recommendation and associated trace query. The relation that the correlation is out of the preset range and is lower than a certain preset value belongs to a weak relation, information and resources of the correlation and the relation are not easy to intercommunicate, and the method is suitable for giving new ideas and selection directions to objects and widening inherent cognition.
In another embodiment of the present application, after acquiring behavior data of an analysis object, before extracting feature information flowing in an analysis field, the method further includes:
acquiring a source address of the behavior data, and identifying whether the behavior data belongs to public properties according to a communication protocol;
and if the source address belongs to public property, removing the behavior data.
It should be noted that when the behavior data is from a public address, such as some internet of things cards, the non-population behavior data needs to be identified and eliminated.
In another embodiment of the present application, performing data processing on the feature field under each partition, and generating the coordinate information specifically includes:
if the necessary characteristic field part does not exist in one event unit, taking the existing characteristic field as a node to call signaling data generated by an analysis object under the characteristic field;
screening feature information associated with the attribute of the missing feature field in the signaling data to generate a simulation feature field;
and arranging the simulation characteristic fields in a certain sequence to generate coordinate information.
It should be noted that, if the characteristic field is partially missing, for example, the time characteristic acquisition fails, signaling data corresponding to the spatial characteristic in the event unit may be retrieved, time information in the signaling data is extracted, or a simulated characteristic field is obtained by calculating theoretical time information according to the spatial characteristic and the behavior characteristic corresponding to the event unit, and the simulated characteristic field is arranged in a time sequence, a spatial distance, or a behavior preference.
Referring to fig. 2, fig. 2 is a block diagram of a strong relationship analysis system according to the present application.
A second aspect of the embodiment of the present application provides a strong relationship analysis system 2, including a memory 21 and a processor 22, where the memory 21 includes a strong relationship analysis program, and when the program is executed by the processor 22, the following steps are implemented:
acquiring behavior data of an analysis object, wherein the behavior data comprises government affair data, consumption data and browsing data, and extracting characteristic information circulating in the analysis field;
the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information;
drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects;
and judging the relationship strength of different analysis objects in the analysis field according to the correlation size.
It should be noted that the behavior data of the analysis object may be derived from government-side data, user communication data, consumption data, a web browsing log, and the like. The related characteristic information is extracted according to the specific relation analysis field, the specific relation type of the object can be analyzed, the relation analysis of the object behavior is more targeted, and the relation type of the object is divided into colleagues, friends or family. The characteristic information can be distributed according to time, space and behavior characteristics, and the characteristic fields are subjected to digitization, for example, the same characteristic field is set as a number, different number numbers are arranged according to a certain sequence and are converted into a group of coordinate data, the behavior information of the object can be hidden, the privacy of the object is prevented from being revealed and tampered, and the multi-source heterogeneous platform data meeting privacy desensitization are fused. And (3) serially connecting the coordinate information of the characteristic field by taking the event as a unit to draw a behavior characteristic curve, wherein the behavior characteristic curve can be a periodic variation curve related to time, space and behavior characteristics and reflects the behavior track and preference of the population within a certain time range. The method for calculating the correlation may be to analyze the overlapping degree of the curves through an empirical function, or may be to calculate the correlation coefficient or to obtain the fitting analysis. The relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, the influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher. Meanwhile, besides the judgment of the strong relationship, the relationship degree, the relationship type, the influence and the associated relationship can be analyzed, and a powerful basis can be provided for marketing schemes and policy making.
The strong relation analysis system extracts related characteristic information aiming at a specific relation analysis field, can analyze specific relation types of objects, and has more pertinence to the analysis of the relevance of object behaviors; the characteristic information is distributed according to time, space and behavior characteristics, and the characteristic fields are subjected to digitization and converted into a group of coordinate data, so that the behavior information of the object can be hidden, the privacy of the object is prevented from being revealed and tampered, and the multi-source heterogeneous platform data meeting privacy desensitization are fused; the relationship analysis result is obtained by analyzing the correlation of the behavior characteristic curve, behavior characteristics can be comprehensively considered, the influence of the variability of the population behaviors is small, and the accuracy of capturing the strong relationship social network is higher.
According to the embodiment of the application, the extracting of the feature information of the flow in the analysis field specifically comprises the following steps:
defining an analysis field of the behavior data according to field attributes and semantic association in the behavior data, wherein the analysis field comprises office matters, life matters and entertainment matters;
the extraction is defined as temporal, spatial and behavioral features within the domain of analysis to which it belongs.
It should be noted that, the attribute classification of the behavior data may be performed according to literal attributes or default attributes of the feature fields in the behavior data, such as living expenses or entertainment shopping consumption in the consumption data, or according to semantic association of the feature fields, the attribute classification of the behavior data may be inferred, such as location properties corresponding to geographic locations. And then extracting the time characteristic, the space characteristic and the behavior characteristic in the corresponding analysis field, and removing the interference data outside the analysis field.
According to the embodiment of the application, the characteristic fields under each partition are subjected to data processing, and the generation of the coordinate information specifically comprises the following steps:
arranging the time fields in a sequential and interval manner, determining an interval distance according to the length of stay, and generating a time coordinate;
and taking the resident place as a center, distributing and arranging the space fields, determining the spacing distance according to the geographic position and generating the space coordinates.
It should be noted that the time coordinates may be arranged in the horizontal direction, the space coordinates may be arranged in the vertical direction, and the time and space coordinates are arranged to form a two-dimensional scatter. Wherein the interval of the time coordinate is determined by the residence time in the space; the spacing of the spatial coordinates is determined by the geographical location distance from the resident locations, which may be the residence, workplace, residence, etc. of the objects, which may be arranged in such a way that each object's resident location has the same horizontal reference, even though it is different.
According to the embodiment of the application, the characteristic fields under each partition are subjected to data processing, and the generation of the coordinate information specifically comprises the following steps:
extracting keywords of the behavior field, and introducing different assignment operators according to the keywords;
and assigning the space coordinate based on an assignment operator to generate a behavior coordinate.
It should be noted that the keywords are identification data for identifying the category of the behavior field, and the extraction method of the keywords may be measured and calculated by a big data technology, or obtained by semantic association analysis. The behavior fields are assigned according to the categories, for example, the behavior categories which can occur to the objects in the same place are divided into consumption, sales, accompany or supervision, and the like, and after the behavior fields are assigned, the analysis process of behavior characteristics can be simplified, so that subsequent data processing is facilitated.
According to the embodiment of the application, the drawing of the behavior characteristic curve related to the characteristic field specifically includes:
and (4) serially connecting the coordinate information in the unit of event and the sequence of occurrence time, wherein the behavior characteristic curve is a periodic fluctuation curve related to time, space and behavior.
It should be noted that the behavior characteristic curve may be a two-dimensional periodic fluctuation curve with time as an abscissa and a spatial position as an ordinate, and the behavior attribute and preference of the object are recorded at each peak to reflect the behavior trajectory of the object over a certain period of time.
According to the embodiment of the application, the calculation of the correlation of the behavior characteristic curves of different analysis objects specifically comprises the following steps:
partitioning the curve by taking an event as a unit, selecting a corresponding calculation model in the partition according to the behavior attribute, and calculating the correlation coefficient of two curve segments in the same event;
and carrying out weight analysis on the correlation coefficient of each curve segment to obtain a correlation result.
It should be noted that, the curve partition may refer to the start and end time nodes of the event, or the space transformation nodes. Because behavior fluctuations generated under different behavior attributes have different characteristics, calculation models need to be established for different behavior purposes, and parameters of the calculation models can be determined according to the mean value of analysis of a large amount of object data. And then, the correlation coefficients of the curve segments are added to obtain a correlation result of the two curves. For example, in the analysis of the relationship between colleagues, the weight ratio of the jobs or the amateur jobs can be respectively enlarged to calculate the correlation, so as to determine the potential business cooperation mode of the object and obtain the potential cooperation relationship in the behavior of the object.
According to the embodiment of the present application, after calculating the correlation of the behavior characteristic curves of different analysis objects, before judging the relationship strength of the different analysis objects in the analysis field according to the magnitude of the correlation, the method further includes:
extracting a frequent item set in the characteristic information, and comparing the coverage of the frequent item set of different analysis objects;
and correcting the correlation result of the behavior characteristic curve according to the size of the coverage surface.
It should be noted that the frequent item set refers to the minimum field set with the frequency of occurrence in the feature field higher than the threshold, and the coverage of the frequent item set can be determined by the similarity degree. If the frequent item sets of different analysis objects are denser, the relevance between the frequent item sets is higher, and the density degree of the frequent item sets can be used as a fixed coefficient to adjust the relevance of the behavior characteristic curve. The behavior characteristic curve reflects the frequency of the relationship, the frequent item set reflects the strength of the relationship, and the synchronous analysis of the two can reflect the degree of the relationship more accurately.
According to the embodiment of the application, the method further comprises the following steps:
judging whether the correlation of the behavior characteristic curve is in a preset range or not;
if the correlation of the behavior characteristic curve is within a preset range, pushing the characteristic resource of the corresponding analysis object to a target analysis object terminal as recommended content;
and if the correlation of the behavior characteristic curve is not in the preset range, judging whether the degree of the correlation deviating from the preset range exceeds a preset value, and if so, pushing the characteristic resource corresponding to the analysis object to a target analysis object terminal as a guide content.
It should be noted that the relationship that the correlation is in the preset range belongs to a strong relationship, has a magnetic attraction effect, and is suitable for occasions such as similar product recommendation and associated trace query. The relation that the correlation is out of the preset range and is lower than a certain preset value belongs to a weak relation, information and resources of the correlation and the relation are not easy to intercommunicate, and the method is suitable for giving new ideas and selection directions to objects and widening inherent cognition.
In another embodiment of the present application, after acquiring behavior data of an analysis object, before extracting feature information flowing in an analysis field, the method further includes:
acquiring a source address of the behavior data, and identifying whether the behavior data belongs to public properties according to a communication protocol;
and if the source address belongs to public property, removing the behavior data.
It should be noted that when the behavior data is from a public address, such as some internet of things cards, the non-population behavior data needs to be identified and eliminated.
In another embodiment of the present application, performing data processing on the feature field under each partition, and generating the coordinate information specifically includes:
if the necessary characteristic field part does not exist in one event unit, taking the existing characteristic field as a node to call signaling data generated by an analysis object under the characteristic field;
screening feature information associated with the attribute of the missing feature field in the signaling data to generate a simulation feature field;
and arranging the simulation characteristic fields in a certain sequence to generate coordinate information.
It should be noted that, if the characteristic field is partially missing, for example, the time characteristic acquisition fails, signaling data corresponding to the spatial characteristic in the event unit may be retrieved, time information in the signaling data is extracted, or a simulated characteristic field is obtained by calculating theoretical time information according to the spatial characteristic and the behavior characteristic corresponding to the event unit, and the simulated characteristic field is arranged in a time sequence, a spatial distance, or a behavior preference.
A third aspect of the embodiments of the present application provides a computer-readable storage medium, where the computer-readable storage medium includes a strong relationship analysis program, and when the program is executed by a processor, the program implements the steps of the strong relationship analysis method.
The above-mentioned storage medium correspondingly executes each step in the strong relationship analysis method, specifically refer to the description of the method step in fig. 1, and will not be described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (6)

1. A strong relation analysis method is characterized by comprising the following steps: acquiring behavior data of an analysis object, wherein the behavior data comprises government affair data, consumption data and browsing data, and extracting characteristic information circulating in the analysis field;
the characteristic information is arranged in a partitioning mode according to attributes, and the characteristic field under each partition is subjected to data processing to generate coordinate information;
drawing a behavior characteristic curve related to the characteristic field, and calculating the correlation of the behavior characteristic curves of different analysis objects;
judging the relationship strength of different analysis objects in the analysis field according to the correlation size;
the characteristic information of the extraction flow in the analysis field is specifically as follows:
defining an analysis field of the behavior data according to field attributes and semantic association in the behavior data, wherein the analysis field comprises office matters, life matters and entertainment matters;
extracting temporal features, spatial features and behavioral features defined as belonging to an analysis domain;
the step of performing data processing on the characteristic field under each partition to generate coordinate information specifically includes:
arranging the time fields in a sequential and interval manner, determining an interval distance according to the length of stay, and generating a time coordinate;
taking the resident place as a center, distributing and arranging the space fields, determining the interval distance according to the geographic position, and generating a space coordinate;
the step of performing data processing on the characteristic field under each partition to generate coordinate information specifically includes:
extracting keywords of the behavior field, and introducing different assignment operators according to the keywords;
assigning the space coordinate based on an assignment operator to generate a behavior coordinate;
the drawing of the behavior characteristic curve related to the characteristic field specifically comprises the following steps:
the coordinate information is connected in series by taking an event as a unit and taking the occurrence time as an order, and the behavior characteristic curve is a periodic fluctuation curve about time, space and behavior;
after the behavior data of the analysis object is obtained and before the feature information flowing in the analysis field is extracted, the method further comprises the following steps:
acquiring a source address of the behavior data, and identifying whether the behavior data belongs to public properties according to a communication protocol;
and if the source address belongs to public property, removing the behavior data.
2. The strong relationship analysis method according to claim 1, wherein the calculating of the correlation of the behavior characteristic curves of different analysis objects specifically comprises:
partitioning the curve by taking an event as a unit, selecting a corresponding calculation model in the partition according to the behavior attribute, and calculating the correlation coefficient of two curve segments in the same event;
and carrying out weight analysis on the correlation coefficient of each curve segment to obtain a correlation result.
3. The strong relationship analysis method according to claim 1, wherein after calculating the correlation of the behavior characteristic curves of different analysis objects, before determining the relationship strength of the different analysis objects in the analysis field according to the magnitude of the correlation, the method further comprises:
extracting a frequent item set in the characteristic information, and comparing the coverage of the frequent item set of different analysis objects;
and correcting the correlation result of the behavior characteristic curve according to the size of the coverage surface.
4. The strong relationship analysis method according to claim 1, further comprising:
judging whether the correlation of the behavior characteristic curve is in a preset range or not;
if the correlation of the behavior characteristic curve is within a preset range, pushing the characteristic resource of the corresponding analysis object to a target analysis object terminal as recommended content;
and if the correlation of the behavior characteristic curve is not in the preset range, judging whether the degree of the correlation deviating from the preset range exceeds a preset value, and if so, pushing the characteristic resource corresponding to the analysis object to a target analysis object terminal as a guide content.
5. A strong relationship analysis system comprising a memory and a processor, the memory including a strong relationship analysis program, which when executed by the processor, performs the steps of the method of any one of claims 1 to 4.
6. A computer-readable storage medium, comprising a strong relationship analysis program which, when executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
CN202111472424.4A 2021-12-06 2021-12-06 Strong relation analysis method, system and storage medium Active CN113901349B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111472424.4A CN113901349B (en) 2021-12-06 2021-12-06 Strong relation analysis method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111472424.4A CN113901349B (en) 2021-12-06 2021-12-06 Strong relation analysis method, system and storage medium

Publications (2)

Publication Number Publication Date
CN113901349A CN113901349A (en) 2022-01-07
CN113901349B true CN113901349B (en) 2022-03-29

Family

ID=79195322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111472424.4A Active CN113901349B (en) 2021-12-06 2021-12-06 Strong relation analysis method, system and storage medium

Country Status (1)

Country Link
CN (1) CN113901349B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874943A (en) * 2017-01-23 2017-06-20 腾讯科技(深圳)有限公司 Business object sorting technique and system
CN107403239B (en) * 2017-07-25 2021-02-12 南京工程学院 Parameter analysis method for control equipment in power system
US11769063B2 (en) * 2019-10-21 2023-09-26 International Business Machines Corporation Providing predictive analytics with predictions tailored for a specific domain
CN111612228A (en) * 2020-05-12 2020-09-01 国网河北省电力有限公司电力科学研究院 User electricity consumption behavior analysis method based on electricity consumption information

Also Published As

Publication number Publication date
CN113901349A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
CN109145204B (en) Portrait label generation and use method and system
CN107862022B (en) Culture resource recommendation system
CN111212303B (en) Video recommendation method, server and computer-readable storage medium
CN110119477B (en) Information pushing method, device and storage medium
CN107305611B (en) Method and device for establishing model corresponding to malicious account and method and device for identifying malicious account
CN105608179A (en) Method and device for determining relevance of user identification
CN107885873A (en) Method and apparatus for output information
CN107977678A (en) Method and apparatus for output information
CN111859234A (en) Illegal content identification method and device, electronic equipment and storage medium
JP2021511555A (en) Partner company supply chain risk analysis method
CN109558531A (en) News information method for pushing, device and computer equipment
CN106294406A (en) A kind of method and apparatus accessing data for processing application
CN110245357A (en) Principal recognition methods and device
CN118035056A (en) Reference test method and test framework for multi-mode data query
CN108959289B (en) Website category acquisition method and device
CN113901349B (en) Strong relation analysis method, system and storage medium
CN111882224A (en) Method and device for classifying consumption scenes
CN113688334A (en) Content display method and device and computer readable storage medium
CN111933133A (en) Intelligent customer service response method and device, electronic equipment and storage medium
CN110069691A (en) For handling the method and apparatus for clicking behavioral data
JP2017167829A (en) Detection device, detection method, and detection program
CN109587248A (en) User identification method, device, server and storage medium
CN105491136A (en) Message sending method and apparatus
CN111191109A (en) Information processing method and device and storage medium
JP2016038667A (en) Information provision device, information provision method and information provision program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant