CN111242147B - Method and device for identifying intimate contact person and frequent active area - Google Patents

Method and device for identifying intimate contact person and frequent active area Download PDF

Info

Publication number
CN111242147B
CN111242147B CN201811434322.1A CN201811434322A CN111242147B CN 111242147 B CN111242147 B CN 111242147B CN 201811434322 A CN201811434322 A CN 201811434322A CN 111242147 B CN111242147 B CN 111242147B
Authority
CN
China
Prior art keywords
user
contact
short message
signaling data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811434322.1A
Other languages
Chinese (zh)
Other versions
CN111242147A (en
Inventor
林宇俊
胡入祯
戴晶
邵妍
周宇飞
鲁银冰
许鑫伶
林甜甜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Hangzhou Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811434322.1A priority Critical patent/CN111242147B/en
Publication of CN111242147A publication Critical patent/CN111242147A/en
Application granted granted Critical
Publication of CN111242147B publication Critical patent/CN111242147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention discloses a method and a device for identifying intimate contact persons and frequent active areas, wherein the method comprises the steps of acquiring short message signaling data, call signaling data and position update signaling data of a user, extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position update signaling data, determining the intimate contact persons of the user according to the contact person classification, the short message behavior characteristics, the call behavior characteristics and preset scoring rules of the user, and determining the frequent active areas of the user according to the user position characteristics and the call behavior characteristics of the user. When the intimate contact person and the frequent active region are determined, the used features are derived from multiple dimensions, so that misjudgment of recognition can be effectively reduced, and recognition accuracy of a recognition model is improved.

Description

Method and device for identifying intimate contact person and frequent active area
Technical Field
The embodiment of the invention relates to the technical field of big data, in particular to a method and a device for identifying intimate contacts and frequent active areas.
Background
The main intimate contact identification schemes at present are as follows: 1. layering the model: initial modeling is conducted on the collected first model sample to form an initial affinity model; performing variable inspection on the initial affinity model, removing invalid variables in the first model sample, and adding new connected variables or derived variables derived from the connected variables in the first model sample to form a second model sample; and carrying out re-modeling on the second model sample to form a re-built affinity model, evaluating the re-built affinity model, scoring, obtaining an affinity score, and judging the affinity between the user and the contact according to the affinity score. 2. Acquiring communication records of a user and all contacts within a preset time period; obtaining a relation value between the user and each contact person according to the communication record; sorting the contacts according to the relation value between the user and each contact; and in a first mode, displaying the contact according to the sorting.
Layering the intimate person model to sequentially form a first model sample and a second model sample, wherein the final model output comprises three stages, each stage is mutually associated, and if a large number of misjudgment exists in the first stage, misjudgment of the subsequent whole output result can be caused; and establishing an intimate person model by acquiring communication records of the user and all contacts within a preset time period, wherein the model granularity is coarse, and the identification precision is difficult to guarantee.
Disclosure of Invention
The embodiment of the invention provides a method and a device for identifying intimate contact persons and frequent active areas, which are used for effectively reducing misjudgment of the intimate contact person identification and improving the model identification precision.
The method for identifying the intimate contact and the frequent active region provided by the embodiment of the invention comprises the following steps:
acquiring short signaling data, call signaling data and position updating signaling data of a user;
extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
determining intimate contacts of the user according to the contact classification, the short message behavior characteristics, the call behavior characteristics and the preset scoring rules of the user;
and determining a frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the determining the intimate contact of the user according to the contact classification, the short message behavior feature, the call behavior feature and the preset scoring rule of the user includes:
scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule according to the contact classification of the user to obtain the score of the affinity of the contact;
after the scores of the contact affinities of the users are summarized, carrying out standardization processing to obtain the affinity value of each contact of the users;
and determining the contact person with the affinity value larger than a first threshold value and the conversation time period and the short message time period within a first time period range as the intimate contact person of the user according to the affinity value of each contact person and the conversation time period distribution and the short message time period distribution of the user and each contact person.
Optionally, after the scores of the affinity of the contacts of the user are summarized, a normalization process is performed to obtain an affinity value of each contact of the user, including:
and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
Optionally, the determining the frequent active area of the user according to the user location feature and the call behavior feature of the user includes:
determining that the residence of the user is a frequent active area of the user according to the conversation time in the conversation behavior feature and the district area code of the district of the user position feature;
and according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user.
Optionally, the short message behavior feature includes one or any combination of the following features: the number of the local machine, the number of the opposite party, the city of the opposite party, the short message sending time, the event type and the short message length;
the call behavior feature comprises one or any combination of the following features: the local number, the city of the opposite party, the conversation starting time, the event type and the conversation duration;
the user location features include one or any combination of the following features: the local number, the area code of the cell, the type of the signaling event, the city of the cell, and the signaling start time.
Correspondingly, the embodiment of the invention also provides a device for identifying the intimate contact person and the frequent active area, which comprises the following steps:
the acquisition module is used for acquiring the short signaling data, the call signaling data and the position updating signaling data of the user;
the processing module is used for extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
the identification module is used for determining intimate contact persons of the user according to the contact person classification, the short message behavior characteristics, the call behavior characteristics and the preset scoring rule of the user; and determining a frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the identification module is specifically configured to:
scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule according to the contact classification of the user to obtain the score of the affinity of the contact;
after the scores of the contact affinities of the users are summarized, carrying out standardization processing to obtain the affinity value of each contact of the users;
and determining the contact person with the affinity value larger than a first threshold value and the conversation time period and the short message time period within a first time period range as the intimate contact person of the user according to the affinity value of each contact person and the conversation time period distribution and the short message time period distribution of the user and each contact person.
Optionally, the identification module is specifically configured to:
and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
Optionally, the identification module is specifically configured to:
determining that the residence of the user is a frequent active area of the user according to the conversation time in the conversation behavior feature and the district area code of the district of the user position feature;
and according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user.
Optionally, the short message behavior feature includes one or any combination of the following features: the number of the local machine, the number of the opposite party, the city of the opposite party, the short message sending time, the event type and the short message length;
the call behavior feature comprises one or any combination of the following features: the local number, the city of the opposite party, the conversation starting time, the event type and the conversation duration;
the user location features include one or any combination of the following features: the local number, the area code of the cell, the type of the signaling event, the city of the cell, and the signaling start time.
Accordingly, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for identifying the intimate contact and the frequent active area according to the obtained program.
Correspondingly, the embodiment of the invention also provides a computer-readable nonvolatile storage medium, which comprises computer-readable instructions, wherein when the computer reads and executes the computer-readable instructions, the computer is caused to execute the method for identifying the intimate contact and the frequent active area.
The embodiment of the invention shows that the short message signaling data, the call signaling data and the position updating signaling data of the user are obtained, the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user are extracted according to the short message signaling data, the call signaling data and the position updating signaling data, the intimate contact of the user is determined according to the contact person classification, the short message behavior characteristics, the call behavior characteristics and the preset scoring rule of the user, and the frequent active area of the user is determined according to the user position characteristics and the call behavior characteristics of the user. When the intimate contact person and the frequent active region are determined, the used features are derived from multiple dimensions, so that misjudgment of recognition can be effectively reduced, and recognition accuracy of a recognition model is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for identifying intimate contacts and frequent active areas according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a short message behavior feature provided in an embodiment of the present invention;
fig. 4 is a schematic diagram of a call behavior feature according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a user location feature according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a distribution of user affinity values according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a talk period distribution according to an embodiment of the present invention;
FIG. 8 is a schematic view of a residence model distance distribution according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a working model distance distribution according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a device for identifying intimate contacts and frequent active areas according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 illustrates a system architecture to which embodiments of the present invention are applicable, which may be a server 100, including a processor 110, a communication interface 120, and a memory 130.
The communication interface 120 is used for communicating with a terminal device, receiving and transmitting information transmitted by the terminal device, and realizing communication.
The processor 110 is a control center of the server 100, connects various parts of the entire server 100 using various interfaces and lines, and performs various functions of the server 100 and processes data by running or executing software programs and/or modules stored in the memory 130, and calling data stored in the memory 130. Optionally, the processor 110 may include one or more processing units.
The memory 130 may be used to store software programs and modules, and the processor 110 performs various functional applications and data processing by executing the software programs and modules stored in the memory 130. The memory 130 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function, and the like; the storage data area may store data created according to business processes, etc. In addition, memory 130 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
It should be noted that the structure shown in fig. 1 is merely an example, and the embodiment of the present invention is not limited thereto.
Based on the foregoing description, fig. 2 illustrates a flow of a method for identifying intimate contacts and frequent active areas, which may be performed by an apparatus for identifying intimate contacts and frequent active areas, such as the server described above, according to an embodiment of the present invention.
As shown in fig. 2, the specific steps of the flow include:
step 201, obtaining short signaling data, call signaling data and location update signaling data of a user.
The original signaling data of the user may be acquired periodically, for example, may be acquired at monthly granularity, at weekly granularity, or at quarterly granularity.
Step 202, extracting the short message behavior feature, the call behavior feature and the user position feature of the user according to the short message signaling data, the call signaling data and the position update signaling data respectively.
After the signaling data is obtained, features can be extracted according to the signaling data, for example, short message behavior features can be extracted from the short signaling data according to month granularity, and the short message behavior features can comprise one or any combination of the following features: the number of the local machine, the number of the opposite party, the city of the opposite party, the sending time of the short message, the event type (used for marking whether the host number is a sender or a receiver), and the short message length are shown in fig. 3. Call behavior features may be extracted from call signaling data at monthly granularity, which may include one or any combination of the following features: the local number, the city of the opposite party, the call start time, the event type (used for marking whether the host number is a calling party or a called party), and the call duration are shown in fig. 4. User location features may be extracted from the location update signaling data at a monthly granularity, and may include one or any combination of the following features: the number of the local machine, the area code of the cell, the type of the signaling event, the city of the cell and the signaling start time are shown in fig. 5.
Step 203, determining intimate contacts of the user according to the contact classification, the short message behavior feature, the call behavior feature and the preset scoring rule of the user.
Specifically, firstly, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain the score of the affinity of the contact, then, after the score of the affinity of the contact of the user is summarized, carrying out standardization processing to obtain the affinity value of each contact of the user, and finally, determining the contact with the affinity value larger than a first threshold and with the conversation time period and the short message time period within a first time range as the affinity contact of the user according to the affinity value of each contact and the conversation time period distribution and the short message time period distribution of the user and each contact. Wherein the first threshold and the first period range may be empirically set.
When the total score is normalized, it may be: and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
The preset evaluation rule may be set according to experience, for example, according to contact classification, and the extracted short message and the call feature are fused, a contact intimacy model as shown in table 1 is established, and the scores converted according to the scoring rule in the representation are summarized and standardized for ranking.
TABLE 1
Figure BDA0001883425040000071
Figure BDA0001883425040000081
The accumulated score values are then normalized using equation (1), scaling the affinity to within the [0,1] range, i.e., normalizing the score values.
Figure BDA0001883425040000082
Where x represents the Score of the table above, minValue represents the minimum value of the Score, and MaxValue represents the maximum value of the Score.
Fig. 6 illustrates an exemplary affinity distribution of an affinity contact identification model, which is a sampling of 1948 users, looking at the user affinity distribution.
The basic communication statistics for 1948 users are as follows:
the calling times are 27500 times and account for 27.50% of the total communication record times, wherein the calling times are 11317 times and account for 11.32% of the total communication record times, the calling times are 12718 times and account for 12.72% of the total communication times; the number of short messages is 72500, which is 72.50% of the total number of communication, wherein the number of effective short messages is 27189, which is 27.19% of the total number of communication, and which is 37.50% of the number of short messages.
And finally obtaining the affinity values 2916 of the 1948 users and the domestic and international urban areas through merging the communication contact cities of the users.
As shown in fig. 6, the distribution of 2916 affinity values accords with the statistical distribution rule, and it is reasonable to consider that the probability that the affinity value is equal to or greater than 0.7 as an affinity contact is far greater than the probability that the affinity value is less than 0.7.
Then the close contact persons are classified into five main categories of parents, couples, men and women friends, common friends and working relations, the characteristics of the message and call behavior are summarized in table 2.
TABLE 2
Figure BDA0001883425040000091
The above 1948 user talk period distribution and short message period distribution are observed as shown in fig. 7. It can be seen that the conversation and short message actions of the user mainly occur at 9-22 points.
In summary, the affinity value is more than or equal to 0.7, and the conversation and short message actions mostly occur at 9-22 points (the times of the user conversation and short message actions occurring at 9-22 points/the times of the user conversation and short message actions occurring at 23-8 points is more than or equal to 7:1) are selected as the final intimate contact person.
Step 204, determining a frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Specifically, according to the call time in the call behavior feature and the area code of the cell of the user position feature, the residence of the user is determined to be the frequent active area of the user. And according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user. I.e. the user's workplace.
For example, the frequent active area identification model is mainly subdivided into two sub-models, a user residence discovery model and a user workplace discovery model, depending on the user activity area. The following features are extracted from the location update signaling at monthly granularity: the method mainly comprises a local number, a laccell (Location Area Code Cell, district area code) where the number is located, a signaling event type (location update), a city where the number is located and a signaling starting time. And fusing the user position features with the user call features extracted by the upper section, and establishing a frequent active region identification model.
1) User residence discovery algorithm
The residence of the user typically has the following features:
feature 1: the home address of a user is often the area in which the user will be located for a relatively steady period of time each day.
Feature 2: typically the area where the user spends the most time.
Feature 3: the period of the user's home is typically a 12 o ' clock night period-to 6 o ' clock the next day in the morning.
The laccells where the user has a call or a location update in a time period of 0-6 points within a time of nearly 30 days are ranked, and the laccell with the largest occurrence number is considered as the user residence.
2) User workplace discovery algorithm
Users are subdivided into four major categories according to their nature of work, as shown in table 3.
TABLE 3 Table 3
Figure BDA0001883425040000101
The user workplace is identified by:
users are classified into users having a fixed workplace and users having no fixed workplace, and users having a fixed workplace have regular behavior tracks.
Some users work in daytime, some users work at night, some users work on weekdays, and some users work on weekends. Suppose 90% of users work in the daytime and 10% work in the evening. Also assume that 90% of users work on weekdays and 10% of users work on weekends.
Assume that daytime hours of operation are [9:00, 12:00], [14:00, 17:00], and evening hours of operation are [0:00,6:00]; using equation (2) to calculate the likelihood that the user is at a's workplace:
Figure BDA0001883425040000111
wherein W represents the cumulative length of time of day of the user appearing in A, X represents the cumulative length of time of day of the user appearing in A at night, Y represents the cumulative length of time of day of the user appearing in A at weekend, Z represents the cumulative length of time of day appearing in A at weekend, F (A) is the score of the possibility of place A, and the time difference of subtracting the first signaling from the last signaling in the area in the specified time period is obtained.
And (3) deleting all points which stay at a certain place for less than 1 hour during calculation, calculating the score of each place according to the formula (2) for each filtered point, and outputting the place with the largest score as the most probable working place of the user.
For a better determination of the possibilities of the user's workplace, the determination can also be made in the following way:
first, the possibility that the user is working at any location L is calculated by the formula (3):
Figure BDA0001883425040000112
where F (L) is a score for the likelihood of location A, W represents the cumulative length of time of day of the user occurring at A, X represents the cumulative length of time of day of the user occurring at A at night, Y represents the cumulative length of time of day of the user occurring at A on weekends, and Z represents the cumulative length of time of day of the user occurring at A on weekends.
To simplify the calculation, let
Figure BDA0001883425040000113
Then there is formula (4):
F(L)=0.72m+0.08n+0.1…………………………(4)
if a person is provided with a working place A and a living place B, the formula (5) is given:
Figure BDA0001883425040000114
if on the daytime of working dayClass, W > X has m A 1; m is the same as B 0, then formula (6):
Figure BDA0001883425040000121
if work is carried out at night on weekdays, W is less than X, m is the number A -0; m is the same as B 1, then formula (7):
Figure BDA0001883425040000122
if holiday is on duty, Y > Z, n A 1; n is the same as B 0, then formula (8):
Figure BDA0001883425040000123
if the holiday is on duty at night, Y is less than Z and n is present A -0; n is the same as B 1, then formula (9):
Figure BDA0001883425040000124
from the definition of m, n, equation (10) can be seen:
Figure BDA0001883425040000125
the above model is verified, 145 users with known living places and workplaces are verified, and a hundred-degree map is used for calculating distances between lac-ci output by the model and lac-ci of the real living places and workplaces, so as to obtain distribution diagrams as shown in fig. 8 and 9. As can be seen from fig. 8, the probability within 3KM of the distance between the residential place identified by the residential place identification model and the actual residential place of the user is 83.7%; as can be seen from fig. 9, the probability that the workplace recognition model recognizes that the distance between the workplace and the user's real workplace is within 3KM is 50%.
The above embodiment shows that, short message signaling data, call signaling data and location update signaling data of a user are obtained, short message behavior features, call behavior features and user location features of the user are extracted according to the short message signaling data, the call signaling data and the location update signaling data, intimate contact of the user is determined according to contact person classification, the short message behavior features, the call behavior features and preset scoring rules of the user, and frequent active areas of the user are determined according to the user location features and the call behavior features of the user. When the intimate contact person and the frequent active region are determined, the used features are derived from multiple dimensions, so that misjudgment of recognition can be effectively reduced, and recognition accuracy of a recognition model is improved.
In order to avoid the defects existing in the prior art, the embodiment of the invention provides a scheme for identifying intimate contacts and frequent active areas based on user signaling data. The scheme extracts communication behavior characteristics (including short message behavior characteristics and call behavior characteristics) of a calling number from signaling data, then establishes an intimate contact identification model according to the communication behavior characteristics, and judges intimate contacts of a user by combining contact communication time periods; and extracting the active position characteristics of the user from the position update data, and establishing a frequent active region identification model according to the activity rule of the user by combining the call behavior characteristics of the user so as to judge the frequent active region of the user. Model features are derived from multiple dimensions, so that erroneous judgment is effectively reduced, and model recognition accuracy is improved.
Based on the same technical concept, fig. 10 illustrates an apparatus for identifying intimate contact and frequent active areas, which is provided by an embodiment of the present invention, and may perform a process for identifying intimate contact and frequent active areas.
As shown in fig. 10, the apparatus may include:
an obtaining module 1001, configured to obtain short signaling data, call signaling data, and location update signaling data of a user;
a processing module 1002, configured to extract a short message behavior feature, a call behavior feature, and a user location feature of the user according to the short message signaling data, the call signaling data, and the location update signaling data, respectively;
the identification module 1003 is configured to determine intimate contacts of the user according to the contact classification, the text message behavior feature, the call behavior feature, and a preset scoring rule of the user; and determining a frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the identification module 1003 is specifically configured to:
scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule according to the contact classification of the user to obtain the score of the affinity of the contact;
after the scores of the contact affinities of the users are summarized, carrying out standardization processing to obtain the affinity value of each contact of the users;
and determining the contact person with the affinity value larger than a first threshold value and the conversation time period and the short message time period within a first time period range as the intimate contact person of the user according to the affinity value of each contact person and the conversation time period distribution and the short message time period distribution of the user and each contact person.
Optionally, the identification module 1003 is specifically configured to:
and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
Optionally, the identification module 1003 is specifically configured to:
determining that the residence of the user is a frequent active area of the user according to the conversation time in the conversation behavior feature and the district area code of the district of the user position feature;
and according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user.
Optionally, the short message behavior feature includes one or any combination of the following features: the number of the local machine, the number of the opposite party, the city of the opposite party, the short message sending time, the event type and the short message length;
the call behavior feature comprises one or any combination of the following features: the local number, the city of the opposite party, the conversation starting time, the event type and the conversation duration;
the user location features include one or any combination of the following features: the local number, the area code of the cell, the type of the signaling event, the city of the cell, and the signaling start time.
Based on the same technical concept, the embodiment of the invention further provides a computing device, which comprises:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for identifying the intimate contact and the frequent active area according to the obtained program.
Based on the same technical concept, the embodiment of the invention also provides a computer readable nonvolatile storage medium, which comprises computer readable instructions, wherein when the computer reads and executes the computer readable instructions, the computer is caused to execute the method for identifying the intimate contact and the frequent active area.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (12)

1. A method of intimate contact and frequent active area identification, comprising:
acquiring short signaling data, call signaling data and position updating signaling data of a user;
extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
determining intimate contacts of the user according to the contact classification, the short message behavior characteristics, the call behavior characteristics and the preset scoring rules of the user;
determining a frequent active region of the user according to the user position features and the conversation behavior features of the user, wherein the frequent active region of the user comprises a user workplace, and the probability score of the user workplace meets the following form:
Figure QLYQS_1
wherein W represents the cumulative length of time of day of the user at A, X represents the cumulative length of time of day of the user at night at A, Y represents the cumulative length of time of day of the user at weekend at A, Z represents the cumulative length of time of day at night at A, F (A) is a score of the likelihood of location A;
and calculating the score of each place, wherein the place with the largest output score is the working place of the user.
2. The method of claim 1, wherein the determining the intimate contact of the user according to the contact classification, the text message behavior feature, the call behavior feature, and the preset scoring rule of the user comprises:
scoring the short message behavior characteristics and the call behavior characteristics of each contact according to a preset scoring rule according to the contact classification of the user to obtain the score of the affinity of the contact;
after the scores of the contact affinities of the users are summarized, carrying out standardization processing to obtain the affinity value of each contact of the users;
and determining the contact person with the affinity value larger than a first threshold value and the conversation time period and the short message time period within a first time period range as the intimate contact person of the user according to the affinity value of each contact person and the conversation time period distribution and the short message time period distribution of the user and each contact person.
3. The method of claim 2, wherein the normalizing the affinity value of each contact of the user after summing up the scores of the affinity of the contacts of the user comprises:
and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
4. The method of claim 1, wherein the determining the frequent active area of the user based on the user location characteristics and the call behavior characteristics of the user comprises:
determining that the residence of the user is a frequent active area of the user according to the conversation time in the conversation behavior feature and the district area code of the district of the user position feature;
and according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user.
5. The method according to any one of claims 1 to 4, wherein the short message behavior feature comprises one or any combination of the following features: the number of the local machine, the number of the opposite party, the city of the opposite party, the short message sending time, the event type and the short message length;
the call behavior feature comprises one or any combination of the following features: the local number, the city of the opposite party, the conversation starting time, the event type and the conversation duration;
the user location features include one or any combination of the following features: the local number, the area code of the cell, the type of the signaling event, the city of the cell, and the signaling start time.
6. An apparatus for intimate contact and frequent active area identification, comprising:
the acquisition module is used for acquiring the short signaling data, the call signaling data and the position updating signaling data of the user;
the processing module is used for extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
the identification module is used for determining intimate contact persons of the user according to the contact person classification, the short message behavior characteristics, the call behavior characteristics and the preset scoring rule of the user; and determining a frequent active area of the user according to the user position features and the call behavior features of the user, wherein the frequent active area of the user comprises a user workplace, and the probability score of the user workplace meets the following form:
Figure QLYQS_2
wherein W represents the cumulative length of time of day of the user at A, X represents the cumulative length of time of day of the user at night at A, Y represents the cumulative length of time of day of the user at weekend at A, Z represents the cumulative length of time of day at night at A, F (A) is a score of the likelihood of location A;
and calculating the score of each place, wherein the place with the largest output score is the working place of the user.
7. The apparatus of claim 6, wherein the identification module is specifically configured to:
scoring the short message behavior characteristics and the call behavior characteristics of each contact according to a preset scoring rule according to the contact classification of the user to obtain the score of the affinity of the contact;
after the scores of the contact affinities of the users are summarized, carrying out standardization processing to obtain the affinity value of each contact of the users;
and determining the contact person with the affinity value larger than a first threshold value and the conversation time period and the short message time period within a first time period range as the intimate contact person of the user according to the affinity value of each contact person and the conversation time period distribution and the short message time period distribution of the user and each contact person.
8. The apparatus of claim 7, wherein the identification module is specifically configured to:
and after the scores of the contact affinities of the users are summarized, normalizing the scores of the contact affinities of the contacts to obtain the affinity value of each contact of the users.
9. The apparatus of claim 6, wherein the identification module is specifically configured to:
determining that the residence of the user is a frequent active area of the user according to the conversation time in the conversation behavior feature and the district area code of the district of the user position feature;
and according to the user position characteristics, counting the accumulated time length of the user traveling on each position in the second period, and determining the position with the accumulated time length being larger than a second threshold value as the frequent active area of the user.
10. The apparatus according to any one of claims 6 to 9, wherein the short message behaviour characteristics comprise one or any combination of the following characteristics: the number of the local machine, the number of the opposite party, the city of the opposite party, the short message sending time, the event type and the short message length;
the call behavior feature comprises one or any combination of the following features: the local number, the city of the opposite party, the conversation starting time, the event type and the conversation duration;
the user location features include one or any combination of the following features: the local number, the area code of the cell, the type of the signaling event, the city of the cell, and the signaling start time.
11. A computing device, comprising:
a memory for storing program instructions;
a processor for invoking program instructions stored in said memory to perform the method of any of claims 1-5 in accordance with the obtained program.
12. A computer readable non-transitory storage medium comprising computer readable instructions which, when read and executed by a computer, cause the computer to perform the method of any of claims 1 to 5.
CN201811434322.1A 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area Active CN111242147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811434322.1A CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811434322.1A CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Publications (2)

Publication Number Publication Date
CN111242147A CN111242147A (en) 2020-06-05
CN111242147B true CN111242147B (en) 2023-07-07

Family

ID=70874034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811434322.1A Active CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Country Status (1)

Country Link
CN (1) CN111242147B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656686A (en) * 2021-07-26 2021-11-16 深圳市中元产教融合科技有限公司 Task report generation method based on birth teaching fusion and service system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BRPI0413441A (en) * 2003-08-08 2006-10-17 Networks In Motion Inc method and system for collecting, synchronizing, and reporting telecommunication call events and workflow information
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN107203901A (en) * 2017-05-11 2017-09-26 中国联合网络通信集团有限公司 The method and device of product information is pushed to user
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user
CN107871286A (en) * 2017-07-20 2018-04-03 上海前隆信息科技有限公司 User is with contacting human world cohesion decision method/system, storage medium and equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019125082A1 (en) * 2017-12-22 2019-06-27 Samsung Electronics Co., Ltd. Device and method for recommending contact information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BRPI0413441A (en) * 2003-08-08 2006-10-17 Networks In Motion Inc method and system for collecting, synchronizing, and reporting telecommunication call events and workflow information
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN107203901A (en) * 2017-05-11 2017-09-26 中国联合网络通信集团有限公司 The method and device of product information is pushed to user
CN107871286A (en) * 2017-07-20 2018-04-03 上海前隆信息科技有限公司 User is with contacting human world cohesion decision method/system, storage medium and equipment
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user

Also Published As

Publication number Publication date
CN111242147A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
CN110401779B (en) Method and device for identifying telephone number and computer readable storage medium
CN106384273B (en) Malicious bill-swiping detection system and method
CN107563757B (en) Data risk identification method and device
CN101166159B (en) A method and system for identifying rubbish information
CN108491720B (en) Application identification method, system and related equipment
CN111464950B (en) Method for extracting travel stop point by using mobile phone signaling data
CN109614433A (en) The recognition methods of data blood relationship, device, equipment and storage medium between operation system
CN104883671A (en) Junk message determining method and system
CN111260102A (en) User satisfaction prediction method and device, electronic equipment and storage medium
CN102438205B (en) Method and system for pushing service based on action of mobile user
CN109657063A (en) A kind of processing method and storage medium of magnanimity environment-protection artificial reported event data
CN111626754B (en) Card-keeping user identification method and device
CN115086880A (en) Travel characteristic identification method, device, equipment and storage medium
CN113412607A (en) Content pushing method and device, mobile terminal and storage medium
CN114741612A (en) Consumption habit classification method and system based on big data and storage medium
CN111242147B (en) Method and device for identifying intimate contact person and frequent active area
CN110827036A (en) Method, device, equipment and storage medium for detecting fraudulent transactions
CN108008973B (en) Method, device and server for associating application program
CN113052425A (en) Rework risk index determination method and device based on big data
CN112199388A (en) Strange call identification method and device, electronic equipment and storage medium
CN110677269B (en) Method and device for determining communication user relationship and computer readable storage medium
CN115640518A (en) Training of user recognition model, user recognition method and device
CN115439247A (en) Transaction data processing method and device
CN112416922B (en) Group association data mining method, device, equipment and storage medium
CN115130577A (en) Method and device for identifying fraudulent number and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant