CN111242147A - Method and device for identifying close contact and frequent active area - Google Patents

Method and device for identifying close contact and frequent active area Download PDF

Info

Publication number
CN111242147A
CN111242147A CN201811434322.1A CN201811434322A CN111242147A CN 111242147 A CN111242147 A CN 111242147A CN 201811434322 A CN201811434322 A CN 201811434322A CN 111242147 A CN111242147 A CN 111242147A
Authority
CN
China
Prior art keywords
user
contact
short message
call
behavior characteristics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811434322.1A
Other languages
Chinese (zh)
Other versions
CN111242147B (en
Inventor
林宇俊
胡入祯
戴晶
邵妍
周宇飞
鲁银冰
许鑫伶
林甜甜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Hangzhou Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811434322.1A priority Critical patent/CN111242147B/en
Publication of CN111242147A publication Critical patent/CN111242147A/en
Application granted granted Critical
Publication of CN111242147B publication Critical patent/CN111242147B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a method and a device for identifying close contacts and frequent active areas, wherein the method comprises the steps of acquiring short message signaling data, call signaling data and position updating signaling data of a user, extracting short message behavior characteristics, call behavior characteristics and user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively, determining the close contacts of the user according to contact classification, the short message behavior characteristics, the call behavior characteristics and a preset grading rule of the user, and determining the frequent active areas of the user according to the user position characteristics and the call behavior characteristics of the user. Because the used characteristics come from multiple dimensions when the close contact and the frequent active region are determined, the misjudgment of identification can be effectively reduced, and the identification precision of the identification model is improved.

Description

Method and device for identifying close contact and frequent active area
Technical Field
The embodiment of the invention relates to the technical field of big data, in particular to a method and a device for identifying close contacts and frequent active areas.
Background
The main contact identification schemes at present are as follows: 1. layering the model: performing initial modeling on the collected first model samples to form an initial intimacy model; performing variable inspection on the initial intimacy model, removing invalid variables in the first model sample, and adding new connection variables or derivative variables derived from the connection variables in the first model sample to form a second model sample; and modeling the second model sample again to form a reconstructed intimacy model, evaluating the reconstructed intimacy model, scoring to obtain intimacy scores, and determining intimacy between the user and the contact according to the intimacy scores. 2. Acquiring communication records of a user and all contacts within a preset time period; obtaining a relation value between the user and each contact person according to the communication record; sorting the contacts according to the relation value between the user and each contact; and in a first mode, displaying the contacts according to the sorting.
Layering the close person model to form a first model sample and a second model sample in sequence, wherein the final model output comprises three stages, each stage is correlated with each other, and if a large amount of misjudgments exist in the first stage, the misjudgments of the subsequent whole output result can be caused; the close person model is established by obtaining the communication records of the user and all the contacts in a preset time period, the model granularity is coarse, and the identification precision is difficult to guarantee.
Disclosure of Invention
The embodiment of the invention provides a method and a device for identifying close contact persons and frequent active areas, which are used for effectively reducing misjudgment of identification of the close contact persons and improving model identification precision.
The embodiment of the invention provides a method for identifying close contacts and frequent active areas, which comprises the following steps:
acquiring short message signaling data, call signaling data and position updating signaling data of a user;
extracting short message behavior characteristics, call behavior characteristics and user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
determining the close contact of the user according to the contact classification of the user, the short message behavior characteristics, the call behavior characteristics and a preset grading rule;
and determining the frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the determining the close contact of the user according to the contact classification of the user, the short message behavior characteristic, the call behavior characteristic, and a preset scoring rule includes:
according to the contact classification of the user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain a score of contact intimacy;
after the scores of the contact person intimacy of the user are aggregated, carrying out standardization processing to obtain the intimacy value of each contact person of the user;
and determining the contact persons with the affinity values larger than a first threshold value and the communication time periods and the short message time periods within a first time period range as the close contact persons of the user according to the affinity values of the contact persons, the communication time period distribution and the short message time period distribution of the user and the contact persons.
Optionally, after the scores of the affinity of the contacts of the user are aggregated, performing a normalization process to obtain an affinity value of each contact of the user, including:
after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
Optionally, the determining a frequent active area of the user according to the user location characteristic and the call behavior characteristic of the user includes:
determining the residence of the user as the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics;
and according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user.
Optionally, the short message behavior characteristics include one or any combination of the following characteristics: the system comprises a local number, an opposite side number, a city of the opposite side, short message sending time, an event type and a short message length;
the call behavior characteristics comprise one or any combination of the following characteristics: the phone number, the city of the opposite party, the call start time, the event type and the call duration;
the user position characteristics comprise one or any combination of the following characteristics: local number, area code of the cell, type of signaling event, city, and signaling start time.
Correspondingly, the embodiment of the invention also provides a device for identifying the close contact and the frequent active area, which comprises the following steps:
the acquisition module is used for acquiring short message signaling data, call signaling data and position updating signaling data of a user;
the processing module is used for extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
the identification module is used for determining the close contact of the user according to the contact classification of the user, the short message behavior characteristics, the call behavior characteristics and a preset grading rule; and determining the frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the identification module is specifically configured to:
according to the contact classification of the user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain a score of contact intimacy;
after the scores of the contact person intimacy of the user are aggregated, carrying out standardization processing to obtain the intimacy value of each contact person of the user;
and determining the contact persons with the affinity values larger than a first threshold value and the communication time periods and the short message time periods within a first time period range as the close contact persons of the user according to the affinity values of the contact persons, the communication time period distribution and the short message time period distribution of the user and the contact persons.
Optionally, the identification module is specifically configured to:
after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
Optionally, the identification module is specifically configured to:
determining the residence of the user as the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics;
and according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user.
Optionally, the short message behavior characteristics include one or any combination of the following characteristics: the system comprises a local number, an opposite side number, a city of the opposite side, short message sending time, an event type and a short message length;
the call behavior characteristics comprise one or any combination of the following characteristics: the phone number, the city of the opposite party, the call start time, the event type and the call duration;
the user position characteristics comprise one or any combination of the following characteristics: local number, area code of the cell, type of signaling event, city, and signaling start time.
Correspondingly, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for identifying the close contact and the frequent active area according to the obtained program.
Accordingly, embodiments of the present invention also provide a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer is caused to execute the method for identifying the close contact and the frequent active area.
The embodiment of the invention shows that short message signaling data, call signaling data and position updating signaling data of a user are acquired, short message behavior characteristics, call behavior characteristics and user position characteristics of the user are extracted according to the short message signaling data, the call signaling data and the position updating signaling data respectively, close contacts of the user are determined according to contact classification, the short message behavior characteristics, the call behavior characteristics and a preset grading rule of the user, and a frequent active area of the user is determined according to the user position characteristics and the call behavior characteristics of the user. Because the used characteristics come from multiple dimensions when the close contact and the frequent active region are determined, the misjudgment of identification can be effectively reduced, and the identification precision of the identification model is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a method for identifying close contacts and frequent active areas according to an embodiment of the present invention;
fig. 3 is a schematic diagram of short message behavior characteristics according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a call behavior feature according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a user location feature provided in an embodiment of the present invention;
fig. 6 is a schematic diagram of a distribution of user affinity values according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a distribution of talk time periods according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a residence model distance distribution provided by an embodiment of the invention;
FIG. 9 is a schematic diagram of a distance distribution of an operational model according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an apparatus for close contact and frequent active area identification according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 illustrates an exemplary system architecture, which may be a server 100, including a processor 110, a communication interface 120, and a memory 130, to which embodiments of the present invention are applicable.
The communication interface 120 is used for communicating with a terminal device, and transceiving information transmitted by the terminal device to implement communication.
The processor 110 is a control center of the server 100, connects various parts of the entire server 100 using various interfaces and lines, performs various functions of the server 100 and processes data by running or executing software programs and/or modules stored in the memory 130 and calling data stored in the memory 130. Alternatively, processor 110 may include one or more processing units.
The memory 130 may be used to store software programs and modules, and the processor 110 executes various functional applications and data processing by operating the software programs and modules stored in the memory 130. The memory 130 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to a business process, and the like. Further, the memory 130 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
It should be noted that the structure shown in fig. 1 is only an example, and the embodiment of the present invention is not limited thereto.
Based on the above description, fig. 2 exemplarily illustrates a flow of a method for close contact and frequent active area identification provided by an embodiment of the present invention, where the flow may be performed by an apparatus for close contact and frequent active area identification, such as the above server.
As shown in fig. 2, the specific steps of the process include:
step 201, obtaining short message signaling data, call signaling data and location update signaling data of a user.
The original signaling data of the user may be obtained periodically, for example, at monthly granularity, at weekly granularity, or at quarterly granularity.
Step 202, extracting the short message behavior feature, the call behavior feature and the user position feature of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively.
After the signaling data is obtained, features may be extracted according to the signaling data, for example, short message behavior features may be extracted from the short message signaling data according to monthly granularity, where the short message behavior features may include one or any combination of the following features: the phone number, the number of the opposite party, the city where the opposite party is located, the sending time of the short message, the event type (used for marking whether the host number is the sender or the receiver), and the length of the short message are shown in fig. 3. The call behavior characteristics can be extracted from the call signaling data according to monthly granularity, and the call behavior characteristics can comprise one or any combination of the following characteristics: the phone number, the city of the number of the opposite party, the city of the opposite party, the call start time, the event type (used for marking whether the host number is the calling party or the called party), and the call duration are specifically shown in fig. 4. User location features may be extracted from the location update signalling data at monthly granularity, which may include one or any combination of the following: the local number, the area code of the cell, the type of signaling event, the city, and the signaling start time, as shown in fig. 5.
Step 203, determining the close contact of the user according to the contact classification of the user, the short message behavior characteristics, the call behavior characteristics and a preset grading rule.
Specifically, the method includes the steps of firstly, according to contact person classification of a user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact person according to a preset scoring rule to obtain a score of contact person intimacy, then, after the scores of the contact person intimacy of the user are gathered, carrying out standardization processing to obtain an intimacy value of each contact person of the user, and finally, according to the intimacy value of each contact person, communication time interval distribution and short message time interval distribution of the user and each contact person, determining the contact person of which the intimacy value is larger than a first threshold value and in which the communication time interval and the short message time interval are within a first time interval range as the intimacy contact person of the user. Wherein the first threshold and the first time period range may be empirically set.
When the total score is normalized, it may be: after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
The preset evaluation rule can be set according to experience, for example, the contact person is classified according to the contact person, the extracted short messages and the call characteristics are fused, a contact person intimacy degree model shown in the table 1 is established, the contact person intimacy degree model is summarized according to scores converted by the scoring rule in the representation, and ranking is carried out after standardization.
TABLE 1
Figure BDA0001883425040000071
Figure BDA0001883425040000081
The cumulative score values are then normalized using equation (1) to scale the intimacy to the [0,1] range, i.e., the score values are normalized.
Figure BDA0001883425040000082
Where x represents the Score above, MinValue represents the minimum value of Score, and MaxValue represents the maximum value of Score.
Fig. 6 illustrates an affinity distribution of an affinity contact identification model, which is a sample of 1948 users and observes a distribution of user affinity values.
The basic communication information statistics of 1948 users are as follows:
the number of calls is 27500, which accounts for 27.50% of the total number of communication records, wherein the number of outgoing calls is 11317, which accounts for 11.32% of the total number of communication records, and the number of incoming calls is 12718, which accounts for 12.72% of the total number of communication records; the number of short messages is 72500, accounting for 72.50% of the total number of communications, wherein the number of effective short messages is 27189, accounting for 27.19% of the total number of communications, accounting for 37.50% of the number of short messages.
Through the combination of the user communication contact cities, the intimacy values of the 1948 users and domestic and international city areas are finally obtained, namely 2916.
As shown in fig. 6, 2916 intimacy values are distributed according to the statistical distribution rule, and it is reasonable to consider that the probability that the intimacy value is greater than or equal to 0.7 as an intimacy contact is much greater than that the intimacy value is less than 0.7.
Then, the close contacts are classified into five categories, namely parents, couples, male and female friends, common friends and work relations, and the characteristics of the close contacts in short messages and conversation behaviors are summarized as shown in table 2.
TABLE 2
Figure BDA0001883425040000091
The above 1948 user talk period distribution and short message period distribution are observed, as shown in fig. 7. It can be seen that the user's conversation and SMS actions mainly occur at 9-22 o' clock.
In summary, the intimacy value is selected to be more than or equal to 0.7, and the conversation and short message behaviors mostly occur at 9-22 points (the times of the conversation and short message behaviors of the user occur at 9-22 points/the times of the conversation and short message behaviors of the user occur at 23-8 points are more than or equal to 7:1) to be used as the final intimacy contact.
And 204, determining a frequent active area of the user according to the user position characteristic and the call behavior characteristic of the user.
Specifically, the residence area of the user is determined to be the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics. And according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user. I.e. the user's work place.
For example, the frequently active region identification model is mainly subdivided into two sub-models, namely a user residence discovery model and a user working discovery model, according to different user activity regions. The following features are extracted from the location update signaling at monthly granularity: the method mainly comprises the number of the mobile phone, the Location Area Code Cell where the number is located, the type of the signaling event (Location update), the city where the number is located, and the signaling start time. And fusing the user position characteristics with the user call characteristics extracted in the previous section, and establishing a frequent active region identification model.
1) User residence discovery algorithm
The user's residence typically has the following features:
the method is characterized in that: the home address of a user is often the area where the user will be located each day for a relatively stable period of time.
And (2) feature: typically the area where the user spends the most time.
And (3) feature: the period of time that the user is at home is typically a period of time from 12 o 'clock at night to 6 o' clock the next morning.
And (4) sequencing the lacells in which the user has a call or a location update within a time period of 0-6 points in the last 30 days, wherein the laccell with the largest occurrence frequency is considered as the residence of the user.
2) User work discovery algorithm
Users are subdivided into four broad categories according to their working properties, as shown in table 3.
TABLE 3
Figure BDA0001883425040000101
Identifying a user workplace by:
the users are divided into users with fixed places of work and users without fixed places of work, and the users with fixed places of work have regular behavior tracks.
Some users work in the daytime, some work at night, some work in the working days, and some work on weekends. Suppose that 90% of users work during the day and 10% work at night. Also assume that 90% of users are working on weekdays and 10% are working on weekends.
Suppose working hours in the daytime are [9:00, 12:00], [14:00, 17:00], and working hours in the evening are [0:00, 6:00 ]; the likelihood that the user's workplace is located at A is calculated using equation (2):
Figure BDA0001883425040000111
wherein, W represents the cumulative duration that the user weekday daytime period occurs at a, X represents the cumulative duration that the user weekday evening occurs at a, Y represents the cumulative duration that the user weekend daytime occurs at a, Z represents the cumulative duration that the user weekend evening occurs at a, f (a) is the score of the probability of location a, which is the time difference of the last signaling minus the first signaling in the area for the user within the specified time period.
And (3) during calculation, deleting all points staying in a certain place for less than 1 hour, calculating the score of each point after filtering according to the formula (2), and outputting the point with the maximum score, namely the most possible work place of the user.
In order to better determine the possibility of the user's workplace, the determination can also be made by:
first, the probability that the user is located at an arbitrary location L is calculated by formula (3):
Figure BDA0001883425040000112
where f (l) is a score of the likelihood of location a, W represents the cumulative length of time that the user is present at a during the daytime hours of the weekday, X represents the cumulative length of time that the user is present at a during the evening of the weekday, Y represents the cumulative length of time that the user is present at a during the daytime on the weekend, and Z represents the cumulative length of time that the user is present at a during the evening on the weekend.
To simplify the calculation, order
Figure BDA0001883425040000113
Then there is equation (4):
F(L)=0.72m+0.08n+0.1…………………………(4)
if a working place of a person is A and a residence is B, the formula (5) is given:
Figure BDA0001883425040000114
if the work is on duty in daytime, W > X has mA→ 1; same theory mB→ 0, then there is formula (6):
Figure BDA0001883425040000121
if the work is done at night, W is less than X, and m isA→ 0; same theory mB→ 1, then there is formula (7):
Figure BDA0001883425040000122
if the holiday is on duty in daytime, Y is greater than Z, and n is presentA→ 1; for the same reason nB→ 0, then there is formula (8):
Figure BDA0001883425040000123
if the holiday is on duty at night, Y is less than Z, and n isA→ 0; for the same reason nB→ 1, then there is formula (9):
Figure BDA0001883425040000124
the formula (10) can be found from the definition of m, n:
Figure BDA0001883425040000125
the above model is verified, and 145 users with known residence and work places are subjected to model verification, and the lac-ci outputted by the model and the lac-ci of the real residence and work places are subjected to distance calculation by using a hundred-degree map, so that distribution graphs as shown in fig. 8 and 9 are obtained. As can be seen from fig. 8, the probability of the residence recognition model recognizing the residence within 3KM from the user's real residence is 83.7%; as can be seen from fig. 9, the probability of the workplace recognition model recognizing the distance of 3KM from the user's true workplace is 50%.
The embodiment shows that the short message signaling data, the call signaling data and the position updating signaling data of the user are obtained, the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user are extracted according to the short message signaling data, the call signaling data and the position updating signaling data respectively, the close contact of the user is determined according to the contact classification, the short message behavior characteristics, the call behavior characteristics and the preset grading rule of the user, and the frequent active area of the user is determined according to the user position characteristics and the call behavior characteristics of the user. Because the used characteristics come from multiple dimensions when the close contact and the frequent active region are determined, the misjudgment of identification can be effectively reduced, and the identification precision of the identification model is improved.
In order to avoid the defects in the prior art, the embodiment of the invention provides a scheme for identifying close contacts and frequent active areas based on user signaling data. The scheme extracts communication behavior characteristics (including short message behavior characteristics and conversation behavior characteristics) of a calling number from signaling data, then establishes an intimate contact recognition model according to the communication behavior characteristics, and judges the intimate contact of a user by combining the communication time interval of the contact; and extracting the user activity position characteristics from the position updating data, and establishing a frequent active area recognition model according to the user activity rule by combining the call behavior characteristics of the user so as to judge the frequent active area of the user. The model features come from a plurality of dimensions, so that misjudgment is effectively reduced, and the model identification precision is improved.
Based on the same technical concept, fig. 10 exemplarily illustrates an apparatus for close contact and frequent active area identification according to an embodiment of the present invention, which may perform a process of close contact and frequent active area identification.
As shown in fig. 10, the apparatus may include:
an obtaining module 1001, configured to obtain short message signaling data, call signaling data, and location update signaling data of a user;
a processing module 1002, configured to extract a short message behavior feature, a call behavior feature, and a user location feature of the user according to the short message signaling data, the call signaling data, and the location update signaling data, respectively;
the identification module 1003 is configured to determine an intimate contact of the user according to the contact classification of the user, the short message behavior characteristic, the call behavior characteristic, and a preset scoring rule; and determining the frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
Optionally, the identifying module 1003 is specifically configured to:
according to the contact classification of the user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain a score of contact intimacy;
after the scores of the contact person intimacy of the user are aggregated, carrying out standardization processing to obtain the intimacy value of each contact person of the user;
and determining the contact persons with the affinity values larger than a first threshold value and the communication time periods and the short message time periods within a first time period range as the close contact persons of the user according to the affinity values of the contact persons, the communication time period distribution and the short message time period distribution of the user and the contact persons.
Optionally, the identifying module 1003 is specifically configured to:
after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
Optionally, the identifying module 1003 is specifically configured to:
determining the residence of the user as the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics;
and according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user.
Optionally, the short message behavior characteristics include one or any combination of the following characteristics: the system comprises a local number, an opposite side number, a city of the opposite side, short message sending time, an event type and a short message length;
the call behavior characteristics comprise one or any combination of the following characteristics: the phone number, the city of the opposite party, the call start time, the event type and the call duration;
the user position characteristics comprise one or any combination of the following characteristics: local number, area code of the cell, type of signaling event, city, and signaling start time.
Based on the same technical concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for identifying the close contact and the frequent active area according to the obtained program.
Based on the same technical concept, the embodiment of the invention also provides a computer-readable non-volatile storage medium, which comprises computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer is enabled to execute the method for identifying the close contact and the frequent active area.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (12)

1. A method for close contact and frequent active area identification, comprising:
acquiring short message signaling data, call signaling data and position updating signaling data of a user;
extracting short message behavior characteristics, call behavior characteristics and user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
determining the close contact of the user according to the contact classification of the user, the short message behavior characteristics, the call behavior characteristics and a preset grading rule;
and determining the frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
2. The method of claim 1, wherein the determining the close contact of the user according to the contact classification of the user, the short message behavior characteristic, the call behavior characteristic and a preset scoring rule comprises:
according to the contact classification of the user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain a score of contact intimacy;
after the scores of the contact person intimacy of the user are aggregated, carrying out standardization processing to obtain the intimacy value of each contact person of the user;
and determining the contact persons with the affinity values larger than a first threshold value and the communication time periods and the short message time periods within a first time period range as the close contact persons of the user according to the affinity values of the contact persons, the communication time period distribution and the short message time period distribution of the user and the contact persons.
3. The method of claim 2, wherein the normalizing after aggregating the scores for the affinity of the user's contacts to obtain an affinity value for each contact of the user comprises:
after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
4. The method of claim 1, wherein said determining a frequently active region of the user based on user location characteristics and call behavior characteristics of the user comprises:
determining the residence of the user as the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics;
and according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user.
5. The method of any one of claims 1 to 4, wherein the short message behavior characteristics comprise one or any combination of the following characteristics: the system comprises a local number, an opposite side number, a city of the opposite side, short message sending time, an event type and a short message length;
the call behavior characteristics comprise one or any combination of the following characteristics: the phone number, the city of the opposite party, the call start time, the event type and the call duration;
the user position characteristics comprise one or any combination of the following characteristics: local number, area code of the cell, type of signaling event, city, and signaling start time.
6. An apparatus for close contact and frequent active area identification, comprising:
the acquisition module is used for acquiring short message signaling data, call signaling data and position updating signaling data of a user;
the processing module is used for extracting the short message behavior characteristics, the call behavior characteristics and the user position characteristics of the user according to the short message signaling data, the call signaling data and the position updating signaling data respectively;
the identification module is used for determining the close contact of the user according to the contact classification of the user, the short message behavior characteristics, the call behavior characteristics and a preset grading rule; and determining the frequent active area of the user according to the user position characteristics and the call behavior characteristics of the user.
7. The apparatus of claim 6, wherein the identification module is specifically configured to:
according to the contact classification of the user, scoring the short message behavior characteristics and the communication behavior characteristics of each contact according to a preset scoring rule to obtain a score of contact intimacy;
after the scores of the contact person intimacy of the user are aggregated, carrying out standardization processing to obtain the intimacy value of each contact person of the user;
and determining the contact persons with the affinity values larger than a first threshold value and the communication time periods and the short message time periods within a first time period range as the close contact persons of the user according to the affinity values of the contact persons, the communication time period distribution and the short message time period distribution of the user and the contact persons.
8. The apparatus of claim 7, wherein the identification module is specifically configured to:
after the scores of the contact intimacy of the user are aggregated, the scores of the contact intimacy of each contact are normalized to obtain the intimacy value of each contact of the user.
9. The apparatus of claim 6, wherein the identification module is specifically configured to:
determining the residence of the user as the frequent activity area of the user according to the call time in the call behavior characteristics and the area code of the cell of the user location characteristics;
and according to the user position characteristics, counting the accumulated time length of the user in each position when the user travels in a second time period, and determining the position with the accumulated time length being greater than a second threshold value as a frequent activity area of the user.
10. The apparatus of any one of claims 6 to 9, wherein the short message behavior characteristics include one or any combination of the following characteristics: the system comprises a local number, an opposite side number, a city of the opposite side, short message sending time, an event type and a short message length;
the call behavior characteristics comprise one or any combination of the following characteristics: the phone number, the city of the opposite party, the call start time, the event type and the call duration;
the user position characteristics comprise one or any combination of the following characteristics: local number, area code of the cell, type of signaling event, city, and signaling start time.
11. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to execute the method of any one of claims 1 to 5 in accordance with the obtained program.
12. A computer-readable non-transitory storage medium including computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1 to 5.
CN201811434322.1A 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area Active CN111242147B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811434322.1A CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811434322.1A CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Publications (2)

Publication Number Publication Date
CN111242147A true CN111242147A (en) 2020-06-05
CN111242147B CN111242147B (en) 2023-07-07

Family

ID=70874034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811434322.1A Active CN111242147B (en) 2018-11-28 2018-11-28 Method and device for identifying intimate contact person and frequent active area

Country Status (1)

Country Link
CN (1) CN111242147B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656686A (en) * 2021-07-26 2021-11-16 深圳市中元产教融合科技有限公司 Task report generation method based on birth teaching fusion and service system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BRPI0413441A (en) * 2003-08-08 2006-10-17 Networks In Motion Inc method and system for collecting, synchronizing, and reporting telecommunication call events and workflow information
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN107203901A (en) * 2017-05-11 2017-09-26 中国联合网络通信集团有限公司 The method and device of product information is pushed to user
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user
CN107871286A (en) * 2017-07-20 2018-04-03 上海前隆信息科技有限公司 User is with contacting human world cohesion decision method/system, storage medium and equipment
WO2019125082A1 (en) * 2017-12-22 2019-06-27 Samsung Electronics Co., Ltd. Device and method for recommending contact information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BRPI0413441A (en) * 2003-08-08 2006-10-17 Networks In Motion Inc method and system for collecting, synchronizing, and reporting telecommunication call events and workflow information
CN106304015A (en) * 2015-05-28 2017-01-04 中兴通讯股份有限公司 The determination method and device of subscriber equipment
CN107203901A (en) * 2017-05-11 2017-09-26 中国联合网络通信集团有限公司 The method and device of product information is pushed to user
CN107871286A (en) * 2017-07-20 2018-04-03 上海前隆信息科技有限公司 User is with contacting human world cohesion decision method/system, storage medium and equipment
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user
WO2019125082A1 (en) * 2017-12-22 2019-06-27 Samsung Electronics Co., Ltd. Device and method for recommending contact information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656686A (en) * 2021-07-26 2021-11-16 深圳市中元产教融合科技有限公司 Task report generation method based on birth teaching fusion and service system

Also Published As

Publication number Publication date
CN111242147B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN110401779B (en) Method and device for identifying telephone number and computer readable storage medium
CN108491720B (en) Application identification method, system and related equipment
CN110337059B (en) Analysis algorithm, server and network system for family relationship of user
CN103052044B (en) A kind of processing method of Stranger Calls and mobile terminal
CN113412607B (en) Content pushing method and device, mobile terminal and storage medium
CN110381218B (en) Method and device for identifying telephone fraud groups
CN110611929A (en) Abnormal user identification method and device
CN111626754B (en) Card-keeping user identification method and device
CN113206909A (en) Crank call interception method and device
CN101389085A (en) Rubbish short message recognition system and method based on sending behavior
CN110677269B (en) Method and device for determining communication user relationship and computer readable storage medium
CN110536302A (en) Telecommunication fraud based reminding method and device
CN111242147B (en) Method and device for identifying intimate contact person and frequent active area
CN109525739B (en) Telephone number identification method and device and server
CN112132612A (en) Task processing method and device, electronic equipment and storage medium
CN114449106B (en) Method, device, equipment and storage medium for identifying abnormal telephone number
CN115640518A (en) Training of user recognition model, user recognition method and device
CN113094412B (en) Identity recognition method and device, electronic equipment and storage medium
CN115130577A (en) Method and device for identifying fraudulent number and electronic equipment
CN116016769A (en) Identification method and device for fraudulent party and readable storage medium
CN112307075B (en) User relationship identification method and device
CN114168423A (en) Abnormal number calling monitoring method, device, equipment and storage medium
CN110166635B (en) Suspicious terminal identification method and suspicious terminal identification system
CN114205462A (en) Fraud telephone identification method, device, system and computer storage medium
CN112417007A (en) Data analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant