CN111385136A - Method and device for determining user communication identifier - Google Patents
Method and device for determining user communication identifier Download PDFInfo
- Publication number
- CN111385136A CN111385136A CN201811653353.6A CN201811653353A CN111385136A CN 111385136 A CN111385136 A CN 111385136A CN 201811653353 A CN201811653353 A CN 201811653353A CN 111385136 A CN111385136 A CN 111385136A
- Authority
- CN
- China
- Prior art keywords
- target
- feature
- target user
- determining
- fused
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/52—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the invention discloses a method and a device for determining a user communication identifier, wherein the method comprises the following steps: acquiring N pieces of message attribute information corresponding to N pieces of historical messages issued by a target account on a target social platform. And determining N time windows according to the release time corresponding to each historical message, and determining the ticket attribute information of any target user in the M target users on any time window according to the target ticket data in any time window so as to obtain the N ticket attribute information of each target user on the N time windows. And extracting information characteristics based on the N message attribute information and the N call ticket attribute information of each target user in N time windows to obtain M target characteristic sets corresponding to the M target users. And determining a target user communication identifier uniquely associated with the target account according to the target feature set corresponding to each target user. By adopting the embodiment of the invention, the complaint feedback efficiency and the user experience of the communication network can be improved.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for determining a user communication identifier.
Background
With the continuous development of computer network technology and communication technology, more and more people can enjoy the convenience brought by internet technology. Particularly, the rise of social networks brings great changes to people's daily life. The social network has penetrated the aspects of our lives, and breaks through the confidence propagation mode of traditional media, so that people can freely publish information contents which we consider valuable on the social network in the modes of texts or multimedia (such as pictures or videos) and the like. However, the rise of social networks brings convenience to life of people and challenges to network providers. Because people can freely publish or browse messages in the social network, when people find that the network service quality is poor, people often choose to publish network failure declaration or network complaint information such as 'user experience of xxx network providers' on the social network, and the complaint information is public and is easily received by more users, so that the brand quality of the network providers is adversely affected. Therefore, how to accurately locate the users who send out the complaint information and improve the network quality experience of the users in a targeted manner becomes a great research hotspot in the field of network maintenance.
In the prior art, a network provider can accurately locate a user corresponding to a certain social account in a network communication system only on the premise that the network provider has a fixed cooperative relationship with a social network platform, so that user experience is improved in a targeted manner. If the network provider and some social network platforms do not have a fixed cooperative relationship, the network provider cannot analyze and process the complaint information issued by the users of some social network platforms, which results in low complaint feedback efficiency and poor user experience of the communication network provided by the network provider.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining a user communication identifier, which can enable a communication network to accurately locate a network user corresponding to a social account, and then carry out fault analysis and resolution in a targeted manner, so that the network complaint feedback efficiency of the communication network can be improved, and the user experience of the communication network can be improved.
In a first aspect, an embodiment of the present invention provides a method for determining a user communication identifier. First, N pieces of message attribute information corresponding to N pieces of history messages posted by a target account on a target social platform may be acquired. Here, one history message corresponds to one message attribute information. Then, N time windows may be determined according to the publishing time corresponding to each historical message. And determining the ticket attribute information of any target user in the M target users on any time window according to the target ticket data in any time window so as to obtain N ticket attribute information of each target user in the M target users on the N time windows. Here, the target user is a communication network user having service interaction with the target social platform within each time window, and the target call ticket data is call ticket data associated with the target user. And extracting information characteristics based on the N message attribute information and the N call ticket attribute information of each target user in the M target users on the N time windows to obtain M target characteristic sets corresponding to the M target users. And determining a target probability corresponding to each target user according to the target feature set corresponding to each target user, determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier. Here, a target probability is used to indicate the degree of association between a target user and the target account.
In the embodiment of the invention, after acquiring N message attribute information corresponding to N historical messages issued by a target social account on a social platform and call ticket attribute information of M users on a time window corresponding to each historical message, feature extraction can be performed on the message attribute information and the call ticket attribute information to obtain M target feature sets corresponding to M users. Then, according to the M feature sets, M target probabilities which can be used for indicating the degree of association between the target user and the target social account are determined. And finally, determining a target user uniquely associated with the target social account from the M target users according to the M target probabilities, and determining a user communication identifier corresponding to the target user in a communication system as a target user communication identifier. The association degree of a certain target user and a target social account is determined through information comparison and statistics between the message attribute of the historical message and ticket attribute information of a certain user on a time window corresponding to the historical message, and a target user communication identifier uniquely associated with the target social account is further determined, so that a network user corresponding to the social account can be accurately positioned by the communication network, then fault analysis and solution are performed in a targeted manner, the network complaint feedback efficiency of the communication network can be improved, and the user experience of the communication network is improved.
In some possible embodiments, a preset time period threshold t may be obtained. And then determining a time window TDi corresponding to any historical message i according to the preset time period threshold t and the release time Ti corresponding to any historical message i to obtain N time windows corresponding to N historical messages. Here, TDi ═ Ti-t, Ti + t. The plurality of time windows are determined by taking the issuing time of the historical message as a reference, so that the ticket data corresponding to the service for issuing the historical message can be ensured to be contained in the ticket data acquired in each time window, and the subsequent information characteristic extraction process and the determination process of the target user communication identifier are reasonable and effective.
In some feasible implementation manners, the following information feature extraction operations of the message attribute information and the ticket attribute information can be performed on any one target user i of the M target users: firstly, determining a comparison feature set of the target user i on any time window according to comparison and statistics of the ticket attribute information of the target user i on any time window and the message attribute information of the target user i on any time window, so as to obtain N comparison feature sets corresponding to the target user i on N time windows. Here, one alignment feature set includes S different kinds of alignment features. And then carrying out feature fusion on N comparison feature sets corresponding to the target user i on N time windows to obtain a target feature set corresponding to the target user i. And finally, determining M sign sets corresponding to the M target users according to the information feature extraction results of the message attribute information and the ticket data information corresponding to each target user. And obtaining a comparison characteristic set corresponding to each target user on N time windows through information comparison and statistics, wherein the comparison characteristic set is used for representing the matching degree between the ticket attribute information and the message attribute information of the historical message. And then fusing to obtain a target feature set for representing the association degree between the target user and the target account for issuing the historical message. The method is easy to implement, reasonable and effective, and can improve the efficiency of the method for determining the user communication identifier.
In some possible embodiments, the S different alignment features include at least one or more of the following: the method comprises the following steps of marking the same type of the initiating terminal, service occurrence time difference, ticket number, uplink flow size, downlink flow relative size, historical message size, multimedia data mark and multimedia data size.
In some possible embodiments, U feature groups to be fused may be determined from N comparison feature sets corresponding to the target user i over N time windows. Here, one feature group to be fused includes one or more comparison features of the target user i in each time window. And determining a target characteristic value corresponding to any feature group to be fused according to a feature fusion result of the comparison features included in any feature group to be fused so as to obtain U target characteristic values corresponding to the U feature groups to be fused. And determining a target feature set corresponding to the target user i according to the U target feature values. The same bit feature or multiple comparison features are subjected to feature fusion, the process is simple, and the target feature set obtained by fusion can fully reflect the association degree between the target user and the target account.
In some possible embodiments, the U feature groups to be fused include a first feature group to be fused, where the first feature group to be fused includes a first comparison feature of the target user i in each time window. And calculating the average value of the characteristic values of the first comparison characteristic in each time window, and determining the average value as the target characteristic value corresponding to any characteristic group to be fused.
In some possible embodiments, the U feature groups to be fused include a second feature group to be fused, and the second feature group to be fused includes a second comparison feature and a third comparison feature of the target user i in each time window. And calculating similarity values between the second comparison features on the time windows and the third comparison features on the time windows, and determining the similarity values as target feature values corresponding to any feature group to be fused.
In some possible embodiments, the U feature groups to be fused include a third feature group to be fused, and the third feature group to be fused includes a fourth alignment feature of the target user i in each time window. And calculating the sum of the feature values of the fourth comparison features on each time window, and determining the ratio of the sum of the feature values to the number N of the historical messages as a target feature value corresponding to any feature group to be fused.
In some feasible embodiments, the target feature sets corresponding to the target users may be sequentially input into a preset classification model, and the target probability corresponding to each target user is determined based on the classification result of the target feature set corresponding to each target user by the classification model. And determining the target probability corresponding to each target user through a preset trained classification model, so that the effectiveness of the target probability can be improved.
In some possible embodiments, the target user corresponding to the highest target probability among the target probabilities corresponding to the target users may be determined as the target user uniquely associated with the target account.
In the embodiment of the invention, the association degree between a certain target user and the target social account is determined through information comparison and statistics between the message attribute of the historical message and the ticket attribute information of the certain user in the time window corresponding to the historical message, and the target user communication identifier uniquely associated with the target social account is further determined. The network users corresponding to the social account can be accurately positioned by the communication network, and then fault analysis and solution are performed in a targeted manner, so that the network complaint feedback efficiency of the communication network can be improved, and the user experience of the communication network can be improved.
In a second aspect, an embodiment of the present invention provides a device for determining a user communication identifier, where the device includes a unit configured to perform the method for determining a user communication identifier provided in any one of the possible implementations of the first aspect, so that the method for determining a user communication identifier provided in the first aspect can also be beneficial (or advantageous) to implement.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a processor and a memory, where the processor and the memory are connected to each other. The memory is used for storing a computer program, the computer program includes program instructions, and the processor is configured to invoke the program instructions to execute the method for determining a user communication identifier provided by the first aspect, so as to achieve the beneficial effects of the method for determining a user communication identifier provided by the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a computer, the instructions cause the computer to perform the method for determining a user communication identifier provided in any one of the foregoing possible implementation manners of the first aspect, and also can achieve the beneficial effects of the method for determining a user communication identifier provided in the first aspect.
In a fifth aspect, an embodiment of the present invention provides a chip system, where the chip system includes a processor, configured to support a terminal device to implement the functions referred to in the foregoing first aspect, for example, to generate or process information referred to in the method for determining a user communication identifier provided in the foregoing first aspect. In one possible design, the above chip system further includes a memory for storing program instructions and data necessary for the terminal. The chip system may be formed by a chip, and may also include a chip and other discrete devices.
In a sixth aspect, an embodiment of the present invention provides a computer program product including instructions, which, when the computer program product runs on a computer, enables the computer to execute the method for determining a user communication identifier provided in the first aspect, and also can achieve the beneficial effects of the method for determining a user communication identifier provided in the first aspect.
Drawings
Fig. 1 is a flowchart illustrating a method for determining a user communication identifier according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a corresponding relationship between a comparison feature set and each time window according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an apparatus for determining a user communication identifier according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
The method for determining the user communication identifier according to the embodiment of the present invention may be executed by a terminal device with data processing capability, such as a desktop computer, a laptop computer, and the like, which is not limited herein. In the embodiment of the present invention, for example, the first and second nodes before the first feature group to be fused and the second feature group to be fused are only used for distinguishing one or more different feature groups to be fused, and have no other limitation, and the first and second nodes before the names of the subsequent first middle-bit feature and the second comparison feature also have no other limitation.
Example one
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for determining a user communication identifier according to an embodiment of the present invention. For convenience of understanding and description, the present embodiment describes the method for determining the user communication identifier by using the terminal device as an execution subject. The determination method comprises the following steps:
s101, acquiring N pieces of message attribute information corresponding to N pieces of historical messages issued by a target account on a target social platform.
In some feasible embodiments, the terminal device may obtain N pieces of message attribute information corresponding to N pieces of historical messages issued by the target account on the target social platform in a certain period of time in the past. Here, the target account is a legal account that has already been registered on the target social platform. One history message corresponds to one message attribute information.
In specific implementation, the terminal device may obtain, through an Application Programming Interface (API) provided by the target social platform, the corresponding historical data of the target social platform in a certain past time period by using methods such as web data capture. And then, screening the historical data to obtain N pieces of historical information issued by the target account within a certain period of time in the past and associated information corresponding to each historical information in the N pieces of historical information. Here, the related information corresponding to the history message is information related to the publisher of the history message, which is disclosed on the target platform, for example, information of the terminal device or the like which published the history message. And finally, the terminal equipment can determine N pieces of message attribute information corresponding to the N pieces of historical messages from the N pieces of historical messages and the associated information corresponding to the N pieces of historical messages according to a preset first information type. Optionally, the first information type may include a distribution time, a message content, a distribution terminal type, and attribute information of multimedia data included in the message. Here, the multimedia data includes picture data and video data, and is not limited herein. The multimedia data attribute is used for indicating whether the history message contains multimedia data and the size of the contained multimedia data. Similarly, it can be understood that the message attribute information corresponding to the history message also includes at least the distribution time of the history message, the content of the history message, the terminal type of the distribution history message, and the attribute information of the multimedia data included in the history message. Taking the example of obtaining the message attribute information corresponding to the history message L in the N history messages as an example, after the terminal device obtains the history data of the target social platform through the API interface provided by the target social platform, the terminal device may screen the history message L published by the target user from the history data, and obtain the associated information such as the message publishing time and the publishing terminal type corresponding to the history message L at the same time. And then according to a preset information type, selecting data which accords with the information type from the historical information L and the associated information corresponding to the historical information L, and determining the data as the information attribute information corresponding to the historical information L.
S102, determining N time windows according to the release time corresponding to each historical message, and determining the ticket attribute information of any target user in M target users on any time window according to the target ticket data in any time window so as to obtain the N ticket attribute information of each target user in M target users on N time windows.
In some feasible embodiments, after the terminal device obtains the N pieces of history information, N time windows (one time window is a fixed time period) may be determined according to the publishing time corresponding to the N pieces of history information. And then acquiring target call ticket data in each time window of the N time windows. And the target call ticket data is call ticket data associated with the target social platform in each time window. Then, the terminal device can determine the ticket attribute information corresponding to one or more target users in any time window according to the target ticket data in any time window, and repeat the above operations to finally obtain N ticket attribute information of each target user in M target users on N time windows. Here, the target user is a communication network user having service interaction with the target social platform in each time window.
Optionally, in a specific implementation, after the terminal device acquires the N history messages, it may determine N publishing times corresponding to the N history messages. Here, one history message corresponds to one publication time. Then, a preset time period threshold t is obtained, and a time window TDi corresponding to any historical message i in the N issuing moments is determined according to the preset time period threshold t and the issuing moment Ti corresponding to the historical message i. Here, TDi ═ Ti-t, Ti + t. For example, if the preset time period threshold t is 1 minute, and the issuing time corresponding to the history message i is 13: 55 points, the time window corresponding to the history message i is [ 13: 54 points, 13: 56 points ]. And the terminal equipment performs the operation on each of the N issuing moments to obtain N time windows corresponding to the N historical messages. The plurality of time windows are determined by taking the issuing time of the historical message as a reference, so that the ticket data corresponding to the service for issuing the historical message can be ensured to be contained in the ticket data acquired in each time window, and the subsequent information characteristic extraction process and the determination process of the target user communication identifier are reasonable and effective.
After the N time windows are determined, the terminal device may obtain, through a Deep Packet Inspection (DPI) technique, DPI data in each of the N time windows. Then, the terminal device can extract call ticket data associated with the target social platform from the DPI data in each time window to determine target call ticket data corresponding to each time window in the N time windows. Here, it should be noted that the ticket data associated with the target social platform in a certain time window is the ticket data corresponding to the service request initiated by one or more target users in the time window and associated with the target social platform. Different service types generate different call ticket data, for example, a target user browses a webpage of a target social platform and the target user publishes a message on the target social platform can generate different call ticket data, so that one target user can correspond to one or more call ticket data in one time window. For example, in a time window TDi, the terminal device determines that data interaction exists between three users, namely a target user 1, a target user 2 and a target user 3, and the target social platform through DPI detection, and then the terminal device can determine all call ticket data of the target user 1 in the time window TDi, all call ticket data of the target user 2 in the time window TDi and all call ticket data of the target user 3 in the time window TDi as target call ticket data corresponding to the time window TDi. And then, the terminal equipment determines N pieces of ticket attribute information corresponding to each target user in the M target users on the N time windows from the target ticket data corresponding to each time window according to a preset second information type. Here, the second information type may include a terminal type corresponding to the ticket, a ticket start time, a user communication identifier included in the ticket, an uplink traffic size corresponding to the ticket, and a downlink traffic size corresponding to the ticket. Specifically, it is taken as an example to determine a corresponding ticket attribute information of the target user i in the time window time TDi. After acquiring the target call ticket data corresponding to the time window TDi, the terminal device may extract one or more call ticket data corresponding to the target user i, and determine the call ticket data with the maximum uplink flow in the one or more call ticket data as the suspected call ticket data corresponding to the target user i. Then, the terminal device may extract the ticket attribute information of the second information type of the symbol from the suspected ticket data, such as the terminal type corresponding to the ticket, the start time of the ticket, the user communication identifier included in the ticket, the uplink traffic size corresponding to the ticket, and the downlink traffic size corresponding to the ticket, and then determine the ticket attribute information corresponding to the suspected ticket data as the ticket attribute information corresponding to the target user i in the time window TDi. It should be further noted that the M target users are all target users that appear in the N time windows. For example, assuming that N is 2, a target user 1, a target user 2, a target user 3, and a target user 4 exist in the time window TD1, and a target user 1, a target user 2, and a target user 4 exist in the time window 2, after obtaining target call ticket data corresponding to 2 time windows, the terminal device may determine 4 call ticket attribute information corresponding to 4 target users, namely the target user 1, the target user 2, the target user 3, and the target user 4 (the call ticket attribute information corresponding to the target user 3 in the time window TD2 is null).
The following illustrates a process in which the terminal device obtains N pieces of ticket attribute information of each target user in M target users over N time windows. Assume that the terminal device determines 4 time windows, time window TD1, time window TD2, time window TD3, and time window TD 4. The terminal equipment can acquire first target ticket data corresponding to a time window TD1, second target ticket data corresponding to a time window TD2, third target ticket data corresponding to a time window TD3 and fourth target ticket data corresponding to a time window TD4 through a DPI probe. Here, it is assumed that the first target call ticket data includes call ticket data corresponding to a target user 1, a target user 2, a target user 3, and a target user 4; the second target call ticket data comprises call ticket data corresponding to a target user 1, a target user 3 and a target user 4; the third target call ticket data comprises call ticket data corresponding to a target user 1, a target user 2, a target user 3 and a target user 4; the fourth target call ticket data comprises a target user 1, a target user 2, a target user 3 and corresponding call ticket data. Then, the terminal device can determine a first suspected ticket, a second suspected ticket, a third suspected ticket and a fourth suspected ticket corresponding to the target user 1 in the time window TD1, the time window TD2, the time window TD3 and the time window TD4 according to the first target ticket data, the second target ticket data, the third target ticket data and the fourth target ticket data, and then extract a terminal type, a start time, a user communication identifier, an uplink flow rate and a downlink flow rate corresponding to each suspected ticket in the first suspected ticket, the second suspected ticket, the third suspected ticket and the fourth suspected ticket corresponding to the target user 1, so as to obtain 4 pieces of corresponding attribute information of the target user 1 in the time window TD1, the time window TD2, the time window TD3 and the time window TD 4. Similarly, the terminal device may determine 4 pieces of ticket attribute information corresponding to the target user 2, the target user 3, and the target user 4 in the time window TD1, the time window TD2, the time window TD3, and the time window TD4, where the ticket attribute information corresponding to the target user 2 in the time window TD2 may be null, and the ticket attribute information corresponding to the target user 4 in the time window TD4 may be null. Finally, the terminal equipment can determine the corresponding call ticket attribute information of each target user in the 4 target users on each time window.
S103, information feature extraction is carried out on the basis of the N message attribute information and the N ticket attribute information of each target user in the M target users on the N time windows, so that M target feature sets corresponding to the M target users are obtained.
In some feasible implementation manners, after acquiring the N message attribute information and the N ticket attribute information of each target user in the M target users on the N time windows, the terminal device may perform information feature extraction based on the N message attribute information and the N ticket attribute information of each target user in the M target users on the N time windows, so as to obtain M target feature sets corresponding to the M target users. Here, one target user corresponds to one target feature set. For convenience of understanding and description, a process of determining M target feature sets corresponding to M target users by the terminal device is described below by taking a process of determining a target feature set corresponding to a target user i as an example.
Optionally, in a specific implementation, after determining the corresponding ticket attribute information of the target user i in each time window, the terminal device may compare and count the corresponding message attribute information in any time window TDi in each time window with the corresponding ticket attribute information of the target user i, so as to obtain a corresponding comparison feature set of the target user i in the time window TDi. Here, the alignment feature set includes S types of alignment features. Optionally, the comparison feature set may specifically include 8 comparison features, that is, a flag V1 with the same type of originating terminal, a service occurrence time difference V2, a ticket number V3, an uplink traffic size V4, a downlink traffic relative size V5, a history message size V6, a multimedia data flag V7, and a multimedia data size V8. The following description will be given by taking the example of the 8 alignment features in the alignment feature set.
Suppose that the time window TDi corresponds to the history message Z and corresponds to the message attribute information Zi, wherein the message attribute information Zi includes the message Z publishing time t1, the terminal type of the publishing message Z, the content size Q1 of the message Z, and the multimedia attribute information of the message Z. The ticket attribute information corresponding to the target user i comprises a ticket terminal type, a ticket starting time t2, a user communication identifier in a ticket, a ticket uplink flow rate Q2 and a ticket downlink flow rate Q3.
The terminal equipment can compare whether the terminal type of the issued message Z is consistent with the call ticket terminal type, if so, the characteristic value of the comparison characteristic V1 is determined to be 1, and if not, the comparison characteristic V1 is determined to be 0. The terminal equipment can calculate the difference value between the message Z release time t1 and the ticket starting time t2 and determine the value of the comparison characteristic V2 to be t1-t 2. The terminal equipment can determine the value of the comparison characteristic V3 according to the number of the call tickets of the target user i in the time window TDi. The terminal equipment can determine the call ticket uplink flow size Q2 in the call ticket attribute information corresponding to the target user i as a value of the comparison characteristic V4. The terminal device may determine the relative magnitude of the downlink traffic of the suspected ticket of the target user i on the time window TDi as the value of the comparison characteristic V5. If the terminal device can compare the downlink traffic volume of the suspected ticket of the target user i in the time window TDi (i.e. the Q3 mentioned above) with the downlink traffic volumes of the two tickets adjacent to the suspected ticket at the time of occurrence of the suspected ticket, so as to determine the relative downlink traffic volume of the suspected ticket of the target user i. For example, it is assumed that the downlink traffic volume of a suspected ticket of the target user i at the time of the TDi is y1, and the downlink traffic volumes of two tickets adjacent to the target user i before and after the occurrence time are y0 and y2, respectively. The predetermined relative sizes are 1,2, 3. The terminal device can compare y0, y1 and y2 to obtain the magnitude sequence of y1 in the three data of y0, y1 and y2, and if the sequence is y0< y1< y2, the terminal device can determine that the relative downlink traffic volume of the suspected ticket of the target user i is 2, that is, the value of the relative downlink traffic volume V5 is 2. The terminal device may also determine the content size Q1 of the history message Z as the value of the comparison feature V6. The terminal equipment can also judge whether the message Z contains multimedia data according to the multimedia attribute information of the message Z. If the comparison result is positive, the value of the comparison characteristic V7 is determined to be 1, and if the comparison result is negative, the value of the comparison characteristic V7 is determined to be 0. The terminal device may further determine the value of the comparison characteristic V8 according to the size of the multimedia data included in the multimedia attribute information of the message Z. Finally, the terminal device may compose a comparison feature set corresponding to the target user i on the time window TDi through the comparison features V1 to V8. Similarly, the terminal device performs the comparison and the statistical operation on the message attribute information on each time window and the ticket attribute information corresponding to the target user i, so as to obtain N comparison feature sets corresponding to the target user i on N time windows. Moreover, each bit feature set includes the above 8 alignment features.
Here, it is easily understood that, the terminal device performs the above comparison and statistics operations on each target user of the M target users, that is, may obtain a comparison feature set corresponding to each target user of the M target users in each time window. For convenience of understanding, please refer to fig. 2, and fig. 2 is a schematic diagram illustrating a correspondence relationship between a comparison feature set and each time window according to an embodiment of the present invention. As can be seen from the figure, any one of the M target users corresponds to one comparison feature set in a certain time window, that is, one target user corresponds to N comparison feature sets in N time windows (for example, the target user 1 corresponds to the comparison feature set 1 to the comparison feature set N, which are N comparison feature sets in total), and the M users correspond to M × N comparison feature sets in total in the N time windows.
In some feasible embodiments, after the terminal device determines N comparison feature sets corresponding to N time windows in the target user i, the terminal device may perform feature fusion on the N comparison feature sets to obtain a target feature set corresponding to the target user i. Specifically, the terminal device may determine U feature groups to be fused according to the N comparison feature sets. Here, one feature group to be fused may include one or more comparison features of the target user i in each time window. For example, a certain feature group to be fused may include N comparison features V1 of the target user i in N time windows. A certain fused feature set may include N aligned features V4 and N aligned features V6 of the target user i in N time windows. Then, the terminal device may fuse the comparison features included in the U feature groups to be fused to obtain U target feature values. Here, a target feature value is obtained by fusing a feature group to be fused. Finally, the terminal device may combine the U target feature values into a target feature set corresponding to the target user i.
In a specific implementation, it is optionally assumed that the U feature groups to be fused may include a first feature group to be fused, where the first feature group to be fused includes first comparison features of the target user i in the time windows. The terminal equipment can calculate the average value of the characteristic values of the first comparison characteristics on each time window, and determines the average value as the target characteristic value corresponding to the first characteristic group to be fused. Optionally, it is assumed that the U feature groups to be fused include a second feature group to be fused, and the second feature group to be fused includes a second comparison feature and a third comparison feature of the target user i in each time window. The terminal equipment can calculate the similarity value between the second comparison characteristic on each time window and the third comparison characteristic on each time window, and determines the similarity value as the target characteristic value corresponding to the second feature group to be fused. Optionally, it is assumed that the U feature groups to be fused include a third feature group to be fused, and the third feature group to be fused includes fourth comparison features of the target user i in each time window. The terminal equipment can calculate the sum of the feature values of the fourth comparison features on each time window, and determine the ratio of the sum of the feature values to the number N of the historical messages as the target feature value corresponding to the third feature group to be fused.
For example, it is assumed that 8 feature groups to be fused are preset, and the first feature group to be fused includes N comparison features V1 of the target user i in each time window. The terminal device can calculate the ratio B' of the sum of the characteristic values corresponding to the N comparison characteristics V1 and the number N of the historical messages, and log10(B') determining a target characteristic value corresponding to the first feature group to be fused. The second feature group to be fused includes N comparison features V2 of the target user i in each time window, and the terminal device may calculate an average value of the feature values of the N comparison features V2, and determine the average value as the target feature value corresponding to the second feature group to be fused. The third feature group to be fused includes N comparison features V3 of the target user i in each time window, and the terminal device may calculate an average value of the feature values of the N comparison features V3, and determine the average value as the target feature value corresponding to the third feature group to be fused. The fourth feature group to be fused includes N comparison features V4 and comparison features V6 of the target user i in each time window, and the terminal device may calculate a similarity value between a first sequence composed of feature values of the N comparison features V4 and a second sequence composed of feature values of the N comparison features V6, and determine the similarity value as a target feature value corresponding to the fourth feature group to be fused. Here, it is optionalAnd the terminal equipment can calculate the Pearson coefficient of the first sequence and the second sequence and determine the coefficient as the target characteristic value corresponding to the fourth feature group to be fused. The fifth feature group to be fused includes N comparison features V5 of the target user i in N time windows, and the terminal device may calculate a mean value of the N comparison features V5, and determine the mean value as a target feature value corresponding to the fifth feature group to be fused. The sixth feature group to be fused includes N comparison features V7 of the target user on N time beds, and the terminal device may calculate a sum of feature values of the N comparison features V7, and determine a ratio of the sum of feature values to the number N of the historical messages as a target feature value corresponding to the sixth feature group to be fused. The seventh to-be-fused feature value includes N comparison features V4, N comparison features V7, and N comparison features V8 corresponding to the target user i over N time windows. The terminal device may calculate the ratio of each alignment feature V8 in the N alignment features V8 to each alignment feature V4 in the N alignment features V4 to obtain N ratios. For example, assuming that there are 4 alignment features V8, V81, V82, V83 and V84, and 4 alignment features V4, V41, V42, V43 and V44, the terminal device can calculate 4 ratios of V81/V41, V82/V42, V83/V43 and V84/V44. Then, the terminal device may determine the sum of the N ratios, may also determine the sum of the eigenvalues of the N comparison eigenvalues V7, and finally determine the ratio between the sum of the N ratios and the sum of the eigenvalues of the N comparison eigenvalues V7 as the target eigenvalue of the seventh to-be-fused eigenvalue. The eighth feature group to be fused includes N comparison features V3 of the target user i in N time windows. The terminal device may determine N logical values according to the N comparison feature values V3, where the logical value is 1 when the value of the comparison feature V3 is not 0, and the logical value is 0 when the value of the comparison feature V3 is 0. The terminal device may calculate the sum of the N logical values, and determine a ratio of the sum of the N logical values to the number N of the historical messages as the target feature value of the eighth feature group to be fused. Finally, the terminal device may combine the 8 target feature values corresponding to the first to-be-fused feature group to the eighth to-be-fused feature group into a target feature value set corresponding to the target user i.
Here, it is easily understood that the terminal device may repeat the above feature fusion operation on N comparison features for each pair of target users among the M target users, so as to obtain M target feature sets corresponding to the M target users.
S104, determining a target probability corresponding to each target user according to the target feature set corresponding to each target user, determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier.
In some feasible embodiments, after acquiring M target feature sets corresponding to the M target users, the terminal device may perform data analysis and processing on each target feature set in the M target feature sets to obtain a target probability corresponding to each target user. Then, the terminal device can determine a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determine a user communication identifier corresponding to the target user as a target user communication identifier. The user communication identifier corresponding to the target user can be determined by the call ticket data of the target user. A target probability is used to indicate the degree of association between a target user and the target account.
Taking the process of determining the target probability corresponding to the target user i according to the target feature set j corresponding to the target user i as an example, the process of determining the target probability corresponding to each target user by the terminal device according to the target feature set corresponding to each target user is described below.
Optionally, in a specific implementation, the terminal device may input the target feature set j into a classification model trained in advance, and then determine a target probability corresponding to the target user i according to an output result of the classification model. Here, the classification model may include a classification model based on a random forest machine learning algorithm, a classification model based on a neural network algorithm, and the like, which is not limited herein. Similarly, the terminal device sequentially inputs the target feature sets corresponding to each target user of the M target users into the classification model, that is, the M target probabilities corresponding to the M target users can be determined according to the output result of the classification model.
After the terminal device obtains the M target probabilities corresponding to the M users, the terminal device may determine, according to the target probabilities corresponding to the target users, a target user uniquely associated with the target account. Optionally, the terminal device may determine a maximum target probability among the M target probabilities, and determine a target user corresponding to the maximum target probability as a target user uniquely associated with the target account. And finally, the terminal equipment can extract the user communication identification corresponding to the target user from the ticket data corresponding to the target user, and determines the user communication identification corresponding to the target user as the target user communication identification uniquely matched with the target account.
Optionally, the terminal device may further train a preset classification model to be trained to obtain the trained classification model. Specifically, the terminal device may obtain a positive sample feature set and a negative sample feature set corresponding to E positive sample users. In the following, a process of acquiring the positive sample target feature set and the negative sample target feature set corresponding to the positive sample user c by the terminal device is taken as an example. The terminal device can first acquire message attribute information corresponding to F pieces of historical messages issued by a positive sample user c on a target social platform through a sample account in a past preset time period. And then F time windows are determined based on F issuing moments corresponding to the F historical messages. For a specific process, reference may be made to the process for determining N time windows described above, and details are not repeated here. Then, the terminal device may obtain F pieces of ticket data attribute information corresponding to the positive sample user c over the F time windows, and determine F sample comparison feature sets corresponding to the positive sample user c over the F time windows based on the F pieces of ticket data attribute information corresponding to the positive sample user c over the F time windows and the message attribute information corresponding to the F pieces of historical messages. Meanwhile, the terminal device may further obtain a negative sample target feature set corresponding to each of one or more negative sample users other than the sample user c. Similarly, the above operations are repeated, and the terminal device may obtain E positive sample target feature sets corresponding to the E positive sample users and negative sample target feature sets corresponding to a plurality of negative sample users. Then, the terminal device may label E positive sample target feature sets corresponding to the E positive sample users and negative sample target feature sets corresponding to the negative sample users. The label is used for indicating whether a sample user associated with the corresponding sample target feature set is uniquely matched with the sample account. For example, a positive sample target feature set may be labeled as 1 and a negative sample target feature set may be labeled as 0. And finally, the terminal equipment can sequentially input the E labeled positive sample target feature sets and the negative sample target feature sets corresponding to the negative sample users into the classification model to be trained, and repeatedly train the classification model to be trained until the model parameters of the classification model to be trained are converged, so that the trained classification model can be obtained.
Optionally, in practical application, in order to expand the number of the positive samples or the negative samples, after the terminal device obtains F sample comparison feature sets corresponding to the positive sample user c, one or more sample comparison feature sets may be randomly sampled from the F sample comparison feature sets. For example. The method can randomly sample for F times, wherein 1 sample comparison feature set is sampled for the first time, 2 sample comparison feature sets are sampled for the second time, and the like, and F sample comparison feature sets are sampled for the F time. And performing feature fusion on the combination of the F sample comparison feature sets obtained by the F times of sampling, so as to obtain F positive sample target feature sets corresponding to the positive sample users. Similarly, the negative sample target feature set may also be used to obtain the sample process by using the above method.
In the embodiment of the invention, after acquiring N message attribute information corresponding to N historical messages issued by a target social account on a social platform and call ticket attribute information of M users on a time window corresponding to each historical message, feature extraction can be performed on the message attribute information and the call ticket attribute information to obtain M target feature sets corresponding to M users. Then, according to the M feature sets, M target probabilities which can be used for indicating the degree of association between the target user and the target social account are determined. And finally, determining a target user uniquely associated with the target social account from the M target users according to the M target probabilities, and determining a user communication identifier corresponding to the target user in a communication system as a target user communication identifier. The association degree of a certain target user and a target social account is determined through information comparison and statistics between the message attribute of the historical message and ticket attribute information of a certain user on a time window corresponding to the historical message, and a target user communication identifier uniquely associated with the target social account is further determined, so that a network user corresponding to the social account can be accurately positioned by the communication network, then fault analysis and solution are performed in a targeted manner, the network complaint feedback efficiency of the communication network can be improved, and the user experience of the communication network is improved.
Example two
Referring to fig. 3, fig. 3 is a schematic structural diagram of a device for determining a user communication identifier according to an embodiment of the present invention.
The determination device includes:
the message attribute information determining unit 10 is configured to acquire N pieces of message attribute information corresponding to N pieces of history messages issued by the target account on the target social platform. Here, one history message corresponds to one message attribute information.
The ticket attribute information determining unit 20 is configured to determine N time windows according to the release time corresponding to each historical message acquired by the message attribute information determining unit 10, and determine ticket attribute information of any one of M target users on any one time window according to target ticket data in any one time window, so as to obtain N ticket attribute information of each target user of the M target users on the N time windows. Here, the target user is a communication network user having service interaction with the target social platform within each time window, and the target call ticket data is call ticket data associated with the target user.
An information feature extraction unit 30, configured to perform information feature extraction based on the N pieces of message attribute information determined by the message attribute information determination unit 10 and the N pieces of ticket attribute information of each target user of the M target users on the N time windows determined by the ticket attribute information determination unit 20, so as to obtain M target feature sets corresponding to the M target users.
A user communication identifier determining unit 40, configured to determine a target probability corresponding to each target user according to the target feature set corresponding to each target user determined by the information feature extracting unit 30. And determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier. Here, a target probability is used to indicate the degree of association between a target user and the target account.
In some possible embodiments, the ticket attribute information determining unit 20 is configured to:
and acquiring a preset time period threshold t. And determining a time window TDi corresponding to any historical message i according to the preset time period threshold t and the release time Ti corresponding to any historical message i to obtain N time windows corresponding to N historical messages. Here, TDi ═ Ti-t, Ti + t.
In some possible embodiments, the information feature extraction unit 30 is configured to:
and performing the following information characteristic extraction operation of the message attribute information and the call ticket data information on any target user i in the M target users: and determining a comparison feature set of the target user i on any time window according to comparison and statistics of the ticket attribute information of the target user i on any time window determined by the ticket attribute information determination unit 10 and the message attribute information of the target user i on any time window determined by the message attribute information determination unit 20, so as to obtain N comparison feature sets corresponding to the target user i on N time windows. Here, one alignment feature set includes S different kinds of alignment features. And performing feature fusion on N comparison feature sets corresponding to the target user i on N time windows to obtain a target feature set corresponding to the target user i. And determining M target feature sets corresponding to the M target users according to the information feature extraction results of the message attribute information and the call ticket attribute information corresponding to each target user.
In some possible embodiments, the S species-distinct alignment features include at least one or more of: the method comprises the following steps of marking the same type of the initiating terminal, service occurrence time difference, ticket number, uplink flow size, downlink flow relative size, historical message size, multimedia data mark and multimedia data size.
In some possible embodiments, the information feature extraction unit 30 is configured to:
and determining U feature groups to be fused from N comparison feature sets corresponding to the target user i in N time windows. And one feature group to be fused comprises one or more comparison features of the target user i on each time window. And determining a target characteristic value corresponding to any feature group to be fused according to a feature fusion result of the comparison features included in any feature group to be fused so as to obtain U target characteristic values corresponding to the U feature groups to be fused. And determining a target feature set corresponding to the target user i according to the U target feature values.
In some possible embodiments, the U feature groups to be fused include a first feature group to be fused, where the first feature group to be fused includes a first comparison feature of the target user i in each time window. The information feature extraction unit 30 is configured to: and calculating the average value of the characteristic values of the first comparison characteristics on each time window, and determining the average value as the target characteristic value corresponding to the first characteristic group to be fused.
In some possible embodiments, the U feature groups to be fused include a second feature group to be fused, and the second feature group to be fused includes a second comparison feature and a third comparison feature of the target user i in each time window. The information feature extraction unit 30 is configured to: and calculating similarity values between the second comparison features on each time window and the third comparison features on each time window, and determining the similarity values as target feature values corresponding to the second feature group to be fused.
In some possible embodiments, the U feature groups to be fused include a third feature group to be fused, and the third feature group to be fused includes a fourth alignment feature of the target user i in each time window. The information feature extraction unit 30 is configured to: and calculating the sum of the feature values of the fourth comparison features on each time window, and determining the ratio of the sum of the feature values to the number N of the historical messages as a target feature value corresponding to the third feature group to be fused.
In some possible embodiments, the user communication identification determination unit 40 is configured to:
and sequentially inputting the target feature sets corresponding to the target users determined by the information feature extraction unit into a preset classification model, and determining the target probability corresponding to each target user based on the classification result of the target feature set corresponding to each target user by the classification model.
In some possible embodiments, the user communication identification determination unit 40 is configured to:
and determining the target user corresponding to the maximum target probability in the target probabilities corresponding to the target users as the target user uniquely associated with the target account.
In some possible embodiments, the message attribute information determining unit 10 may obtain N pieces of message attribute information corresponding to N pieces of history messages published by the target account on the target social platform. Wherein, a piece of historical information corresponds to a piece of message attribute information. For a specific process, reference may be made to the process of obtaining N pieces of message attribute information corresponding to N pieces of history messages described in step S101 in the first embodiment, and details are not repeated here. The bill attribute information determining unit 20 may be configured to determine N time windows according to the release time corresponding to each historical message acquired by the message attribute determining unit 10, and the specific process may refer to a process of determining N time windows on the network described in step S102 in the first embodiment, which is not described herein again. Then, the ticket attribute information determining unit 20 may determine the ticket attribute information of any one of the M target users in any time window according to the target ticket data in any time window, so as to obtain N ticket attribute information of each target user in the M target users in the N time windows. For a specific process, reference may be made to the process of determining the individual ticket attribute information described in step S102 in the first embodiment, and details are not described here again. Then, the information feature extraction unit 30 may perform information feature extraction based on the N pieces of message attribute information determined by the message attribute information determination unit 10 and the N pieces of ticket attribute information of each target user of the M target users on the N time windows determined by the ticket attribute information determination unit 20, so as to obtain M target feature sets corresponding to the M target users. For a specific process, reference may be made to the process of determining M target feature sets corresponding to M target users described in step S103 of the embodiment, and details are not repeated here. Finally, the user communication identifier determining unit 40 may determine the target probability corresponding to each target user according to the target feature set corresponding to each target user determined by the information feature extracting unit 30. And then determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier. For a specific process, refer to the process of determining the target user communication identifier described in step S104 in the first embodiment, which is not described herein again.
In the embodiment of the invention, the association degree between a certain target user and a target social account is determined through information comparison and statistics between the message attribute of the historical message and the ticket attribute information of the certain user in the time window corresponding to the historical message, and the target user communication identifier uniquely associated with the target social account is further determined, so that the communication network can accurately position the network user corresponding to the social account, and then fault analysis and solution are carried out in a targeted manner, the network complaint feedback efficiency of the communication network can be improved, and the user experience of the communication network can be improved.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device provided by the embodiment of the present invention includes a processor 401, a memory 402, and a bus system 403. The processor 401 and the memory 402 are connected by a bus system 403.
The memory 402 is used for storing programs. In particular, the program may include program code including computer operating instructions. The memory 402 includes, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM). Only one memory is shown in fig. 4, but of course, the memory may be provided in plural numbers as necessary.
The memory 402 may also be a memory in the processor 401, which is not limited herein.
The memory 402 stores the following elements, executable modules or data structures, or a subset thereof, or an expanded set thereof:
and (3) operating instructions: including various operational instructions for performing various operations.
Operating the system: including various system programs for implementing various basic services and for handling hardware-based tasks.
The processor 401 controls the operation of the electronic device, and the processor 401 may be one or more Central Processing Units (CPUs). In the case where the processor 401 is one CPU, the CPU may be a single-core CPU or a multi-core CPU.
In a particular application, the various components of the electronic device are coupled together by a bus system 403, wherein the bus system 403 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. But for clarity of illustration the various buses are labeled as bus system 403 in figure 4. For ease of illustration, it is only schematically drawn in fig. 4.
The method for identifying the user communication disclosed by the embodiment of the invention can be applied to the processor 401, or can be implemented by the processor 401. The processor 401 may be an integrated circuit chip having signal processing capabilities.
An embodiment of the present invention provides a computer-readable storage medium, which stores instructions that, when executed on a computer, can implement a method for determining a user communication identifier described in the first embodiment.
The computer readable storage medium may be an internal storage unit of the device for determining the user communication identifier in the first embodiment. The computer readable storage medium may also be an external storage device of the terminal device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash card (flash card), and the like, which are provided on the terminal device. Further, the computer-readable storage medium may include both an internal storage unit and an external storage device of the terminal device. The computer-readable storage medium stores the computer program and other programs and data required by the terminal device. The above-described computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
One of ordinary skill in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the above method embodiments. And the aforementioned storage medium includes: various media capable of storing program codes, such as ROM or RAM, magnetic or optical disks, etc.
Claims (22)
1. A method for determining a user communication identity, the method comprising:
acquiring N pieces of message attribute information corresponding to N pieces of historical messages issued by a target account on a target social platform, wherein one piece of historical message corresponds to one piece of message attribute information;
determining N time windows according to the release time corresponding to each historical message, and determining the ticket attribute information of any target user in M target users on any time window according to the target ticket data in any time window to obtain the N ticket attribute information of each target user in M target users on the N time windows, wherein the target user is a communication network user with service interaction with the target social platform in each time window, and the target ticket data is ticket data associated with the target user;
performing information feature extraction based on the N message attribute information and N ticket attribute information of each target user in the M target users on the N time windows to obtain M target feature sets corresponding to the M target users;
determining a target probability corresponding to each target user according to the target feature set corresponding to each target user, determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier, wherein one target probability is used for indicating the association degree between one target user and the target account.
2. The method of claim 1, wherein the determining N time windows according to the publication time corresponding to each historical message comprises:
acquiring a preset time period threshold t;
determining a time window TDi corresponding to any historical message i according to the preset time period threshold t and the release time Ti corresponding to any historical message i to obtain N time windows corresponding to N historical messages;
wherein, the TDi ═ Ti-t, Ti + t.
3. The method of claim 2, wherein the performing feature extraction based on the N message attribute information and the N ticket attribute information of each target user in the M target users over the N time windows to obtain M target feature sets corresponding to the M target users comprises:
and performing the following information characteristic extraction operation of the message attribute information and the call ticket data information on any target user i in the M target users:
determining a comparison feature set of the target user i on any time window according to comparison and statistics of the ticket attribute information of the target user i on any time window and the message attribute information of the target user i on any time window so as to obtain N comparison feature sets corresponding to the target user i on N time windows, wherein one comparison feature set comprises S comparison features of different types;
performing feature fusion on N comparison feature sets corresponding to the target user i on N time windows to obtain a target feature set corresponding to the target user i;
and determining M target feature sets corresponding to the M target users according to the information feature extraction results of the message attribute information and the call ticket attribute information corresponding to each target user.
4. The method of claim 3, wherein the S species-distinct alignment features comprise at least one or more of: the method comprises the following steps of marking the same type of the initiating terminal, service occurrence time difference, ticket number, uplink flow size, downlink flow relative size, historical message size, multimedia data mark and multimedia data size.
5. The method according to claim 3 or 4, wherein the performing feature fusion on the N comparison feature sets corresponding to the target user i over N time windows to obtain the target feature set corresponding to the target user i comprises:
determining U feature groups to be fused from N comparison feature sets corresponding to the target user i on N time windows, wherein one feature group to be fused comprises one or more comparison features of the target user i on each time window;
determining a target characteristic value corresponding to any feature group to be fused according to a feature fusion result of the comparison features included in the feature group to be fused to obtain U target characteristic values corresponding to the U feature groups to be fused;
and determining a target feature set corresponding to the target user i according to the U target feature values.
6. The method according to claim 5, wherein the U feature groups to be fused include a first feature group to be fused, and the first feature group to be fused includes a first comparison feature of the target user i in each time window;
the determining, according to the feature fusion result of the comparison features included in any feature group to be fused, a target feature value corresponding to any feature group to be fused includes:
and calculating the average value of the characteristic values of the first comparison characteristics on each time window, and determining the average value as the target characteristic value corresponding to the first characteristic group to be fused.
7. The method according to claim 5 or 6, wherein the U feature groups to be fused include a second feature group to be fused, and the second feature group to be fused includes a second comparison feature and a third comparison feature of the target user i in each time window;
the determining, according to the feature fusion result of the comparison features included in any feature group to be fused, a target feature value corresponding to any feature group to be fused includes:
and calculating similarity values between the second comparison features on each time window and the third comparison features on each time window, and determining the similarity values as target feature values corresponding to the second feature group to be fused.
8. The method according to any one of claims 5 to 7, wherein the U feature groups to be fused include a third feature group to be fused, and the third feature group to be fused includes a fourth comparison feature of the target user i in each time window;
the determining, according to the feature fusion result of the comparison features included in any feature group to be fused, a target feature value corresponding to any feature group to be fused includes:
and calculating the sum of the feature values of the fourth comparison features on each time window, and determining the ratio of the sum of the feature values to the number N of the historical messages as a target feature value corresponding to the third feature group to be fused.
9. The method according to any one of claims 1 to 8, wherein the determining the target probability corresponding to each target user according to the target feature set corresponding to each target user comprises:
and sequentially inputting the target feature sets corresponding to the target users into a preset classification model, and determining the target probability corresponding to each target user based on the classification result of the target feature set corresponding to each target user of the classification model.
10. The method according to any one of claims 1 to 9, wherein the determining, according to the target probability corresponding to each target user, a target user uniquely associated with the target account includes:
and determining the target user corresponding to the maximum target probability in the target probabilities corresponding to the target users as the target user uniquely associated with the target account.
11. An apparatus for determining a user communication identity, the apparatus comprising:
the message attribute information determining unit is used for acquiring N pieces of message attribute information corresponding to N pieces of historical messages issued by a target account on a target social platform, wherein one piece of historical message corresponds to one piece of message attribute information;
a ticket attribute information determining unit, configured to determine N time windows according to release moments corresponding to the historical messages, and determine ticket attribute information of any one of M target users on any one time window according to target ticket data in any one time window, so as to obtain N ticket attribute information of each target user in the M target users on the N time windows, where the target user is a communication network user having service interaction with the target social platform in each time window, and the target ticket data is ticket data associated with the target user;
an information feature extraction unit, configured to perform information feature extraction based on the N pieces of message attribute information determined by the message attribute information determination unit and the N pieces of ticket attribute information of each target user of the M target users on the N time windows determined by the ticket attribute information determination unit, so as to obtain M target feature sets corresponding to the M target users;
and the user communication identifier determining unit is used for determining a target probability corresponding to each target user according to the target feature set corresponding to each target user determined by the information feature extracting unit, determining a target user uniquely associated with the target account according to the target probability corresponding to each target user, and determining a user communication identifier corresponding to the target user as a target user communication identifier, wherein one target probability is used for indicating the association degree between one target user and the target account.
12. The apparatus of claim 11, wherein the ticket attribute information determining unit is configured to:
acquiring a preset time period threshold t;
determining a time window TDi corresponding to any historical message i according to the preset time period threshold t and the release time Ti corresponding to any historical message i to obtain N time windows corresponding to N historical messages;
wherein, the TDi ═ Ti-t, Ti + t.
13. The determination apparatus according to claim 12, wherein the information feature extraction unit is configured to:
and performing the following information characteristic extraction operation of the message attribute information and the call ticket data information on any target user i in the M target users:
determining a comparison feature set of the target user i on any time window according to comparison and statistics of the ticket attribute information of the target user i on any time window determined by the ticket attribute information determination unit and the message attribute information of the target user i on any time window determined by the message attribute information determination unit, so as to obtain N comparison feature sets corresponding to the target user i on N time windows, wherein one comparison feature set comprises S comparison features of different types;
performing feature fusion on N comparison feature sets corresponding to the target user i on N time windows to obtain a target feature set corresponding to the target user i;
and determining M target feature sets corresponding to the M target users according to the information feature extraction results of the message attribute information and the call ticket attribute information corresponding to each target user.
14. The apparatus according to claim 13, wherein the S different alignment features comprise at least one or more of: the method comprises the following steps of marking the same type of the initiating terminal, service occurrence time difference, ticket number, uplink flow size, downlink flow relative size, historical message size, multimedia data mark and multimedia data size.
15. The determination apparatus according to claim 13 or 14, wherein the information feature extraction unit is configured to:
determining U feature groups to be fused from N comparison feature sets corresponding to the target user i on N time windows, wherein one feature group to be fused comprises one or more comparison features of the target user i on each time window;
determining a target characteristic value corresponding to any feature group to be fused according to a feature fusion result of the comparison features included in the feature group to be fused to obtain U target characteristic values corresponding to the U feature groups to be fused;
and determining a target feature set corresponding to the target user i according to the U target feature values.
16. The apparatus according to claim 15, wherein the U feature groups to be fused include a first feature group to be fused, and the first feature group to be fused includes a first comparison feature of the target user i in each time window;
the information feature extraction unit is configured to: and calculating the average value of the characteristic values of the first comparison characteristics on each time window, and determining the average value as the target characteristic value corresponding to the first characteristic group to be fused.
17. The apparatus according to claim 15 or 16, wherein the U feature groups to be fused include a second feature group to be fused, and the second feature group to be fused includes a second comparison feature and a third comparison feature of the target user i in each time window;
the information feature extraction unit is configured to: and calculating similarity values between the second comparison features on each time window and the third comparison features on each time window, and determining the similarity values as target feature values corresponding to the second feature group to be fused.
18. The apparatus according to any one of claims 15 to 17, wherein the U feature groups to be fused include a third feature group to be fused, and the third feature group to be fused includes a fourth comparison feature of the target user i in each time window;
the information feature extraction unit is configured to: and calculating the sum of the feature values of the fourth comparison features on each time window, and determining the ratio of the sum of the feature values to the number N of the historical messages as a target feature value corresponding to the third feature group to be fused.
19. The determination apparatus according to any of claims 11-18, wherein the user communication identity determination unit is configured to:
and sequentially inputting the target feature sets corresponding to the target users determined by the information feature extraction unit into a preset classification model, and determining the target probability corresponding to each target user based on the classification result of the target feature set corresponding to each target user by the classification model.
20. The determination apparatus according to any of claims 11-19, wherein the user communication identification determination unit is configured to:
and determining the target user corresponding to the maximum target probability in the target probabilities corresponding to the target users as the target user uniquely associated with the target account.
21. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1-10.
22. An electronic device, comprising a memory for storing program code, a processor for invoking the program code stored by the memory to perform the method of any of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811653353.6A CN111385136B (en) | 2018-12-29 | 2018-12-29 | Method and device for determining user communication identifier |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811653353.6A CN111385136B (en) | 2018-12-29 | 2018-12-29 | Method and device for determining user communication identifier |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111385136A true CN111385136A (en) | 2020-07-07 |
CN111385136B CN111385136B (en) | 2023-01-06 |
Family
ID=71221249
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811653353.6A Active CN111385136B (en) | 2018-12-29 | 2018-12-29 | Method and device for determining user communication identifier |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111385136B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107563429A (en) * | 2017-07-27 | 2018-01-09 | 国家计算机网络与信息安全管理中心 | A kind of sorting technique and device of network user colony |
CN107656918A (en) * | 2017-05-10 | 2018-02-02 | 平安科技(深圳)有限公司 | Obtain the method and device of targeted customer |
CN107665442A (en) * | 2017-05-10 | 2018-02-06 | 平安科技(深圳)有限公司 | Obtain the method and device of targeted customer |
CN107800608A (en) * | 2016-09-05 | 2018-03-13 | 腾讯科技(深圳)有限公司 | A kind of processing method and processing device of user profile |
WO2018090839A1 (en) * | 2016-11-16 | 2018-05-24 | 阿里巴巴集团控股有限公司 | Identity verification system, method, device, and account verification method |
CN108171519A (en) * | 2016-12-07 | 2018-06-15 | 阿里巴巴集团控股有限公司 | The processing of business datum, account recognition methods and device, terminal |
-
2018
- 2018-12-29 CN CN201811653353.6A patent/CN111385136B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107800608A (en) * | 2016-09-05 | 2018-03-13 | 腾讯科技(深圳)有限公司 | A kind of processing method and processing device of user profile |
WO2018090839A1 (en) * | 2016-11-16 | 2018-05-24 | 阿里巴巴集团控股有限公司 | Identity verification system, method, device, and account verification method |
CN108171519A (en) * | 2016-12-07 | 2018-06-15 | 阿里巴巴集团控股有限公司 | The processing of business datum, account recognition methods and device, terminal |
CN107656918A (en) * | 2017-05-10 | 2018-02-02 | 平安科技(深圳)有限公司 | Obtain the method and device of targeted customer |
CN107665442A (en) * | 2017-05-10 | 2018-02-06 | 平安科技(深圳)有限公司 | Obtain the method and device of targeted customer |
CN107563429A (en) * | 2017-07-27 | 2018-01-09 | 国家计算机网络与信息安全管理中心 | A kind of sorting technique and device of network user colony |
Also Published As
Publication number | Publication date |
---|---|
CN111385136B (en) | 2023-01-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112861648B (en) | Character recognition method, character recognition device, electronic equipment and storage medium | |
US20170329565A1 (en) | Information processing method, client, server, and computer-readable storage medium | |
KR102002024B1 (en) | Method for processing labeling of object and object management server | |
CN111177319B (en) | Method and device for determining risk event, electronic equipment and storage medium | |
US20100088130A1 (en) | Discovering Leaders in a Social Network | |
CN112200067B (en) | Intelligent video event detection method, system, electronic equipment and storage medium | |
CN107153716B (en) | Webpage content extraction method and device | |
CN113127633B (en) | Intelligent conference management method and device, computer equipment and storage medium | |
CN111107423A (en) | Video service playing card pause identification method and device | |
CN112286815A (en) | Interface test script generation method and related equipment thereof | |
CN103544150A (en) | Method and system for providing recommendation information for mobile terminal browser | |
CN104317847A (en) | Method and system for identifying languages in network text information | |
CN115883187A (en) | Method, device, equipment and medium for identifying abnormal information in network traffic data | |
CN108804501B (en) | Method and device for detecting effective information | |
CN117493671A (en) | Information processing method, information processing device, electronic equipment and computer storage medium | |
CN111385136B (en) | Method and device for determining user communication identifier | |
CN114900492B (en) | Abnormal mail detection method, device and system and computer readable storage medium | |
CN109214846B (en) | Information storage method and device | |
CN115242684B (en) | Full-link pressure measurement method and device, computer equipment and storage medium | |
CN116192527A (en) | Attack flow detection rule generation method, device, equipment and storage medium | |
CN103049275B (en) | A kind of method, device and equipment processing operational order | |
CN115774762A (en) | Instant messaging information processing method, device, equipment and storage medium | |
CN115168755A (en) | Abnormal data processing method and system based on URL (Uniform resource locator) characteristics | |
CN110704617B (en) | News text classification method, device, electronic equipment and storage medium | |
CN112131611A (en) | Data correctness verification method, device, equipment, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |