WO2018130201A1 - Method for determining associated account, server and storage medium - Google Patents

Method for determining associated account, server and storage medium Download PDF

Info

Publication number
WO2018130201A1
WO2018130201A1 PCT/CN2018/072381 CN2018072381W WO2018130201A1 WO 2018130201 A1 WO2018130201 A1 WO 2018130201A1 CN 2018072381 W CN2018072381 W CN 2018072381W WO 2018130201 A1 WO2018130201 A1 WO 2018130201A1
Authority
WO
WIPO (PCT)
Prior art keywords
user account
usage
terminal device
preset
score
Prior art date
Application number
PCT/CN2018/072381
Other languages
French (fr)
Chinese (zh)
Inventor
戴智君
谢毅
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018130201A1 publication Critical patent/WO2018130201A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5061Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the interaction between service providers and their network customers, e.g. customer relationship management
    • H04L41/5064Customer relationship management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services

Definitions

  • the present application relates to Internet technologies in the field of communications, and in particular, to a method, server and storage medium for determining associated accounts.
  • a user performs a business application through a client, such as a video application played by a client, a service request such as a request to play a video or a request to download a video
  • a service request such as a request to play a video or a request to download a video
  • the server collects these service requests and determines the user according to the service request.
  • Confidence intelligently recommend relevant media information (such as a variety video or TV drama video) or user information to the user through confidence, etc., thereby facilitating the user to select more relevant media information of interest or association with the same hobby
  • the user's information for example, when the user watches the video, the video client can play the video according to the type of the video selected by the user, and the server intelligently recommends the video associated with the video to the user according to the type of the video played by the video client, which is convenient for the user to select. Therefore, how to dig out a friend or item with the same interest from a large variety of item big data is a problem that needs attention.
  • the application example provides a method for determining an associated account, which is applied to a server, and the method includes:
  • the usage data of each first user account includes: the identifier of the terminal device, the first a user account and a usage record of the first user account;
  • the application examples also provide a server comprising one or more processors and one or more memories, the one or more memories comprising computer readable instructions configured to be executed by the one or more processors achieve:
  • the usage data of each first user account includes: an identifier of the terminal device, the first a user account and a usage record of the first user account;
  • the present application examples also provide a non-transitory computer readable storage medium storing computer readable instructions that cause at least one processor to perform the method as described above.
  • FIG. 1 is a schematic diagram of various hardware entities in a data processing system in an example of the present application
  • FIG. 2A is a schematic flowchart of an account identification association method provided by an example of the present application.
  • 2B is a schematic diagram 1 of a framework for an account identification association method provided by an example of the present application.
  • FIG. 3 is a schematic diagram 2 of a framework for an account identification association method provided by an example of the present application.
  • FIG. 4 is a schematic diagram 3 of a framework for an account identification association method provided by an example of the present application.
  • FIG. 5 is a diagram 1 showing an exemplary relationship between an exemplary terminal and a user account provided by an example of the present application
  • FIG. 6 is a diagram 2 showing an exemplary relationship between an exemplary terminal and a user account provided by an example of the present application;
  • FIG. 7 is a schematic diagram 4 of a framework for an account identification association method provided by an example of the present application.
  • FIG. 8 is a schematic structural diagram 1 of a server provided by an example of the present application.
  • FIG. 9 is a schematic structural diagram 2 of a server provided by an example of the present application.
  • FIG. 10 is a schematic structural diagram 3 of a server provided by an example of the present application.
  • Terminal device refers to mobile electronic devices, also known as mobile devices (mobile devices), mobile devices, handheld devices, wearable devices, etc., is an embedded chip-based computing device, usually has a Small display, touch input, or a small keyboard.
  • Machine learning relying on theories of probability, statistics, and neural communication to enable computers to simulate human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance.
  • Model training The manually selected samples are input to the machine learning system, and the accuracy of the sample identification is optimized by continuously adjusting the model parameters.
  • the International Mobile Equipment Identity (IMEI) is the unique identification number of the mobile phone.
  • RFM model In RFM mode, R (Recency) indicates how far the customer last used, F (Frequency) indicates the number of times the customer has used in the most recent period of time, and M (Monetary) indicates that the customer has used it in the most recent period of time. The amount. R used in this scheme indicates the reporting time, F indicates the reporting frequency, and M indicates the reporting source.
  • IMEI-User Account A relationship data between a terminal (IMEI) and a user account.
  • FIG. 1 is a schematic diagram of various hardware entities in an architecture of a data processing system in the example of the present application.
  • FIG. 1 includes: one or more servers 1, terminal devices 21-25, and network 3, in network 3. Network entities including routers, gateways, etc., are not shown in Figure 1.
  • the terminal device 21-25 performs service product information interaction with the server 1 through a wired network or a wireless network, so as to acquire time-related data generated by the user's use of the terminal device from the terminal 21-25, and transmit the acquired data to the server. 1.
  • the data is usage data generated by the user using the terminal device.
  • the usage data may include an identifier of the terminal device, a user account, and a historical access record (the user browses news, articles, watches videos, and accesses recorded data of the social networking site).
  • the type of the terminal device is as shown in FIG. 1, and includes a mobile phone (terminal 23), a tablet or PDA (terminal 25), a desktop (terminal 22), a PC (terminal 24), an all-in-one (terminal 21), and the like.
  • the terminal device is equipped with various application function modules required by various users, such as an application having entertainment functions (such as a video application, an audio playback application, a game application, a reading software), and an application having a service function (such as a map navigation). Applications, group purchase applications, shooting applications, etc., and then set system functions such as applications.
  • the user Based on the hardware entity shown in FIG. 1 , the user generates corresponding usage data of the use terminal by using the application on the terminal or the terminal, where the usage data includes the identifier of the terminal device, the user account, and the user generated by using the terminal device by using the terminal device.
  • the user account may be a communication account corresponding to the terminal (for example, a mobile phone number) or a login account of an application on the terminal device.
  • the terminal device transmits the usage data to the server 1.
  • the identifier of the terminal device may be an IMEI (International Mobile Equipment Identity).
  • the server performs a score calculation on each user account by using the usage data of each user account corresponding to the terminal device acquired by the terminal device, and obtains a score of each user account. And in combination with the correspondence between the preset score and the confidence, the confidence of each user account is determined. Confidence is used to characterize the accuracy of determining a user account as a user account associated with the terminal device. According to the preset selection rule, the confidence level is selected, and the associated data may be pushed by the terminal device according to the usage data corresponding to the user account corresponding to the selected confidence level.
  • the server 1 can be a push platform, such as an advertisement push platform, an article push platform, and the like.
  • the account identification association method provided by the example of the present application is applied to the server, and based on the usage data of each user account, the rating result of each user account is obtained by preset the association recommendation model, and then the user account associated with the terminal is determined.
  • the server recommends relevant data for the second user or item associated with the user's attention or request for the first user intelligence by determining the similarity or relevance.
  • the server uses the most frequent user account as the user account associated with the terminal.
  • FIG. 1 is only an example of a system architecture that implements an example of the present application.
  • the example of the present application is not limited to the system structure described in FIG. 1 above, and various examples of the present application are proposed based on the system architecture.
  • the example of the present application provides a method for determining an associated account. As shown in FIG. 2A, the method may include:
  • the usage data (corresponding to the relationship data in the foregoing) when the one or more users use the terminal device by using the corresponding first user account is performed, and the first user accounts are used.
  • the usage data includes: an identifier of the terminal device, the first user account, and a usage record of the first user account; and determining, according to the usage record corresponding to each first user account, at least two of the first user accounts. Usage parameters of the dimensions (corresponding to at least two dimensions used above).
  • the account identification association method is that the server uses the usage data of each user account of the terminal, and obtains a score result of multiple user accounts corresponding to the terminal by using a preset association recommendation model, thereby determining the terminal association. The process of user accounts.
  • the user When the user uses the terminal device or the application on the terminal, the user reports the usage data to the server 1.
  • the format of the usage data may be: ⁇ identity of the terminal device, user account identifier, usage record ⁇ , wherein the usage record includes the reporting time of the usage data and the source of use.
  • the reporting time of the usage data may be the time of logging in to the first user account, or the time of exiting the first user account.
  • the source may be used by the user to identify the identifier of the application on the terminal device used by the corresponding user account by using the user account, for example, WeChat, Weibo, news, video, etc., or browse the website for the user by using the identifier of the user account.
  • the source of use also includes the active source and the passive source.
  • the active source refers to the source corresponding to the usage data actively reported by the terminal served by the server in the application
  • the passive source refers to the source corresponding to the usage data obtained from other platforms.
  • the plurality of pieces of the usage data of a terminal constitute a first history record of the terminal.
  • the first history record includes multiple pieces of usage data of the foregoing format. For example, including usage data ⁇ terminal 1, account 1, usage record 1 ⁇ , ⁇ terminal 1, account 2, usage record 2 ⁇ , ⁇ terminal 1, account 1, usage record 1 ⁇ , ⁇ terminal 1, account 2, usage record 2 ⁇ .
  • the usage data of the terminal acquired by the server may be the usage data reported by the terminal in the latest period of time, for example, the usage data reported by the terminal in the most recent month, so that the determined user account associated with the terminal is more accurate.
  • the terminal in the example of the present application is an electronic device in which various applications are installed.
  • the server since the server performs data interaction with the terminal, when the user performs the use or operation of the application on the terminal, the server may obtain the usage data reported by the terminal, where the usage data includes the identifier of the terminal device.
  • the terminal can correspond to multiple user accounts (multiple users use the terminal). Among them, when the user uses the terminal through each user account, a usage data is generated respectively.
  • the terminal can report the usage data to the server, and the server adds each usage data to the first history record corresponding to the terminal.
  • the server can determine the usage data corresponding to each user account according to the first history record of the terminal.
  • the server may obtain a first history record corresponding to one terminal, and determine usage data corresponding to each user account according to the first history record.
  • the at least two dimension usage parameters of each first user account include at least two of the usage time of each user account, the number of uses of each user account, and the source of use of each user account.
  • the at least two dimension usage parameters shown may include at least two of usage time, number of uses, and source of use.
  • the current latest reporting time in each usage data is used as the usage time of the user account
  • the number of usage data corresponding to the user account is used as the usage number of the user account, and each usage is used.
  • the source of the data is used as the source of the account.
  • the example of the present application may not limit the number of parameters and the data type of the at least two dimensions, wherein the usage parameter of one data type corresponds to one dimension, for example, the usage parameter may include usage time, usage times, and usage source. Dimensions, the number of usage parameters of each dimension may be one or more, for example, the usage source may be multiple.
  • the server may use the RFM model to process at least two dimension usage parameters of each user account.
  • the server may acquire the usage time of each user account corresponding to the terminal, and the usage times of each user account. Use parameters such as the source of each user account.
  • the RFM model determines a first score of each user account according to at least two dimension usage parameters of each user account; and determines a first account associated with the terminal device according to the first score corresponding to each first user account.
  • the user account may be a communication account corresponding to the terminal (for example, a mobile phone number) or a login account of the application on the terminal, that is, the at least one user account includes: at least one communication account corresponding to the terminal or an application on the terminal.
  • the application example is not limited.
  • the application is a function application that needs to be logged in or registered by the user, and the specific application type is not limited.
  • the identifier of the terminal device can be represented by the IMEI.
  • the terminal reports the usage data based on the user behavior on the terminal to the server, the terminal simultaneously reports its own identifier and the user account, for example, using The data includes the IMEI of the terminal device and the user account.
  • S112. Call the preset association recommendation model to process at least two dimension usage records of each first user account, and output a first score corresponding to each first user account.
  • the first score corresponding to each first user account is calculated by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model.
  • the server After the server obtains the usage data of the terminal based on the historical behavior of the user on the terminal, the server determines at least two dimension usage parameters of each user account corresponding to the terminal according to the usage data. And the preset association recommendation model is established in the server, and the preset association recommendation model is used for performing comprehensive scoring of multiple dimensions on each user account according to at least two dimension usage parameters of each user account. Therefore, the server processes at least two dimension usage parameters of each user account by calling a preset association recommendation model, and outputs a first score corresponding to each first user account. Then, the server can obtain at least one first score corresponding to the at least one first user account corresponding to the terminal.
  • the preset association recommendation model may be composed of two parts, and a part is a preset first model for using an importance score (second score) for each dimension, and a part is for Each dimension outputted after the first model is preset uses a second score corresponding to the parameter to perform a preset second model of the integrated weighted score.
  • the first score is a comprehensive score result of the multi-dimensional evaluation of each first user account of the at least one first user corresponding to the terminal, wherein the comprehensive score result is used to represent the first user account corresponding to the terminal.
  • the accuracy of the correspondence of the terminal devices That is, the higher the comprehensive rating result of the first user account, the higher the accuracy rate corresponding to the first user account, so the server determines that the first user account is a common user account on the terminal, and The first user account is associated with the identity of the terminal device.
  • At least two dimension usage parameters of each first user account include: the usage time of each first user account, the number of uses of each first user account, and the source of use of each first user account
  • the server may calculate the importance score (second score) of each first user account usage time and the importance score of each first user account usage count by using the preset association recommendation model (second score) And the importance score of the source of use of each first user account (second score), and the importance score according to the usage time of each first user account, and the importance of the number of uses of each first user account.
  • the score value and the importance score of each first user account source and the preset association recommendation model are obtained, and the comprehensive score result of each first user account is obtained.
  • the server determines at least two dimension usage parameters of each first user account in each first user account corresponding to the terminal according to the usage data of the terminal, according to at least two of the first user accounts.
  • the dimension usage parameter and the preset association recommendation model determine a comprehensive score (first score) of each first user account, and determine a first user account associated with the terminal device according to the comprehensive score of each first user account.
  • the first user account is evaluated according to the usage parameters of the multiple dimensions of the first user account, so that the determined first user account associated with the terminal device is more accurate.
  • the method for determining an associated account further includes the following steps:
  • S11 Push the associated data to the terminal device according to the usage record corresponding to the first user account associated with the terminal device.
  • the determined first user account associated with the terminal is a user account commonly used by the terminal, and the associated data is recommended to the terminal according to the usage data corresponding to the most commonly used account of the terminal, so that the recommended data is more accurate.
  • the user's interest characteristics corresponding to the account associated with the terminal are determined according to the usage record in the usage data of the account, and the associated data is recommended to the terminal according to the interest feature, for example, pushing advertisements, news, articles, and the like.
  • the account identification association method provided by the present application further includes the following steps:
  • the server processes the at least two dimension usage parameters of each first user account by calling the preset association recommendation model, and obtains a first score corresponding to each first user account. After obtaining at least one first score, the server obtains the At least one first score of the correspondence between the terminal and the at least one user account, the server stores a correspondence between the preset score and the confidence, and the correspondence corresponding to the first score is obtained according to the correspondence between the preset score and the confidence degree. That is, the server obtains the confidence of the correspondence between each first user account and the terminal.
  • the server matches the at least one first score with the correspondence between the preset score and the confidence, for example, when the first score matches the fourth score in the correspondence between the preset score and the confidence, the server The confidence level corresponding to the fourth score in the correspondence between the preset score and the confidence is determined as the confidence corresponding to the first score.
  • the confidence level that satisfies the preset selection rule is obtained from the confidence levels corresponding to the first user accounts, and the first user account corresponding to the confidence level that satisfies the preset selection rule is corresponding.
  • the usage record pushes the associated data for the terminal device.
  • the server may perform the terminal and the first according to the confidence level.
  • the server may adopt different types of recommended associated data.
  • Different rules that is, the preset selection rule selects a confidence level from the at least one confidence level, so as to push the terminal according to the usage data of the first user account corresponding to the confidence level (the selected confidence may correspond to one or more relationship data) Associated data.
  • the server may obtain the confidence with the highest degree of confidence from at least one confidence level, and push the associated data for the terminal according to the usage data of the first user account corresponding to the confidence level. Specifically, the relevant data is pushed for the terminal based on the usage record in the usage data.
  • the server when the server is to recommend the associated video to the terminal, the selected confidence level corresponds to the usage record in the usage data corresponding to the one or more first user accounts to determine the interest feature corresponding to the terminal, and the server may select A related video having a high degree of relevance to the interest characteristics of the terminal is used as the recommended video.
  • the higher the confidence the more accurate the association is. Therefore, the server can select the user's usage record corresponding to the user account corresponding to the highest confidence level as a reference for the user's preference or hobby, and perform related data. Recommended.
  • the server may further obtain, from the at least one confidence level, a confidence level corresponding to the first user account, and associate the terminal with the usage record in the usage data of the first user account corresponding to the confidence level. data.
  • the server may select a confidence level corresponding to at least one first user account, for example, the acquired at least one confidence includes the first confidence and the second confidence.
  • the first confidence level is greater than the second confidence level, and the first confidence level corresponds to a first user account (ie, corresponding to one user, the user is a common user of the terminal), and the second confidence level corresponds to three second user accounts (corresponding to three The user is not the user of the terminal.
  • the second confidence is selected, and the related advertisement is recommended to the terminal according to the usage record in the usage data of the first user account corresponding to the second confidence. That is, the user characteristics are determined according to the usage records of the above three users, and then the advertisement is pushed to the terminal according to the user characteristics. The pushed advertisement is matched with more users who use the terminal. In this case, the recommended ads will make as many users as possible interested in buying. Therefore, in the example of the present application, the server selects the user usage record in the usage data corresponding to the first confidence level corresponding to the at least one first user account as a reference of the user's preference or hobby, and performs recommendation of the associated data.
  • 2B is a detailed flow chart for pushing associated data to a terminal based on confidence.
  • the server can calculate the first score corresponding to each first user account by the terminal by using the preset association recommendation model. Furthermore, the confidence level corresponding to each first user account is such that the server can select different confidence levels according to preset selection rules. The user corresponding to the user account corresponding to the selected confidence level is used as the user to be recommended. The associated data is pushed to the terminal according to the usage record in the usage data of the user account. In the case that the terminal corresponds to multiple user accounts, the user account associated with the terminal is adaptively determined, so that the associated data is recommended for the user corresponding to the user account associated with the terminal, and the accuracy of the association recommendation is improved.
  • the application example provides an account identification association method, where the server invokes a preset association recommendation model, where the association recommendation model is used to process at least two of each first user account.
  • the dimension uses the parameter to output a first score corresponding to each of the first user accounts.
  • the preset association recommendation model includes a first model and a second model, and the process of obtaining at least one first rating may include:
  • S201 Calling the preset first model to process at least two dimension usage records of each first user account, and outputting at least two second scores corresponding to at least two dimension usage records of each first user account, where
  • the preset association recommendation model includes: a preset first model, wherein the preset first model is used to respectively score the importance degree of the usage record of at least two dimensions of each first user account.
  • the preset association recommendation model may be composed of two parts, one part is a preset first model for importance degree scoring using parameters for each dimension, and one part is for each output after the preset first model is output.
  • the dimension uses the first score corresponding to the parameter to perform a preset second model of the integrated weighted score.
  • At least two dimension usage parameters of each first user account include: the usage time of each first user account, the number of uses of each first user account, and the source of use of each first user account
  • the server may calculate the importance score (second score) of each first user account usage time and the importance score of each first user account usage count by using the first model in the preset association recommendation model.
  • the preset first model may include at least two of formula (1), formula (2), and formula (3), wherein formula (1), formula (2), and formula (3) are :
  • the importance score of the use source is determined by the formula (1)
  • the importance score of the use time is determined by the formula (2)
  • the importance score of the use count is determined by the formula (3).
  • m is the total number of sources of use of each first user account
  • h i is a preset score corresponding to the i th use source of each first user account
  • H is the total number of preset use sources
  • M is a pre- Set the first normalization parameter and m is less than H.
  • n is the number of days when the usage time of each first user account is from the current time
  • N is a preset second normalization parameter
  • j is the number of times of use of each first user account
  • k is a preset time
  • J is a preset The third normalization parameter.
  • M, N and J are normalization parameters and are positive integers.
  • the value of M can be The value of the value falls within the absolute value of the upper limit of two adjacent integer intervals;
  • the value of N can be lg (1/(1+(n/30)*n)) and the value falls between two adjacent integers
  • the value of J may be the absolute value of the upper limit value when the value of lg(j/k) falls within two adjacent integer intervals.
  • formula (1) is a model for calculating a second score of the source of use of each first user account
  • formula (2) is used to calculate each first user account.
  • formula (3) is a model for calculating a second score for the number of uses of each first user account.
  • the server scores the source of use of each first user account, since the source of use of each first user account can be classified into an active source and a passive source, the h i corresponding to different usage sources is That is, the preset score corresponding to the ith source of use of each first user account is also different.
  • the value of h i corresponding to the active source is higher than the value of h i corresponding to the passive source, and the value of h i corresponding to the passive source may be assigned different values according to the calculated accuracy rate of the active source, and the same as the active source. Then the value of h i is higher.
  • One or more correspondences of IMEI-user accounts included in the usage data obtained from passive sources such as other platforms and one or more correspondences of IMEI-user accounts included in the usage data acquired by the active source are included
  • the value of h i corresponding to the active source may be 2, and the value range of h i corresponding to the passive source may be between 1 and 1.9.
  • the more the source of use of each first user account the higher the score corresponding to the source of use of each first user account obtained by formula (1).
  • formula (2) indicates that the longer the n is from the current time, the lower the score obtained by the usage time of each first user account, and in order to optimize the time decay,
  • the first user account reported in a 30-day period gives a higher score and slower decay; and the data reported over 30 days is attenuated faster, so the denominator of the logarithm in equation (2) is added (n /30) to optimize.
  • the closer the usage time of each first user account is to the current time the higher the score.
  • the server scores the number of uses of each first user account the higher the number of uses of each first user account, the higher the score.
  • formula (1), formula (2), and formula (3) can also be correspondingly transformed into: formula (4), formula (5), and formula (6), as follows:
  • the preset second model is processed to process the at least two second scores, and the first score corresponding to each first user account is output until the at least one first score is obtained, where the preset association recommendation model further includes: A second model is provided, the preset second model is used to weight the at least two second scores to obtain a total score.
  • step S202 executing the second model to process the second score of each dimension usage record of each first user account, and outputting the first score of each first user account, where the first The second model is configured to weight the second score of each dimension usage record of each first user account to obtain the first score of each first user account.
  • the server processes the at least two dimension usage parameters of each first user account by calling the preset first model, and outputs at least two second scores corresponding to the at least two dimension usage parameters of each first user account. Thereafter, since the preset second model is used to weight the at least two second scores to obtain a total score, the server may call the preset second model to process the at least two second scores, and output the first The first score corresponding to the user account, and in the same manner, the server may obtain the first score corresponding to each first user account in the at least one first user account.
  • the server may perform at least two second scores corresponding to each first user account.
  • Weighting gives a comprehensive score, the first score. In the three dimensions of using source, usage time and usage times, usage time is the most important factor, so the usage time corresponds to a higher weight and the other two weights are lower.
  • the preset second model in the example of the present application can be obtained according to the training model, and the specific implementation process will be described in the following examples.
  • the construction or generation method of the preset second model in the example of the present application can be performed by a common classification method of machine learning, for example, support vector machine, logistic regression, decision tree, GBDT or neural network.
  • a common classification method of machine learning for example, support vector machine, logistic regression, decision tree, GBDT or neural network.
  • the constructed samples are called for training, and the weight parameters are adjusted to obtain an optimal model capable of comprehensive scoring based on multiple dimensions. .
  • the server can calculate the first score corresponding to each first user account by using the preset association recommendation model, the confidence level corresponding to each first user account is obtained, so that the server can select according to the preset.
  • the rule implementation achieves different confidence levels in different situations, and the user corresponding to the user account corresponding to the selected confidence level is used as the user to be recommended.
  • the terminal corresponds to multiple user accounts
  • the user account associated with the terminal is adaptively determined, so that the associated data is recommended for the usage data corresponding to the user account associated with the terminal, and the accuracy of the association recommendation is improved.
  • the example of the present application provides a preset second model based on the introduction of machine learning technology, and all feature dimensions are considered when the first score is obtained, and then the judgment is comprehensively performed.
  • the initial stage of forming the preset second model it is still necessary to manually select as many features as possible (ie, the characteristics of the sample) for the machine learning model training, and determine which features are selected according to the degree of discrimination of the first training result.
  • the comprehensive evaluation involves the comprehensive consideration of the parameters used in multiple dimensions. Improve the accuracy of the comprehensive score.
  • the model itself has the function of evolutionary learning. Even if the allowable range is updated or deleted, by simply re-training the model (sometimes requiring fine-tuning of the feature), the adjustment of the preset second model can be performed to maximize the accuracy of the comprehensive scoring result.
  • the application example provides a method for forming a preset second model. As shown in FIG. 4, the method includes:
  • S301 Acquire a positive sample and a negative sample according to a preset configuration ratio, where the positive sample and the negative sample include a correspondence between each terminal of the at least two terminals and at least one second user account, and At least two third scores of each second user account of each terminal obtained by the model.
  • the second user account and the third user account corresponding to each terminal device in the at least one terminal device are obtained, and at least two dimensions of the second user account and the third user account corresponding to each terminal device are obtained.
  • the parameter determining, according to the first model, at least two third scores of the second user account corresponding to each terminal device and at least two third scores of the third account.
  • the training sample selects multiple terminals, each terminal corresponds to a positive sample and one or more negative samples, and the positive sample is a user account that the terminal is using.
  • the positive sample is the second user account
  • the negative sample is the third user account. Determining, according to the usage data corresponding to the second user account, the usage parameters of the at least two dimensions of the second user account, and determining at least two third scores of the second user account, where each third score corresponds to the usage parameter of each dimension . In the same manner, at least two third scores of each third user account are determined.
  • the negative sample is an account that has been used but has not been used.
  • the configuration ratio is the configuration ratio.
  • the configuration of the training data by the server (the sample of the existing user behavior and the corresponding comprehensive scoring result) also needs to be set according to the configuration ratio.
  • the positive sample and the negative sample are user accounts corresponding to the first terminal.
  • Determining a user account associated with the terminal device according to the first rating of the second user account and the first rating of the third user account.
  • the server in the example of the present application obtains at least two third scores obtained by presetting the first model for each second user account corresponding to the positive sample and the negative sample; and obtaining at least each third user account pair.
  • the process of the two third ratings is the same as the principle of obtaining at least one second score corresponding to each first user account.
  • the parameters of the second model are preset.
  • the second model is used to perform weighted summation of the third scores of the respective dimensions to obtain a first score, and the parameters of the second model include weights corresponding to the second scores of the respective dimensions.
  • the weight corresponding to the second score of each dimension is preset, and for any sample of the positive sample (second user account) and the negative sample (third user account) of each terminal, one parameter is determined according to the preset parameter of the second model.
  • the first score of the sample determines the user account associated with the terminal according to each user account corresponding to one terminal. For example, the user account with the highest highest score is determined as the user account associated with the terminal.
  • the user account associated with the terminal is the second user account (positive sample)
  • the user associated with the terminal determined by the second model is correct.
  • the account corresponding to the terminal device may be determined, and when the account is a positive sample, the evaluation is correct, otherwise, the evaluation error
  • the accuracy of adjusting the parameters of the second model to the second model satisfies a preset condition.
  • the accuracy of the second model is determined by determining the correct terminal device according to the user account corresponding to each terminal device. For example, the ratio of the terminal device that is correctly evaluated to the total terminal device may be used as the accuracy of the second model.
  • the parameters of the second model are adjusted, that is, the weights of the second scores are adjusted, and step S302 is repeatedly performed until the accuracy of the acquired second model reaches a maximum, and the parameters of the second model at this time are optimal.
  • the entry of the training model includes the features of the at least two dimensions described above, and if the feature does not have a favorable influence on the first training result or is wrong At the same time, the feature of the dimension or the weight of the data is lowered. If the feature has a favorable influence on the first training result, the weight of the feature or the data is increased. If the weight of one parameter is reduced to 0, then in the training model, Features will not have any effect.
  • the characteristics of the above different dimensions that ultimately have a positive impact on the first training result are long-term features (ie, at least two third scores in the examples of the present application).
  • the formation process of the preset second model substantially includes And inputting at least two dimensions of the positive sample or the negative sample into the training model using the at least two third scores corresponding to the parameters (ie, invoking the training model), and obtaining the first training result from the training model.
  • the training model constructed therein has at least two third scores, and each of the third scores has a corresponding weight (preset priority).
  • the first training result is continuously monitored until the preset condition is met (when the accuracy of the second model reaches a maximum), then the training model is taken as the preset second model.
  • the first training result is a determined user account associated with each terminal.
  • the preset condition in the example of the present application may be that the accuracy of the comprehensive result reaches a preset threshold, and the preset threshold may be 99%, and the specific preset threshold may be set.
  • the example in the application is not limited, but the preset threshold is The higher the setting, the more accurate the preset second model of the comprehensive score that reaches the preset threshold or preset condition.
  • the accuracy relationship between the terminal and the user account is shown in FIG. 1.
  • the server expresses the accuracy of the comprehensive result by the confidence accuracy rate, with a preset threshold. 99%.
  • the RFM model is used to obtain three dimension usage parameters, that is, the usage time (R) of each first user account, the usage frequency (F) of each first user account, and the usage source (S) of each first user account are For example, the usage parameters of at least two dimensions are as shown in FIG. 5.
  • the usage time of each first user that is, the weight value of R is 0.7
  • the confidence accuracy rate satisfies the preset condition, and therefore, the server trains out.
  • the usage time of each first user of the preset second model corresponds to a weight of 0.7, and the sum of the usage times of each first user account and the weight of each first user account is 0.3, specifically, The usage number of each first user account may be 0.2 and the weight value corresponding to the source of use of each first user account may be 0.1.
  • the accuracy rate chart 2 shows that if the server adopts a single dimension (R or F or M), the confidence accuracy rate (76.5%) does not adopt the RFM model.
  • the three dimensions obtained are higher by the confidence accuracy (88.20%) achieved by using the weighted total score corresponding to the data.
  • the server uses a single dimension (R, F, and M) or directly uses the weight value to directly score the second dimension (RFM total score) of the multi-dimensional, the confidence accuracy is not obtained by the RFM model.
  • the three two dimensions use the weight-weighted total score corresponding to the record to achieve a high confidence rate, and the highest value when the weight of R is 0.7, and the confidence accuracy rate is 99%.
  • the example of the present application adopts a comprehensive scoring method based on the preset second model, when constructing a first user account and at least two of the relationship data between the terminal and the at least one first user account
  • the second scoring performs a comprehensive scoring based on multiple dimensions, and fully utilizes multiple dimension usage parameters corresponding to each first user account on the terminal to obtain a preset second model, which can effectively obtain each first user on the terminal.
  • the use of the account's trustworthiness indicator enables evaluation of each first user account on the relevant terminal.
  • the example of the present application introduces the use of different usage parameters in the data to train the training model, and determines the weight of the second score of each dimension in the second model according to the first training result, and then determines according to the second model.
  • the comprehensive score of each first user account thus improves the accuracy of the comprehensive score of the user account.
  • a remarkable feature of the preset second model adopted in the application example is that the model can self-evolve, automatically adjust the weight according to the transformation of the record using at least two dimensions, and avoid the rule-based manual frequent intervention adjustment parameter.
  • the application example uses the usage data of the terminal, and determines multiple dimensions of each first user account corresponding to the terminal according to the usage data.
  • Use parameters as the primary data source.
  • the scoring process and the model construction process are simple and easy, and do not need to use various complicated coding, clustering, and filtering methods to perform complex construction and processing on the features, which greatly reduces the workload of data processing and makes the preset second model simple. Available.
  • the server may obtain a correspondence between a preset score and a confidence level, which may include:
  • S304 Call the preset second model to process the positive sample and the negative sample, and obtain the second training result.
  • the server may input the positive sample and the negative sample into the preset second model (ie, call the preset second model),
  • the second training result is obtained.
  • the second training result is a first score corresponding to each first user account
  • the second training result in the example of the present application is a comprehensive score for each sample on the basis of the highest confidence accuracy.
  • S305 Call a second training result and a correspondence between the preset sample and the confidence accuracy, and obtain a confidence accuracy rate corresponding to the second training result.
  • the server inputs the positive sample and the negative sample into the preset second model, and after obtaining the second training result, the server can know the correspondence between the second training result and each sample (each sample in the positive sample and the negative sample). .
  • the correspondence between the preset sample and the confidence accuracy rate is further set in the server, wherein the correspondence between the preset sample and the confidence accuracy rate is based on the second score of the usage time in the sample. High, the higher the confidence level of the corresponding confidence is set. That is, the closer the usage time of the user account in the relationship data between the terminal and the user account is to the current time, the higher the confidence rate of the comprehensive score of the server is represented, and the formula (2) or formula (5) is based on the principle.
  • the server matches the correspondence between each sample and the preset sample and the confidence accuracy rate, and obtains the confidence accuracy rate corresponding to each sample, and according to the correspondence between the second training result and each sample, The first confidence accuracy corresponding to the second training result can be obtained.
  • the correspondence between the second training result and the first confidence accuracy is used as a correspondence between the preset score and the confidence.
  • the server After the server obtains the first confidence accuracy rate corresponding to the second training result, the higher the first score obtained by the server through the preset second model is used to represent the relationship between the terminal corresponding to the first score and the at least one first user account.
  • the relational data corresponding to the data has the highest relationship accuracy rate, that is, the first user account corresponding to the first rating is the most commonly used user account in the relationship data between the terminal and the at least one first user account. Therefore, the correspondence between the second training result and the first confidence accuracy rate may be used to represent the relationship accuracy between the terminal corresponding to the first rating and the relationship data of the at least one first user account. Therefore, the server may use the second training result and the first training result.
  • the correspondence between the confidence rate and the confidence is used as the correspondence between the preset score and the confidence. Then, the server can determine the user corresponding to the confidence that the preset selection rule wants to select by the correspondence between the preset score and the confidence. The user corresponding to the account.
  • the second training result is obtained by formula (1), formula (2) and formula (3), and preset second model.
  • Table 1 the second training result (weighted total score), the confidence level, and the correspondence table of the preset samples are summarized.
  • the server can select the adaptive selection by the preset selection rule.
  • the confidence of the most commonly used users on the terminal enables the push of associated data, or the most trusted confidence of the user to use to push the associated data.
  • the server can calculate the first score corresponding to each first user account by using the preset association recommendation model, the confidence level corresponding to each first user account is obtained, so that the server can select according to the preset.
  • the rule selects the confidence level under different conditions, and pushes the associated data to the terminal according to the usage data of the user account corresponding to the selected confidence level.
  • the account associated with the terminal is adaptively determined, so that the associated data is recommended for the user corresponding to the associated account on the terminal, and the accuracy of the association recommendation is improved.
  • a server 1 which may include:
  • the obtaining unit 10 is configured to acquire a first history record of the user, where the first history record of the user includes relationship data of at least one first user account corresponding to the user, and each first user account in the relationship data. Use records for at least two dimensions.
  • the calling unit 11 is configured to invoke a preset association recommendation model, where the preset association recommendation model is configured to process at least two dimension usage records of each of the first user accounts, and output corresponding to each of the first user accounts.
  • the first score is obtained by at least one first score, wherein the at least one first user account corresponds to the at least one first score respectively.
  • the obtaining unit 10 is further configured to invoke the correspondence between the at least one first score and the preset score and the confidence, acquire at least one confidence level corresponding to the at least one first score, and from the at least one Confidence acquisition obtains a first confidence level that satisfies the preset selection rule, and uses at least two dimension usage records of the first user account corresponding to the first confidence level to push the associated data for the terminal, the preset selection rule It is determined by the type of actual push associated data.
  • the at least two dimension usage records of each of the first user accounts acquired by the obtaining unit 10 include: a usage time of each of the first user accounts, a usage count of each of the first user accounts, and the At least two of the sources of use of each first user account.
  • the calling unit 11 is configured to: call the preset first model to process the at least two dimension usage records of each of the first user accounts, and output the at least two dimension usage records corresponding to each of the first user accounts.
  • the preset association recommendation model includes: the preset first model, the preset first model being used for at least two of each of the first user accounts respectively The dimension is scored using the importance of the record; and the preset second model is invoked to process the at least two second scores, and the first score corresponding to each of the first user accounts is output until the location is obtained
  • the preset association recommendation model further comprises: the preset second model, wherein the preset second model is configured to weight the at least two second scores to obtain a total score .
  • the server 1 further includes a detecting unit 12.
  • the obtaining unit 10 is further configured to obtain a positive sample and a negative sample according to a preset configuration ratio, where the positive sample and the negative sample are correspondences between the first terminal and the at least one second user account, and each of the first At least two third scores obtained by the two user accounts through the preset first model.
  • the calling unit 11 is further configured to invoke the set training model to process the positive sample or the negative sample to obtain a first training result.
  • the detecting unit 12 is configured to continuously detect the training model until the first training result satisfies a preset condition.
  • the acquiring unit 10 is further configured to use, as the preset second model, the training model that meets the preset condition that the first training result meets the preset condition, where the preset condition is used to represent according to the preset
  • the data output result obtained by the second model is used to determine the most common user account of the terminal, and is closest to the real user account of the terminal.
  • the calling unit 11 is further configured to: after the training model that meets the preset condition that the first training result meets the preset condition, use the preset second model to process the The positive sample and the negative sample, the second training result is obtained.
  • the acquiring unit 10 is further configured to acquire, according to the correspondence between the second training result and the preset sample and the confidence accuracy, the first confidence accuracy rate corresponding to the second training result; Corresponding relationship between the second training result and the first confidence accuracy rate is used as a correspondence between the preset score and the confidence.
  • the preset first model includes:
  • m is the total number of sources of use of each of the first user accounts
  • h i is a preset score corresponding to the ith source of use of each of the first user accounts
  • H is the total number of preset usage sources
  • M is a preset first normalization parameter, and m is less than H
  • n is the number of days from the current time of each first user account
  • N is a preset second normalization parameter
  • j is The number of times each first user account is used
  • k is a preset time
  • J is a preset third normalization parameter.
  • the obtaining unit 10 is specifically configured to obtain the first confidence that is the most reliable from the at least one confidence level.
  • the obtaining unit 10 is specifically configured to obtain, from the at least one confidence level, the first confidence level that corresponds to the at least one first user account.
  • the at least one first user account acquired by the acquiring unit 10 includes: at least one communication account corresponding to the terminal or at least one login account of the first application on the terminal.
  • the present application also provides a server, including a processor 14 and a storage medium 15, which is linked to the processor 14 via a system bus 16.
  • the processor 14 is embodied by a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA).
  • the storage medium 15 is for storing executable program code, and the program code includes computer operation instructions.
  • the storage medium 15 may include a high speed RAM memory, and may also include a nonvolatile memory, for example, at least one disk storage.
  • the program code stored in the memory is configured to be executed by the processor to implement the method of determining an associated account in the present application described above and to implement the functions of the various modules in the server in the present application.
  • examples of the present application can be provided as a method, system, or computer program product. Accordingly, the examples of the present application may take the form of a hardware instance, a software example, or an example of combining software and hardware aspects. Moreover, the examples of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed in the embodiments of the present application is a method for determining an associated account, the method being applied to a server. The method comprises: acquiring, from a terminal device, usage data concerning one or more users using the terminal device by means of their respective first user accounts, the usage data of each first user account comprising: the identifier of the terminal device, the first user account, and the usage records of the first user account; determining, according to the usage records corresponding to each first user account, the usage parameters of at least two dimensions of each first user account; calculate, using the usage parameters of at least two dimensions of each first user account and a preset association recommendation model, the first score corresponding to each first user account; and determining, according to the first score corresponding to each first user account, the first account associated with the terminal device. Further provided in the embodiments of the present application are a corresponding server and storage medium.

Description

确定关联账号的方法、服务器及存储介质Method, server and storage medium for determining associated accounts
本申请要求于2017年1月16日提交中国专利局、申请号为201710032683.2、申请名称为“一种账号识别关联方法及服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application filed on Jan. 16, 2017, the application number of which is hereby incorporated by reference. in.
技术领域Technical field
本申请涉及通信领域中的互联网技术,尤其涉及确定关联账号的方法、服务器及存储介质。The present application relates to Internet technologies in the field of communications, and in particular, to a method, server and storage medium for determining associated accounts.
背景技术Background technique
随着通信技术的飞速发展,客户端的功能和智能性也越来越丰富。当用户通过客户端在进行业务应用时,如通过客户端安装的视频应用播放视频时,发送播放请求或请求下载视频的请求等等业务请求,服务器会收集这些业务请求,并根据业务请求确定用户的置信度,通过置信度等智能地为用户推荐相关联的媒体信息(如某一个综艺视频或电视剧视频)或用户信息,从而方便用户选择更多感兴趣的相关媒体信息或具有相同爱好的关联用户的信息,例如,在用户观看视频时,视频客户端可以根据用户选择的视频类型播放视频,服务器根据视频客户端播放的视频的类型智能推荐与其关联的视频给用户,便于用户选择。于是,如何能够从种类繁多的物品大数据中挖掘出有相同兴趣的好友用户或物品是一个需要关注的问题。With the rapid development of communication technology, the functions and intelligence of the client are becoming more and more abundant. When a user performs a business application through a client, such as a video application played by a client, a service request such as a request to play a video or a request to download a video, the server collects these service requests and determines the user according to the service request. Confidence, intelligently recommend relevant media information (such as a variety video or TV drama video) or user information to the user through confidence, etc., thereby facilitating the user to select more relevant media information of interest or association with the same hobby The user's information, for example, when the user watches the video, the video client can play the video according to the type of the video selected by the user, and the server intelligently recommends the video associated with the video to the user according to the type of the video played by the video client, which is convenient for the user to select. Therefore, how to dig out a friend or item with the same interest from a large variety of item big data is a problem that needs attention.
技术内容Technical content
本申请实例提供了一种确定关联账号的方法,应用于服务器,所述方法包括:The application example provides a method for determining an associated account, which is applied to a server, and the method includes:
从终端设备获取一个或多个用户通过各自对应的第一用户账号使用所述终端设备时的使用数据,所述各第一用户账号的使用数据包括: 所述终端设备的标识,所述第一用户账号,以及所述第一用户账号的使用记录;Obtaining, by the terminal device, the usage data when the user uses the terminal device by using the corresponding first user account, the usage data of each first user account includes: the identifier of the terminal device, the first a user account and a usage record of the first user account;
根据各第一用户账号对应的使用记录确定所述各第一用户账号至少两个维度的使用参数;Determining, according to the usage record corresponding to each first user account, usage parameters of at least two dimensions of each first user account;
利用所述各第一用户账号至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分;Calculating, by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model, a first score corresponding to each first user account;
根据所述各第一用户账号对应的第一评分确定与所述终端设备关联的第一账号。Determining, according to the first score corresponding to each first user account, a first account associated with the terminal device.
本申请实例还提供了一种服务器,包括一个或一个以上处理器和一个或一个以上存储器,所述一个或一个以上存储器包括计算机可读指令,经配置由所述一个或者一个以上处理器执行以实现:The application examples also provide a server comprising one or more processors and one or more memories, the one or more memories comprising computer readable instructions configured to be executed by the one or more processors achieve:
从终端设备获取一个或多个用户通过各自对应的第一用户账号使用所述终端设备时的使用数据,所述各第一用户账号的使用数据包括:所述终端设备的标识,所述第一用户账号,以及所述第一用户账号的使用记录;Obtaining usage data when the one or more users use the terminal device by using the corresponding first user account, the usage data of each first user account includes: an identifier of the terminal device, the first a user account and a usage record of the first user account;
根据各第一用户账号对应的使用记录确定所述各第一用户账号至少两个维度的使用参数;Determining, according to the usage record corresponding to each first user account, usage parameters of at least two dimensions of each first user account;
利用所述各第一用户账号至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分;Calculating, by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model, a first score corresponding to each first user account;
根据所述各第一用户账号对应的第一评分确定与所述终端设备关联的第一账号。Determining, according to the first score corresponding to each first user account, a first account associated with the terminal device.
本申请实例还提供了一种非易失性计算机可读存储介质,存储有计算机可读指令,可以使至少一个处理器执行如上述所述的方法。The present application examples also provide a non-transitory computer readable storage medium storing computer readable instructions that cause at least one processor to perform the method as described above.
附图说明DRAWINGS
图1为本申请实例中进行数据处理系统中的各种硬件实体的示意图;1 is a schematic diagram of various hardware entities in a data processing system in an example of the present application;
图2A为本申请实例提供的一种账号识别关联方法的流程示意图;2A is a schematic flowchart of an account identification association method provided by an example of the present application;
图2B为本申请实例提供的一种账号识别关联方法的框架示意图一;2B is a schematic diagram 1 of a framework for an account identification association method provided by an example of the present application;
图3为本申请实例提供的一种账号识别关联方法的框架示意图二;3 is a schematic diagram 2 of a framework for an account identification association method provided by an example of the present application;
图4为本申请实例提供的一种账号识别关联方法的框架示意图三;4 is a schematic diagram 3 of a framework for an account identification association method provided by an example of the present application;
图5为本申请实例提供的示例性的终端与用户账号的关系准确率图表1;FIG. 5 is a diagram 1 showing an exemplary relationship between an exemplary terminal and a user account provided by an example of the present application; FIG.
图6为本申请实例提供的示例性的终端与用户账号的关系准确率图表2;6 is a diagram 2 showing an exemplary relationship between an exemplary terminal and a user account provided by an example of the present application;
图7为本申请实例提供的一种账号识别关联方法的框架示意图四;FIG. 7 is a schematic diagram 4 of a framework for an account identification association method provided by an example of the present application; FIG.
图8为本申请实例提供的一种服务器的结构示意图一;FIG. 8 is a schematic structural diagram 1 of a server provided by an example of the present application; FIG.
图9为本申请实例提供的一种服务器的结构示意图二;以及FIG. 9 is a schematic structural diagram 2 of a server provided by an example of the present application;
图10为本申请实例提供的一种服务器的结构示意图三。FIG. 10 is a schematic structural diagram 3 of a server provided by an example of the present application.
具体实施方式detailed description
下面将结合本申请实例中的附图,对本申请实例中的技术方案进行清楚、完整地描述。The technical solutions in the examples of the present application will be clearly and completely described below in conjunction with the drawings in the examples of the present application.
终端设备:指移动电子设备,也被称为行动装置(英语:Mobile device)、流动装置、手持装置(handheld device)、可穿戴设备等,是一种基于嵌入式芯片的计算设备,通常有一个小的显示幕,触控输入,或是小型的键盘。Terminal device: refers to mobile electronic devices, also known as mobile devices (mobile devices), mobile devices, handheld devices, wearable devices, etc., is an embedded chip-based computing device, usually has a Small display, touch input, or a small keyboard.
机器学习:依托概率论,统计学,神经传播等理论,使计算机能够模拟人类的学习行为,以获取新的知识或者技能,重新组织已有知识结构使之不断改善自身的性能。Machine learning: relying on theories of probability, statistics, and neural communication to enable computers to simulate human learning behaviors to acquire new knowledge or skills, and to reorganize existing knowledge structures to continuously improve their performance.
模型训练:将人工选择的样本输入给机器学习系统,通过不断调整模型参数,使最终模型对样本识别的准确率达到最优。Model training: The manually selected samples are input to the machine learning system, and the accuracy of the sample identification is optimized by continuously adjusting the model parameters.
移动设备国际识别码(IMEI,International Mobile Equipment Identity),是手机的唯一识别号码。The International Mobile Equipment Identity (IMEI) is the unique identification number of the mobile phone.
RFM模型:在RFM模式中,R(Recency)表示客户最近一次使用的时间有多远,F(Frequency)表示客户在最近一段时间内使用的次数,M(Monetary)表示客户在最近一段时间内使用的金额。本方案中用到的R表示上报时间,F表示上报频率,M表示上报源。RFM model: In RFM mode, R (Recency) indicates how far the customer last used, F (Frequency) indicates the number of times the customer has used in the most recent period of time, and M (Monetary) indicates that the customer has used it in the most recent period of time. The amount. R used in this scheme indicates the reporting time, F indicates the reporting frequency, and M indicates the reporting source.
IMEI-用户账号:表示一个终端(IMEI)与一个用户账号的一条关系数据。IMEI-User Account: A relationship data between a terminal (IMEI) and a user account.
如图1所示,为本申请实例中进行数据处理系统的架构中的各种硬件实体的示意图,图1中包括:一个或多个服务器1、终端设备21-25及网络3,网络3中包括路由器,网关等等网络实体,图1中并未体现。终端设备21-25通过有线网络或者无线网络与服务器1进行业务产品信息交互,以便从终端21-25处获取通过用户使用终端设备的行为产生的与时间相关的数据,将获取的数据传输至服务器1,该数据为用户使用终端设备产生的使用数据,例如,该使用数据可以包括终端设备的标识、用户账号、以及历史访问记录(用户浏览新闻、文章、观看视频、访问社交网站的记录数据)。终端设备的类型如图1所示,包括手机(终端23)、平板电脑或PDA(终端25)、台式机(终端22)、PC机(终端24)、一体机(终端21)等类型。其中,终端设备中安装有各种用户所需的应用功能模块,比如具备娱乐功能的应用(如视频应用,音频播放应用,游戏应用,阅读软件),又如具备服务功能的应用(如地图导航应用、团购应用、拍摄应用等),再者比如设置应用等系统功能。FIG. 1 is a schematic diagram of various hardware entities in an architecture of a data processing system in the example of the present application. FIG. 1 includes: one or more servers 1, terminal devices 21-25, and network 3, in network 3. Network entities including routers, gateways, etc., are not shown in Figure 1. The terminal device 21-25 performs service product information interaction with the server 1 through a wired network or a wireless network, so as to acquire time-related data generated by the user's use of the terminal device from the terminal 21-25, and transmit the acquired data to the server. 1. The data is usage data generated by the user using the terminal device. For example, the usage data may include an identifier of the terminal device, a user account, and a historical access record (the user browses news, articles, watches videos, and accesses recorded data of the social networking site). . The type of the terminal device is as shown in FIG. 1, and includes a mobile phone (terminal 23), a tablet or PDA (terminal 25), a desktop (terminal 22), a PC (terminal 24), an all-in-one (terminal 21), and the like. Among them, the terminal device is equipped with various application function modules required by various users, such as an application having entertainment functions (such as a video application, an audio playback application, a game application, a reading software), and an application having a service function (such as a map navigation). Applications, group purchase applications, shooting applications, etc., and then set system functions such as applications.
基于上述图1所示的硬件实体,用户通过使用终端或终端上的应用产生使用终端的相应使用数据,该使用数据中包括终端设备的标识、用户账号以及用户通过该用户账号使用终端设备产生的使用记录。该用户账号可以为终端对应的通信账号(例如,手机号码)或终端设备上的应用的登录账号等。终端设备将使用数据发送给服务器1。其中,当终端设备为手机时,该终端设备的标识可以为IMEI(International Mobile Equipment Identity,国际移动设备身份码)。当不同的用户使用所述终端设备或终端设备上的应用时,不同的用户对应不同的用户账号,即使用不同的用户账号登录该应用,服务器中存储有终端设备的不同的用户账号对应的使用数据。服务器通过终端设备获取的该终端设备对应的各用户账号的使用数据对各用户账号进行评分计算,得到各用户账号的评分。并且结合预设评分与置信度的对应关系,确定各用户账号的置信度。置信度用以表征将一用户账号确定为与该终端设备相关联的用户账号 的准确度。根据预设选择规则,选取置信度,后续可依据选取的置信度对应的用户账号对应的使用数据为终端设备推送相关联数据。其中,所述服务器1可以为推送平台,例如广告推送平台、文章推送平台等。本申请实例提供的账号识别关联方法应用于服务器,基于各用户账号的使用数据,通过预设关联推荐模型,得到各用户账号的评分结果,进而确定出与终端关联的用户账号。Based on the hardware entity shown in FIG. 1 , the user generates corresponding usage data of the use terminal by using the application on the terminal or the terminal, where the usage data includes the identifier of the terminal device, the user account, and the user generated by using the terminal device by using the terminal device. Use records. The user account may be a communication account corresponding to the terminal (for example, a mobile phone number) or a login account of an application on the terminal device. The terminal device transmits the usage data to the server 1. The identifier of the terminal device may be an IMEI (International Mobile Equipment Identity). When different users use the application on the terminal device or the terminal device, different users correspond to different user accounts, that is, use different user accounts to log in to the application, and the server stores different user accounts corresponding to the terminal device. data. The server performs a score calculation on each user account by using the usage data of each user account corresponding to the terminal device acquired by the terminal device, and obtains a score of each user account. And in combination with the correspondence between the preset score and the confidence, the confidence of each user account is determined. Confidence is used to characterize the accuracy of determining a user account as a user account associated with the terminal device. According to the preset selection rule, the confidence level is selected, and the associated data may be pushed by the terminal device according to the usage data corresponding to the user account corresponding to the selected confidence level. The server 1 can be a push platform, such as an advertisement push platform, an article push platform, and the like. The account identification association method provided by the example of the present application is applied to the server, and based on the usage data of each user account, the rating result of each user account is obtained by preset the association recommendation model, and then the user account associated with the terminal is determined.
在一些实例中,服务器通过判断相似度或关联度为第一用户智能推荐与用户关注或请求相关联的第二用户或物品等相关数据。但是,在现实中人们经常换终端设备或是终端设备对应的通信账号(用户账号),因此,在全量关系数据中会存在有一个终端对应多个用户账号的情况,或是一个用户账号对应多个终端的情况。在这样的情况下,服务器通过使用最频繁的用户账号作为与终端关联的用户账号。In some examples, the server recommends relevant data for the second user or item associated with the user's attention or request for the first user intelligence by determining the similarity or relevance. However, in reality, people often change the communication account (user account) corresponding to the terminal device or the terminal device. Therefore, in the full amount of relationship data, there may be a case where one terminal corresponds to multiple user accounts, or a user account corresponds to more The situation of the terminal. In such a case, the server uses the most frequent user account as the user account associated with the terminal.
然而,由于终端的使用的用户账号的改变等原因,会存在当前终端上使用的用户账号不是该终端上目前统计出的使用最频繁的用户账号,因此,服务器按照之前的将使用次数最多的用户账号作为终端关联的用户账号会使确定的用户账号的不准确。上述图1的例子只是实现本申请实例的一个系统架构实例,本申请实例并不限于上述图1所述的系统结构,基于该系统架构,提出本申请各个实例。However, due to changes in the user account used by the terminal, etc., there may be a user account used on the current terminal that is not the most frequently used user account currently counted on the terminal. Therefore, the server is the one that will be used the most before. The account as the user account associated with the terminal may make the determined user account inaccurate. The example of FIG. 1 is only an example of a system architecture that implements an example of the present application. The example of the present application is not limited to the system structure described in FIG. 1 above, and various examples of the present application are proposed based on the system architecture.
为解决上述技术问题,本申请实例提供了一种确定关联账号的方法,如图2A所示,该方法可以包括:To solve the above technical problem, the example of the present application provides a method for determining an associated account. As shown in FIG. 2A, the method may include:
S110、获取用户第一历史记录,该用户第一历史记录包括与用户在终端对应的至少一个第一用户账号的关系数据,以及该关系数据中的每个第一用户账号的至少两个维度使用记录。S110. Acquire a first historical record of the user, where the first historical record of the user includes relationship data of at least one first user account corresponding to the user at the terminal, and at least two dimensions of each first user account in the relationship data are used. recording.
在执行上述步骤S101时,执行从终端设备获取一个或多个用户通过各自对应的第一用户账号使用所述终端设备时的使用数据(对应上述中的关系数据),所述各第一用户账号的使用数据包括:所述终端设备的标识,所述第一用户账号,以及所述第一用户账号的使用记录;根据各第一用户账号对应的使用记录确定所述各第一用户账号至少两个维度的使用参数(对应上述至少两个维度使用记录)。When the step S101 is performed, the usage data (corresponding to the relationship data in the foregoing) when the one or more users use the terminal device by using the corresponding first user account is performed, and the first user accounts are used. The usage data includes: an identifier of the terminal device, the first user account, and a usage record of the first user account; and determining, according to the usage record corresponding to each first user account, at least two of the first user accounts. Usage parameters of the dimensions (corresponding to at least two dimensions used above).
需要说明的是,本申请实例提供的账号识别关联方法是服务器基于终端的各用户账号的使用数据,通过预设关联推荐模型,得到终端对应的多个用户账号的评分结果,进而确定出终端关联的用户账号的过程。It should be noted that the account identification association method provided by the example of the present application is that the server uses the usage data of each user account of the terminal, and obtains a score result of multiple user accounts corresponding to the terminal by using a preset association recommendation model, thereby determining the terminal association. The process of user accounts.
用户在使用终端设备或终端上的应用时,向服务器1上报使用数据。使用数据的格式可以为:{终端设备的标识、用户账号标识、使用记录},其中,所述使用记录包括使用数据的上报时间及使用来源。其中,使用数据的上报时间可以是登录第一用户账号的时间,也可以是退出第一用户账号的时间。使用来源可以为用户通过所述用户账号标识对应的用户账号使用的终端设备上的应用的标识,例如,微信、微博、新闻、视频等,也可以为用户通过所述用户账号的标识浏览网站的域名标识,例如,网易邮箱、思博论坛等。使用来源还包括主动来源及被动来源,主动来源指本申请中的服务器所服务的终端主动上报的使用数据对应的来源,被动来源指从其他的一些平台上获取的使用数据对应的来源。其中,一个终端的多条上述使用数据构成该终端的第一历史记录,当多个用户使用同一个终端时,第一历史记录中会包括多条上述格式的使用数据。例如,包括使用数据{终端1、账号1、使用记录1},{终端1、账号2、使用记录2},{终端1、账号1、使用记录1},{终端1、账号2、使用记录2}。服务器获取的终端的使用数据可以为最近一段时间内终端上报的使用数据,例如,最近一个月终端上报的使用数据,以使得确定出的与终端关联的用户账号更加准确。When the user uses the terminal device or the application on the terminal, the user reports the usage data to the server 1. The format of the usage data may be: {identity of the terminal device, user account identifier, usage record}, wherein the usage record includes the reporting time of the usage data and the source of use. The reporting time of the usage data may be the time of logging in to the first user account, or the time of exiting the first user account. The source may be used by the user to identify the identifier of the application on the terminal device used by the corresponding user account by using the user account, for example, WeChat, Weibo, news, video, etc., or browse the website for the user by using the identifier of the user account. Domain name identification, for example, NetEase mailbox, Sibo forum, etc. The source of use also includes the active source and the passive source. The active source refers to the source corresponding to the usage data actively reported by the terminal served by the server in the application, and the passive source refers to the source corresponding to the usage data obtained from other platforms. The plurality of pieces of the usage data of a terminal constitute a first history record of the terminal. When multiple users use the same terminal, the first history record includes multiple pieces of usage data of the foregoing format. For example, including usage data {terminal 1, account 1, usage record 1}, {terminal 1, account 2, usage record 2}, {terminal 1, account 1, usage record 1}, {terminal 1, account 2, usage record 2}. The usage data of the terminal acquired by the server may be the usage data reported by the terminal in the latest period of time, for example, the usage data reported by the terminal in the most recent month, so that the determined user account associated with the terminal is more accurate.
本申请实例中的终端为安装有各种应用的电子设备。The terminal in the example of the present application is an electronic device in which various applications are installed.
在本申请实例中,由于服务器与终端进行数据交互,因此,当用户在终端上进行应用的使用或操作时,该服务器可以获取到该终端上报的使用数据,该使用数据中包括终端设备的标识、用户账号的标识,同时还包括所述用户账号对应的用户使用所述终端的使用记录。而针对一个终端,该终端可以对应多个用户账号(多个用户使用该终端)。其中,用户通过各用户账号使用终端时,分别产生了一条使用数据。于是终端可以将所述使用数据上报给服务器,服务器将各使用数据添加到终端对应第一历史记录中。服务器可以根据终端的第一历史记录确定每个用户 账号对应的使用数据了。总之,服务器可以获取到一个终端对应的第一历史记录,根据该第一历史记录确定每个用户账号对应的使用数据。In the example of the present application, since the server performs data interaction with the terminal, when the user performs the use or operation of the application on the terminal, the server may obtain the usage data reported by the terminal, where the usage data includes the identifier of the terminal device. The identifier of the user account, and the usage record of the terminal corresponding to the user corresponding to the user account. For a terminal, the terminal can correspond to multiple user accounts (multiple users use the terminal). Among them, when the user uses the terminal through each user account, a usage data is generated respectively. The terminal can report the usage data to the server, and the server adds each usage data to the first history record corresponding to the terminal. The server can determine the usage data corresponding to each user account according to the first history record of the terminal. In summary, the server may obtain a first history record corresponding to one terminal, and determine usage data corresponding to each user account according to the first history record.
在根据各用户账号对应的使用数据确定至少两个维度的使用参数时,首先确定各用户账号对应的一条或多条使用数据,根据各账号对应的一条或多条使用数据确定各账号对应的至少两个维度使用参数。在本申请实例中,每个第一用户账号的至少两个维度使用参数包括:每个用户账号的使用时间、每个用户账号的使用次数和每个用户账号的使用来源中的至少两个。所示至少两个维度使用参数可以包括使用时间、使用次数、使用来源中的至少两个。根据一个用户账号对应的一条多多条使用数据,将各使用数据中距离当前最近的上报时间作为用户账号的使用时间,将用户账号对应的使用数据的条数作为用户账号的使用次数,将各使用数据的来源作为所述账号的使用来源。When determining the usage parameters of the at least two dimensions according to the usage data corresponding to each user account, first determining one or more usage data corresponding to each user account, and determining at least one account corresponding to each account according to one or more usage data corresponding to each account. Two dimensions use parameters. In the example of the present application, the at least two dimension usage parameters of each first user account include at least two of the usage time of each user account, the number of uses of each user account, and the source of use of each user account. The at least two dimension usage parameters shown may include at least two of usage time, number of uses, and source of use. According to a plurality of usage data corresponding to one user account, the current latest reporting time in each usage data is used as the usage time of the user account, and the number of usage data corresponding to the user account is used as the usage number of the user account, and each usage is used. The source of the data is used as the source of the account.
但是本申请实例也可以不限制其至少两个维度使用参数的个数和数据类型,其中,一种数据类型的使用参数对应一个维度,例如,使用参数可以包括使用时间、使用次数及使用来源三个维度,每一个维度的使用参数的个数可以为一个或多个,例如,使用来源可以为多个。However, the example of the present application may not limit the number of parameters and the data type of the at least two dimensions, wherein the usage parameter of one data type corresponds to one dimension, for example, the usage parameter may include usage time, usage times, and usage source. Dimensions, the number of usage parameters of each dimension may be one or more, for example, the usage source may be multiple.
优选的,在本申请实例中,服务器可以采用RFM模型处理各用户账号的至少两个维度使用参数,例如,该服务器可以获取终端对应的每个用户账号的使用时间、每个用户账号的使用次数和每个用户账号的使用来源等使用参数。RFM模型根据各用户账号的至少两个维度使用参数确定各用户账号的第一评分;根据所述各第一用户账号对应的第一评分确定与所述终端设备关联的第一账号。Preferably, in the example of the present application, the server may use the RFM model to process at least two dimension usage parameters of each user account. For example, the server may acquire the usage time of each user account corresponding to the terminal, and the usage times of each user account. Use parameters such as the source of each user account. The RFM model determines a first score of each user account according to at least two dimension usage parameters of each user account; and determines a first account associated with the terminal device according to the first score corresponding to each first user account.
在本申请实例中,用户账号可以为终端对应的通信账号(例如,手机号码)或终端上的应用的登录账号等,即至少一个用户账号包括:终端对应的至少一个通信账号或终端上的应用的至少一个登录账号,本申请实例不作限制。其中,应用为终端上安装的各种需要进行用户登录或注册的功能应用,具体的应用类型本申请实例不作限制。In the example of the present application, the user account may be a communication account corresponding to the terminal (for example, a mobile phone number) or a login account of the application on the terminal, that is, the at least one user account includes: at least one communication account corresponding to the terminal or an application on the terminal. At least one login account, the application example is not limited. The application is a function application that needs to be logged in or registered by the user, and the specific application type is not limited.
需要说明的是,终端设备的标识可以用IMEI进行表示,在终端向服务器上报该终端上的基于用户行为的使用数据的时候,该终端同时将 自身的标识与用户账号的进行上报,例如,使用数据中包括终端设备的IMEI及用户账号的。It should be noted that the identifier of the terminal device can be represented by the IMEI. When the terminal reports the usage data based on the user behavior on the terminal to the server, the terminal simultaneously reports its own identifier and the user account, for example, using The data includes the IMEI of the terminal device and the user account.
S112、调用预设关联推荐模型处理每个第一用户账号的至少两个维度使用记录,输出与各第一用户账号对应的第一评分。S112. Call the preset association recommendation model to process at least two dimension usage records of each first user account, and output a first score corresponding to each first user account.
在执行上述步骤S112时,执行利用所述各第一用户账号至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分。When the step S112 is performed, the first score corresponding to each first user account is calculated by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model.
服务器基于用户在终端上的历史行为获取终端的使用数据之后,服务器根据使用数据确定终端对应的每个用户账号的至少两个维度使用参数。并且服务器中已经建立了预设关联推荐模型,该预设关联推荐模型是用于根据各用户账号的至少两个维度使用参数对各用户账号进行多个维度的综合评分。因此,服务器通过调用预设关联推荐模型来处理每个用户账号的至少两个维度使用参数,输出与每个第一用户账号对应的第一评分。那么,服务器就可以得到终端对应的至少一个第一用户账号对应的至少一个第一评分了。After the server obtains the usage data of the terminal based on the historical behavior of the user on the terminal, the server determines at least two dimension usage parameters of each user account corresponding to the terminal according to the usage data. And the preset association recommendation model is established in the server, and the preset association recommendation model is used for performing comprehensive scoring of multiple dimensions on each user account according to at least two dimension usage parameters of each user account. Therefore, the server processes at least two dimension usage parameters of each user account by calling a preset association recommendation model, and outputs a first score corresponding to each first user account. Then, the server can obtain at least one first score corresponding to the at least one first user account corresponding to the terminal.
需要说明的是,在本申请实例中,预设关联推荐模型可以由两部分组成,一部分为针对每个维度使用参数进行重要度评分(第二评分)的预设第一模型,一部分为针对经预设第一模型后输出的每个维度使用参数对应的第二评分进行综合加权评分的预设第二模型。It should be noted that, in the example of the present application, the preset association recommendation model may be composed of two parts, and a part is a preset first model for using an importance score (second score) for each dimension, and a part is for Each dimension outputted after the first model is preset uses a second score corresponding to the parameter to perform a preset second model of the integrated weighted score.
在本申请实例中,第一评分为终端对应的至少一个第一用户中各第一用户账号的多维度评判的综合评分结果,其中,该综合评分结果用于表征终端对应的第一用户账号与终端设备的对应关系的准确率的。即哪个第一用户账号的综合评分结果越高,那么,该第一用户账号对应的准确率也会越高,于是,服务器就判断出该第一用户账号就是该终端上的常用用户账号,将该第一用户账号与终端设备的标识相关联。In the example of the present application, the first score is a comprehensive score result of the multi-dimensional evaluation of each first user account of the at least one first user corresponding to the terminal, wherein the comprehensive score result is used to represent the first user account corresponding to the terminal. The accuracy of the correspondence of the terminal devices. That is, the higher the comprehensive rating result of the first user account, the higher the accuracy rate corresponding to the first user account, so the server determines that the first user account is a common user account on the terminal, and The first user account is associated with the identity of the terminal device.
示例性的,假设每个第一用户账号的至少两个维度使用参数包括:每个第一用户账号的使用时间、每个第一用户账号的使用次数和每个第一用户账号的使用来源,那么,服务器可以通过预设关联推荐模型计算出每个第一用户账号的使用时间的重要度分值(第二评分)、每个第一 用户账号的使用次数的重要度分值(第二评分)和每个第一用户账号的使用来源的重要度分值(第二评分),并根据每个第一用户账号的使用时间的重要度分值、每个第一用户账号的使用次数的重要度分值和每个第一用户账号的使用来源的重要度分值和预设关联推荐模型,得到每个第一用户账号的综合评分结果。Exemplarily, it is assumed that at least two dimension usage parameters of each first user account include: the usage time of each first user account, the number of uses of each first user account, and the source of use of each first user account, Then, the server may calculate the importance score (second score) of each first user account usage time and the importance score of each first user account usage count by using the preset association recommendation model (second score) And the importance score of the source of use of each first user account (second score), and the importance score according to the usage time of each first user account, and the importance of the number of uses of each first user account The score value and the importance score of each first user account source and the preset association recommendation model are obtained, and the comprehensive score result of each first user account is obtained.
采用本申请提供的账号识别关联方法,由于服务器根据终端的使用数据确定终端对应的各第一用户账号中各第一用户账号的至少两个维度使用参数,根据各第一用户账号的至少两个维度使用参数及预设关联推荐模型确定出各第一用户账号的综合评分(第一评分),根据各第一用户账号的综合评分确定与终端设备关联的第一用户账号。根据第一用户账号的多个维度的使用参数对第一用户账号进行评测,使得确定出的终端设备关联的第一用户账号更加准确。According to the account identification association method provided by the application, the server determines at least two dimension usage parameters of each first user account in each first user account corresponding to the terminal according to the usage data of the terminal, according to at least two of the first user accounts. The dimension usage parameter and the preset association recommendation model determine a comprehensive score (first score) of each first user account, and determine a first user account associated with the terminal device according to the comprehensive score of each first user account. The first user account is evaluated according to the usage parameters of the multiple dimensions of the first user account, so that the determined first user account associated with the terminal device is more accurate.
在一些实例中,本申请提供的确定关联账号的方法,进一步包括以下步骤:In some examples, the method for determining an associated account provided by the present application further includes the following steps:
S11:根据与所述终端设备关联的第一用户账号对应的使用记录为所述终端设备推送相关联数据。S11: Push the associated data to the terminal device according to the usage record corresponding to the first user account associated with the terminal device.
确定的与终端关联的第一用户账号为终端常用的用户账号,根据终端最常用的账号对应的使用数据向终端推荐相关联数据,使得推荐的数据更加准确。例如,根据账号的使用数据中的使用记录确定终端关联的账号对应的用户的兴趣特征,根据该兴趣特征向终端推荐相关联的数据,例如,推送广告、新闻、文章等。The determined first user account associated with the terminal is a user account commonly used by the terminal, and the associated data is recommended to the terminal according to the usage data corresponding to the most commonly used account of the terminal, so that the recommended data is more accurate. For example, the user's interest characteristics corresponding to the account associated with the terminal are determined according to the usage record in the usage data of the account, and the associated data is recommended to the terminal according to the interest feature, for example, pushing advertisements, news, articles, and the like.
在一些实例中,本申请提供的账号识别关联方法,进一步包括以下步骤:In some examples, the account identification association method provided by the present application further includes the following steps:
S21、根据所述各第一用户账号对应的第一评分和预设评分与置信度的对应关系,确定各第一用户账号对应的置信度。S21. Determine a confidence level corresponding to each first user account according to the first score corresponding to each first user account and the correspondence between the preset score and the confidence level.
服务器将通过调用预设关联推荐模型处理每个第一用户账号的至少两个维度使用参数,获得与每个第一用户账号对应的第一评分,得到至少一个第一评分之后,由于服务器获取到了终端与至少一个用户账号的对应关系的至少一个第一评分,服务器中存储有预设评分与置信度的 对应关系,根据该预设评分与置信度的对应关系可以获取各第一评分对应的置信度。即该服务器获取各第一用户账号与终端对应关系的置信度了。The server processes the at least two dimension usage parameters of each first user account by calling the preset association recommendation model, and obtains a first score corresponding to each first user account. After obtaining at least one first score, the server obtains the At least one first score of the correspondence between the terminal and the at least one user account, the server stores a correspondence between the preset score and the confidence, and the correspondence corresponding to the first score is obtained according to the correspondence between the preset score and the confidence degree. That is, the server obtains the confidence of the correspondence between each first user account and the terminal.
具体的,服务器将至少一个第一评分依次与预设评分与置信度的对应关系进行匹配,例如,当第一评分与预设评分与置信度的对应关系中的第四评分匹配时,该服务器将预设评分与置信度的对应关系中的第四评分对应的置信度,确定为该第一评分对应的置信度。Specifically, the server matches the at least one first score with the correspondence between the preset score and the confidence, for example, when the first score matches the fourth score in the correspondence between the preset score and the confidence, the server The confidence level corresponding to the fourth score in the correspondence between the preset score and the confidence is determined as the confidence corresponding to the first score.
S22、从至少一个置信度获取满足预设选择规则的第一置信度,依据与该第一置信度对应的第一用户账号的至少两个维度使用记录为终端推送相关联数据,该预设选择规则由实际推送相关联数据的类型决定。S22. Acquire a first confidence level that satisfies a preset selection rule from the at least one confidence level, and use the at least two dimension usage records of the first user account corresponding to the first confidence level to push the associated data to the terminal, where the preset selection is performed. The rules are determined by the type of actual push associated data.
在执行上述步骤S22时,执行从所述各第一用户账号对应的置信度中获取满足预设选择规则的置信度,根据所述满足预设选择规则的置信度对应的第一用户账号对应的使用记录为所述终端设备推送相关联数据。When the step S22 is performed, the confidence level that satisfies the preset selection rule is obtained from the confidence levels corresponding to the first user accounts, and the first user account corresponding to the confidence level that satisfies the preset selection rule is corresponding. The usage record pushes the associated data for the terminal device.
服务器根据至少一个第一评分和预设评分与置信度的对应关系,获取与该至少一个第一评分对应的至少一个置信度之后,在本申请实例中,服务器可以根据置信度来进行终端与第一账户的关联度的判断,因此,服务器可以从至少一个置信度中确定一个置信度,进而根据该置信度对应的第一用户账号的使用数据进行相关联数据的推荐了。After the server obtains at least one confidence level corresponding to the at least one first score according to the correspondence between the at least one first score and the preset score and the confidence level, in the example of the present application, the server may perform the terminal and the first according to the confidence level. The determination of the degree of association of an account. Therefore, the server may determine a confidence level from at least one confidence level, and then perform recommendation of the associated data according to the usage data of the first user account corresponding to the confidence level.
需要说明的是,由于服务器向终端进行推荐的相关联数据的类型可以有多种,例如:广告、视频和音频等等内容,因此,该服务器针对不同的被推荐相关联数据的类型,可以采用不同的规则即预设选择规则从至少一个置信度中选择置信度,从而依据与该置信度对应的第一用户账号的使用数据(选取的置信度可能对应一个或多个关系数据)为终端推送相关联数据。It should be noted that, since the type of associated data recommended by the server to the terminal may be various, for example, advertisement, video, audio, and the like, the server may adopt different types of recommended associated data. Different rules, that is, the preset selection rule selects a confidence level from the at least one confidence level, so as to push the terminal according to the usage data of the first user account corresponding to the confidence level (the selected confidence may correspond to one or more relationship data) Associated data.
具体的,在本申请实例中,服务器可以从至少一个置信度获取置信度最高的置信度,依据与该置信度对应的第一用户账号的使用数据为终端推送相关联数据。具体地,根据使用数据中的使用记录为终端推送相 关数据。例如,服务器要向终端推荐相关联的视频时,选取的置信度对应于的一个或多个第一用户账户对应的使用数据中的使用记录确定出与该终端对应的兴趣特征,该服务器可以选择与该终端的兴趣特征等关联度高的相关视频作为被推荐的视频。由于本申请实例中,置信度越高表征关联性越准确,因此,该服务器可以选择置信度最高的置信度对应的用户账号对应的用户的使用记录作为用户喜好或爱好的参考,进行相关联数据的推荐。Specifically, in the example of the present application, the server may obtain the confidence with the highest degree of confidence from at least one confidence level, and push the associated data for the terminal according to the usage data of the first user account corresponding to the confidence level. Specifically, the relevant data is pushed for the terminal based on the usage record in the usage data. For example, when the server is to recommend the associated video to the terminal, the selected confidence level corresponds to the usage record in the usage data corresponding to the one or more first user accounts to determine the interest feature corresponding to the terminal, and the server may select A related video having a high degree of relevance to the interest characteristics of the terminal is used as the recommended video. In the example of the present application, the higher the confidence, the more accurate the association is. Therefore, the server can select the user's usage record corresponding to the user account corresponding to the highest confidence level as a reference for the user's preference or hobby, and perform related data. Recommended.
在另一些实例中,服务器还可以从至少一个置信度中获取与第一用户账号最多对应的置信度,依据与该置信度对应的第一用户账号的使用数据中的使用记录为终端推送相关联数据。例如,服务器要向终端推荐相关联的广告时,该服务器可以选择与至少一个第一用户账号最多对应的置信度,例如,获取的至少一个置信度中包括第一置信度及第二置信度,其中第一置信度大于第二置信度,第一置信度对应一个第一用户账号(即对应一个用户,该用户为终端的常用用户),第二置信度对应三个第二用户账号(对应三个用户,该三个用户不是终端的常用用户),此时选择第二置信度,进而根据第二置信度对应的第一用户账号的使用数据中的使用记录向终端推荐相关广告。即根据上述三个用户的使用记录确定用户特征,进而根据用户特征向终端推送广告。使得推送的广告与使用终端的较多的用户进行匹配。这样的话,被推荐的广告就可以让尽量多的用户感兴趣并进行购买。因此,本申请实例中,该服务器选择与所述至少一个第一用户账号最多对应的第一置信度对应的使用数据中的用户使用记录作为用户喜好或爱好的参考,进行相关联数据的推荐。图2B为根据置信度向终端推送相关联数据的详细流程图。In another example, the server may further obtain, from the at least one confidence level, a confidence level corresponding to the first user account, and associate the terminal with the usage record in the usage data of the first user account corresponding to the confidence level. data. For example, when the server is to recommend the associated advertisement to the terminal, the server may select a confidence level corresponding to at least one first user account, for example, the acquired at least one confidence includes the first confidence and the second confidence. The first confidence level is greater than the second confidence level, and the first confidence level corresponds to a first user account (ie, corresponding to one user, the user is a common user of the terminal), and the second confidence level corresponds to three second user accounts (corresponding to three The user is not the user of the terminal. In this case, the second confidence is selected, and the related advertisement is recommended to the terminal according to the usage record in the usage data of the first user account corresponding to the second confidence. That is, the user characteristics are determined according to the usage records of the above three users, and then the advertisement is pushed to the terminal according to the user characteristics. The pushed advertisement is matched with more users who use the terminal. In this case, the recommended ads will make as many users as possible interested in buying. Therefore, in the example of the present application, the server selects the user usage record in the usage data corresponding to the first confidence level corresponding to the at least one first user account as a reference of the user's preference or hobby, and performs recommendation of the associated data. 2B is a detailed flow chart for pushing associated data to a terminal based on confidence.
可以理解的是,由于服务器可以通过预设关联推荐模型计算出终端对应每个第一用户账号的第一评分。进而每个第一用户账号对应的置信度,从而使得服务器可以根据预设选择规则实现不同情况下,选取不同的置信度。根据选取的置信度对应的用户账号对应的用户作为待推荐用户。根据用户账号的使用数据中的使用记录向终端推送相关联数据。实现了在终端对应多个用户账号的情况下,适应性确定与终端关联的用户 账号,从而为该终端上关联的用户账号对应的用户推荐相关联数据,提高了关联推荐的准确度。It can be understood that, because the server can calculate the first score corresponding to each first user account by the terminal by using the preset association recommendation model. Furthermore, the confidence level corresponding to each first user account is such that the server can select different confidence levels according to preset selection rules. The user corresponding to the user account corresponding to the selected confidence level is used as the user to be recommended. The associated data is pushed to the terminal according to the usage record in the usage data of the user account. In the case that the terminal corresponds to multiple user accounts, the user account associated with the terminal is adaptively determined, so that the associated data is recommended for the user corresponding to the user account associated with the terminal, and the accuracy of the association recommendation is improved.
基于上述实例的实现过程,如图3所示,本申请实例提供了一种账号识别关联方法,服务器调用预设关联推荐模型,该关联推荐模型用于处理每个第一用户账号的至少两个维度使用参数,输出与该每个第一用户账号对应的第一评分。其中,所述预设关联推荐模型包括第一模型及第二模型,得到至少一个第一评分的过程可以包括:Based on the implementation process of the foregoing example, as shown in FIG. 3, the application example provides an account identification association method, where the server invokes a preset association recommendation model, where the association recommendation model is used to process at least two of each first user account. The dimension uses the parameter to output a first score corresponding to each of the first user accounts. The preset association recommendation model includes a first model and a second model, and the process of obtaining at least one first rating may include:
S201、调用预设第一模型处理每个第一用户账号的至少两个维度使用记录,输出与每个第一用户账号的至少两个维度使用记录对应的至少两个第二评分,其中,该预设关联推荐模型包括:预设第一模型,该预设第一模型用于分别对每个第一用户账号的至少两个维度使用记录的重要度进行评分。S201: Calling the preset first model to process at least two dimension usage records of each first user account, and outputting at least two second scores corresponding to at least two dimension usage records of each first user account, where The preset association recommendation model includes: a preset first model, wherein the preset first model is used to respectively score the importance degree of the usage record of at least two dimensions of each first user account.
在执行上述S201时,执行调用所述第一模型计算所述各第一用户账号的至少两个维度的使用参数,输出各第一用户账号的各维度的使用参数的第二评分。When the foregoing S201 is executed, the use of the first model to calculate the usage parameters of the at least two dimensions of each of the first user accounts is performed, and the second score of the usage parameters of each dimension of each first user account is output.
在本申请实例中,预设关联推荐模型可以由两部分组成,一部分为针对每个维度使用参数进行重要度评分的预设第一模型,一部分为针对经预设第一模型后输出的每个维度使用参数对应的第一评分进行综合加权评分的预设第二模型。In the example of the present application, the preset association recommendation model may be composed of two parts, one part is a preset first model for importance degree scoring using parameters for each dimension, and one part is for each output after the preset first model is output. The dimension uses the first score corresponding to the parameter to perform a preset second model of the integrated weighted score.
示例性的,假设每个第一用户账号的至少两个维度使用参数包括:每个第一用户账号的使用时间、每个第一用户账号的使用次数和每个第一用户账号的使用来源,那么,服务器可以通过预设关联推荐模型中的第一模型计算出每个第一用户账号的使用时间的重要度分值(第二评分)、每个第一用户账号的使用次数的重要度分值(第二评分)和每个第一用户账号的使用来源的重要度分值(第二评分)。Exemplarily, it is assumed that at least two dimension usage parameters of each first user account include: the usage time of each first user account, the number of uses of each first user account, and the source of use of each first user account, Then, the server may calculate the importance score (second score) of each first user account usage time and the importance score of each first user account usage count by using the first model in the preset association recommendation model. The value (second score) and the importance score (second score) of the source of use of each first user account.
本申请实例中,预设第一模型可以包括:公式(1)、公式(2)和公式(3)中的至少两个,其中,公式(1)、公式(2)和公式(3)为:In the example of the present application, the preset first model may include at least two of formula (1), formula (2), and formula (3), wherein formula (1), formula (2), and formula (3) are :
Figure PCTCN2018072381-appb-000001
Figure PCTCN2018072381-appb-000001
lg(1/(1+(n/30)*n))/N   (2)Lg(1/(1+(n/30)*n))/N (2)
lg(j/k)/J   (3)Lg(j/k)/J (3)
其中,通过公式(1)确定使用来源的重要度分值,通过公式(2)确定使用时间的重要度分值,通过公式(3)确定使用次数的重要度分值。其中,m为每个第一用户账号的使用来源的总个数,h i为每个第一用户账号的第i个使用来源对应的预设分数,H为预设使用来源总数,M为预设第一归一化参数,并且m小于H。n为每个第一用户账号的使用时间距当前时间的天数,N为预设第二归一化参数;j为每个第一用户账号的使用次数,k为预设时间,J为预设第三归一化参数。 Among them, the importance score of the use source is determined by the formula (1), the importance score of the use time is determined by the formula (2), and the importance score of the use count is determined by the formula (3). Where m is the total number of sources of use of each first user account, h i is a preset score corresponding to the i th use source of each first user account, H is the total number of preset use sources, and M is a pre- Set the first normalization parameter and m is less than H. n is the number of days when the usage time of each first user account is from the current time, N is a preset second normalization parameter; j is the number of times of use of each first user account, k is a preset time, and J is a preset The third normalization parameter.
在本申请实例中,M,N和J都是归一化参数,为正整数。具体的,M的取值可以为
Figure PCTCN2018072381-appb-000002
的值落在两个相邻整数区间时的上限值的绝对值;N的取值可以为lg(1/(1+(n/30)*n))的值落在两个相邻整数区间时的上限值的绝对值;J的取值可以为lg(j/k)的值落在两个相邻整数区间时的上限值的绝对值。
In the example of the present application, M, N and J are normalization parameters and are positive integers. Specifically, the value of M can be
Figure PCTCN2018072381-appb-000002
The value of the value falls within the absolute value of the upper limit of two adjacent integer intervals; the value of N can be lg (1/(1+(n/30)*n)) and the value falls between two adjacent integers The absolute value of the upper limit value in the interval; the value of J may be the absolute value of the upper limit value when the value of lg(j/k) falls within two adjacent integer intervals.
需要说明的是,在本申请实例中,公式(1)是用来计算每个第一用户账号的使用来源的第二评分的模型;公式(2)是用来计算每个第一用户账号的使用时间的第二评分的模型;公式(3)是用来计算每个第一用户账号的使用次数的第二评分的模型。It should be noted that, in the example of the present application, formula (1) is a model for calculating a second score of the source of use of each first user account; formula (2) is used to calculate each first user account. A model of the second score using time; formula (3) is a model for calculating a second score for the number of uses of each first user account.
具体的,服务器在对每个第一用户账号的使用来源进行评分时,由于每个第一用户账号的使用来源可以分为主动来源和被动来源,因此,针对不同的使用来源对应的h i,即每个第一用户账号的第i个使用来源对应的预设分数也是不同的。主动来源对应的h i的值高于被动来源对应的h i的值,并且被动来源对应的h i值可以根据计算出的与主动来源上报准确率来分别赋不同值,和主动来源越相同,则h i的值越高。从用其他平台等被动来源处获取的使用数据中包括的IMEI-用户账号的一条或多条对应关系与主动来源处获取的使用数据中包括的IMEI-用户账号的一条或多条对应关系中包括的相同的对应关系数越多,认为被动来源与主动来 源越相近。例如,主动来源对应的h i的值可以为2,被动来源对应的h i的值范围相应的可以为1~1.9之间。每个第一用户账号的使用来源越多,通过公式(1)得到的每个第一用户账号的使用来源对应的分数越高。服务器在对每个第一用户账号的使用时间进行评分时,公式(2)表示n距离当前时间越长,每个第一用户账号的使用时间得到的分数越低,同时为了优化时间衰减,对于一个30天内上报的第一用户账号给的分数更高,衰减更慢;而超过30天上报的数据,衰减更快,因此,在公式(2)中的在取对数的分母加上(n/30)来优化。总之,每个第一用户账号的使用时间距离当前时间越近,分数越高。服务器在对每个第一用户账号的使用次数进行评分时,当每个第一用户账号的使用次数越多,分数越高。 Specifically, when the server scores the source of use of each first user account, since the source of use of each first user account can be classified into an active source and a passive source, the h i corresponding to different usage sources is That is, the preset score corresponding to the ith source of use of each first user account is also different. The value of h i corresponding to the active source is higher than the value of h i corresponding to the passive source, and the value of h i corresponding to the passive source may be assigned different values according to the calculated accuracy rate of the active source, and the same as the active source. Then the value of h i is higher. One or more correspondences of IMEI-user accounts included in the usage data obtained from passive sources such as other platforms and one or more correspondences of IMEI-user accounts included in the usage data acquired by the active source are included The more the corresponding correspondences, the closer the passive source is to the active source. For example, the value of h i corresponding to the active source may be 2, and the value range of h i corresponding to the passive source may be between 1 and 1.9. The more the source of use of each first user account, the higher the score corresponding to the source of use of each first user account obtained by formula (1). When the server scores the usage time of each first user account, formula (2) indicates that the longer the n is from the current time, the lower the score obtained by the usage time of each first user account, and in order to optimize the time decay, The first user account reported in a 30-day period gives a higher score and slower decay; and the data reported over 30 days is attenuated faster, so the denominator of the logarithm in equation (2) is added (n /30) to optimize. In short, the closer the usage time of each first user account is to the current time, the higher the score. When the server scores the number of uses of each first user account, the higher the number of uses of each first user account, the higher the score.
进一步地,本申请实例并不限制预设第一模型的形式,只要预设第一模型可以表征多个维度使用记录的评分即可,本申请实例对此不作限制。例如公式(1)、公式(2)和公式(3)还可以对应变形为:公式(4)、公式(5)和公式(6),如下:Further, the example of the present application does not limit the form of the preset first model, as long as the preset first model can be used to represent the scores of the plurality of dimension usage records, which is not limited by the examples in the present application. For example, formula (1), formula (2), and formula (3) can also be correspondingly transformed into: formula (4), formula (5), and formula (6), as follows:
Figure PCTCN2018072381-appb-000003
Figure PCTCN2018072381-appb-000003
1-lg(1/(1+(n/30)*n))/N   (5)1-lg(1/(1+(n/30)*n))/N (5)
1-lg(j/k)/J   (6)1-lg(j/k)/J (6)
最后,公式(1)-(6)中的对数运算的作用为了对分数进行归一化。Finally, the logarithmic operation in equations (1)-(6) is used to normalize the score.
S202、调用预设第二模型处理至少两个第二评分,输出与每个第一用户账号对应的第一评分,直至得到至少一个第一评分,其中,该预设关联推荐模型还包括:预设第二模型,该预设第二模型用于对该至少两个第二评分进行加权得到总评分。S202: The preset second model is processed to process the at least two second scores, and the first score corresponding to each first user account is output until the at least one first score is obtained, where the preset association recommendation model further includes: A second model is provided, the preset second model is used to weight the at least two second scores to obtain a total score.
在执行上述步骤S202时,执行调用所述第二模型处理所述各第一用户账号的各维度使用记录的第二评分,输出各第一用户账号的所述第一评分,其中,所述第二模型用于将各第一用户账号的各维度使用记录的第二评分进行加权得到所述各第一用户账号的所述第一评分。When performing the above step S202, executing the second model to process the second score of each dimension usage record of each first user account, and outputting the first score of each first user account, where the first The second model is configured to weight the second score of each dimension usage record of each first user account to obtain the first score of each first user account.
服务器在调用预设第一模型处理每个第一用户账号的至少两个维度使用参数,输出与每个第一用户账号的至少两个维度使用参数对应的至少两个第二评分。之后,由于预设第二模型用于对该至少两个第二评分进行加权得到总评分,因此,该服务器就可以调用预设第二模型处理至少两个第二评分,输出与每个第一用户账号对应的第一评分,进而采用同样的方法,该服务器可以获取到至少一个第一用户账号中各第一用户账号对应的第一评分了。The server processes the at least two dimension usage parameters of each first user account by calling the preset first model, and outputs at least two second scores corresponding to the at least two dimension usage parameters of each first user account. Thereafter, since the preset second model is used to weight the at least two second scores to obtain a total score, the server may call the preset second model to process the at least two second scores, and output the first The first score corresponding to the user account, and in the same manner, the server may obtain the first score corresponding to each first user account in the at least one first user account.
在本申请实例中,预设第二模型中,每个第一用户账号对应的至少两个第二评分对应不同的权重值,服务器可以对每个第一用户账号对应的至少两个第二评分进行加权得出综合评分,即第一评分。在使用来源,使用时间和使用次数三个维度中,使用时间是最重要的一个因素,因此,使用时间对应的权重更高,其它两个权重更低一些。基于这个原则,本申请实例中的预设第二模型可以根据训练模型经过训练得到的,具体的实现过程将在后续实例中进行说明。In the example of the present application, in the preset second model, at least two second scores corresponding to each first user account correspond to different weight values, and the server may perform at least two second scores corresponding to each first user account. Weighting gives a comprehensive score, the first score. In the three dimensions of using source, usage time and usage times, usage time is the most important factor, so the usage time corresponds to a higher weight and the other two weights are lower. Based on this principle, the preset second model in the example of the present application can be obtained according to the training model, and the specific implementation process will be described in the following examples.
本申请实例中的预设第二模型的构造或生成方法可以通过常见的机器学习的分类方法进行,例如,支持向量机、逻辑回归、决策树、GBDT或神经网络。本申请实例中通过将每个第一用户账号对应的至少两个第二评分作为目标变量,调用构造好的样本进行训练,并调整权重参数,得到能够基于多个维度的综合评分的最优模型。The construction or generation method of the preset second model in the example of the present application can be performed by a common classification method of machine learning, for example, support vector machine, logistic regression, decision tree, GBDT or neural network. In the example of the present application, by using at least two second scores corresponding to each first user account as target variables, the constructed samples are called for training, and the weight parameters are adjusted to obtain an optimal model capable of comprehensive scoring based on multiple dimensions. .
可以理解的是,由于服务器可以通过预设关联推荐模型计算出终端对应每个第一用户账号的第一评分,进而获取每个第一用户账号对应的置信度,从而使得服务器可以根据预设选择规则实现不同情况下选取不同的置信度,根据选择的置信度对应的用户账号对应的用户作为待推荐用户。实现了在终端对应多个用户账号的情况下,适应性确定终端关联的用户账号,从而为该终端关联的用户账号对应的使用数据推荐相关联数据,提高了关联推荐的准确度。It can be understood that, because the server can calculate the first score corresponding to each first user account by using the preset association recommendation model, the confidence level corresponding to each first user account is obtained, so that the server can select according to the preset. The rule implementation achieves different confidence levels in different situations, and the user corresponding to the user account corresponding to the selected confidence level is used as the user to be recommended. In the case that the terminal corresponds to multiple user accounts, the user account associated with the terminal is adaptively determined, so that the associated data is recommended for the usage data corresponding to the user account associated with the terminal, and the accuracy of the association recommendation is improved.
基于上述实例的实现过程的描述,本申请实例提供一种基于引入机器学习技术而形成一种预设第二模型,获取第一评分时考虑所有特征维度然后综合进行判断。在形成预设第二模型的初期,仍然需要人工挑选 尽可能多维度的特征(即样本的特征)供机器学习模型训练,根据特征对第一训练结果的区分度决定选用哪些特征进行描述。这里基本不存在人工干预选择参数的问题,机器学习可以自己学习出合适的参数来。由于特征含义相比没有意义的参数看来更为直观,结合特征的分布,解释起来也比较容易理解;首先基于机器学习模型的实时综合评价,综合评价涉及到多个维度使用参数的综合考虑,提高了综合评分的准确性。另外由于模型自身具有进化学习的功能。即使在允许范围发生更新或删减,通过简单的重新进行模型训练(有时候需要对特征进行微调),即可以进行预设第二模型的调整,使综合评分结果的准确性最高。Based on the description of the implementation process of the above example, the example of the present application provides a preset second model based on the introduction of machine learning technology, and all feature dimensions are considered when the first score is obtained, and then the judgment is comprehensively performed. In the initial stage of forming the preset second model, it is still necessary to manually select as many features as possible (ie, the characteristics of the sample) for the machine learning model training, and determine which features are selected according to the degree of discrimination of the first training result. There is basically no problem of manual intervention selection parameters here, and machine learning can learn the appropriate parameters by itself. Since the meaning of the feature is more intuitive than the meaningless parameter, the distribution of the feature is easier to understand. Firstly, based on the real-time comprehensive evaluation of the machine learning model, the comprehensive evaluation involves the comprehensive consideration of the parameters used in multiple dimensions. Improve the accuracy of the comprehensive score. In addition, the model itself has the function of evolutionary learning. Even if the allowable range is updated or deleted, by simply re-training the model (sometimes requiring fine-tuning of the feature), the adjustment of the preset second model can be performed to maximize the accuracy of the comprehensive scoring result.
机器学习技术在多个维度的综合评分中的应用可以自由的分享和传播,因为机器学习综合评分全面且可以自我进化,不针对特定用户账号,因此,甚至对同一终端的不同用户账号一样可以公开基于机器学习模型的综合评分做法。基于前述的实例,本申请实例提供一种形成预设第二模型的方法,如图4所示,该方法包括:The application of machine learning technology in multiple dimensions of comprehensive scoring can be shared and disseminated freely, because the machine learning comprehensive score is comprehensive and self-evolving, not specific to a specific user account, and therefore can be disclosed even to different user accounts of the same terminal. A comprehensive scoring approach based on machine learning models. Based on the foregoing example, the application example provides a method for forming a preset second model. As shown in FIG. 4, the method includes:
S301、按照预设的配置比例获取正样本和负样本,所述正样本和所述负样本包括所述至少两个终端中各终端与至少一个第二用户账号的对应关系,以及通过所述第一模型得到的各终端的各第二用户账号的至少两个第三评分。S301. Acquire a positive sample and a negative sample according to a preset configuration ratio, where the positive sample and the negative sample include a correspondence between each terminal of the at least two terminals and at least one second user account, and At least two third scores of each second user account of each terminal obtained by the model.
在执行上述步骤S301时,执行获取至少一个终端设备中各终端设备对应的第二用户账号及第三用户账号,获取各终端设备对应的第二用户账号及第三用户账号的至少两个维度的使用参数;根据所述第一模型确定各终端设备对应的第二用户账号的至少两个第三评分及第三账号的至少两个第三评分。When performing the foregoing step S301, the second user account and the third user account corresponding to each terminal device in the at least one terminal device are obtained, and at least two dimensions of the second user account and the third user account corresponding to each terminal device are obtained. Using the parameter; determining, according to the first model, at least two third scores of the second user account corresponding to each terminal device and at least two third scores of the third account.
训练样本选取多个终端,每个终端对应一正样本及一个或多个负样本,正样本为终端正在使用的用户账号,例如,正样本为第二用户账号,负样本为第三用户账号。根据第二用户账号对应的使用数据确定第二用户账号的至少两个维度的使用参数,进而确定第二用户账号的至少两个第三评分,其中,各第三评分与各维度的使用参数对应。采用同样的方式确定各第三用户账号至少两个第三评分。其中,负样本为使用过终端, 但当前没有使用的账号。这里,在实际操作的过程中,正样本和负样本会存在一定的比例,这个比例即为配置比例。在形成预设第二模型时,服务器对训练数据的配置(已有用户行为的样本和对应的综合评分结果)也需要按照该配置比例进行设置。其中,正样本和负样本为第一终端对应的用户账号。The training sample selects multiple terminals, each terminal corresponds to a positive sample and one or more negative samples, and the positive sample is a user account that the terminal is using. For example, the positive sample is the second user account, and the negative sample is the third user account. Determining, according to the usage data corresponding to the second user account, the usage parameters of the at least two dimensions of the second user account, and determining at least two third scores of the second user account, where each third score corresponds to the usage parameter of each dimension . In the same manner, at least two third scores of each third user account are determined. Among them, the negative sample is an account that has been used but has not been used. Here, in the actual operation process, there will be a certain ratio between the positive sample and the negative sample, and this ratio is the configuration ratio. When the preset second model is formed, the configuration of the training data by the server (the sample of the existing user behavior and the corresponding comprehensive scoring result) also needs to be set according to the configuration ratio. The positive sample and the negative sample are user accounts corresponding to the first terminal.
S302、调用设置的训练模型处理正样本或负样本,得到第一训练结果。S302. Call the set training model to process the positive sample or the negative sample to obtain the first training result.
在执行上述步骤S302时,执行以下步骤:When performing the above step S302, the following steps are performed:
预设所述第二模型的参数;Presetting parameters of the second model;
针对所述至少一个终端设备中的各终端设备,执行以下处理:Performing the following processing for each terminal device in the at least one terminal device:
根据该终端设备对应的第二用户账号的至少两个第三评分,确定所述第二用户账号的第一评分,根据该终端设备对应的第三用户账号的至少两个第三评分,确定所述第三用户账号的第一评分;Determining, according to at least two third scores of the second user account corresponding to the terminal device, a first score of the second user account, and determining, according to at least two third scores of the third user account corresponding to the terminal device, Describe the first rating of the third user account;
根据所述第二用户账号的第一评分及所述第三用户账号的第一评分确定与所述终端设备关联的用户账号。Determining a user account associated with the terminal device according to the first rating of the second user account and the first rating of the third user account.
需要说明的是,本申请实例中的服务器对正样本和负样本对应的每个第二用户账号通过预设第一模型得到的至少两个第三评分;以及得到各第三用户账号对的至少两个第三评分的过程与得到每个第一用户账号对应的至少一个第二评分的原理相同。It should be noted that the server in the example of the present application obtains at least two third scores obtained by presetting the first model for each second user account corresponding to the positive sample and the negative sample; and obtaining at least each third user account pair. The process of the two third ratings is the same as the principle of obtaining at least one second score corresponding to each first user account.
可以理解的是,本申请实例中的正样本和负样本涉及的允许范围越完整,后续的综合评分评价是越准确的。It can be understood that the more complete the allowable range involved in the positive and negative samples in the example of the present application, the more accurate the subsequent comprehensive score evaluation is.
在通过机器学习确定第二模型时,先预设第二模型的参数。第二模型用以将各维度的第三评分进行加权求和得到第一评分,第二模型的参数包括各维度的第二评分对应的权重。先预设各维度第二评分对应的权重,针对各终端的正样本(第二用户账号)及负样本(第三用户账号)中的任一样本,根据预设的第二模型的参数确定一个样本的第一评分,根据一个终端对应的各用户账号确定与该终端关联的用户账号。例如,将第一评分最高的用户账号确定为与终端关联的用户账号。当与终端关联的用户账号为第二用户账号(正样本)时,第二模型确定的与终端关 联的用户正确。针对所述至少一个终端设备中的各终端设备,可以确定终端设备对应的账号,当对于的账号为正样本时,评测正确,否则,评测错误When the second model is determined by machine learning, the parameters of the second model are preset. The second model is used to perform weighted summation of the third scores of the respective dimensions to obtain a first score, and the parameters of the second model include weights corresponding to the second scores of the respective dimensions. Firstly, the weight corresponding to the second score of each dimension is preset, and for any sample of the positive sample (second user account) and the negative sample (third user account) of each terminal, one parameter is determined according to the preset parameter of the second model. The first score of the sample determines the user account associated with the terminal according to each user account corresponding to one terminal. For example, the user account with the highest highest score is determined as the user account associated with the terminal. When the user account associated with the terminal is the second user account (positive sample), the user associated with the terminal determined by the second model is correct. For each terminal device in the at least one terminal device, the account corresponding to the terminal device may be determined, and when the account is a positive sample, the evaluation is correct, otherwise, the evaluation error
S303、持续检测训练模型,直至第一训练结果满足预设条件,并将该第一训练结果满足预设条件的所述训练模型作为预设第二模型,该预设条件用于表征根据该预设第二模型得到的数据输出结果运用于确定终端最常用用户账号时,最接近该终端的真实常用用户账号。S303. Continuously detecting the training model until the first training result satisfies the preset condition, and using the training model that meets the preset condition as the preset second training model, the preset condition is used to represent the pre-condition The data output result obtained by the second model is used to determine the most common user account of the terminal, and the real common user account closest to the terminal.
在执行上述步骤S303时,执行以下步骤:When performing the above step S303, the following steps are performed:
根据所述至少一个终端设备中各终端设备关联的用户账号确定所述第二模型的准确率;Determining an accuracy rate of the second model according to a user account associated with each terminal device in the at least one terminal device;
调整所述第二模型的参数至所述第二模型的准确率满足预设条件。The accuracy of adjusting the parameters of the second model to the second model satisfies a preset condition.
根据各终端设备对应的用户账号确定评测正确的终端设备确定第二模型的准确率,例如,可以将评测正确的终端设备占总终端设备的比例作为所述第二模型的准确率。调整第二模型的参数,即调整各第二评分的权重,重复执行步骤S302,直到获取的第二模型的准确率达到最大,此时的第二模型的参数最优。The accuracy of the second model is determined by determining the correct terminal device according to the user account corresponding to each terminal device. For example, the ratio of the terminal device that is correctly evaluated to the total terminal device may be used as the accuracy of the second model. The parameters of the second model are adjusted, that is, the weights of the second scores are adjusted, and step S302 is repeatedly performed until the accuracy of the acquired second model reaches a maximum, and the parameters of the second model at this time are optimal.
本申请实例中,不管采用何种训练模型,在开始训练之时,该训练模型的录入包括上述至少两个维度的特征,经过多次试验如果该特征不对第一训练结果产生有利影响或者错误的时候,就降低该维度的特征或数据的权重,如果该特征对第一训练结果产生有利影响时候,就提高该特征或数据的权重,如果一个参数的权重降低为0,那么在训练模型中该特征将不起任何作用了。经过本申请实例的最终试验,上述不同的维度的特征最终对第一训练结果能够产生积极影响的是长期特征(即本申请实例中的至少两个第三评分)。下面假设不同维度的特征只包括至少两个维度使用参数对应的至少两个第三评分(即已经将其他的不符的特征都剔除掉了),那么上述的预设第二模型的形成过程大致包括:将正样本或负样本的至少两个维度使用参数对应的至少两个第三评分输入训练模型(即调用训练模型),从训练模型获得第一训练结果。其中进行构造的训练模型以至少两个第三评分,且每一个第三评分具有对应的 权值(预设优先权)。持续监测第一训练结果直至满足预设条件时(当第二模型的准确率达到最大时),则将训练模型作为预设第二模型。其中,第一训练结果为确定的与各终端关联的用户账号。In the example of the present application, regardless of the training model, when the training is started, the entry of the training model includes the features of the at least two dimensions described above, and if the feature does not have a favorable influence on the first training result or is wrong At the same time, the feature of the dimension or the weight of the data is lowered. If the feature has a favorable influence on the first training result, the weight of the feature or the data is increased. If the weight of one parameter is reduced to 0, then in the training model, Features will not have any effect. Through the final experiment of the examples of the present application, the characteristics of the above different dimensions that ultimately have a positive impact on the first training result are long-term features (ie, at least two third scores in the examples of the present application). It is assumed below that the features of different dimensions include only at least two third scores corresponding to at least two dimensions using parameters (ie, other non-conforming features have been eliminated), then the formation process of the preset second model substantially includes And inputting at least two dimensions of the positive sample or the negative sample into the training model using the at least two third scores corresponding to the parameters (ie, invoking the training model), and obtaining the first training result from the training model. The training model constructed therein has at least two third scores, and each of the third scores has a corresponding weight (preset priority). The first training result is continuously monitored until the preset condition is met (when the accuracy of the second model reaches a maximum), then the training model is taken as the preset second model. The first training result is a determined user account associated with each terminal.
本申请实例中的预设条件可以为综合结果的准确率达到预设阈值,该预设阈值可以为99%,具体的预设阈值的确定可设置,本申请实例不作限制,但是,预设阈值设置的越高,达到该预设阈值或预设条件的综合评分的预设第二模型就越精确。The preset condition in the example of the present application may be that the accuracy of the comprehensive result reaches a preset threshold, and the preset threshold may be 99%, and the specific preset threshold may be set. The example in the application is not limited, but the preset threshold is The higher the setting, the more accurate the preset second model of the comprehensive score that reaches the preset threshold or preset condition.
示例性的,如图5所示终端与用户账号的关系准确率图表1,服务器在得到预设第二模型的过程中,以置信度准确率来表示综合结果的准确率,以预设阈值为99%。采用RFM模型获取三个维度使用参数,即每个第一用户账号的使用时间(R)、每个第一用户账号的使用次数(F)和每个第一用户账号的使用来源(S)为至少两个维度的使用参数为例,由图5中可知:当每个第一用户的使用时间,即R的权重值为0.7的时候,置信度准确率满足预设条件,因此,服务器训练出来的预设第二模型的每个第一用户的使用时间对应的权重为0.7,每个第一用户账号的使用次数和每个第一用户账号的使用来源的权重之和为0.3,具体的,每个第一用户账号的使用次数对应的权重可以为0.2和每个第一用户账号的使用来源对应的权重值可以为0.1。Exemplarily, as shown in FIG. 5, the accuracy relationship between the terminal and the user account is shown in FIG. 1. In the process of obtaining the preset second model, the server expresses the accuracy of the comprehensive result by the confidence accuracy rate, with a preset threshold. 99%. The RFM model is used to obtain three dimension usage parameters, that is, the usage time (R) of each first user account, the usage frequency (F) of each first user account, and the usage source (S) of each first user account are For example, the usage parameters of at least two dimensions are as shown in FIG. 5. When the usage time of each first user, that is, the weight value of R is 0.7, the confidence accuracy rate satisfies the preset condition, and therefore, the server trains out. The usage time of each first user of the preset second model corresponds to a weight of 0.7, and the sum of the usage times of each first user account and the weight of each first user account is 0.3, specifically, The usage number of each first user account may be 0.2 and the weight value corresponding to the source of use of each first user account may be 0.1.
再者,示例性的,从图6所示的终端与用户账号的关系准确率图表2可知:若是服务器采用单一维度(R或F或M),置信度准确率(76.5%)没有采用RFM模型获取的三个两个维度使用数据对应的权重加权总分达到的置信度准确率(88.20%)高。从图5中可知:若是服务器采用单一维度(R、F和M)或是不采用权重值直接对多维度的第二评分(RFM总分)直接评分时的置信度准确率没有采用RFM模型获取三个两个维度使用记录对应的权重加权总分达到的置信度准确率高,并且以R的权重值为0.7的时候最高,置信度准确率达到了99%。Moreover, for example, from the relationship between the terminal and the user account shown in FIG. 6, the accuracy rate chart 2 shows that if the server adopts a single dimension (R or F or M), the confidence accuracy rate (76.5%) does not adopt the RFM model. The three dimensions obtained are higher by the confidence accuracy (88.20%) achieved by using the weighted total score corresponding to the data. As can be seen from Figure 5, if the server uses a single dimension (R, F, and M) or directly uses the weight value to directly score the second dimension (RFM total score) of the multi-dimensional, the confidence accuracy is not obtained by the RFM model. The three two dimensions use the weight-weighted total score corresponding to the record to achieve a high confidence rate, and the highest value when the weight of R is 0.7, and the confidence accuracy rate is 99%.
从以上流程可以看出,1)本申请实例采用了基于预设第二模型的综合评分方式,当构造终端与至少一个第一用户账号的关系数据中的一个第一用户账号与至少两个第二评分进行基于多个维度的综合评分,充 分利用了终端上的每个第一用户账号对应的多个维度使用参数,得到预设第二模型,能够有效得到反映终端上的每个第一用户账号的使用可信赖程度的指标,实现对相关终端上的每个第一用户账号的评估。2)本申请实例引入了使用数据中的各不同维度的使用参数来对训练模型进行训练,根据第一训练结果确定第二模型中的各维度的第二评分的权重,进而根据第二模型确定各第一用户账号的综合评分,如此提升了用户账户综合评分的准确性。3)本申请实例采用的预设第二模型的一个显著特点是模型可以自我进化,根据至少两个维度使用记录的变换自动进行权值的调整,避免基于规则的人工频繁介入调整参数。As can be seen from the above process, 1) the example of the present application adopts a comprehensive scoring method based on the preset second model, when constructing a first user account and at least two of the relationship data between the terminal and the at least one first user account The second scoring performs a comprehensive scoring based on multiple dimensions, and fully utilizes multiple dimension usage parameters corresponding to each first user account on the terminal to obtain a preset second model, which can effectively obtain each first user on the terminal. The use of the account's trustworthiness indicator enables evaluation of each first user account on the relevant terminal. 2) The example of the present application introduces the use of different usage parameters in the data to train the training model, and determines the weight of the second score of each dimension in the second model according to the first training result, and then determines according to the second model. The comprehensive score of each first user account thus improves the accuracy of the comprehensive score of the user account. 3) A remarkable feature of the preset second model adopted in the application example is that the model can self-evolve, automatically adjust the weight according to the transformation of the record using at least two dimensions, and avoid the rule-based manual frequent intervention adjustment parameter.
可以理解的是,在本申请实例中,相比现有的使用各种复杂的行为数据,本申请实例使用终端的使用数据,根据该使用数据确定终端对应的各第一用户账号的多个维度使用参数作为主要数据源。评分过程和模型构造过程都简单易行,不需要使用各种复杂的编码、聚类、筛选手段对特征进行复杂的构造和处理,大大降低了数据处理的工作量,使得预设第二模型简单可用。It can be understood that, in the example of the present application, the application example uses the usage data of the terminal, and determines multiple dimensions of each first user account corresponding to the terminal according to the usage data. Use parameters as the primary data source. The scoring process and the model construction process are simple and easy, and do not need to use various complicated coding, clustering, and filtering methods to perform complex construction and processing on the features, which greatly reduces the workload of data processing and makes the preset second model simple. Available.
进一步地,如图7所示,在S303之后,本申请实例提供的一种确定关联账号的方法中,服务器就可以获取预设评分与置信度的对应关系了,具体可以包括:Further, as shown in FIG. 7 , after S303, in a method for determining an associated account provided by the example of the present application, the server may obtain a correspondence between a preset score and a confidence level, which may include:
S304、调用预设第二模型处理正样本和负样本,得到的第二训练结果。S304. Call the preset second model to process the positive sample and the negative sample, and obtain the second training result.
在服务器得到了具体的预设第二模型后,即确定每个第二评分的权重值之后,该服务器可以将正样本和负样本输入预设第二模型(即调用预设第二模型),得到的第二训练结果。其中,第二训练结果为各第一用户账号对应的第一评分、After the server obtains the specific preset second model, that is, after determining the weight value of each second score, the server may input the positive sample and the negative sample into the preset second model (ie, call the preset second model), The second training result is obtained. The second training result is a first score corresponding to each first user account,
需要说明的是,本申请实例中的第二训练结果是在置信度准确率最高的础上,对每个样本的综合评分。It should be noted that the second training result in the example of the present application is a comprehensive score for each sample on the basis of the highest confidence accuracy.
S305、调用第二训练结果和预设样本与置信度准确率的对应关系,获取与该第二训练结果对应的置信度准确率。S305: Call a second training result and a correspondence between the preset sample and the confidence accuracy, and obtain a confidence accuracy rate corresponding to the second training result.
服务器将正样本和负样本输入预设第二模型,得到的第二训练结果 之后,服务器就可以知道第二训练结果与每个样本(正样本和负样本中的每个样本)的对应关系了。由于在本申请实例中,服务器中还设置有预设样本与置信度准确率的对应关系,其中,该预设样本与置信度准确率的对应关系是依据样本中的使用时间的第二评分越高,对应的置信度准确率越高来设置的。即终端与用户账号的关系数据中的用户账号对应的使用时间越接近当前时间,则表征服务器进行的综合评分的置信度准确率越高,公式(2)或公式(5)就是依据该原则进行使用时间的分数的计算的。因此,服务器通过将每个样本与预设样本与置信度准确率的对应关系进行匹配,得到了每个样本对应的置信度准确率,再根据第二训练结果与每个样本的对应关系,就可以得到该第二训练结果对应的第一置信度准确率了。The server inputs the positive sample and the negative sample into the preset second model, and after obtaining the second training result, the server can know the correspondence between the second training result and each sample (each sample in the positive sample and the negative sample). . In the example of the present application, the correspondence between the preset sample and the confidence accuracy rate is further set in the server, wherein the correspondence between the preset sample and the confidence accuracy rate is based on the second score of the usage time in the sample. High, the higher the confidence level of the corresponding confidence is set. That is, the closer the usage time of the user account in the relationship data between the terminal and the user account is to the current time, the higher the confidence rate of the comprehensive score of the server is represented, and the formula (2) or formula (5) is based on the principle. Calculate the score of the time used. Therefore, the server matches the correspondence between each sample and the preset sample and the confidence accuracy rate, and obtains the confidence accuracy rate corresponding to each sample, and according to the correspondence between the second training result and each sample, The first confidence accuracy corresponding to the second training result can be obtained.
S306、将第二训练结果与第一置信度准确率的对应关系作为预设评分与置信度的对应关系。S306. The correspondence between the second training result and the first confidence accuracy is used as a correspondence between the preset score and the confidence.
服务器获取与该第二训练结果对应的第一置信度准确率之后,该服务器经过预设第二模型得到的第一评分越高表征该第一评分对应的终端与至少一个第一用户账号的关系数据对应的一条关系数据的关系准确率最高,也就是说,该第一评分对应的第一用户账号是终端与至少一个第一用户账号的关系数据中最常用的用户账号。因此,第二训练结果与第一置信度准确率的对应关系可以表征第一评分对应的终端与至少一个第一用户账号的关系数据的关系准确度,因此,服务器可以将第二训练结果与第一置信度准确率的对应关系作为预设评分与置信度的对应关系,然后,该服务器就可以通过预设评分与置信度的对应关系确定出预设选择规则想要选择的置信度对应的用户账号对应的用户了。After the server obtains the first confidence accuracy rate corresponding to the second training result, the higher the first score obtained by the server through the preset second model is used to represent the relationship between the terminal corresponding to the first score and the at least one first user account. The relational data corresponding to the data has the highest relationship accuracy rate, that is, the first user account corresponding to the first rating is the most commonly used user account in the relationship data between the terminal and the at least one first user account. Therefore, the correspondence between the second training result and the first confidence accuracy rate may be used to represent the relationship accuracy between the terminal corresponding to the first rating and the relationship data of the at least one first user account. Therefore, the server may use the second training result and the first training result. The correspondence between the confidence rate and the confidence is used as the correspondence between the preset score and the confidence. Then, the server can determine the user corresponding to the confidence that the preset selection rule wants to select by the correspondence between the preset score and the confidence. The user corresponding to the account.
示例性的,以RFM三个维度为例,通过公式(1)、公式(2)和公式(3),以及预设第二模型得到了第二训练结果。如表1所示,为第二训练结果(加权总分)、置信度以及预设样本的对应关系总表。Exemplarily, taking the three dimensions of RFM as an example, the second training result is obtained by formula (1), formula (2) and formula (3), and preset second model. As shown in Table 1, the second training result (weighted total score), the confidence level, and the correspondence table of the preset samples are summarized.
Figure PCTCN2018072381-appb-000004
Figure PCTCN2018072381-appb-000004
表1Table 1
由表1可知:加权总分越高,则置信度越高,而置信度则表征终端与至少一个第一用户账号的关系准确率,因此,服务器可以通过预设选择规则适应性的选择需要挑选终端上最常用的用户的置信度实现关联数据的推送,还是挑选使用的用户最多的置信度实现关联数据的推送。It can be seen from Table 1 that the higher the weighted total score, the higher the confidence, and the confidence level indicates the relationship accuracy between the terminal and the at least one first user account. Therefore, the server can select the adaptive selection by the preset selection rule. The confidence of the most commonly used users on the terminal enables the push of associated data, or the most trusted confidence of the user to use to push the associated data.
可以理解的是,由于服务器可以通过预设关联推荐模型计算出终端对应每个第一用户账号的第一评分,进而获取每个第一用户账号对应的置信度,从而使得服务器可以根据预设选择规则选取不同情况下下的置信度,并根据选取的置信度对应的用户账号的使用数据向终端推送相关联数据。实现了在终端上的使用的用户账号改变的情况下,适应性确定与终端关联的账号,从而为该终端上的关联的账号对应的用户推荐相关联数据,提高了关联推荐的准确度。It can be understood that, because the server can calculate the first score corresponding to each first user account by using the preset association recommendation model, the confidence level corresponding to each first user account is obtained, so that the server can select according to the preset. The rule selects the confidence level under different conditions, and pushes the associated data to the terminal according to the usage data of the user account corresponding to the selected confidence level. In the case that the user account used on the terminal is changed, the account associated with the terminal is adaptively determined, so that the associated data is recommended for the user corresponding to the associated account on the terminal, and the accuracy of the association recommendation is improved.
如图8所示,本申请实例提供了一种服务器1,该服务器1可以包括:As shown in FIG. 8, the example of the present application provides a server 1, which may include:
获取单元10,用于获取用户第一历史记录,所述用户第一历史记录包括用户在终端对应的至少一个第一用户账号的关系数据,以及所述关系数据中的每个第一用户账号的至少两个维度使用记录。The obtaining unit 10 is configured to acquire a first history record of the user, where the first history record of the user includes relationship data of at least one first user account corresponding to the user, and each first user account in the relationship data. Use records for at least two dimensions.
调用单元11,用于调用预设关联推荐模型,所述预设关联推荐模型用于处理所述每个第一用户账号的至少两个维度使用记录,输出与所述每个第一用户账号对应的第一评分,得到至少一个第一评分,其中,所述至少一个第一用户账号分别对应所述至少一个第一评分。The calling unit 11 is configured to invoke a preset association recommendation model, where the preset association recommendation model is configured to process at least two dimension usage records of each of the first user accounts, and output corresponding to each of the first user accounts. The first score is obtained by at least one first score, wherein the at least one first user account corresponds to the at least one first score respectively.
所述获取单元10,还用于调用所述至少一个第一评分和预设评分与置信度的对应关系,获取与所述至少一个第一评分对应的至少一个置信度;以及从所述至少一个置信度获取满足预设选择规则的第一置信度,依据与所述第一置信度对应的第一用户账号的至少两个维度使用记录为所述终端推送相关联数据,所述预设选择规则由实际推送相关联数据的类型决定。The obtaining unit 10 is further configured to invoke the correspondence between the at least one first score and the preset score and the confidence, acquire at least one confidence level corresponding to the at least one first score, and from the at least one Confidence acquisition obtains a first confidence level that satisfies the preset selection rule, and uses at least two dimension usage records of the first user account corresponding to the first confidence level to push the associated data for the terminal, the preset selection rule It is determined by the type of actual push associated data.
所述获取单元10获取的所述每个第一用户账号的至少两个维度使用记录包括:所述每个第一用户账号的使用时间、所述每个第一用户账号的使用次数和所述每个第一用户账号的使用来源中的至少两个。The at least two dimension usage records of each of the first user accounts acquired by the obtaining unit 10 include: a usage time of each of the first user accounts, a usage count of each of the first user accounts, and the At least two of the sources of use of each first user account.
所述调用单元11,具体用于调用预设第一模型处理所述每个第一用户账号的至少两个维度使用记录,输出与所述每个第一用户账号的至少两个维度使用记录对应的至少两个第二评分,其中,所述预设关联推荐模型包括:所述预设第一模型,所述预设第一模型用于分别对所述每个第一用户账号的至少两个维度使用记录的重要度进行评分;以及,调用所述预设第二模型处理所述至少两个第二评分,输出与所述每个第一用户账号对应的所述第一评分,直至得到所述至少一个第一评分,其中,所述预设关联推荐模型还包括:所述预设第二模型,所述预设第二模型用于对所述至少两个第二评分进行加权得到总评分。The calling unit 11 is configured to: call the preset first model to process the at least two dimension usage records of each of the first user accounts, and output the at least two dimension usage records corresponding to each of the first user accounts. At least two second scores, wherein the preset association recommendation model includes: the preset first model, the preset first model being used for at least two of each of the first user accounts respectively The dimension is scored using the importance of the record; and the preset second model is invoked to process the at least two second scores, and the first score corresponding to each of the first user accounts is output until the location is obtained Determining at least one first rating, wherein the preset association recommendation model further comprises: the preset second model, wherein the preset second model is configured to weight the at least two second scores to obtain a total score .
基于图8,如图9所示,所述服务器1还包括:检测单元12。Based on FIG. 8, as shown in FIG. 9, the server 1 further includes a detecting unit 12.
所述获取单元10,还用于按照预设的配置比例获取正样本和负样 本,所述正样本和所述负样本为第一终端与至少一个第二用户账号的对应关系,以及每个第二用户账号通过所述预设第一模型得到的至少两个第三评分。The obtaining unit 10 is further configured to obtain a positive sample and a negative sample according to a preset configuration ratio, where the positive sample and the negative sample are correspondences between the first terminal and the at least one second user account, and each of the first At least two third scores obtained by the two user accounts through the preset first model.
所述调用单元11,还用于调用设置的训练模型处理所述正样本或所述负样本,得到第一训练结果。The calling unit 11 is further configured to invoke the set training model to process the positive sample or the negative sample to obtain a first training result.
所述检测单元12,用于持续检测训练模型,直至所述第一训练结果满足预设条件。The detecting unit 12 is configured to continuously detect the training model until the first training result satisfies a preset condition.
所述获取单元10,还用于将所述第一训练结果满足所述预设条件的所述训练模型作为所述预设第二模型,所述预设条件用于表征根据所述预设第二模型得到的数据输出结果运用于确定所述终端最常用用户账号时,最接近所述终端的真实常用用户账号。The acquiring unit 10 is further configured to use, as the preset second model, the training model that meets the preset condition that the first training result meets the preset condition, where the preset condition is used to represent according to the preset The data output result obtained by the second model is used to determine the most common user account of the terminal, and is closest to the real user account of the terminal.
所述调用单元11,还用于所述将所述第一训练结果满足所述预设条件的所述训练模型作为所述预设第二模型之后,调用所述预设第二模型处理所述正样本和所述负样本,得到的第二训练结果。The calling unit 11 is further configured to: after the training model that meets the preset condition that the first training result meets the preset condition, use the preset second model to process the The positive sample and the negative sample, the second training result is obtained.
所述获取单元10,还用于根据所述第二训练结果和预设样本与置信度准确率的对应关系,获取与所述第二训练结果对应的第一置信度准确率;以及,将所述第二训练结果与所述第一置信度准确率的对应关系作为所述预设评分与置信度的对应关系。The acquiring unit 10 is further configured to acquire, according to the correspondence between the second training result and the preset sample and the confidence accuracy, the first confidence accuracy rate corresponding to the second training result; Corresponding relationship between the second training result and the first confidence accuracy rate is used as a correspondence between the preset score and the confidence.
所述预设第一模型包括:The preset first model includes:
Figure PCTCN2018072381-appb-000005
lg(1/(1+(n/30)*n))/N和lg(j/k)/J中的至少两个;
Figure PCTCN2018072381-appb-000005
At least two of lg(1/(1+(n/30)*n))/N and lg(j/k)/J;
其中,m为所述每个第一用户账号的使用来源的总个数,h i为所述每个第一用户账号的第i个使用来源对应的预设分数,H为预设使用来源总数,M为预设第一归一化参数,并且m小于H;n为所述每个第一用户账号的使用时间距当前时间的天数,N为预设第二归一化参数;j为所述每个第一用户账号的使用次数,k为预设时间,J为预设第三归一化参数。 Where m is the total number of sources of use of each of the first user accounts, h i is a preset score corresponding to the ith source of use of each of the first user accounts, and H is the total number of preset usage sources , M is a preset first normalization parameter, and m is less than H; n is the number of days from the current time of each first user account, and N is a preset second normalization parameter; j is The number of times each first user account is used, k is a preset time, and J is a preset third normalization parameter.
所述获取单元10,具体用于从所述至少一个置信度获取最信度最高的所述第一置信度。The obtaining unit 10 is specifically configured to obtain the first confidence that is the most reliable from the at least one confidence level.
所述获取单元10,具体用于从所述至少一个置信度获取与所述至少一个第一用户账号最多对应的所述第一置信度。The obtaining unit 10 is specifically configured to obtain, from the at least one confidence level, the first confidence level that corresponds to the at least one first user account.
所述获取单元10获取的所述至少一个第一用户账号包括:所述终端对应的至少一个通信账号或所述终端上的第一应用的至少一个登录账号。The at least one first user account acquired by the acquiring unit 10 includes: at least one communication account corresponding to the terminal or at least one login account of the first application on the terminal.
如图10所示,本申请还提供了一种服务器,包括处理器14及存储介质15,存储介质15通过系统总线16与处理器14链接。处理器14具体为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等实现。其中,存储介质15用于存储可执行程序代码,该程序代码包括计算机操作指令,存储介质15可能包含高速RAM存储器,也可能还包括非易失性存储器,例如,至少一个磁盘存储器。存储器中存储的程序代码且经配置以由处理器执行,以实现上述本申请中的确定关联账户的方法及实现本申请中服务器中各模块的功能。As shown in FIG. 10, the present application also provides a server, including a processor 14 and a storage medium 15, which is linked to the processor 14 via a system bus 16. The processor 14 is embodied by a central processing unit (CPU), a microprocessor (MPU), a digital signal processor (DSP), or a field programmable gate array (FPGA). The storage medium 15 is for storing executable program code, and the program code includes computer operation instructions. The storage medium 15 may include a high speed RAM memory, and may also include a nonvolatile memory, for example, at least one disk storage. The program code stored in the memory is configured to be executed by the processor to implement the method of determining an associated account in the present application described above and to implement the functions of the various modules in the server in the present application.
本领域内的技术人员应明白,本申请的实例可提供为方法、系统、或计算机程序产品。因此,本申请实例可采用硬件实例、软件实例、或结合软件和硬件方面的实例的形式。而且,本申请实例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that examples of the present application can be provided as a method, system, or computer program product. Accordingly, the examples of the present application may take the form of a hardware instance, a software example, or an example of combining software and hardware aspects. Moreover, the examples of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
本申请实例是参照根据本申请实例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application examples are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to examples of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机 可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
以上所述,仅为本申请实例的较佳实例而已,并非用于限定本申请实例的保护范围。The above descriptions are only preferred examples of the examples of the present application, and are not intended to limit the scope of protection of the examples of the present application.

Claims (19)

  1. 一种确定关联账号的方法,应用于服务器,所述方法包括:A method for determining an associated account is applied to a server, and the method includes:
    从终端设备获取一个或多个用户通过各自对应的第一用户账号使用所述终端设备时的使用数据,所述各第一用户账号的使用数据包括:所述终端设备的标识,所述第一用户账号,以及所述第一用户账号的使用记录;Obtaining usage data when the one or more users use the terminal device by using the corresponding first user account, the usage data of each first user account includes: an identifier of the terminal device, the first a user account and a usage record of the first user account;
    根据各第一用户账号对应的使用记录确定所述各第一用户账号的至少两个维度的使用参数;Determining, according to the usage record corresponding to each first user account, usage parameters of at least two dimensions of each first user account;
    利用所述各第一用户账号至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分;Calculating, by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model, a first score corresponding to each first user account;
    根据所述各第一用户账号对应的第一评分确定与所述终端设备关联的第一账号。Determining, according to the first score corresponding to each first user account, a first account associated with the terminal device.
  2. 根据权利要求1所述的方法,其中,所述方法进一步包括:The method of claim 1 wherein the method further comprises:
    根据与所述终端设备关联的第一用户账号对应的使用记录为所述终端设备推送相关联数据。And uploading the associated data to the terminal device according to the usage record corresponding to the first user account associated with the terminal device.
  3. 根据权利要求1所述的方法,其中,所述方法进一步包括:The method of claim 1 wherein the method further comprises:
    根据预设评分与置信度的对应关系,以及所述各第一用户账号对应的第一评分,确定各第一用户账号对应的置信度;Determining a confidence level corresponding to each first user account according to a correspondence between a preset score and a confidence level, and a first score corresponding to each of the first user accounts;
    从所述各第一用户账号对应的置信度中获取满足预设选择规则的置信度,根据所述满足预设选择规则的置信度对应的第一用户账号对应的使用记录为所述终端设备推送相关联数据。And obtaining a confidence level that meets the preset selection rule, and the usage record corresponding to the first user account corresponding to the confidence level of the preset selection rule is pushed by the terminal device according to the confidence level corresponding to the first user account. Associated data.
  4. 根据权利要求1所述的方法,其中,The method of claim 1 wherein
    所述各第一用户账号的至少两个维度的使用参数包括如下三个类型的数据中的至少两个类型的数据:各第一用户账号的使用时间、各第一用户账号的使用次数和各第一用户账号的使用来源。The usage parameters of the at least two dimensions of each first user account include data of at least two types of data of the following three types: usage time of each first user account, usage times of each first user account, and each The source of the first user account.
  5. 根据权利要求1所述的方法,其中,所述预设关联推荐模型包括第一模型及第二模型,其中,利用所述各第一用户账号的至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分包括:The method according to claim 1, wherein the preset association recommendation model comprises a first model and a second model, wherein the usage parameters and the preset association recommendation of at least two dimensions of each of the first user accounts are utilized. The first score corresponding to each model of the first user account is calculated by the model:
    调用所述第一模型计算所述各第一用户账号的至少两个维度的使用参数,输出各第一用户账号的各维度的使用参数的第二评分;Invoking the first model to calculate usage parameters of at least two dimensions of each first user account, and outputting a second score of usage parameters of each dimension of each first user account;
    调用所述第二模型对所述各第一用户账号的各维度的使用参数的第二评分进行加权,得到所述各第一用户账号的所述第一评分。And calling the second model to weight the second score of the usage parameter of each dimension of each first user account to obtain the first score of each first user account.
  6. 根据权利要求5所述的方法,其中,所述各第一用户账号的至少两个维度使用参数包括:各第一用户账号的使用时间、各第一用户账号的使用次数和各第一用户账号的使用来源;The method according to claim 5, wherein the at least two dimension usage parameters of each first user account comprise: a usage time of each first user account, a usage number of each first user account, and each first user account. Source of use;
    其中,调用所述第一模型计算所述各第一用户账号的至少两个维度的使用参数,输出各第一用户账号的各维度的使用参数的第二评分包括:The invoking the first model to calculate the usage parameters of the at least two dimensions of the first user accounts, and outputting the second scores of the usage parameters of each dimension of each first user account includes:
    通过以下公式(1)确定各第一用户账号的使用来源的第二评分;Determining a second rating of the source of use of each first user account by the following formula (1);
    Figure PCTCN2018072381-appb-100001
    Figure PCTCN2018072381-appb-100001
    通过以下公式(2)确定各第一用户账号的使用时间的第二评分;Determining a second rating of the usage time of each first user account by the following formula (2);
    lg(1/(1+(n/30)*n))/N    (2)Lg(1/(1+(n/30)*n))/N (2)
    通过以下公式(3)确定各第一用户账号的使用次数的第二评分;Determining a second score of the number of uses of each first user account by the following formula (3);
    lg(j/k)/J    (3)Lg(j/k)/J (3)
    其中,m为每个第一用户账号的使用来源的总个数,hi为每个第一用户账号的第i个使用来源对应的预设分数,H为预设使用来源总数,M为预设第一归一化参数,并且m小于H;n为每个第一用户账号的使用时间距当前时间的天数,N为预设第二归一化参数;j为每个第一用户账号的使用次数,k为预设时间,J为预设第三归一化参数。Where m is the total number of sources used by each first user account, hi is the preset score corresponding to the ith source of each first user account, H is the total number of preset usage sources, and M is a preset The first normalization parameter, and m is less than H; n is the number of days from the current time of each first user account, N is a preset second normalization parameter; j is the use of each first user account The number of times, k is the preset time, and J is the preset third normalization parameter.
  7. 根据权利要求5所述的方法,其中,所述方法还包括:The method of claim 5 wherein the method further comprises:
    获取至少一个终端设备中各终端设备对应的第二用户账号及第三用户账号,获取各终端设备对应的第二用户账号及第三用户账号的至少两个维度的使用参数;Obtaining a second user account and a third user account corresponding to each terminal device in the at least one terminal device, and acquiring usage parameters of at least two dimensions of the second user account and the third user account corresponding to each terminal device;
    根据所述第一模型确定各终端设备对应的第二用户账号的至少两个第三评分及第三账号的至少两个第三评分;Determining, according to the first model, at least two third scores of the second user account corresponding to each terminal device and at least two third scores of the third account;
    预设所述第二模型的参数;Presetting parameters of the second model;
    针对所述至少一个终端设备中的各终端设备,执行以下处理:Performing the following processing for each terminal device in the at least one terminal device:
    根据该终端设备对应的第二用户账号的至少两个第三评分,确定所述第二用户账号的第一评分,根据该终端设备对应的第三用户账号的至少两个第三评分,确定所述第三用户账号的第一评分;Determining, according to at least two third scores of the second user account corresponding to the terminal device, a first score of the second user account, and determining, according to at least two third scores of the third user account corresponding to the terminal device, Describe the first rating of the third user account;
    根据所述第二用户账号的第一评分及所述第三用户账号的第一评分确定所述第二用户账号或第三用户账号作为与所述终端设备关联的用户账号;Determining, according to the first score of the second user account and the first score of the third user account, the second user account or the third user account as a user account associated with the terminal device;
    所述方法进一步包括:The method further includes:
    根据所述至少一个终端设备中各终端设备关联的用户账号确定所述第二模型的准确率;Determining an accuracy rate of the second model according to a user account associated with each terminal device in the at least one terminal device;
    调整所述第二模型的参数至所述第二模型的准确率满足预设条件。The accuracy of adjusting the parameters of the second model to the second model satisfies a preset condition.
  8. 根据权利要求3所述的方法,其中,所述从所述各第一用户账号对应的置信度中获取满足预设选择规则的置信度包括:The method according to claim 3, wherein the obtaining a confidence level that satisfies a preset selection rule from the confidence levels corresponding to the first user accounts comprises:
    从所述各第一用户账号对应的置信度中获取置信度最高的置信度。The confidence level with the highest confidence is obtained from the confidence levels corresponding to the first user accounts.
  9. 根据权利要求3所述的方法,其中,所述从所述各第一用户账号对应的置信度中获取满足预设选择规则的置信度包括:The method according to claim 3, wherein the obtaining a confidence level that satisfies a preset selection rule from the confidence levels corresponding to the first user accounts comprises:
    从所述各第一用户账号对应的置信度中获取对应第一用户账号最多置信度。Obtaining a maximum confidence level corresponding to the first user account from the confidence levels corresponding to the first user accounts.
  10. 一种服务器,包括一个或一个以上处理器和一个或一个以上存储器,所述一个或一个以上存储器包括计算机可读指令,经配置由所述一个或者一个以上处理器执行以实现:A server comprising one or more processors and one or more memories, the one or more memories comprising computer readable instructions configured to be executed by the one or more processors to:
    从终端设备获取一个或多个用户通过各自对应的第一用户账号使用所述终端设备时的使用数据,所述各第一用户账号的使用数据包括:所述终端设备的标识,所述第一用户账号,以及所述第一用户账号的使用记录;Obtaining usage data when the one or more users use the terminal device by using the corresponding first user account, the usage data of each first user account includes: an identifier of the terminal device, the first a user account and a usage record of the first user account;
    根据各第一用户账号对应的使用记录确定所述各第一用户账号至少两个维度的使用参数;Determining, according to the usage record corresponding to each first user account, usage parameters of at least two dimensions of each first user account;
    利用所述各第一用户账号至少两个维度的使用参数和预设关联推荐模型计算所述各第一用户账号对应的第一评分;Calculating, by using the usage parameter of the at least two dimensions of each first user account and the preset association recommendation model, a first score corresponding to each first user account;
    根据所述各第一用户账号对应的第一评分确定与所述终端设备关联的第一账号。Determining, according to the first score corresponding to each first user account, a first account associated with the terminal device.
  11. 根据权利要求10所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 10 wherein said one or more processors execute said computer readable instructions to:
    根据与所述终端设备关联的第一用户账号对应的使用记录为所述终端设备推送相关联数据。And uploading the associated data to the terminal device according to the usage record corresponding to the first user account associated with the terminal device.
  12. 根据权利要求10所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 10 wherein said one or more processors execute said computer readable instructions to:
    根据预设评分与置信度的对应关系,以及所述各第一用户账号对应的第一评分,确定各第一用户账号对应的置信度;Determining a confidence level corresponding to each first user account according to a correspondence between a preset score and a confidence level, and a first score corresponding to each of the first user accounts;
    从所述各第一用户账号对应的置信度中获取满足预设选择规则的置信度,根据所述满足预设选择规则的置信度对应的第一用户账号对应的使用记录为所述终端设备推送相关联数据。And obtaining a confidence level that meets the preset selection rule, and the usage record corresponding to the first user account corresponding to the confidence level of the preset selection rule is pushed by the terminal device according to the confidence level corresponding to the first user account. Associated data.
  13. 根据权利要求10所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 10 wherein said one or more processors execute said computer readable instructions to:
    所述各第一用户账号的至少两个维度的使用记录包括如下三个类型的数据中的至少两个类型的数据:各第一用户账号的使用时间、各第一用户账号的使用次数和各第一用户账号的使用来源。The usage record of the at least two dimensions of each first user account includes data of at least two types of the following three types of data: the usage time of each first user account, the usage times of each first user account, and each The source of the first user account.
  14. 根据权利要求10所述的服务器,其中,所述预设关联推荐模型包括第一模型及第二模型,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 10, wherein the preset associated recommendation model comprises a first model and a second model, the one or more processors executing the computer readable instructions to:
    调用所述第一模型计算所述各第一用户账号的至少两个维度的使用参数,输出各第一用户账号的各维度的使用参数的第二评分;Invoking the first model to calculate usage parameters of at least two dimensions of each first user account, and outputting a second score of usage parameters of each dimension of each first user account;
    调用所述第二模型对所述各第一用户账号的各维度的使用参数的第二评分进行加权,得到所述各第一用户账号的所述第一评分。And calling the second model to weight the second score of the usage parameter of each dimension of each first user account to obtain the first score of each first user account.
  15. 根据权利要求14所述的服务器,其中,所述各第一用户账号的至少两个维度使用参数包括:各第一用户账号的使用时间、各第一用户账号的使用次数和各第一用户账号的使用来源;The server according to claim 14, wherein the at least two dimension usage parameters of each first user account include: a usage time of each first user account, a usage number of each first user account, and each first user account. Source of use;
    其中,所述一个或者一个以上处理器执行所述计算机可读指令以实 现:Wherein the one or more processors execute the computer readable instructions to:
    通过以下公式(1)确定各第一用户账号的使用来源的第二评分;Determining a second rating of the source of use of each first user account by the following formula (1);
    Figure PCTCN2018072381-appb-100002
    Figure PCTCN2018072381-appb-100002
    通过以下公式(2)确定各第一用户账号的使用时间的第二评分;Determining a second rating of the usage time of each first user account by the following formula (2);
    lg(1/(1+(n/30)*n))/N    (2)Lg(1/(1+(n/30)*n))/N (2)
    通过以下公式(3)确定各第一用户账号的使用次数的第二评分;Determining a second score of the number of uses of each first user account by the following formula (3);
    lg(j/k)/J    (3)Lg(j/k)/J (3)
    其中,m为每个第一用户账号的使用来源的总个数,hi为每个第一用户账号的第i个使用来源对应的预设分数,H为预设使用来源总数,M为预设第一归一化参数,并且m小于H;n为每个第一用户账号的使用时间距当前时间的天数,N为预设第二归一化参数;j为每个第一用户账号的使用次数,k为预设时间,J为预设第三归一化参数。Where m is the total number of sources used by each first user account, hi is the preset score corresponding to the ith source of each first user account, H is the total number of preset usage sources, and M is a preset The first normalization parameter, and m is less than H; n is the number of days from the current time of each first user account, N is a preset second normalization parameter; j is the use of each first user account The number of times, k is the preset time, and J is the preset third normalization parameter.
  16. 根据权利要求14所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 14 wherein said one or more processors execute said computer readable instructions to:
    获取至少一个终端设备中各终端设备对应的第二用户账号及第三用户账号,获取各终端设备对应的第二用户账号及第三用户账号的至少两个维度的使用参数;Obtaining a second user account and a third user account corresponding to each terminal device in the at least one terminal device, and acquiring usage parameters of at least two dimensions of the second user account and the third user account corresponding to each terminal device;
    根据所述第一模型确定各终端设备对应的第二用户账号的至少两个第三评分及第三账号的至少两个第三评分;Determining, according to the first model, at least two third scores of the second user account corresponding to each terminal device and at least two third scores of the third account;
    预设所述第二模型的参数;Presetting parameters of the second model;
    针对所述至少一个终端设备中的各终端设备,执行以下处理:Performing the following processing for each terminal device in the at least one terminal device:
    根据该终端设备对应的第二用户账号的至少两个第三评分,确定所述第二用户账号的第一评分,根据该终端设备对应的第三用户账号的至少两个第三评分,确定所述第三用户账号的第一评分;Determining, according to at least two third scores of the second user account corresponding to the terminal device, a first score of the second user account, and determining, according to at least two third scores of the third user account corresponding to the terminal device, Describe the first rating of the third user account;
    根据所述第二用户账号的第一评分及所述第三用户账号的第一评分确定与所述终端设备关联的用户账号;Determining a user account associated with the terminal device according to the first rating of the second user account and the first rating of the third user account;
    其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:Wherein the one or more processors execute the computer readable instructions to:
    根据所述至少一个终端设备中各终端设备关联的用户账号确定所 述第二模型的准确率;Determining an accuracy rate of the second model according to a user account associated with each terminal device in the at least one terminal device;
    调整所述第二模型的参数至所述第二模型的准确率满足预设条件。The accuracy of adjusting the parameters of the second model to the second model satisfies a preset condition.
  17. 根据权利要求12所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 12 wherein said one or more processors execute said computer readable instructions to:
    从所述各第一用户账号对应的置信度中获取置信度最高的置信度。The confidence level with the highest confidence is obtained from the confidence levels corresponding to the first user accounts.
  18. 根据权利要求12所述的服务器,其中,所述一个或者一个以上处理器执行所述计算机可读指令以实现:The server of claim 12 wherein said one or more processors execute said computer readable instructions to:
    从所述各第一用户账号对应的置信度中获取对应第一用户账号最多置信度。Obtaining a maximum confidence level corresponding to the first user account from the confidence levels corresponding to the first user accounts.
  19. 一种非易失性计算机可读存储介质,存储有计算机可读指令,可以使至少一个处理器执行如权利要求1-9任一项所述的方法。A non-transitory computer readable storage medium storing computer readable instructions for causing at least one processor to perform the method of any of claims 1-9.
PCT/CN2018/072381 2017-01-16 2018-01-12 Method for determining associated account, server and storage medium WO2018130201A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710032683.2 2017-01-16
CN201710032683.2A CN108322317B (en) 2017-01-16 2017-01-16 Account identification association method and server

Publications (1)

Publication Number Publication Date
WO2018130201A1 true WO2018130201A1 (en) 2018-07-19

Family

ID=62839658

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/072381 WO2018130201A1 (en) 2017-01-16 2018-01-12 Method for determining associated account, server and storage medium

Country Status (2)

Country Link
CN (1) CN108322317B (en)
WO (1) WO2018130201A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177670A (en) * 2019-12-17 2020-05-19 腾讯云计算(北京)有限责任公司 Heterogeneous account association method, device, equipment and storage medium
CN111768263A (en) * 2020-03-31 2020-10-13 北京京东尚科信息技术有限公司 Information pushing method and device, server and storage medium
CN113760939A (en) * 2020-07-01 2021-12-07 北京沃东天骏信息技术有限公司 Account type determination method, device and equipment
CN115374370A (en) * 2022-10-26 2022-11-22 小米汽车科技有限公司 Content pushing method and device based on multiple models and electronic equipment
CN115730251A (en) * 2022-12-06 2023-03-03 贝壳找房(北京)科技有限公司 Relationship recognition method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111385658B (en) * 2018-12-28 2022-07-29 深圳Tcl新技术有限公司 Control method and control system for account information synchronization among multiple devices
CN109902921A (en) * 2019-01-17 2019-06-18 平安城市建设科技(深圳)有限公司 Management method, device, equipment and the storage medium of user's growth system
CN110941769B (en) * 2019-11-19 2023-03-28 腾讯科技(深圳)有限公司 Target account determination method and device and electronic device
CN112100505B (en) * 2020-11-04 2022-02-08 腾讯科技(深圳)有限公司 Content pushing method and device, computer equipment and storage medium
CN112541015B (en) * 2020-11-26 2023-05-16 杭州数跑科技有限公司 Anonymous user identification method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103647800A (en) * 2013-11-19 2014-03-19 乐视致新电子科技(天津)有限公司 Method and system of recommending application resources
CN104104660A (en) * 2013-04-07 2014-10-15 中国移动通信集团浙江有限公司 Method of acquiring user data and system
CN105007184A (en) * 2015-07-22 2015-10-28 胡东雁 Acquisition method for user behavior habits
US20160306815A1 (en) * 2015-04-16 2016-10-20 Comcast Cable Communications, Llc Methods And Systems For Providing Persistent Storage
CN106056444A (en) * 2016-05-25 2016-10-26 腾讯科技(深圳)有限公司 Data processing method and device
CN106202190A (en) * 2016-06-27 2016-12-07 乐视控股(北京)有限公司 A kind of browser account information storage method and mobile terminal

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616101B (en) * 2008-06-26 2012-01-18 阿里巴巴集团控股有限公司 Method and device for filtering user information
CN104780193B (en) * 2014-01-15 2016-11-09 腾讯科技(深圳)有限公司 Information-pushing method, device and system
CN105227429B (en) * 2014-06-25 2019-10-18 腾讯科技(深圳)有限公司 A kind of information-pushing method and device
CN104811758B (en) * 2015-03-30 2018-09-04 腾讯科技(北京)有限公司 Programme providing method and device
CN105262794B (en) * 2015-09-17 2018-08-17 腾讯科技(深圳)有限公司 Content put-on method and device
CN105681835B (en) * 2016-02-26 2019-11-19 腾讯科技(深圳)有限公司 A kind of method and server of information push
CN106027380B (en) * 2016-07-28 2019-01-11 宇龙计算机通信科技(深圳)有限公司 A kind of information push method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104104660A (en) * 2013-04-07 2014-10-15 中国移动通信集团浙江有限公司 Method of acquiring user data and system
CN103647800A (en) * 2013-11-19 2014-03-19 乐视致新电子科技(天津)有限公司 Method and system of recommending application resources
US20160306815A1 (en) * 2015-04-16 2016-10-20 Comcast Cable Communications, Llc Methods And Systems For Providing Persistent Storage
CN105007184A (en) * 2015-07-22 2015-10-28 胡东雁 Acquisition method for user behavior habits
CN106056444A (en) * 2016-05-25 2016-10-26 腾讯科技(深圳)有限公司 Data processing method and device
CN106202190A (en) * 2016-06-27 2016-12-07 乐视控股(北京)有限公司 A kind of browser account information storage method and mobile terminal

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177670A (en) * 2019-12-17 2020-05-19 腾讯云计算(北京)有限责任公司 Heterogeneous account association method, device, equipment and storage medium
CN111768263A (en) * 2020-03-31 2020-10-13 北京京东尚科信息技术有限公司 Information pushing method and device, server and storage medium
CN113760939A (en) * 2020-07-01 2021-12-07 北京沃东天骏信息技术有限公司 Account type determination method, device and equipment
CN115374370A (en) * 2022-10-26 2022-11-22 小米汽车科技有限公司 Content pushing method and device based on multiple models and electronic equipment
CN115374370B (en) * 2022-10-26 2023-04-07 小米汽车科技有限公司 Content pushing method and device based on multiple models and electronic equipment
CN115730251A (en) * 2022-12-06 2023-03-03 贝壳找房(北京)科技有限公司 Relationship recognition method
CN115730251B (en) * 2022-12-06 2024-06-07 贝壳找房(北京)科技有限公司 Relationship identification method

Also Published As

Publication number Publication date
CN108322317A (en) 2018-07-24
CN108322317B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
WO2018130201A1 (en) Method for determining associated account, server and storage medium
US11659050B2 (en) Discovering signature of electronic social networks
CN107341268B (en) Hot searching ranking method and system
US8417654B1 (en) Decision tree refinement
US8732017B2 (en) Methods, systems, and media for applying scores and ratings to web pages, web sites, and content for safe and effective online advertising
US10747771B2 (en) Method and apparatus for determining hot event
CN110012060B (en) Information pushing method and device of mobile terminal, storage medium and server
US10846613B2 (en) System and method for measuring and predicting content dissemination in social networks
EP2407897A1 (en) Device for determining internet activity
WO2021196639A1 (en) Message pushing method and apparatus, and computer device and storage medium
CN109635206B (en) Personalized recommendation method and system integrating implicit feedback and user social status
US11989784B1 (en) Monitored alerts
CN112231570B (en) Recommendation system support attack detection method, device, equipment and storage medium
US8725735B2 (en) Information processing system, information processing method, program, and non-transitory information storage medium
US20170124468A1 (en) Bias correction in content score
WO2020135642A1 (en) Model training method and apparatus employing generative adversarial network
CN113569129A (en) Click rate prediction model processing method, content recommendation method, device and equipment
JP2011227721A (en) Interest extraction device, interest extraction method, and interest extraction program
CN111522724A (en) Abnormal account determination method and device, server and storage medium
CN110825868A (en) Topic popularity based text pushing method, terminal device and storage medium
US20230306263A1 (en) Pattern-based classification
CN115841366A (en) Article recommendation model training method and device, electronic equipment and storage medium
CN105447148B (en) A kind of Cookie mark correlating method and device
CN110196805B (en) Data processing method, data processing apparatus, storage medium, and electronic apparatus
CN111461188A (en) Target service control method, device, computing equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18739143

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18739143

Country of ref document: EP

Kind code of ref document: A1