WO2018205458A1 - 获取目标用户的方法、装置、电子设备及介质 - Google Patents

获取目标用户的方法、装置、电子设备及介质 Download PDF

Info

Publication number
WO2018205458A1
WO2018205458A1 PCT/CN2017/099699 CN2017099699W WO2018205458A1 WO 2018205458 A1 WO2018205458 A1 WO 2018205458A1 CN 2017099699 W CN2017099699 W CN 2017099699W WO 2018205458 A1 WO2018205458 A1 WO 2018205458A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
target
user
feature information
public
Prior art date
Application number
PCT/CN2017/099699
Other languages
English (en)
French (fr)
Inventor
王健宗
黄章成
吴天博
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2018205458A1 publication Critical patent/WO2018205458A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/52User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services

Definitions

  • the present application belongs to the field of information processing technologies, and in particular, to a method, an apparatus, an electronic device, and a medium for acquiring a target user.
  • the target user is classified based on some keywords or identifiers in the user behavior data, and then the target user is selected. For example, if a user browses a product used by a newborn, the user can be tagged with an infant product.
  • the prior art has at least the following deficiencies: if the user has paid attention to a certain aspect of content, such as newborn related content, the user may not pay much attention to the infant product now. The above method does not accurately determine the target user.
  • the embodiments of the present invention provide a method, an apparatus, an electronic device, and a medium for acquiring a target user, so as to solve the problem that the target user cannot be accurately determined due to the influence of the time factor on the user classification in the prior art.
  • a first aspect of the embodiments of the present invention provides a method for acquiring a target user, including:
  • a second aspect of the embodiments of the present invention provides an apparatus for acquiring a target user, including:
  • An information obtaining module configured to acquire public information published by a user's social account, where the public information includes information content and a publishing time;
  • a determining module configured to determine public information related to the target feature information according to the target feature information and each piece of the public information
  • a processing module configured to determine, according to each piece of public information related to the target feature information determined by the determining module, whether the user is a target user.
  • a third aspect of the embodiments of the present invention provides an acquisition target user electronic device, including a memory, a processor, wherein the memory stores a computer program executable on the processor, and the processor executes the computer
  • the memory stores a computer program executable on the processor
  • the processor executes the computer
  • a computer readable storage medium storing a computer program, the computer program being executed by at least one processor, implements the following steps:
  • the public information including the information content and the posting time of the user's social account is obtained, and the public information related to the target feature information is determined according to the target feature information and each piece of public information, and then according to the determined and target features.
  • the pieces of public information related to the information determine whether the user is the target user. Since the public information includes the time when the information is published, the influence of the time factor on the target user acquisition can be fully considered, so that the target user can be more accurately determined.
  • FIG. 1 is a flowchart of a method for acquiring a target user according to an embodiment of the present invention
  • FIG. 2 is a flowchart of an implementation of step S101 in FIG. 1;
  • FIG. 3 is a specific flowchart of a method for acquiring a target user according to an embodiment of the present invention
  • step S302 in FIG. 3 is a flowchart of an implementation of step S302 in FIG. 3;
  • FIG. 5 is a flowchart of an implementation of step S303 in FIG. 3;
  • FIG. 6 is a schematic diagram of an operating environment for acquiring a target user program according to an embodiment of the present invention.
  • FIG. 7 is a block diagram of a program for acquiring a target user program according to an embodiment of the present invention.
  • FIG. 1 is a flowchart showing an implementation process of a method for acquiring a target user according to an embodiment of the present invention, which is described in detail as follows:
  • Step S101 Acquire public information published by the user's social account, the public information includes the information content and the publishing time, and determine the public information related to the target feature information according to the target feature information and each piece of the public information.
  • the social account includes, but is not limited to, a Weibo account and an instant messaging platform account.
  • the public information published by the user's social account may be public information related to the hobby, life, work, etc. posted by the user, and can represent various aspects of the user's concern. Moreover, since the public information includes information content and publication time, the public information can also characterize various aspects of the user's concern or concern at various time periods.
  • the target feature information is preset feature information for determining a target user in the user, for example, target feature information includes, but is not limited to, finance, sports, entertainment, and the like. Specifically, if the target feature information is financial, and the public information published by the user's social account includes financial information, the user may be the target account.
  • the posting of each Weibo information of user u has time information.
  • different types of tags L are set for each piece of microblog information w i using different methods.
  • a text-based label classification algorithm is used (the result of the classification is 0/1, that is, whether the microblog information is related to the label l), and the user u is all associated with the label.
  • the related microblog information set w u (l) ⁇ w 1 , w 2 , . . .
  • n is the number of microblog information related to the label l in the microblog information published by the user, and n is smaller than Equal to the number of all Weibo messages published by the user u.
  • the tag 1 indicates that the microblog information published by the user corresponds to a feature information, such as finance, sports, entertainment, and the like.
  • determining the public information related to the classification label according to the target feature information and each piece of the public information in step S101 may be specifically implemented by using the following process:
  • Step S201 extracting first classification feature information of each piece of the public information, where the first classification feature information includes a keyword and/or an identifier.
  • the public information published by the user through the social account may include the classification feature information of the user's hobbies, life, work, etc., so the first category including the keyword and/or identifier may be extracted from the public information published by the user.
  • the keywords include, but are not limited to, words related to the user's hobbies, life, work, etc.
  • the identifiers include, but are not limited to, identifiers such as pictures, expressions, and the like related to the user's hobbies, life, work, and the like.
  • Step S202 Determine, according to the first classification feature information of the public information and the target feature information, whether each piece of the public information is related to the target feature information.
  • the target feature information may include at least one keyword and at least one identifier. Specifically, after the first classification feature information is extracted in step S201, the first classification feature information and the target feature information may be matched, if When the matching degree of the classification feature information and the target feature information is greater than the first threshold, determining that the public information is related to the target feature information; otherwise, determining that the public information is not related to the target feature information.
  • the first classification feature information when the first classification feature information is a keyword, the first classification feature information may be matched with each keyword in the target feature information, and if the matching is successful, the public information is determined to be related to the target feature information; otherwise, the determination is performed.
  • the public information is not related to the target feature information.
  • the first classification feature information when the first classification feature information is an identifier, the first classification feature information may be matched with the identifier in the target feature information, and if the matching degree is greater than the first threshold, determining that the public information is related to the target feature information, Otherwise, it is determined that the public information is not related to the target feature information.
  • the keyword or the identifier may be prioritized, and the first classification feature information and the target feature information are matched according to the priority.
  • Step S102 Determine, according to the determined pieces of public information related to the target feature information, whether the user is a target user.
  • the determined degree of relevance of each piece of public information related to the target feature information and the target feature information may be determined whether the user is a target user. Specifically, the correlation degree of each piece of public information and the target feature information related to the target feature information may be averaged, and then the user is determined to be the target user according to the size relationship between the average value and the second threshold.
  • s and x 0 are preset coefficients, and x represents a time difference between the release time of the public information related to the classification feature information 1 and the crawler acquisition time.
  • the set of weight values for all public information associated with tag l is Whether the user is the target user is determined according to the size of the weight value corresponding to each piece of public information. For example, the tag 1 represents the financial information, and the relevance of the public information related to the financial information published by the user to the financial information is small, and the average value is less than the second threshold, then the user may be determined not to be the target user or the non-quality target customer. Otherwise, it is determined that the user is a target user.
  • the unit of the public information release time from the crawler time difference is year.
  • FIG. 3 shows a specific flowchart of the method for acquiring a target user, and the repeated description is not repeated.
  • Step S301 Acquire public information published by the user's social account, the public information includes the information content and the publishing time, and determine the public information related to the target feature information according to the target feature information and each piece of the public information.
  • step S101 For details in this step, refer to related content in step S101, and details are not described herein again.
  • step S302 the target account information of the user's social account is taken, the target account information includes the classification information of the target account and the ranking information of the target account, and is determined according to the target feature information and each of the target account information.
  • the target account information related to the target feature information is determined according to the target feature information and each of the target account information.
  • the target account information that the user's social account pays attention to may be account information related to the user's hobbies, life, work, and the like, and can represent various aspects of the user's concern. Moreover, since the target account information of the user's social account includes the classification information of the target account and the ranking information of the target account, the target account information of the user's social account can also represent various aspects of the user's concern or concern at various time periods. .
  • the user may be the target account.
  • Every user on social media will basically use the attention function, subscribe to the user account that is of interest to them, or pay attention to the friends they know.
  • the user's interest can be inferred by the account that the user is interested in (including his personal introduction and posting content). For example: pay attention to the star class account, indicating that the user is a fan of the corresponding star; pay attention to the parenting account, indicating that the user is interested in the topic of the newborn.
  • V u (l) ⁇ v 1 ,v 2 ,...,v k ⁇ of the users of the user u who are interested in the list of tags l, where k is the target of interest of the user
  • the account information is related to the number of accounts associated with the tag l, and k is less than or equal to the number of all target accounts that the user is interested in.
  • the tag 1 indicates that the target account information that the user is interested in corresponds to a feature information, such as finance, sports, or entertainment.
  • determining the target account information related to the target feature information according to the target feature information and each of the target account information in step S302 may be implemented by using the following process:
  • Step S401 Extract second classification feature information of each target account information, where the second classification feature information includes a keyword and/or an identifier.
  • the classification information of the target account in the target account information that the user pays attention through the social account account may include the classification feature information of the user's hobbies, life, work, and the like, so that the keyword and the keyword may be extracted from the public information posted by the user. /Second classification feature information of the identifier to classify each piece of target account information.
  • the keywords include, but are not limited to, words related to the user's hobbies, life, work, etc.
  • the identifiers include, but are not limited to, identifiers such as pictures, expressions, and the like related to the user's hobbies, life, work, and the like.
  • Step S402 determining, according to the second classification feature information of each target account information and the target feature information. Whether each of the target account information is related to the target feature information.
  • the target feature information may include at least one keyword and at least one identifier. Specifically, after the second classification feature information is extracted in step S401, the second classification feature information may be matched with the target feature information, and if the matching degree between the second classification feature information and the target feature information is greater than a third threshold, then determining The target account information is related to the target feature information. Otherwise, it is determined that the target account information is not related to the target feature information.
  • the second classification feature information when the second classification feature information is a keyword, the second classification feature information may be matched with each keyword in the target feature information. If the matching is successful, the target account information is determined to be related to the target feature information; otherwise, the determination is performed. The target account information is not related to the target feature information.
  • the second classification feature information when the second classification feature information is an identifier, the second classification feature information may be matched with the identifier in the target feature information, and if the matching degree is greater than the second threshold, determining that the target account information is related to the target feature information Otherwise, it is determined that the target account information is not related to the target feature information.
  • the keyword or the identifier may be prioritized, and the second classification feature information is matched with the target feature information according to the priority.
  • Step S303 determining whether the user is a target user according to the determined pieces of public information related to the target feature information and each piece of target account information.
  • the determined degree of relevance of each piece of public information related to the target feature information and the degree of relevance of each piece of target account information and target feature information may be comprehensively considered to determine the user. Whether it is a target user.
  • step S303 can be implemented by the following process:
  • Step S501 Establish a relevance model of the user according to the determined public information and target account information related to the target feature information.
  • the weight model of the user may be:
  • l represents a classification feature information
  • S u (l) is the weight of the user and the classification feature information 1.
  • n is the number of pieces of public information related to the classification feature information 1 issued by the user
  • k is the user The number of target accounts that are related to the classification feature information l.
  • t and y 0 are preset coefficients
  • y represents the ranking information of the target account associated with the classification feature information l.
  • the weight value set of all the target account information related to the tag l is
  • Step S502 determining whether the user is a target user according to the weight model of the user.
  • the weight model can be used to comprehensively consider the public information published by the user and the target account information of interest, and then determine whether the user is the target user.
  • the value calculated by the weight model may be compared with a fourth threshold to determine whether the user is a target user.
  • the method for obtaining the target user firstly obtains the public information including the information content and the publishing time posted by the social account of the user, and the target account information including the classification information of the target account and the ranking information of the target account, which are related to the social account of the user, Then, the public information related to the target feature information is determined according to the target feature information and each piece of public information, and the target account information related to the target feature information is determined according to the target feature information and each target account information, and finally, according to the determined and target feature information.
  • the related public information and the target account information determine whether the user is the target user. Since the public information includes the information release time, the target account information includes the ranking information of the target account, so that the influence of the time factor on the target user acquisition can be fully considered, thereby Ability to more accurately identify target users.
  • FIG. 6 shows an acquisition target provided by an embodiment of the present invention. Schematic diagram of the operating environment of the target user program. For the convenience of explanation, only the parts related to the present embodiment are shown.
  • the acquisition target user program 600 is installed and runs in the electronic device 60.
  • the electronic device 60 can be a mobile terminal, a palmtop computer, a server, or the like.
  • the electronic device 60 can include, but is not limited to, a memory 603, a processor 601, and a display 602.
  • Figure 6 shows only electronic device 60 having components 601-603, but it should be understood that not all illustrated components may be implemented and that more or fewer components may be implemented instead.
  • the memory 603 may be an internal storage unit of the electronic device 60, such as a hard disk or memory of the electronic device 60, in some embodiments.
  • the memory 603 may also be an external storage device of the electronic device 60 in other embodiments, such as a plug-in hard disk equipped on the electronic device 60, a smart memory card (SMC), and a secure digital device. (Secure Digital, SD) card, flash card, etc.
  • SMC smart memory card
  • SD Secure Digital
  • flash card etc.
  • the memory 603 may also include both an internal storage unit of the electronic device 60 and an external storage device.
  • the memory 603 is configured to store application software and various types of data installed in the electronic device 60, such as the program code of the acquisition target user program 600, and the like.
  • the memory 603 can also be used to temporarily store data that has been output or is about to be output.
  • the processor 601 may be a Central Processing Unit (CPU), a microprocessor or other data processing chip for running program code or processing data stored in the memory 603, such as The acquisition target user program 600 or the like is executed.
  • CPU Central Processing Unit
  • microprocessor or other data processing chip for running program code or processing data stored in the memory 603, such as The acquisition target user program 600 or the like is executed.
  • the display 602 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments.
  • the display 602 is used to display information processed in the electronic device 60 and a user interface for displaying visualizations, such as an application menu interface, an application icon interface, and the like.
  • the components 601-603 of the electronic device 60 communicate with one another via a system bus.
  • FIG. 7 is a block diagram of a program for acquiring a target user program 600 according to an embodiment of the present invention.
  • the acquisition target user program 600 may be divided into one or more modules, the one or more modules being stored in the memory 603 and being processed by one or more processors (this Embodiments are performed by the processor 601) to complete the present invention.
  • the acquisition target user program 600 can be divided into an information acquisition module 701, a determination module 702, and a processing module 703.
  • a module referred to in the present invention refers to a series of computer program instruction segments capable of performing a particular function, and is more suitable than the program to describe the execution process of the acquisition target user program 600 in the electronic device 60. The following description will specifically describe the functions of the modules 701-703.
  • the information obtaining module 701 is configured to obtain public information published by the social account of the user, where the public information includes the information content and the publishing time.
  • the determining module 702 is configured to determine public information related to the target feature information according to the target feature information and each piece of the public information.
  • a processing module 703 configured to determine, according to the determining module 702, each piece related to the target feature information Open the message to determine if the user is the target user.
  • the information obtaining module 701 is further configured to obtain target account information that is related to the social account of the user, where the target account information includes the classification information of the target account and the ranking information of the target account.
  • the determining module 702 is further configured to determine target account information related to the target feature information according to the target feature information and each of the target account information.
  • the processing module 703 is specifically configured to: determine, according to each piece of public information and each piece of target account information related to the target feature information determined by the determining module, whether the user is a target user.
  • the determining module 702 can be divided into an extracting unit 801 and a determining unit 802.
  • the extracting unit 801 is configured to extract first classified feature information of each piece of the public information, where the first classified feature information includes a keyword and/or an identifier.
  • the determining unit 802 is configured to determine, according to the first classification feature information and the target feature information of each piece of the public information, whether each piece of the public information is related to the target feature information.
  • the extracting unit 701 is further configured to extract second classification feature information of each of the target account information, where the second classification feature information includes a keyword and/or an identifier.
  • the determining unit 702 is further configured to determine, according to the second classification feature information of each of the target account information and the target feature information, whether each piece of the target account information is related to the target feature information.
  • the processing module 703 can be divided into a model establishing unit 901 and a determining unit 902.
  • the model establishing unit 901 is configured to establish a weight model of the user according to the public information and the target account information related to the target feature information determined by the determining module.
  • the determining unit 902 is configured to determine, according to the weight model of the user, whether the user is a target user.
  • the weight model established by the model establishing unit 301 is specifically:
  • l represents a classification feature information
  • S u (l) is a weight of the user related to the classification feature information 1.
  • n is the number of pieces of public information related to the classification feature information 1 issued by the user
  • k is the The number of target accounts related to the classification feature information l that the user pays attention to.
  • s and x 0 are preset coefficients, and x represents a time difference between the release time of the public information related to the classification feature information 1 and the crawler acquisition time.
  • t and y 0 are preset coefficients
  • y represents the ranking information of the target account associated with the classification feature information l.
  • each functional unit and module described above is exemplified. In practical applications, the above functions may be assigned to different functional units as needed.
  • the module is completed by dividing the internal structure of the device into different functional units or modules to perform all or part of the functions described above.
  • Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit, and the integrated unit may be hardware.
  • Formal implementation can also be implemented in the form of software functional units.
  • the specific names of the respective functional units and modules are only for the purpose of facilitating mutual differentiation, and are not intended to limit the scope of protection of the present application.
  • For the specific working process of the unit and the module in the foregoing system reference may be made to the corresponding process in the foregoing method embodiment, and details are not described herein again.
  • the disclosed apparatus and method may be implemented in other manners.
  • the system embodiment described above is merely illustrative.
  • the division of the module or unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the embodiments of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute various embodiments of the embodiments of the present invention. All or part of the steps of the method.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请适用于信息处理技术领域,提供了一种获取目标用户的方法、装置、电子设备及介质。该获取目标用户的方法包括:获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。本申请充分考虑时间因素对目标用户获取的影响,从而能够更加准确地确定目标用户。

Description

获取目标用户的方法、装置、电子设备及介质 技术领域
本申请属于信息处理技术领域,尤其涉及一种获取目标用户的方法、装置、电子设备及介质。
背景技术
通常在确定目标用户时,会基于用户行为数据中一些关键词或标识符对目标用户进行分类,进而选定目标用户。例如,用户浏览新生儿所使用的产品,则可以给该用户打上关注婴幼产品的标签。但是发明人在实现本发明的过程中发现现有技术至少存在以下不足:如果用户在若干前关注过某方面内容,例如新生儿相关内容,那么该用户可能现在已不太关注婴幼产品,因此上述方法并不能准确地确定目标用户。
技术问题
有鉴于此,本发明实施例提供了一种获取目标用户的方法、装置、电子设备及介质,以解决现有技术中未考虑时间因素对用户分类的影响而导致不能准确地确定目标用户的问题。
技术解决方案
本发明实施例的第一方面,提供了一种获取目标用户的方法,包括:
获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
根据所确定的与所述目标特征信息相关的公开信息,确定所述用户是否为目标用户。
本发明实施例的第二方面,提供了一种获取目标用户的装置,包括:
信息获取模块,用于获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间;
确定模块,用于根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
处理模块,用于根据所述确定模块所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
本发明实施例的第三方面,提供了一种获取目标用户电子设备,包括存储器、处理器,所述存储器存储有可在所述处理器上运行的计算机程序,所述处理器执行所述计算机序时实现如下步骤:
获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
根据所确定的与所述目标特征信息相关的公开信息,确定所述用户是否为目标用户。
本发明实施例的第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现如下步骤:
获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
根据所确定的与所述目标特征信息相关的公开信息,确定所述用户是否为目标用户。
有益效果
本发明实施例,获取用户的社交账号发布的包括信息内容和发布时间的公开信息,并根据目标特征信息和各条公开信息确定与目标特征信息相关的公开信息,再根据所确定的与目标特征信息相关的各条公开信息,确定用户是否为目标用户,由于公开信息包括信息的发布时间,因此能够充分考虑时间因素对目标用户获取的影响,从而能够更加准确地确定目标用户。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的获取目标用户的方法的流程图;
图2是图1中步骤S101的实现流程图;
图3是本发明实施例提供的获取目标用户的方法的具体流程图;
图4是图3中步骤S302的实现流程图;
图5是图3中步骤S303的实现流程图;
图6是本发明实施例提供的获取目标用户程序的运行环境示意图;
图7是本发明实施例提供的获取目标用户程序的程序模块图。
本发明的实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为了说明本发明所述的技术方案,下面通过具体实施例来进行说明。
实施例一
图1示出了本发明实施例提供的获取目标用户的方法的实现流程,详述如下:
步骤S101,获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息。
其中,社交账号包括但不限于微博账号和即时通信平台账号。用户的社交账号发布的公开信息可以为用户发布的与爱好、生活、工作等方面相关的公开信息,能够表征用户所关心的各个方面。而且由于公开信息包括信息内容和发布时间,因此公开信息还能够表征用户在各个时间段所关注或关心的各个方面。
目标特征信息为预设的特征信息,用于确定用户中的目标用户,例如目标特征信息包括但不限于金融、体育和娱乐等。具体的,若目标特征信息为金融,而用户的社交账号发布的公开信息中包括金融信息,则该用户可能为目标账户。
以下以社交账号为微博账号为例进行进一步说明,但并不以此为限。用户u的每一条微博信息的发布,都是具有时间信息的。基于微博信息的文本内容,使用不同方法为每一条微博信息wi设置不同类型的标签L。以某一标签l∈L为例,利用基于文本的标签分类算法(通常分类的结果为0/1取值,即该条微博信息是否与标签l相关),得到该用户u所有与标签l相关的微博信息集wu(l)={w1,w2,…,wn},其中n为该用户发布的微博信息中与标签l相关的微博信息条数,且n小于等于该用户u发布的所有微博信息数。其中,标签l表征该用户发布的微博信息对应一种特征信息,例如金融、体育或娱乐等。
参见图2,一些实施例中,步骤S101中的所述根据目标特征信息和各条所述公开信息确定与所述分类标签相关的公开信息具体可以通过以下过程实现:
步骤S201,提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包括关键词和/标识符。
可以理解的,用户通过社交账号发布的公开信息中会包含用户的爱好、生活、工作等方面的分类特征信息,因此可以从用户发布的公开信息中提取包括关键词和/标识符的第一分类特征信息,以对各条公开信息进行分类。其中,关键词包括但不限于与用户的爱好、生活、工作等方面相关的词语,标识符包括但不限于与用户的爱好、生活、工作等方面相关的图片、表情等标示符。
步骤S202,根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信息与所述目标特征信息是否相关。
其中,目标特征信息可以包括至少一个关键词和至少一个标识符。具体的,在步骤S201中提取出第一分类特征信息以后,可以将第一分类特征信息与目标特征信息进行匹配,若第 一分类特征信息与目标特征信息匹配度大于第一阈值时,则判定该公开信息与目标特征信息相关,否则,判定该公开信息与目标特征信息不相关。
例如,第一分类特征信息为关键词时,可以将第一分类特征信息与目标特征信息中的各个关键词进行匹配,若匹配成功,则判定该公开信息与目标特征信息相关,否则,判定该公开信息与目标特征信息不相关。
又例如,第一分类特征信息为标识符时,可以将第一分类特征信息与目标特征信息中的标识符进行匹配,若匹配度大于第一阈值,则判定该公开信息与目标特征信息相关,否则,判定该公开信息与目标特征信息不相关。
又例如,第一分类特征信息同时包括关键词和标识符时,可以对关键词或标识符设置优先级,按照优先级将第一分类特征信息与目标特征信息进行匹配。
步骤S102,根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
其中,可以对所确定的与所述目标特征信息相关的各条公开信息与目标特征信息的相关度大小,确定所述用户是否为目标用户。具体的,可以对与所述目标特征信息相关的各条公开信息与目标特征信息的相关度大小取平均值,然后根据平均值与第二阈值的大小关系,确定所述用户是否为目标用户。
例如,用户每发布一条新公开信息,都是有发布时间的。根据时间上,离当前最近的时效性最强的想法,可以使用sigmoid函数对标签l相关的公开信息wu(l)进行转化,得到新的权重值:
Figure PCTCN2017099699-appb-000001
其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差。
所有与标签l相关的公开信息的权重值集合为
Figure PCTCN2017099699-appb-000002
根据各条公开信息对应的权重值的大小确定该用户是否为目标用户。例如,标签l表征金融信息,用户发布的与金融信息相关的公开信息与金融信息的相关度都较小,平均值小于第二阈值,则可以判定所述用户不是目标用户,或非优质目标客户,否则,判定所述用户是目标用户。
在实际应用中,尝试不同的sigmoid参数调整其曲线,最终取值s=-0.2以及x0=12时效果较佳。需要注意的是,本实施例中,公开信息发布时间距离爬虫时间差的单位为年。
图3示出了该获取目标用户的方法的具体流程图,重复之处不再赘述。
步骤S301,获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息。
本步骤中的详细内容可以参考步骤S101中的相关内容,在此不再赘述。
步骤S302,取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息,并根据所述目标特征信息和各条所述目标账号信息确定与所述目标特征信息相关的目标账号信息。
其中,用户的社交账号关注的目标账号信息可以为与用户的爱好、生活、工作等方面相关的账号信息,能够表征用户所关心的各个方面。而且由于用户的社交账号关注的目标账号信息包括目标账号的分类信息和目标账号的排位信息,因此用户的社交账号关注的目标账号信息也能够表征用户在各个时间段所关注或关心的各个方面。
可以理解的,若目标特征信息为金融,而用户的社交账号发布的关注的目标账号信息中的目标账号的分类信息包括金融信息,则该用户可能为目标账户。
以下以社交账号为微博账号为例进行进一步说明,但并不以此为限。可以理解的,每一个在社交媒体上的用户,基本上都会使用关注功能,订阅关注其感兴趣的用户账号,或者关注认识的好友。例如,通过用户所关注的账号(包括其个人介绍及发布内容),可以推测用户的爱好。例如:关注明星类账号,说明用户是对应明星的粉丝;关注育儿类账号,说明用户对新生儿话题感兴趣。给定一个账号标签列表,找到用户u所关注的用户中落在标签l列表的用户集Vu(l)={v1,v2,…,vk},其中k为该用户关注的目标账号信息与标签l相关的账号个数,且k小于等于该用户所关注的所有目标账号个数。其中,标签l表征该用户关注的目标账号信息对应一种特征信息,例如金融、体育或娱乐等。
参见图4,一些实施例中,步骤S302中的所述根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息可以通过以下过程实现:
步骤S401,提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符。
可以理解的,用户通过社交账号关注的目标账号信息中的目标账号的分类信息会包含用户的爱好、生活、工作等方面的分类特征信息,因此可以从用户发布的公开信息中提取包括关键词和/标识符的第二分类特征信息,以对各条目标账号信息进行分类。其中,关键词包括但不限于与用户的爱好、生活、工作等方面相关的词语,标识符包括但不限于与用户的爱好、生活、工作等方面相关的图片、表情等标示符。
步骤S402,根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定 各条所述目标账号信息与所述目标特征信息是否相关。
其中,目标特征信息可以包括至少一个关键词和至少一个标识符。具体的,在步骤S401中提取出第二分类特征信息以后,可以将第二分类特征信息与目标特征信息进行匹配,若第二分类特征信息与目标特征信息匹配度大于第三阈值时,则判定该目标账号信息与目标特征信息相关,否则,判定该目标账号信息与目标特征信息不相关。
例如,第二分类特征信息为关键词时,可以将第二分类特征信息与目标特征信息中的各个关键词进行匹配,若匹配成功,则判定该目标账号信息与目标特征信息相关,否则,判定该目标账号信息与目标特征信息不相关。
又例如,第二分类特征信息为标识符时,可以将第二分类特征信息与目标特征信息中的标识符进行匹配,若匹配度大于第二阈值,则判定该目标账号信息与目标特征信息相关,否则,判定该目标账号信息与目标特征信息不相关。
又例如,第二分类特征信息同时包括关键词和标识符时,可以对关键词或标识符设置优先级,按照优先级将第二分类特征信息与目标特征信息进行匹配。
步骤S303,根据所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
其中,可以对所确定的与所述目标特征信息相关的各条公开信息与目标特征信息的相关度大小,以及各条目标账号信息与目标特征信息的相关度大小,综合考虑以确定所述用户是否为目标用户。
参见图5,一些实施例中,步骤S303可以通过以下过程实现:
步骤S501,根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的相关度模型。
具体的,所述用户的权重模型可以为:
Figure PCTCN2017099699-appb-000003
其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l的权重,
Figure PCTCN2017099699-appb-000004
为所述用户在公开信息上与分类特征信息l的权重,
Figure PCTCN2017099699-appb-000005
为所述用户在目标账号信息上与分类特征信息l的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数。
Figure PCTCN2017099699-appb-000006
其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差。在实际应用中,尝试不同的sigmoid参数调整其曲线,最终取值s=-0.2以及x0=12时效果较佳。需要注意的是,本实施例中,公开信息发布时间距离爬虫时间差的单位为年。
Figure PCTCN2017099699-appb-000007
其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。具体的,目标账号排名越靠前表示该用户关注该目标账号的时间越近,y越小。在实际应用中,尝试不同的sigmoid参数调整其曲线,最终取值t=-0.2以及y0=12时效果较佳。其中,所有与标签l相关的目标账号信息的权重值集合为
Figure PCTCN2017099699-appb-000008
步骤S502,根据所述用户的权重模型判定所述用户是否为目标用户。
其中,可以通过所述权重模型,综合考虑用户发布的公开信息和关注的目标账号信息,然后确定用户是否为目标用户。当α=0.5时,该权重模型能够平均考虑用户发布的公开信息和关注的目标账号信息;且α不同的取值会对用户发布的公开信息和关注的目标账号信息有不同的侧重。例如,α>0.5时,该权重模型更侧重通过用户发布的公开信息确定该用户是否为目标用户;α<0.5时,该权重模型更侧重通过用户关注的目标账号信息确定该用户是否为目标用户。
具体的,可以将通过该权重模型计算得出的数值与第四阈值进行比较,来确定该用户是否为目标用户。
上述获取目标用户的方法,首先获取用户的社交账号发布的包括信息内容和发布时间的公开信息,以及用户的社交账号关注的包括目标账号的分类信息和目标账号的排位信息的目标账号信息,然后根据目标特征信息和各条公开信息确定与目标特征信息相关的公开信息,以及根据目标特征信息和各个目标账号信息确定与目标特征信息相关的目标账号信息,最后根据所确定的与目标特征信息相关的公开信息和目标账号信息,确定用户是否为目标用户,由于公开信息包括信息的发布时间,目标账号信息包括目标账号的排位信息,因此能够充分考虑时间因素对目标用户获取的影响,从而能够更加准确地确定目标用户。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
对应于上文实施例所述的获取目标用户的方法,图6示出了本发明实施例提供的获取目 标用户程序的运行环境示意图。为了便于说明,仅示出了与本实施例相关的部分。
在本实施例中,所述的获取目标用户程序600安装并运行于电子设备60中。该电子设备60可以是移动终端、掌上电脑、服务器等。该电子设备60可包括,但不仅限于,存储器603、处理器601及显示器602。图6仅示出了具有组件601-603的电子设备60,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
所述存储器603在一些实施例中可以是所述电子设备60的内部存储单元,例如该电子设备60的硬盘或内存。所述存储器603在另一些实施例中也可以是所述电子设备60的外部存储设备,例如所述电子设备60上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器603还可以既包括所述电子设备60的内部存储单元也包括外部存储设备。所述存储器603用于存储安装于所述电子设备60的应用软件及各类数据,例如所述获取目标用户程序600的程序代码等。所述存储器603还可以用于暂时地存储已经输出或者将要输出的数据。
所述处理器601在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行所述存储器603中存储的程序代码或处理数据,例如执行所述获取目标用户程序600等。
所述显示器602在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。所述显示器602用于显示在所述电子设备60中处理的信息以及用于显示可视化的用户界面,例如应用菜单界面、应用图标界面等。所述电子设备60的部件601-603通过系统总线相互通信。
请参阅图7,是本发明实施例提供的获取目标用户程序600的程序模块图。在本实施例中,所述的获取目标用户程序600可以被分割成一个或多个模块,所述一个或者多个模块被存储于所述存储器603中,并由一个或多个处理器(本实施例为所述处理器601)所执行,以完成本发明。例如,在图7中,所述的获取目标用户程序600可以被分割成信息获取模块701、确定模块702和处理模块703。本发明所称的模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述所述获取目标用户程序600在所述电子设备60中的执行过程。以下描述将具体介绍所述模块701-703的功能。
其中,信息获取模块701,用于获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间。
确定模块702,用于根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息。
处理模块703,用于根据所述确定模块702所确定的与所述目标特征信息相关的各条公 开信息,确定所述用户是否为目标用户。
可选的,信息获取模块701,还用于获取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息。确定模块702,还用于根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息。处理模块703具体用于:根据所述确定模块所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
可选的,确定模块702可以分割为提取单元801和确定单元802。其中,提取单元801,用于提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包括关键词和/标识符。确定单元802,用于根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信息与所述目标特征信息是否相关。
提取单元701,还用于提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符。确定单元702,还用于根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定各条所述目标账号信息与所述目标特征信息是否相关。
可选的,处理模块703可以分割为模型建立单元901和判定单元902。模型建立单元901,用于根据所述确定模块所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的权重模型。判定单元902,用于根据所述用户的权重模型判定所述用户是否为目标用户。
具体的,所述模型建立单元301建立的权重模型具体为:
Figure PCTCN2017099699-appb-000009
其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l相关的权重,
Figure PCTCN2017099699-appb-000010
为所述用户在公开信息上与分类特征信息l相关的权重,
Figure PCTCN2017099699-appb-000011
为所述用户在目标账号信息上与分类特征信息l相关的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数。
Figure PCTCN2017099699-appb-000012
其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差。
Figure PCTCN2017099699-appb-000013
其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本发明所提供的实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的系统实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出 来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明实施例各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。

Claims (20)

  1. 一种获取目标用户的方法,其特征在于,包括:
    获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
    根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
  2. 根据权利要求1所述的获取目标用户的方法,其特征在于,还包括:
    获取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息,并根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的各条目标账号信息;
    所述根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户具体为:
    根据所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
  3. 根据权利要求2所述的获取目标用户的方法,其特征在于,所述根据目标特征信息和各条所述公开信息确定与所述分类标签相关的公开信息包括:
    提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包括关键词和/标识符;
    根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信息与所述目标特征信息是否相关;
    所述根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息包括:
    提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符;
    根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定各条所述目标账号信息与所述目标特征信息是否相关。
  4. 根据权利要求2所述的获取目标用户的方法,其特征在于,所述根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,确定所述用户是否为目标用户包括:
    根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的权重模型;
    根据所述用户的权重模型判定所述用户是否为目标用户。
  5. 根据权利要求4所述的获取目标用户的方法,其特征在于,所述用户的权重模型具体 为:
    Figure PCTCN2017099699-appb-100001
    其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100002
    为所述用户在公开信息上与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100003
    为所述用户在目标账号信息上与分类特征信息l相关的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数;
    Figure PCTCN2017099699-appb-100004
    其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差;
    Figure PCTCN2017099699-appb-100005
    其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。
  6. 一种获取目标用户的装置,其特征在于,包括:
    信息获取模块,用于获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间;
    确定模块,用于根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
    处理模块,用于根据所述确定模块所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
  7. 根据权利要求6所述的获取目标用户的装置,其特征在于,所述信息获取模块,还用于获取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息;
    所述确定模块,还用于根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息;
    所述处理模块具体用于:根据所述确定模块所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
  8. 根据权利要求7所述的获取目标用户的装置,其特征在于,所述确定模块包括:
    提取单元,用于提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包 括关键词和/标识符;
    确定单元,用于根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信息与所述目标特征信息是否相关。
    所述提取单元,还用于提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符;
    所述确定单元,还用于根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定各条所述目标账号信息与所述目标特征信息是否相关。
  9. 根据权利要求7所述的获取目标用户的装置,其特征在于,所述处理模块包括:
    模型建立单元,用于根据所述确定模块所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的权重模型;
    判定单元,用于根据所述用户的权重模型判定所述用户是否为目标用户。
  10. 根据权利要求9所述的获取目标用户的装置,其特征在于,所述模型建立单元建立的权重模型具体为:
    Figure PCTCN2017099699-appb-100006
    其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100007
    为所述用户在公开信息上与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100008
    为所述用户在目标账号信息上与分类特征信息l相关的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数;
    Figure PCTCN2017099699-appb-100009
    其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差;
    Figure PCTCN2017099699-appb-100010
    其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。
  11. 一种获取目标用户电子设备,其特征在于,包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,所述处理器执行所述计算机序时实现如下步骤:
    获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
    根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
  12. 根据权利要求11所述的获取目标用户电子设备,其特征在于,所述处理器执行所述计算机序时还实现如下步骤:
    获取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息,并根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的各条目标账号信息;
    所述根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户具体为:
    根据所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
  13. 根据权利要求12所述的获取目标用户电子设备,其特征在于,所述根据目标特征信息和各条所述公开信息确定与所述分类标签相关的公开信息包括:
    提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包括关键词和/标识符;
    根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信息与所述目标特征信息是否相关;
    所述根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息包括:
    提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符;
    根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定各条所述目标账号信息与所述目标特征信息是否相关。
  14. 根据权利要求12所述的获取目标用户电子设备,其特征在于,所述根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,确定所述用户是否为目标用户包括:
    根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的权重模型;
    根据所述用户的权重模型判定所述用户是否为目标用户。
  15. 根据权利要求14所述的获取目标用户电子设备,其特征在于,所述用户的权重模型具体为:
    Figure PCTCN2017099699-appb-100011
    其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100012
    为所述用户在公开信息上与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100013
    为所述用户在目标账号信息上与分类特征信息l相关的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数;
    Figure PCTCN2017099699-appb-100014
    其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差;
    Figure PCTCN2017099699-appb-100015
    其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被至少一个处理器执行时实现如下步骤:
    获取用户的社交账号发布的公开信息,所述公开信息包括信息内容和发布时间,并根据目标特征信息和各条所述公开信息确定与所述目标特征信息相关的公开信息;
    根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户。
  17. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述计算机程序被至少一个处理器执行时还实现如下步骤:
    获取用户的社交账号关注的目标账号信息,所述目标账号信息包括目标账号的分类信息和目标账号的排位信息,并根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的各条目标账号信息;
    所述根据所确定的与所述目标特征信息相关的各条公开信息,确定所述用户是否为目标用户具体为:
    根据所确定的与所述目标特征信息相关的各条公开信息和各条目标账号信息,确定所述用户是否为目标用户。
  18. 根据权利要求17所述的计算机可读存储介质,其特征在于,所述根据目标特征信息和各条所述公开信息确定与所述分类标签相关的公开信息包括:
    提取各条所述公开信息的第一分类特征信息,所述第一分类特征信息包括关键词和/标识符;
    根据各条所述公开信息的第一分类特征信息和所述目标特征信息,确定各条所述公开信 息与所述目标特征信息是否相关;
    所述根据所述目标特征信息和各个所述目标账号信息确定与所述目标特征信息相关的目标账号信息包括:
    提取各个所述目标账号信息的第二分类特征信息,所述第二分类特征信息包括关键词和/标识符;
    根据各个所述目标账号信息的第二分类特征信息和所述目标特征信息,确定各条所述目标账号信息与所述目标特征信息是否相关。
  19. 根据权利要求17所述的计算机可读存储介质,其特征在于,所述根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,确定所述用户是否为目标用户包括:
    根据所确定的与所述目标特征信息相关的公开信息和目标账号信息,建立所述用户的权重模型;
    根据所述用户的权重模型判定所述用户是否为目标用户。
  20. 根据权利要求19所述的计算机可读存储介质,其特征在于,所述用户的权重模型具体为:
    Figure PCTCN2017099699-appb-100016
    其中,l表示一个分类特征信息,Su(l)为所述用户与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100017
    为所述用户在公开信息上与分类特征信息l相关的权重,
    Figure PCTCN2017099699-appb-100018
    为所述用户在目标账号信息上与分类特征信息l相关的权重,α∈[0,1],n为所述用户发布的与分类特征信息l相关的公开信息的条数,k为所述用户关注的与分类特征信息l相关的目标账号的个数;
    Figure PCTCN2017099699-appb-100019
    其中,s和x0均为预设系数,x表征与分类特征信息l相关的公开信息的发布时间距离爬虫获取时间的时间差;
    Figure PCTCN2017099699-appb-100020
    其中,t和y0均为预设系数,y表征与分类特征信息l相关的目标账号的排位信息。
PCT/CN2017/099699 2017-05-10 2017-08-30 获取目标用户的方法、装置、电子设备及介质 WO2018205458A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710326329.0A CN107656918B (zh) 2017-05-10 2017-05-10 获取目标用户的方法及装置
CN201710326329.0 2017-05-10

Publications (1)

Publication Number Publication Date
WO2018205458A1 true WO2018205458A1 (zh) 2018-11-15

Family

ID=61127595

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/099699 WO2018205458A1 (zh) 2017-05-10 2017-08-30 获取目标用户的方法、装置、电子设备及介质

Country Status (2)

Country Link
CN (1) CN107656918B (zh)
WO (1) WO2018205458A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362831A (zh) * 2019-07-17 2019-10-22 武汉斗鱼鱼乐网络科技有限公司 目标用户识别方法、装置、电子设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619070B (zh) * 2018-06-04 2022-05-10 北京百度网讯科技有限公司 文章生成方法和装置
CN111385136B (zh) * 2018-12-29 2023-01-06 华为技术服务有限公司 一种用户通信标识的确定方法和装置
CN111198992A (zh) * 2020-01-07 2020-05-26 精硕科技(北京)股份有限公司 母婴人群的识别方法、识别装置、电子设备及存储介质
CN112104642B (zh) * 2020-09-11 2021-12-28 腾讯科技(深圳)有限公司 一种异常账号确定方法和相关装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090216620A1 (en) * 2008-02-22 2009-08-27 Samjin Lnd., Ltd Method and system for providing targeting advertisement service in social network
CN103870538A (zh) * 2014-01-28 2014-06-18 百度在线网络技术(北京)有限公司 针对用户进行个性化推荐的方法、用户建模设备及系统
CN104268130A (zh) * 2014-09-24 2015-01-07 南开大学 一种面向Twitter的社交广告可投放性分析方法
CN104317959A (zh) * 2014-11-10 2015-01-28 北京字节跳动网络技术有限公司 基于社交平台的数据挖掘方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489117B (zh) * 2012-06-12 2015-07-01 深圳市腾讯计算机系统有限公司 信息投放方法和系统
CN103577988B (zh) * 2012-07-24 2017-08-04 阿里巴巴集团控股有限公司 一种识别特定用户的方法和装置
CN103544312B (zh) * 2013-11-04 2017-06-16 成都数之联科技有限公司 基于社交网络的招聘信息匹配方法
CN104036037A (zh) * 2014-06-30 2014-09-10 小米科技有限责任公司 处理垃圾用户的方法及装置
CN106354822A (zh) * 2016-08-30 2017-01-25 五八同城信息技术有限公司 获取目标用户的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090216620A1 (en) * 2008-02-22 2009-08-27 Samjin Lnd., Ltd Method and system for providing targeting advertisement service in social network
CN103870538A (zh) * 2014-01-28 2014-06-18 百度在线网络技术(北京)有限公司 针对用户进行个性化推荐的方法、用户建模设备及系统
CN104268130A (zh) * 2014-09-24 2015-01-07 南开大学 一种面向Twitter的社交广告可投放性分析方法
CN104317959A (zh) * 2014-11-10 2015-01-28 北京字节跳动网络技术有限公司 基于社交平台的数据挖掘方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362831A (zh) * 2019-07-17 2019-10-22 武汉斗鱼鱼乐网络科技有限公司 目标用户识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN107656918B (zh) 2019-07-05
CN107656918A (zh) 2018-02-02

Similar Documents

Publication Publication Date Title
WO2018205458A1 (zh) 获取目标用户的方法、装置、电子设备及介质
US8311950B1 (en) Detecting content on a social network using browsing patterns
WO2022141861A1 (zh) 情感分类方法、装置、电子设备及存储介质
US9519723B2 (en) Aggregating electronic content items from different sources
WO2020082596A1 (zh) 基于数据处理的用户画像自动生成方法和系统
JP5960274B2 (ja) 特徴抽出に基づいた画像の得点付け
US9223849B1 (en) Generating a reputation score based on user interactions
AU2012216321B2 (en) Share box for endorsements
JP6661790B2 (ja) テキストタイプを識別する方法、装置及びデバイス
WO2019041521A1 (zh) 用户关键词提取装置、方法及计算机可读存储介质
US11361045B2 (en) Method, apparatus, and computer-readable storage medium for grouping social network nodes
US20160225030A1 (en) Social data collection and automated social replies
US9524526B2 (en) Disambiguating authors in social media communications
US20190199519A1 (en) Detecting and treating unauthorized duplicate digital content
WO2014085495A1 (en) Customized predictors for user actions in an online system
US20150278367A1 (en) Determination and Presentation of Content from Connections
US10146815B2 (en) Query-goal-mission structures
JP2019519019A5 (zh)
WO2011163132A2 (en) Social network user list detection and searching
WO2014085341A1 (en) Querying features based on user actions in online systems
US10970293B2 (en) Ranking search result documents
US10630632B2 (en) Systems and methods for ranking comments
WO2019062081A1 (zh) 业务员画像形成方法、电子装置及计算机可读存储介质
US10592782B2 (en) Image analysis enhanced related item decision
US10497045B2 (en) Social network data processing and profiling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17908853

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17908853

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 13/01/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17908853

Country of ref document: EP

Kind code of ref document: A1