WO2022199494A1 - User interest-based content recommendation method, and terminal device - Google Patents

User interest-based content recommendation method, and terminal device Download PDF

Info

Publication number
WO2022199494A1
WO2022199494A1 PCT/CN2022/081770 CN2022081770W WO2022199494A1 WO 2022199494 A1 WO2022199494 A1 WO 2022199494A1 CN 2022081770 W CN2022081770 W CN 2022081770W WO 2022199494 A1 WO2022199494 A1 WO 2022199494A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
operation behavior
behavior data
user operation
interest
Prior art date
Application number
PCT/CN2022/081770
Other languages
French (fr)
Chinese (zh)
Inventor
邢超
赵洋
赵路德
陈少杰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022199494A1 publication Critical patent/WO2022199494A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Definitions

  • the present application relates to the technical field of data processing, and in particular, to a content recommendation method and terminal device based on user interests.
  • Embodiments of the present application provide a method and terminal device for content recommendation based on user interests, so as to improve the accuracy and efficiency of content recommendation based on user interests on the basis of protecting user privacy data.
  • an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a terminal device.
  • the method includes: collecting a plurality of user operation behavior data input by a user when the user uses a target application program one or more times in a set duration; performing desensitization processing on the collected user operation behavior data, and the desensitization
  • the processing is to filter out the private data related to the user in the user operation behavior data; send the desensitized multiple user operation behavior data to the server, so that the server can desensitize the multiple user operation behaviors after the desensitization process.
  • the data is analyzed to obtain the subject interest table of the user using the target application.
  • the terminal device will collect the user operation behavior data in the target application program, but the user privacy data will be desensitized on the terminal device before being uploaded to the server, so the server will not obtain the data.
  • the target application may be any application suitable for recommending content based on user interests provided in the embodiment of the present application, such as a browser and the like.
  • the terminal device collects multiple user operation behavior data input by the user when the user uses the target application one or more times in the set duration, which may be implemented as: detecting that the user starts the target application When the program is instructed, start the target application; after the target application is started, collect at least one operation data performed by the user on the target application; detect that the user exits the target application When instructed, close the target application; store at least one operation data collected during the process of starting to closing the target application as a set of user operation behavior data.
  • the user operation behavior data when the user uses the target application is collected through the terminal device, and the user operation behavior data is desensitized and then uploaded to the server. Therefore, it is convenient for the server to perform big data analysis based on the desensitized user operation behavior data, so as to obtain the user interest of the user in the target application program.
  • the terminal device randomly replaces the user operation behavior data under the same interest topic based on the differential privacy algorithm for one or more user operation behavior data in the plurality of user operation behavior data, the The topic of interest is determined according to the topic interest table; the user information contained in the plurality of user operation behavior data is stripped.
  • the terminal device can achieve the effect of masking the real user operation behavior data, thereby achieving the purpose of protecting user privacy.
  • the terminal device before uploading the user operation behavior data to the server, the terminal device can also strip the user information so that the server cannot collect the user's private data, thereby ensuring the privacy and security of the user data.
  • the terminal device may also determine the sequence length of each user operation behavior data; according to the preset value The user operation behavior data is truncated and compensated to obtain user operation behavior data with a specified sequence length.
  • the user operation behavior data contains too little operation data, that is, the sequence length is short, more accurate user interests cannot be analyzed from the user operation behavior data.
  • the user operation behavior data contains too much operation data, that is, the sequence length is long, which will lead to the problem of excessive calculation. Therefore, by sampling the user operation behavior data according to the preset value to obtain the user operation behavior data with a relatively uniform sequence length, the processing efficiency of desensitizing the user operation behavior data can be improved, and the server's ability to detect the user operation behavior can be improved. Efficiency and accuracy of statistical analysis of data.
  • the terminal device performs truncation and compensation processing on the user operation behavior data according to a preset value, and obtains user operation behavior data with a specified sequence length.
  • the preset value is to supplement the user operation behavior data with pre-defined user operation behavior data of a target length to obtain user operation behavior data of a specified sequence length; or, if the sequence length of the user operation behavior data is greater than the A preset value, truncating the target length of the user operation behavior data to obtain the user operation behavior data of the specified sequence length; wherein, the target length is the difference between the sequence length of the user operation behavior data and a preset value the absolute value of .
  • a specific scenario of sampling according to the preset value is given.
  • the sequence length of the user operation behavior data if the sequence length is less than the preset value of the user operation behavior data, the predefined default user can be used.
  • the operation behavior data is supplemented, and the user operation behavior data whose sequence length is greater than the preset value can be randomly truncated. Therefore, after sampling the user operation behavior data according to the preset value, the user operation behavior data with a relatively uniform sequence length can be obtained to facilitate desensitization processing.
  • an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a server.
  • the method includes: receiving a plurality of desensitized user operation behavior data sent by one or more terminal devices; the desensitized user operation behavior data is collected by the one or more terminal devices for a set duration A plurality of user operation behavior data entered by the user when the user uses the target application program one or more times, and obtained by desensitizing the collected user operation behavior data; the desensitization process is to desensitize the Filter out the privacy data related to the user in the user operation behavior data; analyze the multiple user operation behavior data after the desensitization processing to obtain the subject interest table of the user using the target application; send the subject interest table to the one or more terminal devices.
  • the server has better computing power than the terminal device, and the server can integrate the user operation behavior data sent by a plurality of terminal devices to analyze the group user operation behavior data, it is possible to obtain a more timely result. It can also be understood as a topic of interest that attracts the attention of most users at the moment, and a topic interest table is generated. And, the server can send the topic interest table to the terminal device, so that the terminal device can perform real-time recommendation in combination with the topic interest table, so as to improve user experience.
  • the server analyzes a plurality of user operation behavior data after the desensitization processing, and obtains the subject interest table of the user using the target application.
  • the obtained multiple user operation behavior data is input into a pre-built topic interest model, so as to perform unsupervised learning on the desensitized multiple user operation behavior data; the topic interest table output by the pre-built topic interest model is obtained. .
  • a topic interest table can be obtained according to a large amount of user operation behavior data sent by multiple terminal devices, and the obtained topic interest table can better reflect the current interest that is more concerned by most users. subject of interest.
  • an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a terminal device.
  • the method includes: receiving a topic interest table sent by a server, where the topic interest table is obtained by analyzing a plurality of user operation behavior data after desensitization processing by the server; the user operation behavior data after desensitization processing is:
  • One or more terminal devices collect multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration, and desensitize the collected multiple user operation behavior data
  • the desensitization process is to filter out the private data involving the user in the user operation behavior data; when detecting the user's instruction to start the target application, start the target application and A first recommendation interface is displayed, and the first recommendation interface includes at least one recommended content; the at least one recommended content is determined according to the topic interest table.
  • the terminal device can perform real-time recommendation in combination with the topic interest table sent by the server.
  • the terminal device can make recommendations according to the topic interest table, and by comparing the current interest topics that most users pay attention to The recommendation of the corresponding content can avoid cold-start recommendation in the target application, that is, the target application recommends content corresponding to some unpopular interest topics and cannot even be recommended, resulting in failure to arouse the user's browsing interest.
  • the starting the target application and displaying the first recommendation interface may be implemented as: starting the target application; displaying the first recommendation interface after starting the target application;
  • One or more interest topics included in the topic interest table are taken as user interests, and at least one recommended content is obtained according to the user interest, and the obtained at least one recommended content is displayed in the first recommendation interface; wherein,
  • Each of the interest topics has an associated weight value, and the greater the associated weight value of the interest topic, the higher the proportion of the recommended content including the related content of the interest topic.
  • the target application is started and the first recommendation interface is displayed, it is implemented as: receiving and collecting one or more user operation behavior data input by the user when the user uses the target application; When detecting the user's instruction to refresh the first recommendation interface, a second recommendation interface is displayed; the recommended content included in the second recommendation interface is based on the one or more user operation behavior data and the topic interest table determined.
  • the user's real-time operation data can be analyzed to obtain the interest topics that the user pays more attention to, so as to facilitate the timely analysis of the user's interest topics.
  • interests are adjusted, so that a recommendation interface more matching the user's interests can be displayed in time.
  • the displaying of the second recommendation interface may be implemented as: determining one or more corresponding interest topics according to the one or more user operation behavior data, and assigning an association to each of the interest topics take one or more interest topics corresponding to the user operation behavior data and one or more of the interest topics included in the topic interest table as the user interest, and obtain at least one recommendation according to the user interest content, and display at least one item of recommended content obtained in the second recommendation interface; wherein, each of the interest topics included in the topic interest table has an associated weight value, and the greater the weight value associated with the interest topic , the higher the proportion of content related to the topic of interest included in the recommended content.
  • the user's personal interest can also be taken into account, so that the recommended content that is more in line with the user's interest can be obtained. , in order to improve the user experience.
  • the acquiring at least one piece of the recommended content according to the user's interest may be implemented as: searching for the recommended content corresponding to the user's interest from locally cached content;
  • the content providing server of the recommended content corresponding to the user's interest acquires the recommended content corresponding to the user's interest.
  • the terminal device determines the user's interest, it can obtain the recommended content related to the user's interest in various possible ways, such as hot articles, hot news, etc., to improve the diversity of the recommended content.
  • an embodiment of the present application further provides a terminal device, including: one or more processors; one or more memories; the one or more memories for storing one or more computer programs and data information; wherein the one or more computer programs include instructions; when the instructions are executed by the one or more processors, the terminal device is caused to perform the method according to any one of the above first aspects, Or perform the method according to any one of the above third aspects.
  • an embodiment of the present application further provides a server, including: one or more processors; one or more memories; the one or more memories for storing one or more computer programs and data information ; wherein the one or more computer programs comprise instructions; when executed by the one or more processors, the instructions cause the server to perform the method of any one of the second aspects above.
  • an embodiment of the present application further provides a communication system, including: a terminal device and a server; the terminal device can perform the steps of the terminal device in the method provided in the first aspect above, or perform the steps in the third aspect above.
  • the steps of the terminal device in the provided method; the server may execute the steps of the server in the method provided in the second aspect above.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable medium stores a computer program (also referred to as code, or instruction) when it runs on a computer, so that the computer executes the above-mentioned first
  • a computer program also referred to as code, or instruction
  • an embodiment of the present application provides a computer program product.
  • the computer program product includes: a computer program (also referred to as code, or an instruction), which, when the computer program is executed, causes the computer to execute any of the above-mentioned first aspects.
  • the method in one possible implementation manner, or the method in any one possible implementation manner of the foregoing second aspect, or the method in any one possible implementation manner in the foregoing third aspect.
  • an embodiment of the present application further provides a graphical user interface on a terminal device, where the terminal device has a display screen, one or more memories, and one or more processors, where the one or more processors are used for executing one or more computer programs stored in the one or more memories, the graphical user interface includes a graphical user interface displayed when the terminal device executes any possible implementation manner of the first aspect of the embodiments of the present application, Or a graphical user interface displayed when any possible implementation manner of the third aspect of the embodiments of the present application is executed.
  • FIG. 1 is an application scenario diagram of a method for recommending content based on user interests provided by an embodiment of the present application
  • FIG. 2a is a schematic diagram of a hardware architecture of a terminal device provided by an embodiment of the application.
  • FIG. 2b is a block diagram of a software structure of a terminal device provided by an embodiment of the application.
  • FIG. 3 is a schematic structural diagram of a content recommendation method based on user interests provided by an embodiment of the present application
  • FIG. 4 is one of schematic diagrams of user interfaces of a method for recommending content based on user interests provided by an embodiment of the present application;
  • FIG. 5 is a schematic diagram of a radar chart of a topic of user interest provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a method for recommending content based on user interests according to an embodiment of the present application
  • FIG. 7 is a second schematic diagram of a user interface of a method for recommending content based on user interests according to an embodiment of the present application
  • FIG. 8a is a third schematic diagram of a user interface of a method for recommending content based on user interests provided by an embodiment of the present application;
  • FIG. 8b is a schematic diagram of a user interface of a content recommendation method based on user interests provided by an embodiment of the present application
  • FIG. 8c is a fifth schematic diagram of a user interface of a method for recommending content based on user interests according to an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a terminal device or a server according to an embodiment of the present application.
  • terminal devices such as mobile phones are becoming more and more popular.
  • Terminal devices not only have communication functions, but also have powerful processing capabilities, storage capabilities, and camera functions.
  • the terminal device executes the corresponding application program through the operating system (for example, the Android operating system), and the user can use the terminal device to make calls, send short messages, browse web pages, take pictures, play games, watch videos, and so on.
  • the terminal device can recommend content according to the user's interests.
  • the terminal device can recommend the content that the user may be interested in according to the user's operation behavior such as the user's search words, search history, and current browsing content.
  • the user interest can be a custom theme interest classification in the application, or a commonly used theme interest classification.
  • the theme interest classification in the browser can include sports, finance, current affairs, etc., and the theme interest classification in the shopping APP.
  • a content recommendation system based on user interests needs to collect a large amount of user operation behavior data to complete content recommendation.
  • the recommendation accuracy of a content recommendation system based on user interests is improved. And there is no good solution for recommending efficiency yet.
  • the present application provides a content recommendation method based on user interests, by collecting user operation behavior data on the terminal device side, and then uploading the user operation behavior data after stripping the user's sensitive private data to the server side.
  • the server side builds a topic interest table according to a large amount of user operation behavior data uploaded by multiple terminal devices, and returns the topic interest table to the terminal device.
  • the terminal device may determine the user's interest in combination with the topic interest table delivered by the server side, the user's real-time operation behavior data on the terminal device, historical user interests, and other factors.
  • the terminal device may request the server side for the relevant content of the user's interest, or may also obtain the relevant content of the user's interest from the local cache of the terminal device, so as to realize real-time Content recommendation.
  • FIG. 1 is an application scenario diagram of a content recommendation method based on user interests provided by an embodiment of the present application.
  • the application scenario may include a terminal device 110, a server 120, and a database 130.
  • An application program may be installed in the terminal device 110.
  • the server 120 may be a background server that communicates with the terminal device, or may be a separate server for mining potential objects. server.
  • the application may be a web version application or an application pre-installed in the terminal device 110.
  • the application in this application may be, for example, a browser application, a small video application, a shopping application, etc. Any type of application that makes content recommendations.
  • Both the terminal device 110 and the server 120 can access the database 130 , and store the access logs generated during the user's access in the database 130 .
  • the database 130 may be set on the server 120, or may be set relatively independently from the server 120.
  • the database 130 may be implemented by a server cluster, a cloud server, or a distributed storage server. It should be noted that this application does not limit the number and types of terminal devices 110, servers 120 and databases 130 included in the application scenario, for example, there may be multiple terminal devices 110 shown in FIG. 1 .
  • the user operation behavior data may be, for example, the user's operations such as searching and browsing in the browser.
  • the terminal device 110 can perform content recommendation according to the user operation behavior data and the topic interest table issued by the server 120 .
  • the terminal device 110 can desensitize the user operation behavior data and upload it to the server 120 after closing the application program, and then the server 120 can generate or update the topic interest table according to the massive user operation behavior data, and then return it to the server 120.
  • the server 120 may also store the generated topic interest table or the received user operation behavior data in the database 130 .
  • the terminal device 110 in this embodiment of the present application may be, for example, a mobile phone, a tablet computer, a wearable device (for example, a watch, a wristband, a helmet, a headset, etc.), a vehicle-mounted device, an augmented reality (AR)/ Virtual reality (VR) devices, laptops, ultra-mobile personal computers (UMPCs), netbooks, personal digital assistants (PDAs), smart home devices (e.g., smart TVs, smart speakers, smart cameras, etc.), etc.
  • AR augmented reality
  • VR Virtual reality
  • laptops laptops
  • ultra-mobile personal computers (UMPCs) ultra-mobile personal computers
  • PDAs personal digital assistants
  • smart home devices e.g., smart TVs, smart speakers, smart cameras, etc.
  • the terminal device 110 to which the embodiments of this application can be applied and exemplary embodiments include but are not limited to carrying Or portable terminal devices with other operating systems.
  • the above-mentioned portable terminal device may also be other portable terminal devices, such as a laptop computer (Laptop) or the like having a touch-sensitive surface (eg, a touch panel).
  • FIG. 2a shows a schematic diagram of the hardware structure of a possible terminal device.
  • the terminal device 110 includes: a radio frequency (RF) circuit 210, a power supply 220, a processor 230, a memory 240, an input unit 250, a display unit 260, an audio circuit 270, a communication interface 280, and a wireless fidelity ( components such as wireless fidelity, Wi Fi) module 290.
  • RF radio frequency
  • FIG. 2a does not constitute a limitation on the terminal device, and the terminal device provided in this embodiment of the present application may include more or less components than those shown in the figure, and may be combined Two or more components, or may have different component configurations.
  • the various components shown in Figure 2a may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
  • the RF circuit 210 can be used for data reception and transmission during communication or conversation. In particular, after receiving the downlink data of the base station, the RF circuit 210 sends it to the processor 230 for processing; in addition, it sends the uplink data to be sent to the base station.
  • the RF circuit 210 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like.
  • LNA low noise amplifier
  • the RF circuit 210 may also communicate with networks and other devices via wireless communication.
  • the wireless communication can use any communication standard or protocol, including but not limited to global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (code division multiple access) division multiple access, CDMA), wideband code division multiple access (WCDMA), long term evolution (long term evolution, LTE), email, short message service (short messaging service, SMS), etc.
  • GSM global system of mobile communication
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • long term evolution long term evolution
  • email short message service
  • the Wi-Fi technology belongs to the short-distance wireless transmission technology, and the terminal device 110 can be connected to an access point (access point, AP) through the Wi-Fi module 290, thereby realizing the access of the data network.
  • the WiFi module 290 can be used for data reception and transmission during the communication process.
  • the terminal device 110 can be physically connected with other devices through the communication interface 280 .
  • the communication interface 280 is connected with the communication interface of the other device through a cable to realize data transmission between the terminal device 110 and the other device.
  • the terminal device 110 can implement communication services and interact with the server side, so the terminal device 110 needs to have a data transmission function, that is, the terminal device 110 needs to include a communication module.
  • FIG. 2 a shows communication modules such as the RF circuit 210 , the WiFi module 290 , and the communication interface 280 , it can be understood that the terminal device 110 has at least one or other of the above components.
  • a communication module (such as a Bluetooth module) used to implement communication for data transmission.
  • the terminal device 110 when the terminal device 110 is a mobile phone, the terminal device 110 may include the RF circuit 210, and may also include the WiFi module 290; when the terminal device 110 is a computer, the terminal device 110 The communication interface 280 may be included, and the WiFi module 290 may also be included; when the terminal device 110 is a tablet computer, the terminal device 110 may include the WiFi module.
  • the memory 240 may be used to store software programs and modules.
  • the processor 230 executes various functional applications and data processing of the terminal device 110 by running software programs and modules stored in the memory 240 .
  • the memory 240 may mainly include a program storage area and a data storage area.
  • the storage program area can store the operating system (mainly including the corresponding software programs or modules of the kernel layer, the system layer, the application program framework layer, and the application program layer).
  • the application layer may include various applications, and among the applications that can be recommended, content recommendation based on user interests can be implemented by using the method provided by the embodiments of the present application.
  • the memory 240 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
  • the input unit 250 can be used to receive editing operations of various types of data objects such as numbers or character information input by the user, and generate key signal input related to user settings and function control of the terminal device 110 .
  • the input unit 250 may include a touch panel 251 and other input devices 252 .
  • the touch panel 251 also called a touch screen, can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc. on the touch panel 251 or on the touch panel 251). operation near the touch panel 251 ), and drive the corresponding connection device according to a preset program.
  • the other input devices 252 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, joysticks, and the like.
  • function keys such as volume control keys, switch keys, etc.
  • trackballs mice, joysticks, and the like.
  • the display unit 260 may be used to display information input by the user or information provided to the user and various menus of the terminal device 110 .
  • the display unit 260 is the display system of the terminal device 110, and is used for presenting an interface and realizing human-computer interaction.
  • the display unit 260 may include a display panel 261 .
  • the display panel 261 may be configured in the form of a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (organic light-emitting diode, OLED) or the like.
  • the display unit 260 may display a visual page corresponding to the user's operation on the terminal device. For example, after the user enters a search term, the display unit 260 displays the information flow, web page, etc.
  • the processor 230 is the control center of the terminal device 110, uses various interfaces and lines to connect various components, runs or executes the software programs and/or modules stored in the memory 240, and invokes the software programs and/or modules stored in the
  • the data in the memory 240 executes various functions of the terminal device 110 and processes data, thereby realizing various services based on the terminal device.
  • the processor 230 is configured to implement the method provided by the embodiment of the present application, so as to perform more accurate content recommendation for the user.
  • the terminal device 110 also includes a power source 220 (such as a battery) for powering the various components.
  • a power source 220 such as a battery
  • the power supply 220 may be logically connected to the processor 230 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption through the power management system.
  • the terminal device 110 further includes an audio circuit 270 , a microphone 271 and a speaker 272 , which can provide an audio interface between the user and the terminal device 110 .
  • the audio circuit 270 can be used to convert the audio data into a signal that can be recognized by the speaker 272, and transmit the signal to the speaker 272, and the speaker 272 converts it into a sound signal and outputs it.
  • the microphone 271 is used to collect external sound signals (such as voices of people speaking or other sounds, etc.), convert the collected external sound signals into signals that can be recognized by the audio circuit 270 , and send them to the audio circuit 270 .
  • the audio circuit 270 can also be used to convert the signal sent by the microphone 271 into audio data, and then output the audio data to the RF circuit 220 for transmission to, for example, another terminal, or output the audio data to the memory 240 for subsequent further processing.
  • the terminal device 110 may further include at least one type of sensor, camera, etc., which will not be repeated here.
  • the operating system (operating system, OS) involved in the embodiments of the present application is the most basic system software running on the terminal device 110 .
  • the operating system may be an Android system or an IOS system.
  • the following embodiments take the android system as an example for introduction. Those skilled in the art can understand that in other operating systems, a similar method can also be used for implementation.
  • the software system of the terminal device 110 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiments of the present application take an android system using a layered architecture as an example to illustrate the software structure of the terminal device 110 as an example.
  • FIG. 2b shows a software structural block diagram of an android system provided by an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate with each other through software interfaces.
  • the android system is divided into five layers, from top to bottom, the application layer, the application framework (framework) layer, the Android runtime (android runtime) and system library, the hardware abstraction layer, and the kernel layer. .
  • the application layer is the top layer of the operating system and can include a series of application packages.
  • the application layer may include native applications of the operating system and third-party applications, wherein the native applications of the operating system may include user interface (UI), browser, camera, settings, mobile phone Butler, music, text messages, calls, etc., third-party applications can include maps, shopping APPs, small video APPs, etc.
  • the applications mentioned below may be native applications of the operating system installed on the terminal device 110 when it leaves the factory, or may be third-party applications downloaded from the network or acquired from other terminal devices 110 by the user during the use of the terminal device 110 .
  • the application layer may be used to implement the presentation of an editing interface, and the above-mentioned editing interface may be used by a user to perform operations.
  • the user may perform user operations such as inputting a search term on the editing interface correspondingly presented by the browser.
  • the application can be developed using the java language, and it can be done by calling the application programming interface (API) provided by the application framework layer.
  • API application programming interface
  • the bottom layers of the system (such as hardware abstraction layer, kernel layer, etc.) interact to develop their own applications.
  • the application framework layer is mainly a series of services and management systems of the operating system.
  • the application framework layer provides application programming interfaces and programming frameworks for applications in the application layer.
  • the application framework layer includes some predefined functions. As shown in Figure 2b, the application framework layer may include a window manager, content provider, view system, telephony manager, resource manager, notification manager, etc.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, take screenshots, etc.
  • Content providers are used to store and retrieve data and make these data accessible to applications.
  • the data may include video, images, audio, calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as text controls that display text, picture controls that display pictures, and so on. View systems can be used to build applications.
  • a display interface can consist of one or more views.
  • the telephony manager is used to provide communication functions of the terminal device 110, such as management of call status display (including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localization strings, icons, pictures, layout files, video files, etc.
  • the application framework layer is mainly responsible for invoking a service interface that communicates with the hardware abstraction layer, so as to transmit the operation request of the user to the hardware abstraction layer, and the operation request may include that the user opens a certain An operation request corresponding to an APP, or an operation request corresponding to a search term entered by a user in an APP, etc. may be included.
  • the hardware abstraction layer generates the corresponding content recommendation service according to the operation request passed by the application layer.
  • the content recommendation service may include a data collection module, a data calibration module, a real-time recommendation module, a privacy protection module, and the like for implementing the method provided by the present application.
  • the data collection module is used to collect the user operation behavior of the user on the client terminal on the terminal device, so as to obtain the user operation behavior data.
  • the data calibration module is used for preprocessing the user operation behavior data collected by the data acquisition module to obtain user operation behavior data with a relatively uniform sequence length.
  • the privacy protection module is used for desensitizing the collected user operation behavior data, stripping or replacing the user privacy data involved in the user operation behavior data, etc., so as to obtain user operation behavior data that does not reveal the user's privacy.
  • the desensitized user operation behavior data is transmitted to the server side, and the desensitized user operation behavior data is used to construct a topic interest model and generate a topic interest table.
  • the real-time recommendation module is used to perform real-time content recommendation according to the determined user interests.
  • Android runtime includes core libraries and virtual machines.
  • the android runtime is responsible for the scheduling and management of the Android system.
  • the core library of the Android system consists of two parts: one is the function functions that the java language needs to call, and the other is the core library of Android.
  • the application layer and the application framework layer run in virtual machines. Taking java as an example, the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, safety and exception management, and garbage collection.
  • a system library can include multiple functional modules. For example: surface manager (surface manager), media library (media library), three-dimensional graphics processing library (eg: OpenGL ES), two-dimensional (2D) graphics engine (eg: SGL) and so on.
  • surface manager surface manager
  • media library media library
  • three-dimensional graphics processing library eg: OpenGL ES
  • 2D graphics engine eg: SGL
  • the Surface Manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the hardware abstraction layer is the support of the application framework layer and an important link between the application framework layer and the kernel layer. It can provide services for developers through the application framework layer.
  • the function of the content recommendation service in the embodiment of the present application may be implemented by configuring a first process in the hardware abstraction layer, and the first process may be a sub-process independently constructed in the hardware abstraction layer.
  • the first process may include modules such as a content recommendation service configuration interface, a content recommendation service controller, and the like.
  • the content recommendation service configuration interface is a service interface that communicates with the application framework layer.
  • the content recommendation service controller is used to monitor the content recommendation service configuration interface, for example, to control whether the content recommendation service needs to be authenticated, etc., and is also responsible for monitoring whether the data input in the terminal device 110 needs to be cached or updated.
  • the hardware abstraction layer may further include a daemon process, and the daemon process may be used to cache data in the first process, and the daemon process may also be a subprocess constructed separately in the hardware abstraction layer.
  • the kernel layer can be the Linux kernel (Linux kernel) layer, which is an abstraction layer between hardware and software.
  • the kernel layer has many drivers related to the terminal device 110, including at least display drivers; Linux-based frame buffer drivers; keyboard drivers and mouse drivers as input devices; Flash drivers based on memory technology devices; audio drivers; Bluetooth drivers, etc., This embodiment of the present application does not impose any limitation on this.
  • the Linux kernel layer is used to provide the core system services of the operating system, such as security, memory management, process management, network protocol stack and driver model, all based on the Linux kernel.
  • the terminal device 110 can run multiple applications at the same time. Simple, one application can correspond to one process, and more complex, one application can correspond to multiple processes. Each process has a process number (process ID).
  • the following is an example to illustrate that the terminal device 110 performs the implementation of the present application for the scenario of content recommendation based on user interests.
  • At least one refers to one or more, and "a plurality” refers to two or more.
  • And/or which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects are an “or” relationship.
  • At least one (item) of the following or its similar expression refers to any combination of these items, including any combination of single item (item) or plural item (item).
  • At least one (a) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, where a, b, c Can be single or multiple.
  • the multiple involved in the embodiments of the present application refers to greater than or equal to two.
  • terminal device in the embodiments of the present application, “terminal device”, “device”, “mobile phone”, etc. may be used interchangeably, that is, various devices that can be used to implement the embodiments of the present application; “Application” can also be mixed, both refer to programs or clients that have certain service provision capabilities, that is to say, applications and clients can also be mixed, such as browser clients and game clients can also be called browser applications or game applications, etc.
  • the hardware structure of the terminal device may be as shown in FIG. 2a
  • the software architecture may be as shown in FIG. 2b
  • the software programs and/or modules corresponding to the software architecture in the terminal device may be stored in the memory 240
  • the processor 230 The software programs and applications stored in the memory 240 may be executed to execute the flow of the method for recommending content based on user interests provided by the embodiments of the present application.
  • each functional module in each embodiment of the present application may be integrated in one processor, or may exist independently physically, or two or more modules may be integrated in one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software programs.
  • the terminal device side may include a data collection module 301, a data calibration module 302, a privacy protection module 303 and a real-time recommendation module 306; the server side may include a data statistics module 304 and a real-time recommendation module 306. Data analysis module 305 .
  • the data collection module 301 on the terminal device side is used to collect user operation behavior data, and the collected user operation behavior data can not only continue to be preprocessed by the data calibration module 302, but also can be sent to the real-time recommendation module 306 for user interest recommended content.
  • the privacy protection module 303 on the terminal device side is configured to further perform desensitization processing on the user operation behavior data preprocessed by the data calibration module 302, and then send it to the data statistics module 304 on the server side.
  • the data statistics module 304 on the server side is configured to send data to one or more terminal devices (only one terminal device is shown as an example in FIG. 3 , if there are multiple terminal devices, the processing procedures of other terminal devices are similar, and will not be repeated here)
  • the desensitized user operation behavior data is statistically summarized, and then sent to the data analysis module 305, and the data analysis module 305 performs training according to a large number of aggregated user operation behavior data to generate a topic interest table.
  • the data analysis module 305 on the server side can return the generated topic interest table to the terminal device.
  • the data analysis module 305 can not only send the generated topic interest table to the data calibration module 302, so that the data calibration module 302 can use it as a reference when preprocessing the user operation behavior data, on the other hand, the data analysis module 305 can also The generated topic interest table is sent to the real-time recommendation module 306, so that the real-time recommendation module 306 can perform real-time content recommendation based on user interests in combination with the topic interest table.
  • the data statistics module 304 on the server side can also be integrated with the data analysis module 305 into one module.
  • stage 1 the data collection module 301 on the terminal device collects user operation behavior data.
  • the implementation of the content recommendation method based on the user's interests often needs to be based on massive user operation behavior data.
  • a large amount of user operation behavior data is generally generated by the user in the process of using the terminal device.
  • the user operation from the time when the user starts the application to when the user closes the application can be regarded as a complete set of user operation behaviors, such as through a session ( Refers to the time interval between a terminal user and a server that provides application services, usually refers to the time elapsed between the time the user registers and enters the server that provides the service to the time that he logs out of the server) object to record the operation behavior of this group of users Contains operational data.
  • a session object may contain one or more operation data, and the number and type of operation data are not limited in this application.
  • the time from user A opening the browser to exiting the browser may be recorded as a session.
  • user A opens the browser, which may be recorded as the beginning of a session, such as interface 1 in FIG. 4 .
  • the terminal device can record all operations of user A in this session through the data collection module 301, as shown in interface 2 in FIG. 4, which shows the home page interface of the browser.
  • the terminal device detects the user operation behavior of user A exiting the browser, it is recorded as the end of this session; for example, the terminal device detects that the current display interface is changed to the main interface of the mobile phone (interface 3 in Figure 4), Or the terminal device detects that the current display interface is switched to the display interface of another application program, etc., which indicates that the current display interface of the mobile phone no longer stays on the browser.
  • the user operation behavior data collected by the data collection module 301 is embodied in the characteristics of multi-domain behavior.
  • the multi-domain behavior refers to the operation performed by the user on different display pages with different protocols, and/or domain names, and/or ports. For example, if any two display pages included in the user operation behavior data use the same protocol, domain name and port, etc., the two display pages belong to the same domain; on the contrary, if any two display pages included in the user operation behavior data use different protocols, the two display pages belong to different domains, that is, User operation behavior data is represented as multi-domain behavior.
  • the webpages corresponding to different information flows generally belong to different domains, so the user operation behavior data of the user in the browser generally has the characteristics of multi-domain behavior. , that is, the characteristics of cross-domain behavior.
  • the interest topic tables in multiple domains can be synchronized, and the interests of user operation behaviors in multiple domains can be synthesized. user interests after the features, so that more accurate content recommendations can be made.
  • the data calibration module 302 on the terminal device preprocesses the user operation behavior data collected by the data collection module 301 .
  • the sequence length of the user operation behavior data during each session may be inconsistent, wherein the sequence length of the user operation behavior data is determined according to the number of operations performed by the user. For example, if the user performs a few user operation behaviors in the browser after opening the browser, the sequence length of the user operation behavior data collected during this session is shorter; if the user opens the browser and searches in the browser , information flow browsing, web browsing and other user operation behaviors, the sequence length of the user operation behavior data collected during this session is longer.
  • Table 1 is an example of user operation behavior data collected during a session, as follows:
  • a row of data in the user operation behavior data in Table 1 above represents a set of sequences, and Table 1 contains 4 rows of actual user operation behavior data. Therefore, the sequence length in Table 1 can be considered to be 4, and the subsequent Similar tables in the embodiments have the same definitions, and in the specific introduction process, repeated points will not be repeated.
  • the information types of the user operation behavior data shown in Table 1 may include: user ID, key identifiers used to reflect user operation behaviors, user operation behavior types, interest topics corresponding to the key identifiers, and additional summary information (optional) ) and other information.
  • the user operation behavior data may also include more or less different information types than those in Table 1 above, for example, application program identifiers, etc., which are not limited in this application.
  • the storage form of the user operation behavior data may be in a tabular form or in other forms, which is also not limited in this application.
  • the key identifiers in Table 1 can be obtained according to user operation behavior, and the key identifiers can be search words, information flow keywords, web page addresses, and the like.
  • the key identifier may be "NBA”.
  • the key identifier may be "basketball”.
  • the key identifier may be the web page keyword "real economy”, or the URL of the currently browsed web page, etc.
  • the interest topics in Table 1 may not be obtained from the user's operation behavior (for example, when the user performs a search word operation, the terminal device can obtain the key identifier from the user's operation behavior, but cannot be determined. interest topic), the terminal device can determine it according to the key identifier in the user's operation behavior and the topic interest table obtained from the server side.
  • the topic interest table is used to indicate the mapping relationship between key identifiers and interest topics, and the server side performs statistics based on the user operation behavior data received from one or more terminal devices and processed by the data calibration module 302 and the privacy protection module 303.
  • the topic interest table on the terminal device side may be acquired from the server side periodically and stored on the terminal device.
  • the acquisition method may be actively requested by the terminal device, or periodically issued by the server, or automatically issued by the server after detecting that the topic interest table has been updated.
  • the terminal device acquires the topic interest from the server side.
  • the implementation of the table is not limited.
  • the terminal device can further determine the interest topic corresponding to at least one key identifier included in the user operation behavior according to the topic interest table. For example, if the key identifier obtained by the terminal device is "NBA", and the interest topic table contains the mapping relationship between "NBA” and the interest topic "Sports”, the terminal device can determine the key identifier "Sports” by querying the interest topic table.
  • the interest topic corresponding to NBA" is "sports”, and the interest topic and key identifiers are jointly stored as user operation behavior data, such as the data content shown in the second row in Table 1 above.
  • the user operation behavior data collected during the session is pre- Set the value for sampling, and then realize that the user operation behavior data collected during each session has a relatively fixed sequence length. Specifically, the user operation behavior data with a shorter sequence length is supplemented by the default user operation behavior, and the user operation behavior data with a longer sequence length is randomly truncated and sampled.
  • user operation behavior data with a relatively uniform behavior sequence length can be obtained, which can avoid the short sequence length of the user operation behavior data, or it can be understood that the sample data is too small to accurately analyze the user's interests, and can avoid the user's interest due to the short sequence length of the user operation behavior data.
  • the sequence length of the operation behavior data is long, which can also be understood as the problem that the sample data is too much, and the analysis is redundant, which leads to the problem of low processing efficiency.
  • Table 2a is an example of user operation behavior data collected during any session, and it can be obtained that the sequence length of the user operation behavior data is short.
  • the data calibration module 302 performs sampling according to the preset value of the sequence length of the user operation behavior data is 3, when the sequence length of the user operation behavior data collected by the data acquisition module 301 is short ( It can also be understood that when it is less than the preset value, such as the sequence length in Table 2a is 2 less than the preset value 3), the pre-defined default user operation behavior data of the target length can be supplemented, thereby obtaining a user operation with a sequence length of 3. behavioral data.
  • the target length is the absolute value of the difference between the sequence length of the user operation behavior data and the preset value.
  • the sequence length in Table 2a is 2 and the preset value is 3, the target length is
  • 1, then add one sequence of user operation behavior data in Table 2b.
  • "0" is used to represent the default key identifier
  • "0" is used to represent the default interest topic corresponding to the mapping of the default key identifier.
  • the default key identifier and the default interest topic can be set in advance, or determined according to certain rules (for example, according to the user operation behavior of the current hot spot), etc., for example, the default key identifier can be "new crown", and the default interest topic is "current affairs” .
  • Table 3a is an example of user operation behavior data collected during any session, and it can be obtained that the sequence length of the user operation behavior data is relatively long.
  • the data calibration module 302 performs sampling according to the preset value of the sequence length of the user operation behavior data is 3, when the sequence length of the user operation behavior data collected by the data acquisition module 301 is long ( It can also be understood that when it is greater than the preset value, for example, the sequence length in Table 3a is 5 greater than the preset value 3), the target length can be randomly truncated to obtain user operation behavior data with sequence length 3.
  • the target length is the absolute value of the difference between the sequence length of the user operation behavior data and the preset value.
  • the sequence length in Table 3a is 5 and the preset value is 3, the target length is
  • 2, the user operation behavior data of the two sequences are truncated in Table 3b.
  • the implementation method of weighted sampling can also be performed according to the behavior type of the user operation behavior, and then according to the implementation method of randomly truncating the target length.
  • the preset value selects several groups of sequences included in the user operation behavior data with larger weights.
  • the terminal device obtains the weights of the 5 groups of sequences contained in Table 3a according to the type of user operation behavior, selects 3 groups of larger weights. Sequences, such as the 3 sets of sequences shown in Table 3b. Alternatively, other implementation manners of sampling to obtain user operation behavior data whose sequence length is a preset value may also be adopted during implementation, which is not limited in this application.
  • the user operation behavior data collected in stage 1 can be preprocessed into a data structure with a uniform sequence length, so that various user operation behavior data generated in various scenarios can be processed.
  • the modeling for realizing the content recommendation system based on the user's interests will be introduced in detail in the following embodiments, and will not be repeated here.
  • the privacy protection module 303 on the terminal device performs desensitization processing on the user operation behavior data preprocessed by the data calibration module 302 to obtain desensitized user operation behavior data.
  • the user operation behavior data may be desensitized by one or a combination of the following methods, or the user operation behavior data may also be desensitized in other possible ways, which are not limited in this application. Exemplary, including:
  • Mode 1 The terminal device uses a differential privacy algorithm to perform random replacement processing on the user operation behavior data under the same interest theme. Specifically, for each user operation behavior included in the corresponding user operation behavior data during each session, there is a certain probability that it remains unchanged, and there is a certain probability that it is randomly replaced.
  • the terminal device is implemented to keep the topic interest unchanged, by randomly searching the topic interest table, selecting a key identifier under the topic interest, and replacing the original key identifier. Or, regenerate additional summary information and the like in the user operation behavior data.
  • Manner 2 The terminal device strips the user privacy data in the user operation behavior data.
  • the user operation behavior data collected by the terminal device includes user privacy data such as user ID, user operation behavior type, etc., these privacy data may be stripped.
  • Table 5 is an example of user operation behavior data after stripping user-related information, as follows:
  • the processed user operation behavior data can be processed in two ways.
  • Table 6 is an example of the desensitized user operation behavior data after the user operation behavior data is desensitized by way 1 and way 2, as follows:
  • the obtained desensitized user operation behavior data can better protect user privacy, mainly for the purpose of Highlight the user interests of the current user operation behaviors, so as to obtain the current hot topic interests and the hot key identifiers under each topic interest, so as to facilitate the server side to generate the topic interest table or update the topic interest table, so as to improve the performance of the topic interest table based on user interests. Timeliness and accuracy of content recommendations.
  • the privacy protection module 303 on the terminal device uploads the processed and desensitized user operation behavior data to the data statistics module 304 on the server side.
  • the server side can be connected to one or more terminal devices, so the server side can obtain multiple sets of desensitized user operation behavior data uploaded by the privacy protection module 303 in one or more terminal devices.
  • the data statistics module 304 in the server performs summary statistics on the received desensitized user operation behavior data uploaded by the privacy protection module 303 in one or more terminal devices. And, the data analysis module 305 in the server analyzes the desensitized user operation behavior data collected and counted by the data statistics module 304 to generate a topic interest table.
  • the terminal device when the application is implemented, considering the computing capability of the terminal device side, if the terminal device is trained to generate a topic interest table, the terminal device needs to have high performance requirements. However, this implementation has high costs and cannot be used. It is a good defect to improve the recommendation efficiency. Therefore, the topic interest table obtained by training according to the collected massive user operation behavior data can generally be implemented on the server side.
  • the terminal device in order to protect user privacy data, before uploading the user operation behavior data to the server, the terminal device performs sensitive data desensitization processing on the user operation behavior data. The desensitization processing introduced; then, the terminal device sends the desensitized user operation behavior data obtained after the processing to the server to perform the operation of generating the topic interest table.
  • the data statistics module 304 may perform statistical analysis on the received large amount of desensitized user operation behavior data, wherein the statistical analysis is performed.
  • the analysis may include one or a combination of the corresponding relationship between the desensitized user operation behavior and the key identifier, the corresponding relationship between the desensitized user operation behavior data and the interest topic, and the corresponding relationship between the interest topic and the key identifier.
  • the popular key identifiers currently searched or browsed by the current user can be obtained, so that the terminal device can perform content recommendation according to the popular key identifiers, that is, mainly obtain the information related to the popular key identifiers. Contents such as information flow and web pages related to the key identification are recommended, so that the content recommended by the terminal device is the content that the user may be more interested in. For example, if the statistical analysis result indicates that the key identifier "NBA" is included the most times in a large amount of desensitized user operation behavior data, the terminal device determines that "NBA" is a popular key identifier currently searched or browsed by the user. , in this scenario, the terminal device can obtain some content related to "NBA" for recommendation.
  • the terminal device can perform content recommendation according to the topic of interest.
  • the statistical analysis result indicates that among a large amount of desensitized user operation behavior data, the interest topic "sports" is included the most times, and the terminal device determines that "sports" is a popular interest topic currently searched or browsed by the user. , in this scenario, the terminal device can obtain some related content under the "sports" interest topic for recommendation.
  • the popular key identifiers under each interest topic can be obtained, so that the terminal device can further perform content based on popular key identifiers in the scenario of content recommendation based on popular interest topics. recommend.
  • the statistical analysis result indicates that in a large amount of user operation behavior data after desensitization processing, it is determined that the key identifiers included under the interest topic "sports" include "NBA", "soccer", etc. If the number of times is the most, the terminal device determines that the popular key identifier under the "sports" interest topic is "NBA". You can get more content related to "NBA" for recommendation.
  • the specific implementation is that the data statistics module 304 receives a large number of desensitization processing uploaded by the privacy protection module 303 in one or more terminal devices. After the user operation behavior data is obtained, the mapping relationship between multiple key identifiers and topic interests is determined according to multiple sequences contained in each desensitized user operation behavior data.
  • the data statistics module 304 can determine the set of key identifiers contained under each topic of interest based on different topics of interest as categories, so as to obtain a mapping relationship between each topic of interest and the set of key identifiers contained in the topic of interest. For example, following the example in Table 6, after the summary statistics of the data statistics module 304, the mapping relationship of Table 7 is obtained, as follows:
  • the data analysis module 305 can train the topic interest model after obtaining the statistical summary result of the desensitized user operation behavior data.
  • the construction of the topic interest model may be implemented by an algorithm such as a latent Dirichlet allocation (LDA) topic model algorithm, which is not limited in this application.
  • LDA latent Dirichlet allocation
  • the topic interest model can generate topic interest tables, user interest topic radar charts, etc. to determine user interests.
  • the topic interest table can be in the form shown in Table 7, each column represents an interest topic, and each column contains one or more key identifiers, wherein the key identifiers included in each column of interest topics are processed from mass desensitization It is obtained by analyzing the user operation behavior data.
  • each column of interest topics can also be associated with a corresponding weight value, and the weight value represents the user's interest in the interest topic, where the weight value can be obtained through statistics and analysis of a large number of desensitized user operation behavior data Obtained, for example, the weight value corresponding to the interest topic that appears more frequently in the desensitized user operation behavior data is larger. It can be understood that the larger the weight value associated with the interest topic, the greater the interest degree of most users in the interest topic.
  • FIG. 5 is an example diagram of a radar chart of user interest topics shown in an embodiment of the present application. It is assumed that the data analysis module 305 learns a large number of desensitized data by analyzing the desensitized user operation behavior data. In the user operation behavior data, the interest topic "sports" is the most searched and browsed interest topic by users, followed by "current affairs", and less “financial and economics", and then generate the user operation shown in Figure 5 that can reflect the desensitization process. A radar chart of the behavior's level of interest in a topic of interest.
  • stage 6 the data analysis module 305 on the server side sends the topic interest table to the data calibration module 302 on the terminal device side, so as to implement content recommendation based on user interests.
  • the topic interest table generated by the data analysis module 305 on the server side can be used by the data calibration module 302 on the terminal device side to preprocess the user operation behavior data to obtain the preprocessed user operation data. behavioral data.
  • the data calibration module 302 on the terminal device side can combine the topic interest table obtained from the server in the process of preprocessing the collected desensitized user operation behavior data. Therefore, it can be realized that the desensitized user operation behavior data after preprocessing can include key identifiers and interest topics, thereby obtaining more accurate desensitized user operation behavior data.
  • the topic interest table generated by the data analysis module 305 on the server side can also be used by the privacy protection module 303 on the terminal device side to perform a differential privacy algorithm on the preprocessed user operation behavior data. processing to obtain desensitized user operation behavior data.
  • the privacy protection module 303 of the terminal device can randomly replace the content in the preprocessed user operation behavior data based on the topic interest table, so as to protect the privacy of the user operation behavior data .
  • the topic interest table generated by the data analysis module 305 on the server side can be used by the real-time recommendation module 306 on the terminal device side to perform real-time content recommendation, such as the content introduced in the following stage 7 part. , which will not be described in detail here.
  • the real-time recommendation module 306 on the terminal device determines user interests according to the user operation behavior data collected in real time by the data collection module 301 and the topic interest table obtained from the server side, and performs real-time recommendation according to the user interests.
  • the real-time recommendation by the terminal device may include the following scenarios:
  • Scenario 1 The user opens the application as a new user.
  • the terminal device does not store the user's historical user interests for the application.
  • the terminal device After the terminal device detects the user operation behavior of the user entering the application program, and does not detect other user operation behaviors of the user in the application program, it can be carried out according to the weight value of the interest topic included in the topic interest table.
  • Content recommendation For example, assuming that the terminal device detects that the user has opened the browser, and before detecting the user's user operation behavior in the browser, the subject interest table contains the interest topics of "sports", "current affairs” and "finance and finance", then the terminal device The relevant content of these interest topics can be obtained from the server side, and recommended through the information flow on the home page of the browser, and if the weight value of the interest topic is larger, the proportion of the recommended relevant content will be higher.
  • the topic interest table is as shown in the following table 8a:
  • the terminal device detects the user's real-time operation behavior in the application program, it can determine the user's interest in combination with the user's real-time operation behavior and the subject interest table obtained from the server side, and then perform real-time operation according to the user's interest. recommend.
  • the user's real-time operation behavior is that the user enters the search term "Pitaya” in the browser, and the terminal device generates the user's user information according to the key identifier "Pitaya" and its corresponding interest topic "Fruit” together with the topic interest table interest (as shown in Table 8b below).
  • the terminal device may associate different weight values for the topic of interest according to the type of user operation behavior. For example, since the search operation can better reflect the user's personal interests, a higher weight value may be assigned to the interest topic "fruit", so that the terminal device recommends a higher proportion of "fruit” related content. For another example, the terminal device detects that the user clicks on a certain information stream in the process of browsing the information stream on the home page interface, and the terminal device determines that the key identifier contained in the information stream is "Leo" and the corresponding interest topic is "Constellation”. , you can add it to user interests. Since the information flow browsing operation is generally expressed as the user's immediate interest, a lower weight can be assigned to the interest topic "constellation", so that the terminal device recommends a lower proportion of "constellation” related content.
  • the weight of the corresponding topic of interest may also be updated according to the number of operations performed by the user. For example, if the user browses the content related to "constellation" for many times in the browser subsequently, the weight value allocated to the "constellation" can be increased as the number of times the user browses increases.
  • user interests can be reflected in the form of personalized topic interest tables, as shown in Table 8b below:
  • the subject interest table shown in Table 8b above is only a possible example, and is not used to limit the embodiment of the user's interest.
  • the interest topics shown in Table 8b can also be sorted from left to right according to the weight value, and the key identifiers included in each interest topic can also be sorted from top to bottom according to the weight value. That is to say, it can be understood that the weight value associated with the topic of interest "fruit" in Table 8b is currently the largest, so the terminal device displays the highest proportion of the content recommended for "fruit".
  • the user interest obtained according to the real-time operation behavior of the user this time can also be stored as the user's historical user interest, which can be used as the user's next entry into the application. References to determine user interests.
  • Scenario 2 The user opens the application as an old user.
  • the terminal device generally stores the user's historical user interests for the application. It should be noted that, if the terminal device detects that the user is an old user, but does not store the user's historical user interests, the user interests can also be determined according to the implementation manner described in the foregoing scenario 1.
  • the terminal device can combine historical user interests and the topic interest table obtained from the server side to perform content recommendation.
  • the user interest determined by the terminal device can be determined through the personalized topic interest table shown in Table 8b. For example, “fruit” and “constellation” in Table 8b are historical user interests, while “Sports” in Table 8b , “Finance”, and “Current Affairs” are obtained from the subject interest table obtained from the server side.
  • the terminal device detects the user's real-time operation behavior in the application program, it can update the user's interest in combination with the user's real-time operation behavior and the above-mentioned personalized theme interest table, and then perform real-time recommendation according to the updated user interest.
  • the terminal device detects from the user's real-time operation behavior that the number of users' browsing of the topic interest "constellation" has increased significantly, and the relevant associated identifiers browsed are also "capricornus” and "horoscope", the terminal device is added to "constellation” The assigned weight value, and the key ID under "Constellation” is updated.
  • the personalized topic interest table corresponding to the updated user interests can be shown in Table 8c below:
  • the recommended related content and the manner of acquiring the related content are not limited.
  • the terminal device may search for the recommended content corresponding to the user interest from the locally cached content, or may also obtain the recommended content corresponding to the user interest from a content providing server that provides the recommended content corresponding to the user interest Wait.
  • the terminal device can determine the user's real-time interest according to the topic interest table obtained from the server side and in combination with the user's real-time operation behavior, etc., so that content can be recommended according to the user's real-time interest of the terminal device.
  • the real-time interests of users are generated on the terminal device side.
  • the user interests generated on the server side are directly recommended to different users.
  • the content recommendation method based on user interests provided by the present application can update user interests in time according to the user's real-time operation, thereby ensuring that the recommended content can better reflect the content that the user is currently interested in.
  • the method provided in this application can be mainly divided into two parts: the sampling part and the recommendation part.
  • the terminal device may collect each operation of the user to obtain user operation behavior data. Then, the terminal device processes the user operation behavior data and sends it to the server side, so that the server side generates a topic interest table according to a large amount of collected user operation behavior data after desensitization processing.
  • the server may also send a topic interest table to the terminal device, so that the terminal device can perform content recommendation according to the topic interest table.
  • FIG. 6 is a schematic flowchart of a content recommendation based on user interests provided by an embodiment of the present application, including the following steps:
  • the terminal device detects the user's instruction to start the target application, and starts the target application.
  • the user entering the target application may be the user clicking on the application icon on the main interface of the terminal device, or the user may wake up the target application through voice, or the user may also use any display interface on the terminal device.
  • the target application can be quickly entered into the target application, etc., which is not limited in this application.
  • the terminal device detects the user operation behavior of the user clicking the browser icon, and can refer to the content shown in 1 in FIG. 7;
  • the wake-up word of the browser, or the terminal device receives the user's click on the shortcut entry identifier of the browser contained in the drop-down interface, etc., you can refer to the content shown in 2 in FIG.
  • the terminal device collects at least one operation data performed by the user on the target application. Exemplarily, as shown in FIG. 4 , after the terminal device enters the target application program, the user's operation behavior is collected in real time.
  • the terminal device When the terminal device detects the user's instruction to quit the target application, the terminal device closes the target application.
  • the user may close the display interface of the target application and return to the main display interface of the terminal device; or the user may close the running of the target application through background cleaning; or it may be The application does not limit the forced exit of the target application due to the program unresponsiveness, etc.
  • the terminal device stores at least one operation data of the processing procedures of S601 to S603 as a set of user operation behavior data.
  • the terminal device preprocesses the user operation behavior data, and obtains preprocessed user operation behavior data whose sequence value is a preset value.
  • each group of user operation behavior data may have different sequence lengths, and the terminal device may sample user operation behavior data of different sequence lengths based on a preset value. Specifically, if the sequence length of the user operation behavior data is less than the preset value, supplement the user operation behavior data with a default user operation behavior, where the default user operation behavior may be obtained from the subject interest table, or is pre-defined, and obtains the user operation behavior data of the specified sequence length.
  • sequence length of the user operation behavior data is greater than the preset value, random truncation and sampling processing is performed on the user operation behavior data to obtain user operation behavior data with a specified sequence length.
  • the preset value may be customized by the terminal device, or obtained based on historical experience, or determined according to other rules, which is not limited in this application.
  • the terminal device performs desensitization processing on the preprocessed user operation behavior data to obtain desensitized user operation behavior data.
  • the terminal device can randomly replace the content contained in the preprocessed user operation behavior data according to the topic interest table, so that the interest characteristics of the user operation behavior can be blurred to a certain extent.
  • the terminal device may also perform user information stripping on the obfuscated user operation behavior data to obtain desensitized user operation behavior data, thereby avoiding leakage of user privacy.
  • the subject interest table involved in the implementation of the above S605 and S606 may be stored after the terminal device obtains it from the server side. Therefore, S6050 is located before S605 and S606, but the execution sequence between S6050 and S601 to S604 is not limited.
  • the terminal device acquires the topic interest table generated on the server side.
  • the server side may automatically send the topic interest table to the terminal device side in real time, or periodically, or after the topic interest table is updated.
  • the terminal device may also send the request information to the server side, and the server sends the latest topic interest table to the terminal device after receiving the request information.
  • the terminal device uploads the desensitized user operation behavior data to the server side.
  • the server can train the topic interest model based on more user operation behavior data, it is helpful for training to obtain a more accurate and comprehensive topic interest table.
  • the desensitized user operation behavior data obtained after the terminal device performs the processing on the user operation behavior data such as S605 and S606 is uploaded to the server.
  • the terminal device side processes the collected user operation behavior data to obtain desensitized user operation behavior data, and then uploads it to the server side. Therefore, the server side cannot collect the user's private data, so The security of user operation behavior data can be improved.
  • the server performs statistical summary on the desensitized user operation behavior data uploaded by one or more terminal devices.
  • the server may be connected to one or more terminal devices, and FIG. 6 only takes one terminal device as an example for description, and the interactions between other terminal devices and the server are similar.
  • FIG. 6 only takes one terminal device as an example for description, and the interactions between other terminal devices and the server are similar.
  • the statistical summary of the desensitized user operation behavior data currently popular interest topics and key identifiers included in each interest topic can be obtained.
  • the server generates or updates a topic interest table based on the user operation behavior data after desensitization processing after statistical aggregation.
  • the server may use the statistical aggregated user operation behavior data as a training sample to perform unsupervised learning on user interests, thereby obtaining a topic interest table that can reflect the mapping relationship between interest topics and key identifiers.
  • the topic interest table may include multiple interest topics and key identifiers included in each interest topic.
  • the weight value of each interest topic and the weight value of each key identifier included in each interest topic can also be associated in the topic interest table, and the weight value reflects the interest topics and key identifiers that most users are interested in.
  • the interest topic "sports" contained in the topic interest table has the largest weight value, indicating that most users are currently interested in the sports topic; further, the key identifier "basketball” contained in the "sports" topic has the largest weight value, It means that most current users are more interested in basketball keywords.
  • the server sends the topic interest table to the one or more terminal devices.
  • the terminal device detects the user's instruction to start the target application, and starts the target application. For example, as shown in FIG. 7 , the terminal device detects a user operation behavior of the user opening the browser again.
  • S601-S603 and S611-S613 are respectively used to represent the processing performed by the terminal device in response to user operations in two different scenarios, wherein S601-S603 are collecting user operations in real time. While performing content recommendation based on user interests, S611-S 6113 can also perform real-time collection of user operation behaviors while performing content recommendation based on user interests.
  • the terminal device determines the user's interest.
  • the determination of the user interest by the terminal device may include the following possible scenarios:
  • Scenario A The user opens the browser for the first time.
  • the terminal device can recommend the information flow in the browser homepage interface for the user according to the topic interest table, that is, display the first recommendation interface.
  • the topics of interest acquired by the terminal device from the server side include “sports”, “financial and economics” and “current affairs”, and the key identifiers contained in "sports” include basketball, https://china.nba.com/,
  • the key identifiers included in "Finance” include the real economy
  • the key identifiers included in "Current Affairs” include the new crown.
  • the browser homepage interface can be as shown in Figure 8a.
  • the content displayed on the homepage interface of the browser includes the recommended information flow of the basketball association-homepage related to the key identifier "basketball”, and "https://" china.nba.com/” related to the NBA China official website, articles related to the “real economy”, and hot interpretation articles related to the “new crown”; and, by sliding down the home page of the browser, users can also It is possible to browse to more relevant recommended content (not shown in FIG. 8a ) related to the interest topics contained in the topic interest table.
  • Scenario B It is not the first time that the user opens the browser, and before the user's operation behavior is performed.
  • historical user interests may be stored in the browser, and the terminal device may recommend the information flow in the browser homepage interface for the user according to the historical user interests and the topic interest table.
  • the terminal device may recommend the information flow in the browser homepage interface for the user according to the historical user interests and the topic interest table.
  • the browser home page interface may be as shown in FIG. 8b.
  • the terminal device can recommend the information flow in the browser homepage interface for the user according to the user's real-time operation behavior, historical user interest and topic interest table, that is, the first recommendation interface is displayed.
  • the updated personalized topic interest table obtained by the user's real-time operation behavior, it can be obtained that the user is more interested in the topic interest "constellation", then in the content recommendation of the browser home page interface, ""
  • the browser home page interface after the terminal device refreshes according to the updated personalized topic interest table may be as shown in Figure 8c. Since the updated personalized topic interest table indicates that users are more interested in "constellations", the proportion of recommended content related to "constellations" in the browser homepage interface increases, and it is higher in the browser homepage interface. s position.
  • the second recommendation interface may be triggered to display after the terminal device detects an instruction of the user to refresh the first recommendation interface. For example, in an application, when the user swipes down from the top of the terminal device, it means that the user wants to refresh the current interface. At this time, the terminal device can display the recommended content corresponding to the updated user interests on the terminal On the device, it can also be understood as displaying the second recommendation interface.
  • a user's interest does not specifically refer to a certain interest. It can represent a collection of multiple interest topics, and each interest topic is associated with a weight value. The weight value is used to reflect the degree of user interest. The greater the interest topic weight value , which can indicate that the user is more interested in the topic of interest.
  • the terminal device recommends content according to the user's interests.
  • the terminal device may obtain the relevant content corresponding to the user's interests from the local cache, or may also send an obtaining request to the server to obtain the relevant content corresponding to the user's interests from the server.
  • FIG. 9 shows a terminal device 900 provided by an embodiment of the present application.
  • the terminal device 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 .
  • the communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver.
  • the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
  • the collecting a plurality of user operation behavior data input by the user when the user uses the target application program at one time is specifically implemented as starting the target application program when an instruction of the user to start the target application program is detected; After the target application is started, collect at least one operation data performed by the user on the target application; when detecting the user's instruction to exit the target application, close the target application; At least one operation data collected from the start-up to the shutdown process of the target application is stored as a set of user operation behavior data.
  • performing desensitization processing on the multiple user operation behavior data collected is specifically implemented as, for one or more user operation behavior data in the multiple user operation behavior data, based on a differential privacy algorithm.
  • the sequence length of each user operation behavior data is determined;
  • the operation behavior data is truncated and compensated to obtain the user operation behavior data of the specified sequence length.
  • the truncation and compensation processing is performed on the user operation behavior data according to the preset value to obtain the user operation behavior data of the specified sequence length, which is specifically implemented as follows: if the sequence length of the user operation behavior data is less than the preset value, Supplementing the user operation behavior data with pre-defined user operation behavior data of a target length to obtain user operation behavior data with a specified sequence length; if the sequence length of the user operation behavior data is greater than the preset value, the The user operation behavior data is truncated to the target length to obtain the user operation behavior data of the specified sequence length; wherein the target length is the absolute value of the difference between the sequence length of the user operation behavior data and a preset value.
  • FIG. 9 may also provide a server 900 provided in this embodiment of the present application.
  • the server 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 .
  • the communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver.
  • the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
  • the desensitization-processed plurality of user operation behavior data is analyzed to obtain the subject interest table of the user using the target application, which is specifically implemented as:
  • the user operation behavior data is input into a pre-built topic interest model to perform unsupervised learning on the desensitized multiple user operation behavior data; a topic interest table output by the pre-built topic interest model is obtained.
  • FIG. 9 may also provide a terminal device 900 provided in an embodiment of the present application.
  • the terminal device 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 .
  • the communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver.
  • the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
  • the terminal device collects multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration, and desensitizes the collected multiple user operation behavior data;
  • the desensitization process is to filter out the private data related to the user in the user operation behavior data; when detecting the user's instruction to start the target application, start the target application and display the first recommendation
  • the first recommendation interface includes at least one item of recommended content; the at least one item of recommended content is determined according to the topic interest table.
  • the starting the target application and displaying the first recommendation interface is specifically implemented as: starting the target application; displaying the first recommendation interface after starting the target application;
  • One or more interest topics included in the user interest are taken as user interests, at least one recommended content is obtained according to the user interest, and the obtained at least one recommended content is displayed in the first recommendation interface; wherein, each of the interests The topic has an associated weight value, and the greater the weight value associated with the interest topic, the higher the proportion of the recommended content including the related content of the interest topic.
  • the target application is started and the first recommendation interface is displayed, one or more user operation behavior data input by the user when the user uses the target application is received and collected;
  • a second recommendation interface is displayed; the recommended content included in the second recommendation interface is determined according to the one or more user operation behavior data and the topic interest table .
  • the displaying of the second recommendation interface is specifically implemented as determining one or more corresponding interest topics according to the one or more user operation behavior data, and assigning an association to each of the interest topics take one or more interest topics corresponding to the user operation behavior data and one or more of the interest topics included in the topic interest table as the user interest, and obtain at least one recommendation according to the user interest content, and display at least one item of recommended content obtained in the second recommendation interface; wherein, each of the interest topics included in the topic interest table has an associated weight value, and the greater the weight value associated with the interest topic , the higher the proportion of content related to the topic of interest included in the recommended content.
  • the acquiring at least one piece of the recommended content according to the user's interest is specifically implemented as searching for the recommended content corresponding to the user's interest from locally cached content;
  • the content providing server of the recommended content corresponding to the user's interest acquires the recommended content corresponding to the user's interest.
  • Each functional unit in each of the embodiments of the embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
  • a computer-readable storage medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.

Abstract

A user interest-based content recommendation method, and a terminal device, which are used for improving the accuracy and efficiency of user interest-based content recommendation on the premise that user privacy data is protected. The terminal device collects behavior data of multiple user operations input by a user who uses a target application one or more times in a set duration; the collected behavior data of multiple user operations is desensitized; the desensitized behavior data of multiple user operations is sent to a server, such that the server analyses the desensitized behavior data of multiple user operations to obtain a topic interest table of the target application used by the user. The terminal device receives a topic interest table, and displays a first recommendation interface when the user starts the target application, wherein the first recommendation interface comprises at least one recommended content item which is determined according to the topic interest table.

Description

一种基于用户兴趣的内容推荐方法与终端设备A content recommendation method and terminal device based on user interests
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求在2021年03月23日提交中国专利局、申请号为202110307500.X、申请名称为“一种基于用户兴趣的内容推荐方法与终端设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on March 23, 2021 with the application number 202110307500.X and the application name "A Content Recommendation Method and Terminal Device Based on User Interests", the entire contents of which are Incorporated herein by reference.
技术领域technical field
本申请涉及数据处理技术领域,尤其涉及一种基于用户兴趣的内容推荐方法与终端设备。The present application relates to the technical field of data processing, and in particular, to a content recommendation method and terminal device based on user interests.
背景技术Background technique
随着互联网技术的发展,为了提升用户体验,越来越多的应用或系统更加注重千人千面的内容推荐技术。基于用户兴趣的内容推荐系统往往需要建立在海量的用户操作行为数据之上,随之带来的就是对用户的隐私数据的保护问题。其中,用户的隐私数据例如用户的个人信息、用户个性化兴趣等。With the development of Internet technology, in order to improve the user experience, more and more applications or systems pay more attention to the content recommendation technology of thousands of people. Content recommendation systems based on user interests often need to be built on massive user behavior data, which brings with it the protection of users' private data. Among them, the user's private data such as the user's personal information, the user's personalized interests, and the like.
现有技术中,存在利用安全加密技术(例如同态加密技术)保护用户的隐私数据的技术方案。虽然在终端设备与服务器侧传递、存储用户操作行为数据的时候,对用户的隐私数据采用安全加密技术,但是在用户操作行为数据传递过程中依旧存在隐私泄露安全风险。现有技术中,还存在利用分布式联邦学习的技术训练推荐模型,这种技术通过将用户操作行为数据保留在终端设备侧的方式,可以保护用户隐私,但是需要终端设备侧具有训练模型的能力,故而对终端设备侧的性能要求较高,并且存在推荐效率低的缺点。In the prior art, there is a technical solution for protecting user's private data by using a secure encryption technology (eg, homomorphic encryption technology). Although the security encryption technology is adopted for the user's private data when the terminal device and the server side transmit and store the user's operation behavior data, there is still a security risk of privacy leakage during the transmission of the user's operation behavior data. In the prior art, there is also the use of distributed federated learning technology to train recommendation models. This technology can protect user privacy by keeping user operation behavior data on the terminal device side, but requires the terminal device side to have the ability to train the model. , so the performance requirements on the terminal device side are relatively high, and there is a disadvantage of low recommendation efficiency.
因此,如何在保护用户隐私数据的基础上,提升基于用户兴趣下内容推荐的准确度和效率,目前还存在很大的挑战。Therefore, how to improve the accuracy and efficiency of content recommendation based on user interests on the basis of protecting user privacy data is still a great challenge.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种基于用户兴趣的内容推荐方法与终端设备,用以实现在保护用户隐私数据的基础上,提升基于用户兴趣的内容推荐的准确度和效率。Embodiments of the present application provide a method and terminal device for content recommendation based on user interests, so as to improve the accuracy and efficiency of content recommendation based on user interests on the basis of protecting user privacy data.
第一方面,本申请实施例提供了一种基于用户兴趣的内容推荐方法,该方法可以应用于终端设备中。该方法包括:采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据;将采集的所述多个用户操作行为数据进行脱敏处理,所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;将脱敏处理后的多个用户操作行为数据发送给服务器,以使服务器对脱敏处理后的多个用户操作行为数据进行分析,得到所述用户使用所述目标应用程序的主题兴趣表。In a first aspect, an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a terminal device. The method includes: collecting a plurality of user operation behavior data input by a user when the user uses a target application program one or more times in a set duration; performing desensitization processing on the collected user operation behavior data, and the desensitization The processing is to filter out the private data related to the user in the user operation behavior data; send the desensitized multiple user operation behavior data to the server, so that the server can desensitize the multiple user operation behaviors after the desensitization process. The data is analyzed to obtain the subject interest table of the user using the target application.
通过本申请实施例提供的方法,终端设备会采集用户在目标应用程序中的用户操作行为数据,但是用户隐私数据会在终端设备上进行脱敏处理之后才上传到服务器上,因此服务器不会获取到用户隐私数据,可以较好的保护用户隐私,提升用户体验。其中,目标应用程序可以为适用于本申请实施例提供的基于用户兴趣进行内容推荐的任意应用程序,例 如浏览器等。With the method provided by the embodiment of this application, the terminal device will collect the user operation behavior data in the target application program, but the user privacy data will be desensitized on the terminal device before being uploaded to the server, so the server will not obtain the data. To user privacy data, it can better protect user privacy and improve user experience. Wherein, the target application may be any application suitable for recommending content based on user interests provided in the embodiment of the present application, such as a browser and the like.
在一种可能的设计中,终端设备采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据,可实施为:检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序;在所述目标应用程序启动后,采集所述用户对所述目标应用程序执行的至少一个操作数据;检测到所述用户退出所述目标应用程序的指令时,关闭所述目标应用程序;将所述目标应用程序从启动到关闭过程中采集到的至少一个操作数据存储为一组用户操作行为数据。In a possible design, the terminal device collects multiple user operation behavior data input by the user when the user uses the target application one or more times in the set duration, which may be implemented as: detecting that the user starts the target application When the program is instructed, start the target application; after the target application is started, collect at least one operation data performed by the user on the target application; detect that the user exits the target application When instructed, close the target application; store at least one operation data collected during the process of starting to closing the target application as a set of user operation behavior data.
在该设计中,通过终端设备对用户使用目标应用程序时的用户操作行为数据进行收集,并对用户操作行为数据进行脱敏处理之后,上传到服务器。从而可以便于服务器基于脱敏处理后的用户操作行为数据进行大数据分析,以得到用户在该目标应用程序中的用户兴趣。In this design, the user operation behavior data when the user uses the target application is collected through the terminal device, and the user operation behavior data is desensitized and then uploaded to the server. Therefore, it is convenient for the server to perform big data analysis based on the desensitized user operation behavior data, so as to obtain the user interest of the user in the target application program.
在一种可能的设计中,终端设备针对所述多个用户操作行为数据中的一个或多个用户操作行为数据,基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换,所述兴趣主题根据所述主题兴趣表确定;剥离所述多个用户操作行为数据中包含的用户信息。In a possible design, the terminal device randomly replaces the user operation behavior data under the same interest topic based on the differential privacy algorithm for one or more user operation behavior data in the plurality of user operation behavior data, the The topic of interest is determined according to the topic interest table; the user information contained in the plurality of user operation behavior data is stripped.
在该设计中,终端设备通过对用户操作行为数据基于差分隐私算法进行随机替换,可以实现将真实的用户操作行为数据进行掩盖的效果,进而可以达到保护用户隐私的目的。并且,终端设备在将用户操作行为数据上传服务器之前,还可以通过对用户信息进行剥离,以达到服务器无法收集到用户隐私数据的目的,从而可以保障用户数据的隐私性及安全性。In this design, by randomly replacing the user operation behavior data based on the differential privacy algorithm, the terminal device can achieve the effect of masking the real user operation behavior data, thereby achieving the purpose of protecting user privacy. In addition, before uploading the user operation behavior data to the server, the terminal device can also strip the user information so that the server cannot collect the user's private data, thereby ensuring the privacy and security of the user data.
在一种可能的设计中,所述基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换之前,终端设备还可以确定每个所述用户操作行为数据的序列长度;按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据。In a possible design, before the random replacement of user operation behavior data under the same topic of interest based on the differential privacy algorithm, the terminal device may also determine the sequence length of each user operation behavior data; according to the preset value The user operation behavior data is truncated and compensated to obtain user operation behavior data with a specified sequence length.
在该设计中,若用户操作行为数据包含的操作数据过少,也即序列长度较短,则无法从所述用户操作行为数据中分析到较准确的用户兴趣。而用户操作行为数据包含的操作数据过多,也即序列长度较长,则会导致计算量过大的问题。因此,通过对用户操作行为数据按照预设值进行采样,以得到序列长度较为统一的用户操作行为数据,从而可以提升对用户操作行为数据进行脱敏处理的处理效率,以及提升服务器对用户操作行为数据进行统计分析的效率和准确性。In this design, if the user operation behavior data contains too little operation data, that is, the sequence length is short, more accurate user interests cannot be analyzed from the user operation behavior data. However, the user operation behavior data contains too much operation data, that is, the sequence length is long, which will lead to the problem of excessive calculation. Therefore, by sampling the user operation behavior data according to the preset value to obtain the user operation behavior data with a relatively uniform sequence length, the processing efficiency of desensitizing the user operation behavior data can be improved, and the server's ability to detect the user operation behavior can be improved. Efficiency and accuracy of statistical analysis of data.
可选的,终端设备按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据,具体可实施为:若所述用户操作行为数据的序列长度小于所述预设值,为所述用户操作行为数据补充目标长度的、预先定义的用户操作行为数据,得到指定序列长度的用户操作行为数据;或者,若所述用户操作行为数据的序列长度大于所述预设值,对所述用户操作行为数据截断目标长度,得到所述指定序列长度的用户操作行为数据;其中,所述目标长度为所述用户操作行为数据的序列长度与预设值的差值的绝对值。Optionally, the terminal device performs truncation and compensation processing on the user operation behavior data according to a preset value, and obtains user operation behavior data with a specified sequence length. The preset value is to supplement the user operation behavior data with pre-defined user operation behavior data of a target length to obtain user operation behavior data of a specified sequence length; or, if the sequence length of the user operation behavior data is greater than the A preset value, truncating the target length of the user operation behavior data to obtain the user operation behavior data of the specified sequence length; wherein, the target length is the difference between the sequence length of the user operation behavior data and a preset value the absolute value of .
在该设计中,给出了根据预设值进行采样的具体场景,通过对用户操作行为数据的序列长度进行判断,若序列长度小于预设值的用户操作行为数据,可以采用预先定义的默认用户操作行为数据进行补足,而序列长度大于预设值的用户操作行为数据,可以进行随机截断。因此,经过对用户操作行为数据根据预设值进行采样之后,可以得到序列长度较为统一的用户操作行为数据,以便于进行脱敏处理。In this design, a specific scenario of sampling according to the preset value is given. By judging the sequence length of the user operation behavior data, if the sequence length is less than the preset value of the user operation behavior data, the predefined default user can be used. The operation behavior data is supplemented, and the user operation behavior data whose sequence length is greater than the preset value can be randomly truncated. Therefore, after sampling the user operation behavior data according to the preset value, the user operation behavior data with a relatively uniform sequence length can be obtained to facilitate desensitization processing.
第二方面,本申请实施例提供了一种基于用户兴趣的内容推荐方法,该方法可以应用于服务器中。该方法包括:接收一个或多个终端设备发送的脱敏处理后的多个用户操作行为数据;所述脱敏处理后的用户操作行为数据为所述一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表;发送所述主题兴趣表给所述一个或多个终端设备。In a second aspect, an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a server. The method includes: receiving a plurality of desensitized user operation behavior data sent by one or more terminal devices; the desensitized user operation behavior data is collected by the one or more terminal devices for a set duration A plurality of user operation behavior data entered by the user when the user uses the target application program one or more times, and obtained by desensitizing the collected user operation behavior data; the desensitization process is to desensitize the Filter out the privacy data related to the user in the user operation behavior data; analyze the multiple user operation behavior data after the desensitization processing to obtain the subject interest table of the user using the target application; send the subject interest table to the one or more terminal devices.
该方法中,由于服务器相比于终端设备具有更好的计算能力,并且服务器可以综合多个终端设备发送的用户操作行为数据进行群体性的用户操作行为数据的分析,从而可以得到较为满足时效性的热点兴趣主题,也可以理解为当下比较受大多数用户关注的兴趣主题,并生成主题兴趣表。以及,服务器可以将主题兴趣表发送给终端设备,用于终端设备结合所述主题兴趣表进行实时推荐,以提升用户体验。In this method, because the server has better computing power than the terminal device, and the server can integrate the user operation behavior data sent by a plurality of terminal devices to analyze the group user operation behavior data, it is possible to obtain a more timely result. It can also be understood as a topic of interest that attracts the attention of most users at the moment, and a topic interest table is generated. And, the server can send the topic interest table to the terminal device, so that the terminal device can perform real-time recommendation in combination with the topic interest table, so as to improve user experience.
在一种可能的设计中,服务器对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表,具体可实施为:将所述脱敏处理后的多个用户操作行为数据输入预先构建的主题兴趣模型,以对所述脱敏处理后的多个用户操作行为数据进行无监督学习;得到所述预先构建的主题兴趣模型输出的主题兴趣表。In a possible design, the server analyzes a plurality of user operation behavior data after the desensitization processing, and obtains the subject interest table of the user using the target application. The obtained multiple user operation behavior data is input into a pre-built topic interest model, so as to perform unsupervised learning on the desensitized multiple user operation behavior data; the topic interest table output by the pre-built topic interest model is obtained. .
在该设计中,通过预先构建的主题兴趣模型,可以根据多个终端设备发送的大量用户操作行为数据得到主题兴趣表,并且得到的主题兴趣表可以较好的反映当下比较受大多数用户关注的兴趣主题。In this design, through the pre-built topic interest model, a topic interest table can be obtained according to a large amount of user operation behavior data sent by multiple terminal devices, and the obtained topic interest table can better reflect the current interest that is more concerned by most users. subject of interest.
第三方面,本申请实施例提供了一种基于用户兴趣的内容推荐方法,该方法可以应用于终端设备中。该方法包括:接收服务器发送的主题兴趣表,所述主题兴趣表为所述服务器对脱敏处理后的多个用户操作行为数据进行分析得到的;所述脱敏处理后的用户操作行为数据为一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时所述用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序并显示第一推荐界面,所述第一推荐界面中包含至少一项推荐内容;所述至少一项推荐内容是根据所述主题兴趣表确定的。In a third aspect, an embodiment of the present application provides a method for recommending content based on user interests, and the method can be applied to a terminal device. The method includes: receiving a topic interest table sent by a server, where the topic interest table is obtained by analyzing a plurality of user operation behavior data after desensitization processing by the server; the user operation behavior data after desensitization processing is: One or more terminal devices collect multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration, and desensitize the collected multiple user operation behavior data The desensitization process is to filter out the private data involving the user in the user operation behavior data; when detecting the user's instruction to start the target application, start the target application and A first recommendation interface is displayed, and the first recommendation interface includes at least one recommended content; the at least one recommended content is determined according to the topic interest table.
在该方法中,终端设备可以结合服务器发送的主题兴趣表进行实时推荐。在此场景下,若用户属于新用户,或者终端设备中未存储有该用户的历史用户兴趣的场景下,终端设备可以根据主题兴趣表进行推荐,通过对当下比较受大多数用户关注的兴趣主题所对应的内容的推荐,可以避免目标应用程序中的冷启动推荐,即目标应用程序推荐一些冷门的兴趣主题所对应的内容甚至无法进行推荐,导致无法引起用户的浏览兴趣。In this method, the terminal device can perform real-time recommendation in combination with the topic interest table sent by the server. In this scenario, if the user belongs to a new user, or the terminal device does not store the user's historical user interests, the terminal device can make recommendations according to the topic interest table, and by comparing the current interest topics that most users pay attention to The recommendation of the corresponding content can avoid cold-start recommendation in the target application, that is, the target application recommends content corresponding to some unpopular interest topics and cannot even be recommended, resulting in failure to arouse the user's browsing interest.
在一种可能的设计中,所述启动所述目标应用程序并显示第一推荐界面,可以实施为:启动所述目标应用程序;在启动所述目标应用程序后显示第一推荐界面;将所述主题兴趣表中包含的一个或多个兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第一推荐界面中;其中,各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含所述兴趣主题的相关内容的比例越高。In a possible design, the starting the target application and displaying the first recommendation interface may be implemented as: starting the target application; displaying the first recommendation interface after starting the target application; One or more interest topics included in the topic interest table are taken as user interests, and at least one recommended content is obtained according to the user interest, and the obtained at least one recommended content is displayed in the first recommendation interface; wherein, Each of the interest topics has an associated weight value, and the greater the associated weight value of the interest topic, the higher the proportion of the recommended content including the related content of the interest topic.
在该设计中,通过根据主题兴趣表中包含的兴趣主题关联的权重值进行内容推荐,可以实现在推荐界面中更为热门的兴趣主题所对应的相关推荐内容的比重更高,从而可以提升用户感兴趣的可能性,以提升用户体验。In this design, by performing content recommendation according to the weight value associated with the interest topic contained in the topic interest table, it can achieve a higher proportion of the relevant recommended content corresponding to the more popular interest topic in the recommendation interface, thereby improving the user experience. Interested in the possibility to enhance the user experience.
在一种可能的设计中,所述启动所述目标应用程序并显示第一推荐界面之后,实施为:接收并采集用户使用所述目标应用程序时用户输入的一个或多个用户操作行为数据;检测到所述用户刷新所述第一推荐界面的指令时,显示第二推荐界面;所述第二推荐界面中包含的推荐内容是根据所述一个或多个用户操作行为数据以及所述主题兴趣表确定的。In a possible design, after the target application is started and the first recommendation interface is displayed, it is implemented as: receiving and collecting one or more user operation behavior data input by the user when the user uses the target application; When detecting the user's instruction to refresh the first recommendation interface, a second recommendation interface is displayed; the recommended content included in the second recommendation interface is based on the one or more user operation behavior data and the topic interest table determined.
在该设计中,本申请实施时,在目标应用程序获取到用户实时的操作数据之后,通过对用户实时的操作数据的分析,可以分析得到该用户更为关注的兴趣主题,以便于及时对用户兴趣进行调整,进而可以及时显示与所述用户兴趣更为匹配的推荐界面。In this design, when the application is implemented, after the target application acquires the user's real-time operation data, the user's real-time operation data can be analyzed to obtain the interest topics that the user pays more attention to, so as to facilitate the timely analysis of the user's interest topics. interests are adjusted, so that a recommendation interface more matching the user's interests can be displayed in time.
在一种可能的设计中,所述显示第二推荐界面,可实施为:根据所述一个或多个用户操作行为数据确定对应的一个或多个兴趣主题,并为各所述兴趣主题分配关联的权重值;将所述用户操作行为数据对应的一个或多个兴趣主题、所述主题兴趣表中包含的一个或多个所述兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第二推荐界面中;其中,所述主题兴趣表中包括的各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含的所述兴趣主题的相关内容的比例越高。In a possible design, the displaying of the second recommendation interface may be implemented as: determining one or more corresponding interest topics according to the one or more user operation behavior data, and assigning an association to each of the interest topics take one or more interest topics corresponding to the user operation behavior data and one or more of the interest topics included in the topic interest table as the user interest, and obtain at least one recommendation according to the user interest content, and display at least one item of recommended content obtained in the second recommendation interface; wherein, each of the interest topics included in the topic interest table has an associated weight value, and the greater the weight value associated with the interest topic , the higher the proportion of content related to the topic of interest included in the recommended content.
在该设计中,通过结合用户实时的用户操作行为数据以及服务器生成的主题兴趣表,可以在当下热门的兴趣主题的基础上,还可以兼顾用户个人兴趣,从而可以得到更符合用户兴趣的推荐内容,以便于提升用户使用体验。In this design, by combining the user's real-time user operation behavior data and the topic interest table generated by the server, on the basis of the current popular interest topics, the user's personal interest can also be taken into account, so that the recommended content that is more in line with the user's interest can be obtained. , in order to improve the user experience.
在一种可能的设计中,所述根据所述用户兴趣获取至少一项所述推荐内容,可实施为:从本地缓存内容中查找与所述用户兴趣对应的推荐内容;和/或,从提供与所述用户兴趣对应的推荐内容的内容提供服务器中获取与所述用户兴趣对应的推荐内容。In a possible design, the acquiring at least one piece of the recommended content according to the user's interest may be implemented as: searching for the recommended content corresponding to the user's interest from locally cached content; The content providing server of the recommended content corresponding to the user's interest acquires the recommended content corresponding to the user's interest.
在该设计中,在终端设备确定用户兴趣之后,可以通过多种可能的方式获取所述用户兴趣相关的推荐内容,比如热点文章、热点新闻等,用以提升推荐内容的多样性。In this design, after the terminal device determines the user's interest, it can obtain the recommended content related to the user's interest in various possible ways, such as hot articles, hot news, etc., to improve the diversity of the recommended content.
第四方面,本申请实施例还提供了一种终端设备,包括:一个或多个处理器;一个或多个存储器;所述一个或多个存储器,用于存储一个或多个计算机程序以及数据信息;其中所述一个或多个计算机程序包括指令;当所述指令被所述一个或多个处理器执行时,使得所述终端设备执行如上述第一方面中任一项所述的方法,或执行如上述第三方面中任一项所述的方法。In a fourth aspect, an embodiment of the present application further provides a terminal device, including: one or more processors; one or more memories; the one or more memories for storing one or more computer programs and data information; wherein the one or more computer programs include instructions; when the instructions are executed by the one or more processors, the terminal device is caused to perform the method according to any one of the above first aspects, Or perform the method according to any one of the above third aspects.
第五方面,本申请实施例还提供了一种服务器,包括:一个或多个处理器;一个或多个存储器;所述一个或多个存储器,用于存储一个或多个计算机程序以及数据信息;其中所述一个或多个计算机程序包括指令;当所述指令被所述一个或多个处理器执行时,使得所述服务器执行如上述第二方面中任一项所述的方法。In a fifth aspect, an embodiment of the present application further provides a server, including: one or more processors; one or more memories; the one or more memories for storing one or more computer programs and data information ; wherein the one or more computer programs comprise instructions; when executed by the one or more processors, the instructions cause the server to perform the method of any one of the second aspects above.
第六方面,本申请实施例还提供了一种通信系统,包括:终端设备和服务器;所述终端设备可以执行如上述第一方面提供的方法中终端设备的步骤,或执行如上述第三方面提供的方法中终端设备的步骤;所述服务器可以执行如上述第二方面提供的方法中服务器的步骤。In a sixth aspect, an embodiment of the present application further provides a communication system, including: a terminal device and a server; the terminal device can perform the steps of the terminal device in the method provided in the first aspect above, or perform the steps in the third aspect above. The steps of the terminal device in the provided method; the server may execute the steps of the server in the method provided in the second aspect above.
第七方面,本申请实施例提供了一种计算机可读存储介质,计算机可读介质存储有计算机程序(也可以称为代码,或指令)当其在计算机上运行时,使得计算机执行上述第一 方面中任一种可能实现方式中的方法,或执行上述第二方面中任一种可能实现方式中的方法,又或执行上述第三方面中任一种可能实现方式中的方法。In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable medium stores a computer program (also referred to as code, or instruction) when it runs on a computer, so that the computer executes the above-mentioned first The method in any possible implementation of the aspect, or the method in any of the possible implementations of the second aspect above, or the method in any possible implementation of the third aspect above.
第八方面,本申请实施例提供了一种计算机程序产品,计算机程序产品包括:计算机程序(也可以称为代码,或指令),当计算机程序被运行时,使得计算机执行上述第一方面中任一种可能实现方式中的方法,或执行上述第二方面中任一种可能实现方式中的方法,又或执行上述第三方面中任一种可能实现方式中的方法。In an eighth aspect, an embodiment of the present application provides a computer program product. The computer program product includes: a computer program (also referred to as code, or an instruction), which, when the computer program is executed, causes the computer to execute any of the above-mentioned first aspects. The method in one possible implementation manner, or the method in any one possible implementation manner of the foregoing second aspect, or the method in any one possible implementation manner in the foregoing third aspect.
第九方面,本申请实施例还提供一种终端设备上的图形用户界面,该终端设备具有显示屏、一个或多个存储器、以及一个或多个处理器,所述一个或多个处理器用于执行存储在所述一个或多个存储器中的一个或多个计算机程序,所述图形用户界面包括所述终端设备执行本申请实施例第一方面任一可能的实现方式时显示的图形用户界面,或执行本申请实施例第三方面任一可能的实现方式时显示的图形用户界面。In a ninth aspect, an embodiment of the present application further provides a graphical user interface on a terminal device, where the terminal device has a display screen, one or more memories, and one or more processors, where the one or more processors are used for executing one or more computer programs stored in the one or more memories, the graphical user interface includes a graphical user interface displayed when the terminal device executes any possible implementation manner of the first aspect of the embodiments of the present application, Or a graphical user interface displayed when any possible implementation manner of the third aspect of the embodiments of the present application is executed.
上述第四方面至第九方面中任一方面的有益效果请具体参阅上述第一方面至第三方面中各种可能的设计的有益效果,在此不再赘述。For the beneficial effects of any one of the fourth aspect to the ninth aspect, please refer to the beneficial effects of various possible designs in the first aspect to the third aspect, which will not be repeated here.
附图说明Description of drawings
图1为本申请实施例提供的一种基于用户兴趣的内容推荐方法的应用场景图;1 is an application scenario diagram of a method for recommending content based on user interests provided by an embodiment of the present application;
图2a为本申请实施例提供的一种终端设备的硬件架构的示意图;2a is a schematic diagram of a hardware architecture of a terminal device provided by an embodiment of the application;
图2b为本申请实施例提供的终端设备的软件结构框图;FIG. 2b is a block diagram of a software structure of a terminal device provided by an embodiment of the application;
图3为本申请实施例提供的一种基于用户兴趣的内容推荐方法的结构示意图;3 is a schematic structural diagram of a content recommendation method based on user interests provided by an embodiment of the present application;
图4为本申请实施例提供的一种基于用户兴趣的内容推荐方法的用户界面示意图之一;4 is one of schematic diagrams of user interfaces of a method for recommending content based on user interests provided by an embodiment of the present application;
图5为本申请实施例提供的一种用户兴趣主题雷达图的示意图;FIG. 5 is a schematic diagram of a radar chart of a topic of user interest provided by an embodiment of the present application;
图6为本申请实施例提供的一种基于用户兴趣的内容推荐方法的流程示意图;FIG. 6 is a schematic flowchart of a method for recommending content based on user interests according to an embodiment of the present application;
图7为本申请实施例提供的一种基于用户兴趣的内容推荐方法的用户界面示意图之二;FIG. 7 is a second schematic diagram of a user interface of a method for recommending content based on user interests according to an embodiment of the present application;
图8a为本申请实施例提供的一种基于用户兴趣的内容推荐方法的用户界面示意图之三;FIG. 8a is a third schematic diagram of a user interface of a method for recommending content based on user interests provided by an embodiment of the present application;
图8b为本申请实施例提供的一种基于用户兴趣的内容推荐方法的用户界面示意图之FIG. 8b is a schematic diagram of a user interface of a content recommendation method based on user interests provided by an embodiment of the present application
四;Four;
图8c为本申请实施例提供的一种基于用户兴趣的内容推荐方法的用户界面示意图之五;FIG. 8c is a fifth schematic diagram of a user interface of a method for recommending content based on user interests according to an embodiment of the present application;
图9为本申请实施例提供的一种终端设备或服务器的结构示意图。FIG. 9 is a schematic structural diagram of a terminal device or a server according to an embodiment of the present application.
具体实施方式Detailed ways
随着社会的快速发展,终端设备例如手机越来越普及。终端设备不但具有通信功能、还具有强大的处理能力、存储能力、照相功能等。终端设备通过操作系统(例如安卓操作系统)执行相应的应用程序,用户可以使用终端设备打电话、发短消息、浏览网页、拍照、玩游戏、看视频等。其中,在一些具有不同兴趣领域数据的应用程序(application,APP)中,终端设备可以根据用户兴趣进行内容推荐,例如,用户在浏览器中进行搜索、浏览网页等用户操作行为时、在小视频APP浏览小视频时、或者在购物APP进行购物时,终端设备可以根据用户的搜索词、搜索历史记录、当前浏览内容等用户操作行为进行用户可能感兴趣的内容推荐。其中,用户兴趣可以是应用程序内自定义的主题兴趣分类,或者常用的主题兴趣分类,比如浏览器内的主题兴趣分类可以有体育类、财经类、时政类等,购物 APP内的主题兴趣分类可以有服饰类、生活用品类、食品类等。With the rapid development of society, terminal devices such as mobile phones are becoming more and more popular. Terminal devices not only have communication functions, but also have powerful processing capabilities, storage capabilities, and camera functions. The terminal device executes the corresponding application program through the operating system (for example, the Android operating system), and the user can use the terminal device to make calls, send short messages, browse web pages, take pictures, play games, watch videos, and so on. Among them, in some application programs (applications, APPs) with data of different interest fields, the terminal device can recommend content according to the user's interests. When the APP browses small videos, or when shopping on the shopping APP, the terminal device can recommend the content that the user may be interested in according to the user's operation behavior such as the user's search words, search history, and current browsing content. Among them, the user interest can be a custom theme interest classification in the application, or a commonly used theme interest classification. For example, the theme interest classification in the browser can include sports, finance, current affairs, etc., and the theme interest classification in the shopping APP. There can be clothing, daily necessities, food, etc.
结合背景技术中的描述,基于用户兴趣的内容推荐系统需要收集海量的用户操作行为数据以完成内容推荐,然而目前在可以保护用户隐私的基础上,提升基于用户兴趣的内容推荐系统的推荐准确度和推荐效率还没有很好的解决方案。In combination with the description in the background art, a content recommendation system based on user interests needs to collect a large amount of user operation behavior data to complete content recommendation. However, at present, on the basis of protecting user privacy, the recommendation accuracy of a content recommendation system based on user interests is improved. And there is no good solution for recommending efficiency yet.
有鉴于此,本申请提供了一种基于用户兴趣的内容推荐方法,通过在终端设备侧收集用户操作行为数据,然后将用户敏感的隐私数据进行剥离之后的用户操作行为数据上传至服务器侧。服务器侧根据多个终端设备侧上传的大量用户操作行为数据构建主题兴趣表,并向终端设备返回所述主题兴趣表。终端设备可以结合服务器侧下发的主题兴趣表、用户在终端设备上实时的操作行为数据以及历史用户兴趣等因素确定用户兴趣。最后,终端设备根据确定的用户兴趣可以向服务器侧请求所述用户兴趣的相关内容、或者也可以从终端设备的本地缓存中获取所述用户兴趣的相关内容,以实现在终端设备上进行实时的内容推荐。In view of this, the present application provides a content recommendation method based on user interests, by collecting user operation behavior data on the terminal device side, and then uploading the user operation behavior data after stripping the user's sensitive private data to the server side. The server side builds a topic interest table according to a large amount of user operation behavior data uploaded by multiple terminal devices, and returns the topic interest table to the terminal device. The terminal device may determine the user's interest in combination with the topic interest table delivered by the server side, the user's real-time operation behavior data on the terminal device, historical user interests, and other factors. Finally, according to the determined user interest, the terminal device may request the server side for the relevant content of the user's interest, or may also obtain the relevant content of the user's interest from the local cache of the terminal device, so as to realize real-time Content recommendation.
下面将结合附图,对本申请实施例进行详细描述。The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
图1为本申请实施例提供的一种基于用户兴趣的内容推荐方法的应用场景图。该应用场景中可以包括终端设备110、服务器120和数据库130,终端设备110中可以安装有应用程序,该服务器120可以是与终端设备通信的后台服务器,也可以是单独的用于挖掘潜在对象的服务器。应用程序可以是网页版应用程序,也可以是预装在终端设备110中的应用程序,本申请中的应用程序例如可以是浏览器类应用程序、小视频类应用程序、购物类应用程序等可以进行内容推荐的任意类型应用程序。终端设备110和服务器120均可以访问数据库130,将用户访问过程中产生的访问日志存储在数据库130中。数据库130可以设置在服务器120上,也可以是与服务器120相对独立设置,例如数据库130可以通过服务器集群、云服务器或者分布式存储服务器等实现。需要说明的是,本申请对应用场景中包含的终端设备110、服务器120和数据库130的数量和类型不进行限制,例如图1中示出的终端设备110可以有多个。FIG. 1 is an application scenario diagram of a content recommendation method based on user interests provided by an embodiment of the present application. The application scenario may include a terminal device 110, a server 120, and a database 130. An application program may be installed in the terminal device 110. The server 120 may be a background server that communicates with the terminal device, or may be a separate server for mining potential objects. server. The application may be a web version application or an application pre-installed in the terminal device 110. The application in this application may be, for example, a browser application, a small video application, a shopping application, etc. Any type of application that makes content recommendations. Both the terminal device 110 and the server 120 can access the database 130 , and store the access logs generated during the user's access in the database 130 . The database 130 may be set on the server 120, or may be set relatively independently from the server 120. For example, the database 130 may be implemented by a server cluster, a cloud server, or a distributed storage server. It should be noted that this application does not limit the number and types of terminal devices 110, servers 120 and databases 130 included in the application scenario, for example, there may be multiple terminal devices 110 shown in FIG. 1 .
例如,当前用户通过启动终端设备110中的应用程序,并在所述应用程序上进行用户操作而产生用户操作行为数据时,用户操作行为数据例如可以是用户在浏览器中进行搜索、浏览等操作数据,终端设备110可以根据用户操作行为数据以及服务器120下发的主题兴趣表进行内容推荐。其中终端设备110可以在关闭所述应用程序后,将用户操作行为数据进行脱敏处理之后上传至服务器120上,然后服务器120可以根据海量的用户操作行为数据生成或更新主题兴趣表,再返回给终端设备110。此外,服务器120还可以将生成的主题兴趣表或接收到的用户操作行为数据存储到数据库130中。For example, when the current user generates user operation behavior data by starting an application program in the terminal device 110 and performing user operations on the application program, the user operation behavior data may be, for example, the user's operations such as searching and browsing in the browser. data, the terminal device 110 can perform content recommendation according to the user operation behavior data and the topic interest table issued by the server 120 . The terminal device 110 can desensitize the user operation behavior data and upload it to the server 120 after closing the application program, and then the server 120 can generate or update the topic interest table according to the massive user operation behavior data, and then return it to the server 120. Terminal device 110. In addition, the server 120 may also store the generated topic interest table or the received user operation behavior data in the database 130 .
可以理解的是,本申请实施例的终端设备110可以是诸如手机、平板电脑、可穿戴设备(例如,手表、手环、头盔、耳机等)、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、智能家居设备(例如,智能电视,智能音箱,智能摄像头等)等。可以理解的是,本申请实施例对终端设备110的具体类型不作任何限制。It can be understood that the terminal device 110 in this embodiment of the present application may be, for example, a mobile phone, a tablet computer, a wearable device (for example, a watch, a wristband, a helmet, a headset, etc.), a vehicle-mounted device, an augmented reality (AR)/ Virtual reality (VR) devices, laptops, ultra-mobile personal computers (UMPCs), netbooks, personal digital assistants (PDAs), smart home devices (e.g., smart TVs, smart speakers, smart cameras, etc.), etc. It can be understood that, the embodiment of the present application does not impose any limitation on the specific type of the terminal device 110 .
本申请实施例可以应用到的终端设备110,示例性实施例包括但不限于搭载
Figure PCTCN2022081770-appb-000001
Figure PCTCN2022081770-appb-000002
或者其它操作系统的便携式终端设备。上述便携式终端设备也可以是其它便携式终端设备,诸如具有触敏表面(例如触控面板)的膝上型计算机(Laptop)等。
The terminal device 110 to which the embodiments of this application can be applied, and exemplary embodiments include but are not limited to carrying
Figure PCTCN2022081770-appb-000001
Figure PCTCN2022081770-appb-000002
Or portable terminal devices with other operating systems. The above-mentioned portable terminal device may also be other portable terminal devices, such as a laptop computer (Laptop) or the like having a touch-sensitive surface (eg, a touch panel).
图2a示出了一种可能的终端设备的硬件结构示意图。其中,所述终端设备110包括:射频(radio frequency,RF)电路210、电源220、处理器230、存储器240、输入单元250、显示单元260、音频电路270、通信接口280、以及无线保真(wireless fidelity,Wi Fi)模块290等部件。本领域技术人员可以理解,图2a中示出的终端设备的硬件结构并不构成对终端设备的限定,本申请实施例提供的终端设备可以包括比图示更多或更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图2a中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。Figure 2a shows a schematic diagram of the hardware structure of a possible terminal device. The terminal device 110 includes: a radio frequency (RF) circuit 210, a power supply 220, a processor 230, a memory 240, an input unit 250, a display unit 260, an audio circuit 270, a communication interface 280, and a wireless fidelity ( components such as wireless fidelity, Wi Fi) module 290. Those skilled in the art can understand that the hardware structure of the terminal device shown in FIG. 2a does not constitute a limitation on the terminal device, and the terminal device provided in this embodiment of the present application may include more or less components than those shown in the figure, and may be combined Two or more components, or may have different component configurations. The various components shown in Figure 2a may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
下面结合图2a对所述终端设备110的各个构成部件进行具体的介绍:Each component of the terminal device 110 will be specifically introduced below with reference to FIG. 2a:
所述RF电路210可用于通信或通话过程中,数据的接收和发送。特别地,所述RF电路210在接收到基站的下行数据后,发送给所述处理器230处理;另外,将待发送的上行数据发送给基站。通常,所述RF电路210包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(low noise amplifier,LNA)、双工器等。The RF circuit 210 can be used for data reception and transmission during communication or conversation. In particular, after receiving the downlink data of the base station, the RF circuit 210 sends it to the processor 230 for processing; in addition, it sends the uplink data to be sent to the base station. Typically, the RF circuit 210 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like.
此外,RF电路210还可以通过无线通信与网络和其他设备进行通信。所述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(global system of mobile communication,GSM)、通用分组无线服务(general packet radio service,GPRS)、码分多址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、长期演进(long term evolution,LTE)、电子邮件、短消息服务(short messaging service,SMS)等。In addition, the RF circuit 210 may also communicate with networks and other devices via wireless communication. The wireless communication can use any communication standard or protocol, including but not limited to global system of mobile communication (GSM), general packet radio service (GPRS), code division multiple access (code division multiple access) division multiple access, CDMA), wideband code division multiple access (WCDMA), long term evolution (long term evolution, LTE), email, short message service (short messaging service, SMS), etc.
Wi Fi技术属于短距离无线传输技术,所述终端设备110通过Wi Fi模块290可以连接访问接入点(access point,AP),从而实现数据网络的访问。所述Wi Fi模块290可用于通信过程中,数据的接收和发送。The Wi-Fi technology belongs to the short-distance wireless transmission technology, and the terminal device 110 can be connected to an access point (access point, AP) through the Wi-Fi module 290, thereby realizing the access of the data network. The WiFi module 290 can be used for data reception and transmission during the communication process.
所述终端设备110可以通过所述通信接口280与其他设备实现物理连接。可选的,所述通信接口280与所述其他设备的通信接口通过电缆连接,实现所述终端设备110和其他设备之间的数据传输。The terminal device 110 can be physically connected with other devices through the communication interface 280 . Optionally, the communication interface 280 is connected with the communication interface of the other device through a cable to realize data transmission between the terminal device 110 and the other device.
由于在本申请实施例中,所述终端设备110能够实现通信业务,与服务器侧实现交互,因此所述终端设备110需要具有数据传输功能,即所述终端设备110内部需要包含通信模块。虽然图2a示出了所述RF电路210、所述Wi Fi模块290、和所述通信接口280等通信模块,但是可以理解的是,所述终端设备110中存在上述部件中的至少一个或者其他用于实现通信的通信模块(如蓝牙模块),以进行数据传输。In this embodiment of the present application, the terminal device 110 can implement communication services and interact with the server side, so the terminal device 110 needs to have a data transmission function, that is, the terminal device 110 needs to include a communication module. Although FIG. 2 a shows communication modules such as the RF circuit 210 , the WiFi module 290 , and the communication interface 280 , it can be understood that the terminal device 110 has at least one or other of the above components. A communication module (such as a Bluetooth module) used to implement communication for data transmission.
例如,当所述终端设备110为手机时,所述终端设备110可以包含所述RF电路210,还可以包含所述Wi Fi模块290;当所述终端设备110为计算机时,所述终端设备110可以包含所述通信接口280,还可以包含所述Wi Fi模块290;当所述终端设备110为平板电脑时,所述终端设备110可以包含所述Wi Fi模块。For example, when the terminal device 110 is a mobile phone, the terminal device 110 may include the RF circuit 210, and may also include the WiFi module 290; when the terminal device 110 is a computer, the terminal device 110 The communication interface 280 may be included, and the WiFi module 290 may also be included; when the terminal device 110 is a tablet computer, the terminal device 110 may include the WiFi module.
所述存储器240可用于存储软件程序以及模块。所述处理器230通过运行存储在所述存储器240的软件程序以及模块,从而执行所述终端设备110的各种功能应用以及数据处理。可选的,所述存储器240可以主要包括存储程序区和存储数据区。其中,存储程序区可存储操作系统(主要包括内核层、系统层、应用程序框架层和应用程序层等各自对应的软件程序或模块)。应用程序层可以包含各种应用,其中可以进行推荐的应用中通过采用本申请实施例提供的方法,可以实现基于用户兴趣的内容推荐。The memory 240 may be used to store software programs and modules. The processor 230 executes various functional applications and data processing of the terminal device 110 by running software programs and modules stored in the memory 240 . Optionally, the memory 240 may mainly include a program storage area and a data storage area. The storage program area can store the operating system (mainly including the corresponding software programs or modules of the kernel layer, the system layer, the application program framework layer, and the application program layer). The application layer may include various applications, and among the applications that can be recommended, content recommendation based on user interests can be implemented by using the method provided by the embodiments of the present application.
此外,所述存储器240可以包括高速随机存取存储器,还可以包括非易失性存储器, 例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。In addition, the memory 240 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
所述输入单元250可用于接收用户输入的数字或字符信息等多种不同类型的数据对象的编辑操作,以及产生与所述终端设备110的用户设置以及功能控制有关的键信号输入。可选的,输入单元250可包括触控面板251以及其他输入设备252。The input unit 250 can be used to receive editing operations of various types of data objects such as numbers or character information input by the user, and generate key signal input related to user settings and function control of the terminal device 110 . Optionally, the input unit 250 may include a touch panel 251 and other input devices 252 .
其中,所述触控面板251,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在所述触控面板251上或在所述触控面板251附近的操作),并根据预先设定的程序驱动相应的连接装置。Wherein, the touch panel 251, also called a touch screen, can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc. on the touch panel 251 or on the touch panel 251). operation near the touch panel 251 ), and drive the corresponding connection device according to a preset program.
可选的,所述其他输入设备252可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。Optionally, the other input devices 252 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, joysticks, and the like.
所述显示单元260可用于显示由用户输入的信息或提供给用户的信息以及所述终端设备110的各种菜单。所述显示单元260即为所述终端设备110的显示系统,用于呈现界面,实现人机交互。所述显示单元260可以包括显示面板261。可选的,所述显示面板261可以采用液晶显示屏(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)等形式来配置。本申请实施例中,例如可以通过显示单元260显示用户在终端设备上的操作所对应的可视化页面,比如用户输入搜索词之后显示单元260显示所述搜索词对应的信息流、网页等。The display unit 260 may be used to display information input by the user or information provided to the user and various menus of the terminal device 110 . The display unit 260 is the display system of the terminal device 110, and is used for presenting an interface and realizing human-computer interaction. The display unit 260 may include a display panel 261 . Optionally, the display panel 261 may be configured in the form of a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (organic light-emitting diode, OLED) or the like. In this embodiment of the present application, for example, the display unit 260 may display a visual page corresponding to the user's operation on the terminal device. For example, after the user enters a search term, the display unit 260 displays the information flow, web page, etc.
所述处理器230是所述终端设备110的控制中心,利用各种接口和线路连接各个部件,通过运行或执行存储在所述存储器240内的软件程序和/或模块,以及调用存储在所述存储器240内的数据,执行所述终端设备110的各种功能和处理数据,从而实现基于所述终端设备的多种业务。本申请实施例中,处理器230用来实现本申请实施例提供的方法,进而为用户进行更准确的兴趣内容推荐。The processor 230 is the control center of the terminal device 110, uses various interfaces and lines to connect various components, runs or executes the software programs and/or modules stored in the memory 240, and invokes the software programs and/or modules stored in the The data in the memory 240 executes various functions of the terminal device 110 and processes data, thereby realizing various services based on the terminal device. In the embodiment of the present application, the processor 230 is configured to implement the method provided by the embodiment of the present application, so as to perform more accurate content recommendation for the user.
所述终端设备110还包括用于给各个部件供电的电源220(比如电池)。可选的,所述电源220可以通过电源管理系统与所述处理器230逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗等功能。The terminal device 110 also includes a power source 220 (such as a battery) for powering the various components. Optionally, the power supply 220 may be logically connected to the processor 230 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption through the power management system.
如图2a所示,终端设备110还包括音频电路270、麦克风271和扬声器272,可提供用户与终端设备110之间的音频接口。音频电路270可用于将音频数据转换为扬声器272能够识别的信号,并将信号传输到扬声器272,由扬声器272转换为声音信号输出。麦克风271用于收集外部的声音信号(如人说话的声音、或者其它声音等),并将收集的外部的声音信号转换为音频电路270能够识别的信号,发送给音频电路270。音频电路270还可用于将麦克风271发送的信号转换为音频数据,再将音频数据输出至RF电路220以发送给比如另一终端,或者将音频数据输出至存储器240以便后续进一步处理。As shown in FIG. 2 a , the terminal device 110 further includes an audio circuit 270 , a microphone 271 and a speaker 272 , which can provide an audio interface between the user and the terminal device 110 . The audio circuit 270 can be used to convert the audio data into a signal that can be recognized by the speaker 272, and transmit the signal to the speaker 272, and the speaker 272 converts it into a sound signal and outputs it. The microphone 271 is used to collect external sound signals (such as voices of people speaking or other sounds, etc.), convert the collected external sound signals into signals that can be recognized by the audio circuit 270 , and send them to the audio circuit 270 . The audio circuit 270 can also be used to convert the signal sent by the microphone 271 into audio data, and then output the audio data to the RF circuit 220 for transmission to, for example, another terminal, or output the audio data to the memory 240 for subsequent further processing.
尽管未示出,所述终端设备110还可以包括至少一种传感器、摄像头等,在此不再赘述。Although not shown, the terminal device 110 may further include at least one type of sensor, camera, etc., which will not be repeated here.
本申请实施例涉及的操作系统(operating system,OS),是运行在终端设备110上的最基本的系统软件。以智能手机为例,操作系统可以是安卓(android)系统或IOS系统。以下实施例以android系统为例进行介绍。本领域技术人员可以理解,其它操作系统中,也可以采用类似的方法实现。The operating system (operating system, OS) involved in the embodiments of the present application is the most basic system software running on the terminal device 110 . Taking a smartphone as an example, the operating system may be an Android system or an IOS system. The following embodiments take the android system as an example for introduction. Those skilled in the art can understand that in other operating systems, a similar method can also be used for implementation.
终端设备110的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以采用分层架构的android系统为例,示例性说明终端设备110的软件结构。The software system of the terminal device 110 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. The embodiments of the present application take an android system using a layered architecture as an example to illustrate the software structure of the terminal device 110 as an example.
图2b示出了本申请实施例提供的android系统的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将android系统分为五层,从上至下分别为应用程序层,应用程序框架(framework)层,安卓运行时(android runtime)和系统库,硬件抽象层,以及内核层。FIG. 2b shows a software structural block diagram of an android system provided by an embodiment of the present application. The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate with each other through software interfaces. In some embodiments, the android system is divided into five layers, from top to bottom, the application layer, the application framework (framework) layer, the Android runtime (android runtime) and system library, the hardware abstraction layer, and the kernel layer. .
应用程序层是操作系统的最上一层,可以包括一系列应用程序包。如图2b所示,应用程序层可以包括操作系统的原生应用程序和第三方应用程序,其中,操作系统的原生应用程序可以包括用户界面(user interface,UI)、浏览器、相机、设置、手机管家、音乐、短信息、通话等,第三方应用程序可以包括地图,购物APP,小视频APP等。下文中提到的应用,可以是终端设备110出厂时已安装的操作系统的原生应用,也可以是用户在使用终端设备110的过程中从网络下载或从其他终端设备110获取的第三方应用。The application layer is the top layer of the operating system and can include a series of application packages. As shown in Figure 2b, the application layer may include native applications of the operating system and third-party applications, wherein the native applications of the operating system may include user interface (UI), browser, camera, settings, mobile phone Butler, music, text messages, calls, etc., third-party applications can include maps, shopping APPs, small video APPs, etc. The applications mentioned below may be native applications of the operating system installed on the terminal device 110 when it leaves the factory, or may be third-party applications downloaded from the network or acquired from other terminal devices 110 by the user during the use of the terminal device 110 .
在本申请一些实施例中,应用程序层可以用于实现编辑界面的呈现,上述编辑界面可以用于用户进行操作。例如,用户可以在浏览器对应呈现的编辑界面上进行输入搜索词等用户操作行为。In some embodiments of the present application, the application layer may be used to implement the presentation of an editing interface, and the above-mentioned editing interface may be used by a user to perform operations. For example, the user may perform user operations such as inputting a search term on the editing interface correspondingly presented by the browser.
一种可能的实现方式中,应用程序可以使用java语言开发,通过调用应用程序框架层所提供的应用程序编程接口(application programming interface,API)来完成,开发者可以通过应用程序框架层来与操作系统的底层(例如硬件抽象层、内核层等)进行交互,开发自己的应用程序。该应用程序框架层主要是操作系统的一系列的服务和管理系统。In a possible implementation, the application can be developed using the java language, and it can be done by calling the application programming interface (API) provided by the application framework layer. The bottom layers of the system (such as hardware abstraction layer, kernel layer, etc.) interact to develop their own applications. The application framework layer is mainly a series of services and management systems of the operating system.
应用程序框架层为应用程序层的应用程序提供应用编程接口和编程框架。应用程序框架层包括一些预定义函数。如图2b所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。The application framework layer provides application programming interfaces and programming frameworks for applications in the application layer. The application framework layer includes some predefined functions. As shown in Figure 2b, the application framework layer may include a window manager, content provider, view system, telephony manager, resource manager, notification manager, etc.
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。视图系统包括可视控件,例如显示文本的文本控件,显示图片的图片控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。电话管理器用于提供终端设备110的通信功能,例如通话状态显示的管理(包括接通,挂断等)。资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等。A window manager is used to manage window programs. The window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, take screenshots, etc. Content providers are used to store and retrieve data and make these data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone book, etc. The view system includes visual controls, such as text controls that display text, picture controls that display pictures, and so on. View systems can be used to build applications. A display interface can consist of one or more views. The telephony manager is used to provide communication functions of the terminal device 110, such as management of call status display (including connecting, hanging up, etc.). The resource manager provides various resources for the application, such as localization strings, icons, pictures, layout files, video files, etc.
在本申请一些实施例中,该应用程序框架层主要负责调用与硬件抽象层之间通信的服务接口,以将用户进行操作的操作请求传递到硬件抽象层,所述操作请求可以包含用户打开某一APP对应的操作请求、或可以包含用户在某一APP进行的键入搜索词对应的操作请求等。硬件抽象层根据应用程序层传递的操作请求生成对应的内容推荐服务。In some embodiments of the present application, the application framework layer is mainly responsible for invoking a service interface that communicates with the hardware abstraction layer, so as to transmit the operation request of the user to the hardware abstraction layer, and the operation request may include that the user opens a certain An operation request corresponding to an APP, or an operation request corresponding to a search term entered by a user in an APP, etc. may be included. The hardware abstraction layer generates the corresponding content recommendation service according to the operation request passed by the application layer.
示例性的,内容推荐服务可以包括用于实现本申请提供的方法的数据采集模块、数据校准模块、实时推荐模块、隐私保护模块等。其中,所述数据采集模块,用于采集用户在终端设备上对客户端的用户操作行为,以得到用户操作行为数据。所述数据校准模块,用于对所述数据采集模块采集的用户操作行为数据进行预处理,得到序列长度较为统一的用户操作行为数据。所述隐私保护模块,用于对收集的用户操作行为数据进行脱敏处理,将用户操作行为数据中涉及用户隐私数据进行剥离或替换等,进而得到不彰显用户隐私的用户操作行为数据,并将脱敏处理后的用户操作行为数据传递至服务器侧,脱敏处理后的用户操作行为数据用于进行构建主题兴趣模型,生成主题兴趣表。所述实时推荐模块,用于根据确定的用户兴趣进行实时的内容推荐。Exemplarily, the content recommendation service may include a data collection module, a data calibration module, a real-time recommendation module, a privacy protection module, and the like for implementing the method provided by the present application. Wherein, the data collection module is used to collect the user operation behavior of the user on the client terminal on the terminal device, so as to obtain the user operation behavior data. The data calibration module is used for preprocessing the user operation behavior data collected by the data acquisition module to obtain user operation behavior data with a relatively uniform sequence length. The privacy protection module is used for desensitizing the collected user operation behavior data, stripping or replacing the user privacy data involved in the user operation behavior data, etc., so as to obtain user operation behavior data that does not reveal the user's privacy. The desensitized user operation behavior data is transmitted to the server side, and the desensitized user operation behavior data is used to construct a topic interest model and generate a topic interest table. The real-time recommendation module is used to perform real-time content recommendation according to the determined user interests.
安卓运行时(android runtime)包括核心库和虚拟机。android runtime负责安卓系统的调度和管理。安卓系统的核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。Android runtime (android runtime) includes core libraries and virtual machines. The android runtime is responsible for the scheduling and management of the Android system. The core library of the Android system consists of two parts: one is the function functions that the java language needs to call, and the other is the core library of Android.
应用程序层和应用程序框架层运行在虚拟机中。以java举例,虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。The application layer and the application framework layer run in virtual machines. Taking java as an example, the virtual machine executes the java files of the application layer and the application framework layer as binary files. The virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, safety and exception management, and garbage collection.
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维图形处理库(例如:OpenGL ES),二维(2D)图形引擎(例如:SGL)等。A system library can include multiple functional modules. For example: surface manager (surface manager), media library (media library), three-dimensional graphics processing library (eg: OpenGL ES), two-dimensional (2D) graphics engine (eg: SGL) and so on.
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供2D和3D图层的融合。媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。2D图形引擎是2D绘图的绘图引擎。The Surface Manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications. The media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc. The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing. 2D graphics engine is a drawing engine for 2D drawing.
硬件抽象层(hardware abstraction layer,HAL)是应用程序框架层的支撑,是连接应用程序框架层与内核层的重要纽带,其可通过应用程序框架层为开发者提供服务。The hardware abstraction layer (HAL) is the support of the application framework layer and an important link between the application framework layer and the kernel layer. It can provide services for developers through the application framework layer.
示例性的,可以通过在硬件抽象层配置第一进程来实现本申请实施例中内容推荐服务的功能,第一进程可以是在硬件抽象层中单独构建的子进程。其中,第一进程可以包括内容推荐服务配置接口、内容推荐服务控制器等模块。其中,内容推荐服务配置接口是与应用程序框架层进行通信的服务接口。其中,内容推荐服务控制器用于对内容推荐服务配置接口进行监控,例如,对内容推荐服务是否需要进行鉴权进行控制等,还负责监控终端设备110中输入的数据是否需要缓存或有更新,当输入的数据需要缓存或者有更新时,可以通知应用程序框架层进行相应数据的缓存或更新,以保证显示界面显示的是最新的数据。硬件抽象层中还可以包含有守护进程,该守护进程可以用于缓存第一进程中的数据,守护进程也可以是在硬件抽象层中单独构建的子进程。Exemplarily, the function of the content recommendation service in the embodiment of the present application may be implemented by configuring a first process in the hardware abstraction layer, and the first process may be a sub-process independently constructed in the hardware abstraction layer. The first process may include modules such as a content recommendation service configuration interface, a content recommendation service controller, and the like. The content recommendation service configuration interface is a service interface that communicates with the application framework layer. The content recommendation service controller is used to monitor the content recommendation service configuration interface, for example, to control whether the content recommendation service needs to be authenticated, etc., and is also responsible for monitoring whether the data input in the terminal device 110 needs to be cached or updated. When the input data needs to be cached or updated, the application framework layer can be notified to cache or update the corresponding data to ensure that the latest data is displayed on the display interface. The hardware abstraction layer may further include a daemon process, and the daemon process may be used to cache data in the first process, and the daemon process may also be a subprocess constructed separately in the hardware abstraction layer.
内核层可以是Linux内核(Linux kernel)层,是硬件和软件之间的抽象层。内核层有许多与终端设备110相关的驱动程序,至少包含显示驱动;基于Linux的帧缓冲驱动;作为输入设备的键盘驱动、鼠标驱动;基于内存技术设备的Flash驱动;音频驱动;蓝牙驱动等,本申请实施例对此不做任何限制。Linux内核层用于提供操作系统的核心系统服务,如安全性、内存管理、进程管理、网络协议栈和驱动模型等都基于Linux内核实现。The kernel layer can be the Linux kernel (Linux kernel) layer, which is an abstraction layer between hardware and software. The kernel layer has many drivers related to the terminal device 110, including at least display drivers; Linux-based frame buffer drivers; keyboard drivers and mouse drivers as input devices; Flash drivers based on memory technology devices; audio drivers; Bluetooth drivers, etc., This embodiment of the present application does not impose any limitation on this. The Linux kernel layer is used to provide the core system services of the operating system, such as security, memory management, process management, network protocol stack and driver model, all based on the Linux kernel.
通常终端设备110可以同时运行多个应用程序。较为简单的,一个应用程序可以对应一个进程,较为复杂的,一个应用程序可以对应多个进程。每个进程具备一个进程号(进程ID)。Usually, the terminal device 110 can run multiple applications at the same time. Simple, one application can correspond to one process, and more complex, one application can correspond to multiple processes. Each process has a process number (process ID).
结合上述图2a中对终端设备110的硬件结构的介绍,以及图2b中对终端设备110的软件框架的介绍,下面针对基于用户兴趣的内容推荐的场景,示例性说明终端设备110执行本申请实施例中提出的基于用户兴趣的内容推荐方法的软件以及硬件的工作原理。In combination with the introduction of the hardware structure of the terminal device 110 in FIG. 2a and the introduction of the software framework of the terminal device 110 in FIG. 2b, the following is an example to illustrate that the terminal device 110 performs the implementation of the present application for the scenario of content recommendation based on user interests. The working principle of the software and hardware of the content recommendation method based on user interest proposed in the example.
应理解,本申请实施例中“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一(项)个”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b或c 中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c,或a、b和c,其中a、b、c可以是单个,也可以是多个。It should be understood that, in the embodiments of the present application, "at least one" refers to one or more, and "a plurality" refers to two or more. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects are an "or" relationship. "At least one (item) of the following" or its similar expression refers to any combination of these items, including any combination of single item (item) or plural item (item). For example, at least one (a) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, where a, b, c Can be single or multiple.
本申请实施例涉及的多个,是指大于或等于两个。The multiple involved in the embodiments of the present application refers to greater than or equal to two.
另外,需要理解的是,在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。In addition, it should be understood that in the description of this application, words such as "first" and "second" are only used for the purpose of distinguishing the description, and should not be understood as indicating or implying relative importance, nor should it be understood as indicating or implied order.
此外,本申请实施例中,“终端设备”、“设备”、“手机”等可以混用,即指可以用于实现本申请实施例的各种设备;本申请实施例中的“应用”和“应用程序”也可以混用,均指具有一定业务提供能力的程序或客户端等,也就是说应用和客户端也可混用,比如浏览器客户端、游戏客户端也可以称之为浏览器应用或游戏应用等。In addition, in the embodiments of the present application, "terminal device", "device", "mobile phone", etc. may be used interchangeably, that is, various devices that can be used to implement the embodiments of the present application; "Application" can also be mixed, both refer to programs or clients that have certain service provision capabilities, that is to say, applications and clients can also be mixed, such as browser clients and game clients can also be called browser applications or game applications, etc.
应理解,终端设备的硬件结构可以如图2a所示,软件架构可以如图2b所示,其中,终端设备中的软件架构对应的软件程序和/或模块可以存储在存储器240中,处理器230可以运行存储器240中存储的软件程序和应用以执行本申请实施例提供的基于用户兴趣的内容推荐方法的流程。It should be understood that the hardware structure of the terminal device may be as shown in FIG. 2a, and the software architecture may be as shown in FIG. 2b, wherein, the software programs and/or modules corresponding to the software architecture in the terminal device may be stored in the memory 240, and the processor 230 The software programs and applications stored in the memory 240 may be executed to execute the flow of the method for recommending content based on user interests provided by the embodiments of the present application.
为了便于理解本申请提供的基于用户兴趣的内容推荐方法,以下结合图3中所示的结构示意图,对采用本申请提供的方法包含的步骤进行介绍。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独的物理存在,也可以两个或两个以上模块集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序的形式实现。In order to facilitate the understanding of the content recommendation method based on user interests provided by the present application, the steps involved in adopting the method provided by the present application will be introduced below with reference to the schematic structural diagram shown in FIG. 3 . It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation. In addition, each functional module in each embodiment of the present application may be integrated in one processor, or may exist independently physically, or two or more modules may be integrated in one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software programs.
如图3所示,本申请实施时,根据逻辑功能划分,终端设备侧可以包括数据采集模块301、数据校准模块302、隐私保护模块303及实时推荐模块306;服务器侧可以包括数据统计模块304及数据分析模块305。As shown in FIG. 3 , when the application is implemented, according to the division of logical functions, the terminal device side may include a data collection module 301, a data calibration module 302, a privacy protection module 303 and a real-time recommendation module 306; the server side may include a data statistics module 304 and a real-time recommendation module 306. Data analysis module 305 .
其中,终端设备侧的数据采集模块301用于进行用户操作行为数据的采集,并且采集的用户操作行为数据不仅可以继续通过数据校准模块302进行预处理,也可以送入实时推荐模块306进行用户兴趣的内容推荐。终端设备侧的隐私保护模块303用于对数据校准模块302进行预处理之后的用户操作行为数据进一步进行脱敏处理,然后发送至服务器侧上的数据统计模块304。Among them, the data collection module 301 on the terminal device side is used to collect user operation behavior data, and the collected user operation behavior data can not only continue to be preprocessed by the data calibration module 302, but also can be sent to the real-time recommendation module 306 for user interest recommended content. The privacy protection module 303 on the terminal device side is configured to further perform desensitization processing on the user operation behavior data preprocessed by the data calibration module 302, and then send it to the data statistics module 304 on the server side.
服务器侧的数据统计模块304用于对一个或多个终端设备(图3中仅示出一个终端设备作为示例,若存在多个终端设备,则其他终端设备的处理过程类似,不再赘述)发送的脱敏处理后的用户操作行为数据进行统计汇总,然后送入到数据分析模块305,由数据分析模块305根据大量汇总的用户操作行为数据进行训练生成主题兴趣表。服务器侧的数据分析模块305可以将生成的主题兴趣表返回到终端设备上。一方面数据分析模块305不仅可以将生成的主题兴趣表送入数据校准模块302,以使数据校准模块302可以在对用户操作行为数据进行预处理时参考使用,另一方面数据分析模块305还可以将生成的主题兴趣表送入到实时推荐模块306,以使实时推荐模块306可以结合主题兴趣表进行基于用户兴趣的实时内容推荐。其中,服务器侧的数据统计模块304也可以和数据分析模块305集成为一个模块。The data statistics module 304 on the server side is configured to send data to one or more terminal devices (only one terminal device is shown as an example in FIG. 3 , if there are multiple terminal devices, the processing procedures of other terminal devices are similar, and will not be repeated here) The desensitized user operation behavior data is statistically summarized, and then sent to the data analysis module 305, and the data analysis module 305 performs training according to a large number of aggregated user operation behavior data to generate a topic interest table. The data analysis module 305 on the server side can return the generated topic interest table to the terminal device. On the one hand, the data analysis module 305 can not only send the generated topic interest table to the data calibration module 302, so that the data calibration module 302 can use it as a reference when preprocessing the user operation behavior data, on the other hand, the data analysis module 305 can also The generated topic interest table is sent to the real-time recommendation module 306, so that the real-time recommendation module 306 can perform real-time content recommendation based on user interests in combination with the topic interest table. The data statistics module 304 on the server side can also be integrated with the data analysis module 305 into one module.
基于上述对图3的结构示意图的介绍,本申请提供的方法主要可以分为几个阶段,以下进行具体介绍:Based on the above-mentioned introduction to the schematic structural diagram of FIG. 3 , the method provided by the present application can be mainly divided into several stages, which are introduced in detail below:
阶段1、终端设备上的数据采集模块301进行用户操作行为数据的采集。In stage 1, the data collection module 301 on the terminal device collects user operation behavior data.
为了能够保证终端设备上推荐的内容可以更准确地跟踪用户的喜好、习惯,基于用户兴趣的内容推荐方法的实现往往需要建立在海量的用户操作行为数据上。海量的用户操作行为数据一般为用户在使用终端设备的过程中产生的。通常情况下,对终端设备包含的任一应用程序来说,从用户启动该应用程序到用户关闭该应用程序期间进行的用户操作,可以作为一组完整的用户操作行为,比如可以通过一个session(指一个终端用户与提供应用程序服务的服务器之间进行通信的时间间隔,通常指用户从注册进入该提供服务的服务器到注销退出该服务器之间所经过的时间)对象来记录此组用户操作行为包含的操作数据。其中,一个session对象中可以包含一个或多个操作数据,并且本申请中对操作数据的数量和类型不进行限定。In order to ensure that the content recommended on the terminal device can more accurately track the user's preferences and habits, the implementation of the content recommendation method based on the user's interests often needs to be based on massive user operation behavior data. A large amount of user operation behavior data is generally generated by the user in the process of using the terminal device. Usually, for any application included in the terminal device, the user operation from the time when the user starts the application to when the user closes the application can be regarded as a complete set of user operation behaviors, such as through a session ( Refers to the time interval between a terminal user and a server that provides application services, usually refers to the time elapsed between the time the user registers and enters the server that provides the service to the time that he logs out of the server) object to record the operation behavior of this group of users Contains operational data. Wherein, a session object may contain one or more operation data, and the number and type of operation data are not limited in this application.
作为一种示例,假设用户A通过浏览器应用程序进行一些搜索操作,则从用户A打开浏览器到退出浏览器可以记为一个session。具体的,结合图4示出的内容,用户A打开浏览器,可以记为一个session的开始,如图4中的界面1。用户A在使用浏览器的期间,终端设备可以通过数据采集模块301记录用户A在这一个session内的所有操作,如图4中的界面2,示出的是浏览器的首页界面,用户A在首页界面中可以通过搜索框进行关键词的搜索,或进行信息流的浏览(信息流如浏览器首页界面显示的词条,并且用户A还可以通过上下滑动操作浏览更多的信息流,图4中并未示出该操作),亦或进行在浏览器的首页界面上通过点击信息流进入网页的浏览(如用户A点击“篮球协会-首页”的信息流进入详情界面浏览更多关于篮球的咨询网页,图4中并未示出该操作)。最后,若终端设备检测到用户A退出浏览器的用户操作行为,记为这一个session的结束;例如,终端设备检测到当前显示界面变更到手机的主界面(如图4中的界面3),或者终端设备检测到当前显示界面切换到其他应用程序的显示界面等任一表示手机的当前显示界面不再停留在浏览器上的状态。As an example, assuming that user A performs some search operations through a browser application, the time from user A opening the browser to exiting the browser may be recorded as a session. Specifically, in combination with the content shown in FIG. 4 , user A opens the browser, which may be recorded as the beginning of a session, such as interface 1 in FIG. 4 . When user A is using the browser, the terminal device can record all operations of user A in this session through the data collection module 301, as shown in interface 2 in FIG. 4, which shows the home page interface of the browser. In the home page interface, you can search for keywords through the search box, or browse the information flow (the information flow is such as the entries displayed on the home page interface of the browser, and user A can also browse more information flows by sliding up and down, Figure 4 This operation is not shown in the page), or browse the webpage by clicking the information flow on the homepage interface of the browser (for example, user A clicks the information flow of "Basketball Association - Homepage" to enter the details interface to browse more about basketball Consult the web page, this operation is not shown in Figure 4). Finally, if the terminal device detects the user operation behavior of user A exiting the browser, it is recorded as the end of this session; for example, the terminal device detects that the current display interface is changed to the main interface of the mobile phone (interface 3 in Figure 4), Or the terminal device detects that the current display interface is switched to the display interface of another application program, etc., which indicates that the current display interface of the mobile phone no longer stays on the browser.
基于用户的操作行为具有多种可能的应用场景,数据采集模块301采集的用户操作行为数据体现为多域行为的特点。其中,多域行为表示用户在具有不同的协议、和/或域名、和/或端口等不同显示页面进行的操作,比如,若用户操作行为数据中包含的任两个显示页面采用相同的协议、域名和端口等,则这两个显示页面属于同一个域中;反之,若用户操作行为数据中包含的任两个显示页面采用不同的协议,则这两个显示页面属于不同域中,也即用户操作行为数据表示为多域行为。例如,用户在浏览器中进行搜索、信息流浏览、网页浏览等行为,其中不同的信息流对应的网页一般属于不同域中,因此用户在浏览器的用户操作行为数据一般具有多域行为的特点,也即跨域行为的特点。There are many possible application scenarios based on the user's operation behavior, and the user operation behavior data collected by the data collection module 301 is embodied in the characteristics of multi-domain behavior. Among them, the multi-domain behavior refers to the operation performed by the user on different display pages with different protocols, and/or domain names, and/or ports. For example, if any two display pages included in the user operation behavior data use the same protocol, domain name and port, etc., the two display pages belong to the same domain; on the contrary, if any two display pages included in the user operation behavior data use different protocols, the two display pages belong to different domains, that is, User operation behavior data is represented as multi-domain behavior. For example, when a user conducts searches, information flow browsing, web browsing and other behaviors in the browser, the webpages corresponding to different information flows generally belong to different domains, so the user operation behavior data of the user in the browser generally has the characteristics of multi-domain behavior. , that is, the characteristics of cross-domain behavior.
通过对具有多域行为特点的用户操作行为数据的采集,在对海量的用户操作行为数据进行统计分析之后,可以同步多个域内的兴趣主题表,得到综合多个域中的用户操作行为的兴趣特点之后的用户兴趣,从而可以进行更准确的内容推荐。Through the collection of user operation behavior data with multi-domain behavior characteristics, after statistical analysis of massive user operation behavior data, the interest topic tables in multiple domains can be synchronized, and the interests of user operation behaviors in multiple domains can be synthesized. user interests after the features, so that more accurate content recommendations can be made.
阶段2、终端设备上的数据校准模块302对数据采集模块301采集的用户操作行为数据进行预处理。In stage 2, the data calibration module 302 on the terminal device preprocesses the user operation behavior data collected by the data collection module 301 .
示例性的,在每个session期间,由于用户每次不同的操作,每个session期间的用户操作行为数据的序列长度可能不一致,其中,用户操作行为数据的序列长度根据用户的操作次数确定。比如,若用户打开浏览器之后,在浏览器中进行少个用户操作行为,则此session期间采集的用户操作行为数据的序列长度较短;若用户打开浏览器,并且在浏览器中进行了搜索、信息流浏览以及网页浏览等多个用户操作行为,则此session期间采集的用 户操作行为数据的序列长度较长。表1为一个session期间采集的用户操作行为数据的示例,如下:Exemplarily, during each session, due to the different operations performed by the user each time, the sequence length of the user operation behavior data during each session may be inconsistent, wherein the sequence length of the user operation behavior data is determined according to the number of operations performed by the user. For example, if the user performs a few user operation behaviors in the browser after opening the browser, the sequence length of the user operation behavior data collected during this session is shorter; if the user opens the browser and searches in the browser , information flow browsing, web browsing and other user operation behaviors, the sequence length of the user operation behavior data collected during this session is longer. Table 1 is an example of user operation behavior data collected during a session, as follows:
表1Table 1
Figure PCTCN2022081770-appb-000003
Figure PCTCN2022081770-appb-000003
本申请实施例中,如上表1中的用户操作行为数据中的一行数据表示一组序列,表1中包含4行实际的用户操作行为数据,因此表1中的序列长度可以认为是4,后续实施例中类似的表格中具有相同的定义,在具体介绍过程中,重复之处不再赘述。表1中示出的用户操作行为数据的信息类型可以包括:用户ID、用于体现用户操作行为的关键标识、用户操作行为类型、所述关键标识对应的兴趣主题及附加摘要信息(非必填)等信息。除此之外,用户操作行为数据还可以包括比上述表1更多或更少的不同的信息类型,例如,应用程序标识等,本申请对此不进行限定。并且,用户操作行为数据的存储形式可以通过表格形式,也可以通过其他形式,本申请对此同样不进行限定。In the embodiment of the present application, a row of data in the user operation behavior data in Table 1 above represents a set of sequences, and Table 1 contains 4 rows of actual user operation behavior data. Therefore, the sequence length in Table 1 can be considered to be 4, and the subsequent Similar tables in the embodiments have the same definitions, and in the specific introduction process, repeated points will not be repeated. The information types of the user operation behavior data shown in Table 1 may include: user ID, key identifiers used to reflect user operation behaviors, user operation behavior types, interest topics corresponding to the key identifiers, and additional summary information (optional) ) and other information. Besides, the user operation behavior data may also include more or less different information types than those in Table 1 above, for example, application program identifiers, etc., which are not limited in this application. In addition, the storage form of the user operation behavior data may be in a tabular form or in other forms, which is also not limited in this application.
上述示例中,表1中的关键标识可以根据用户操作行为得到,关键标识可以是搜索词、信息流关键词、网页地址等。例如,若用户操作行为对应为搜索操作,搜索词为“NBA”,则关键标识可以为“NBA”。或者,若用户操作行为对应为信息流浏览,信息流关键词为“篮球”(如图4中的2示出的第2条、第3条信息流),则关键标识可以为“篮球”。又或者,若用户操作行为对应为网页浏览,网页内包含内容的关键词为“实体经济”,则关键标识可以为网页关键词“实体经济”,或者也可以是当前浏览网页的网址,等。In the above example, the key identifiers in Table 1 can be obtained according to user operation behavior, and the key identifiers can be search words, information flow keywords, web page addresses, and the like. For example, if the user operation behavior corresponds to a search operation and the search term is "NBA", the key identifier may be "NBA". Alternatively, if the user's operation behavior corresponds to information flow browsing, and the information flow keyword is "basketball" (the second and third information flows as shown in 2 in Figure 4), the key identifier may be "basketball". Or, if the user's operation behavior corresponds to web browsing, and the keyword contained in the web page is "real economy", the key identifier may be the web page keyword "real economy", or the URL of the currently browsed web page, etc.
一种可能的预处理场景中,表1中的兴趣主题可能无法从用户操作行为中获取到(例如用户在进行搜索词操作时,终端设备可以从用户操作行为中获取到关键标识,但无法确定兴趣主题),则终端设备可以根据用户操作行为中的关键标识以及从服务器侧获取的主题兴趣表确定。其中,主题兴趣表用于指示关键标识与兴趣主题的映射关系,是服务器侧基于从一个或多个终端设备接收的经数据校准模块302和隐私保护模块303处理之后的用户操作行为数据,进行统计分析之后生成的,具体的生成方式在后续实施例中进行详细介绍,在此暂不详述。进一步的,终端设备侧的主题兴趣表可以是定期从服务器侧获取并存储在终端设备上。其中,获取方式可以是终端设备侧主动请求的,也可以是服务器周期性下发的,或者是服务器检测到主题兴趣表有更新之后主动下发的,本申请对终端设备从服务器侧获取主题兴趣表的实现方式不限定。In a possible preprocessing scenario, the interest topics in Table 1 may not be obtained from the user's operation behavior (for example, when the user performs a search word operation, the terminal device can obtain the key identifier from the user's operation behavior, but cannot be determined. interest topic), the terminal device can determine it according to the key identifier in the user's operation behavior and the topic interest table obtained from the server side. Among them, the topic interest table is used to indicate the mapping relationship between key identifiers and interest topics, and the server side performs statistics based on the user operation behavior data received from one or more terminal devices and processed by the data calibration module 302 and the privacy protection module 303. After analysis, the specific generation method will be described in detail in subsequent embodiments, and will not be described in detail here. Further, the topic interest table on the terminal device side may be acquired from the server side periodically and stored on the terminal device. The acquisition method may be actively requested by the terminal device, or periodically issued by the server, or automatically issued by the server after detecting that the topic interest table has been updated. In this application, the terminal device acquires the topic interest from the server side. The implementation of the table is not limited.
具体实施为,在终端设备从用户操作行为中获取到关键标识之后,可以进一步根据主题兴趣表确定此次用户操作行为包含的至少一个关键标识分别对应的兴趣主题。例如,若终端设备获取到的关键标识为“NBA”,在兴趣主题表中包含“NBA”与兴趣主题“体育”之间的映射关系,则终端设备通过查询兴趣主题表可以确定该关键标识“NBA”对应的兴趣主题为“体育”,并将兴趣主题和关键标识共同存储为用户操作行为数据,如以上表1中第2行示出的数据内容。另一种可能的预处理场景中,为了保障基于大量的session期间采集的用 户操作行为数据进行建模时的准确性以及处理效率,本申请实施时,对session期间采集的用户操作行为数据按照预设值进行采样,进而实现每个session期间采集的用户操作行为数据具有较为固定的序列长度。具体实施为,对序列长度较短的用户操作行为数据使用默认用户操作行为进行补足,对序列长度较长的用户操作行为数据进行随机截断采样。这样,可以得到行为序列长度较为统一的用户操作行为数据,从而可以避免由于用户操作行为数据的序列长度较短,也可以理解为样本数据过少,无法准确分析到用户兴趣,以及可以避免由于用户操作行为数据的序列长度较长,也可以理解为样本数据过多,进行分析时较为冗余而导致处理效率低的问题。Specifically, after the terminal device obtains the key identifier from the user operation behavior, it can further determine the interest topic corresponding to at least one key identifier included in the user operation behavior according to the topic interest table. For example, if the key identifier obtained by the terminal device is "NBA", and the interest topic table contains the mapping relationship between "NBA" and the interest topic "Sports", the terminal device can determine the key identifier "Sports" by querying the interest topic table. The interest topic corresponding to NBA" is "sports", and the interest topic and key identifiers are jointly stored as user operation behavior data, such as the data content shown in the second row in Table 1 above. In another possible preprocessing scenario, in order to ensure the accuracy and processing efficiency of modeling based on a large number of user operation behavior data collected during the session, when the application is implemented, the user operation behavior data collected during the session is pre- Set the value for sampling, and then realize that the user operation behavior data collected during each session has a relatively fixed sequence length. Specifically, the user operation behavior data with a shorter sequence length is supplemented by the default user operation behavior, and the user operation behavior data with a longer sequence length is randomly truncated and sampled. In this way, user operation behavior data with a relatively uniform behavior sequence length can be obtained, which can avoid the short sequence length of the user operation behavior data, or it can be understood that the sample data is too small to accurately analyze the user's interests, and can avoid the user's interest due to the short sequence length of the user operation behavior data. The sequence length of the operation behavior data is long, which can also be understood as the problem that the sample data is too much, and the analysis is redundant, which leads to the problem of low processing efficiency.
作为一种示例,以下结合表2(包括表2a、表2b)和表3(包括表3a、表3b)来介绍数据校准模块302进行的处理,如下:As an example, the following describes the processing performed by the data calibration module 302 in conjunction with Table 2 (including Table 2a, Table 2b) and Table 3 (including Table 3a, Table 3b), as follows:
表2aTable 2a
关键标识key identification 兴趣主题topic of interest
NBANBA 体育physical education
实体经济real economy 财经Finance
     
其中,表2a为任一个session期间采集的用户操作行为数据的示例,可以得到该用户操作行为数据的序列长度较短。Wherein, Table 2a is an example of user operation behavior data collected during any session, and it can be obtained that the sequence length of the user operation behavior data is short.
表2bTable 2b
关键标识key identification 兴趣主题topic of interest
NBANBA 体育physical education
实体经济real economy 财经Finance
“0”"0" “0”"0"
根据以上表2a和表2b可以得到,假设数据校准模块302按照用户操作行为数据的序列长度的预设值为3进行采样,在数据采集模块301采集的用户操作行为数据的序列长度较短时(也可以理解为小于预设值时,如表2a中的序列长度为2小于预设值3),可以补充目标长度的、预先定义的默认用户操作行为数据,从而得到序列长度为3的用户操作行为数据。其中,目标长度为所述用户操作行为数据的序列长度与预设值的差值的绝对值,例如表2a中的序列长度为2,预设值为3,则目标长度为|2-3|=1,则在表2b中补充1个序列的用户操作行为数据。需要说明的是,上述表2b中通过“0”来表示默认关键标识,以及通过“0”来表示默认关键标识对应映射的默认兴趣主题。其中,默认关键标识和默认兴趣主题可以预先进行设定、或根据一定规则确定(例如根据当前热点的用户操作行为)等,比如,默认关键标识可以为“新冠”,默认兴趣主题为“时政”。According to the above Table 2a and Table 2b, it can be obtained, assuming that the data calibration module 302 performs sampling according to the preset value of the sequence length of the user operation behavior data is 3, when the sequence length of the user operation behavior data collected by the data acquisition module 301 is short ( It can also be understood that when it is less than the preset value, such as the sequence length in Table 2a is 2 less than the preset value 3), the pre-defined default user operation behavior data of the target length can be supplemented, thereby obtaining a user operation with a sequence length of 3. behavioral data. The target length is the absolute value of the difference between the sequence length of the user operation behavior data and the preset value. For example, if the sequence length in Table 2a is 2 and the preset value is 3, the target length is |2-3| =1, then add one sequence of user operation behavior data in Table 2b. It should be noted that, in the above Table 2b, "0" is used to represent the default key identifier, and "0" is used to represent the default interest topic corresponding to the mapping of the default key identifier. Among them, the default key identifier and the default interest topic can be set in advance, or determined according to certain rules (for example, according to the user operation behavior of the current hot spot), etc., for example, the default key identifier can be "new crown", and the default interest topic is "current affairs" .
表3aTable 3a
关键标识key identification 兴趣主题topic of interest
NBANBA 体育physical education
实体经济real economy 财经Finance
足球football 体育physical education
https://china.nba.com/https://china.nba.com/ 体育physical education
新冠new crown 时政current affairs
其中,表3a为任一个session期间采集的用户操作行为数据的示例,可以得到该用户操作行为数据的序列长度较长。Wherein, Table 3a is an example of user operation behavior data collected during any session, and it can be obtained that the sequence length of the user operation behavior data is relatively long.
表3bTable 3b
关键标识key identification 兴趣主题topic of interest
NBANBA 体育physical education
足球football 体育physical education
新冠new crown 时政current affairs
根据以上表3a和表3b可以得到,假设数据校准模块302按照用户操作行为数据的序列长度的预设值为3进行采样,在数据采集模块301采集的用户操作行为数据的序列长度较长时(也可以理解为大于预设值时,如表3a中的序列长度为5大于预设值3),可以进行随机截断目标长度,从而得到序列长度为3的用户操作行为数据。其中,目标长度为所述用户操作行为数据的序列长度与预设值的差值的绝对值,例如表3a中的序列长度为5,预设值为3,则目标长度为|5-3|=2,则在表3b中截断掉两个序列的用户操作行为数据。或者,可选的,在用户操作行为数据的序列长度大于采样的预设值时,除了采用随机截断目标长度的实现方式,还可以根据用户操作行为的行为类型进行加权采样的实现方式,然后根据预设值大小选取权重较大的用户操作行为数据包含的几组序列,例如,终端设备根据用户操作行为的类型获取表3a中包含的5组序列的权重大小之后,选取3组权重较大的序列,如表3b中示出的3组序列。或者,实施时还可以采用其他进行采样得到序列长度为预设值的用户操作行为数据的实现方式,本申请对此不进行限定。According to the above Table 3a and Table 3b, it can be obtained, assuming that the data calibration module 302 performs sampling according to the preset value of the sequence length of the user operation behavior data is 3, when the sequence length of the user operation behavior data collected by the data acquisition module 301 is long ( It can also be understood that when it is greater than the preset value, for example, the sequence length in Table 3a is 5 greater than the preset value 3), the target length can be randomly truncated to obtain user operation behavior data with sequence length 3. The target length is the absolute value of the difference between the sequence length of the user operation behavior data and the preset value. For example, if the sequence length in Table 3a is 5 and the preset value is 3, the target length is |5-3| = 2, the user operation behavior data of the two sequences are truncated in Table 3b. Or, optionally, when the sequence length of the user operation behavior data is greater than the preset value of the sampling, in addition to the implementation method of randomly truncating the target length, the implementation method of weighted sampling can also be performed according to the behavior type of the user operation behavior, and then according to the implementation method of randomly truncating the target length. The preset value selects several groups of sequences included in the user operation behavior data with larger weights. For example, after the terminal device obtains the weights of the 5 groups of sequences contained in Table 3a according to the type of user operation behavior, selects 3 groups of larger weights. Sequences, such as the 3 sets of sequences shown in Table 3b. Alternatively, other implementation manners of sampling to obtain user operation behavior data whose sequence length is a preset value may also be adopted during implementation, which is not limited in this application.
本申请实施时,通过阶段2的处理,可以将阶段1中采集的用户操作行为数据预处理为序列长度统一的数据结构,从而可以对多种场景下产生的各种各样的用户操作行为数据进行预处理之后,便于实现通过隐私保护模块303继续进行脱敏处理,以及便于实现基于用户兴趣的内容推荐系统的建模。其中,实现基于用户兴趣的内容推荐系统的建模在后续实施例中会进行具体介绍,在此暂不赘述。During the implementation of this application, through the processing of stage 2, the user operation behavior data collected in stage 1 can be preprocessed into a data structure with a uniform sequence length, so that various user operation behavior data generated in various scenarios can be processed. After the preprocessing, it is convenient to continue the desensitization processing through the privacy protection module 303, and it is convenient to realize the modeling of the content recommendation system based on the user's interest. The modeling for realizing the content recommendation system based on the user's interests will be introduced in detail in the following embodiments, and will not be repeated here.
阶段3、终端设备上的隐私保护模块303对数据校准模块302预处理之后的用户操作行为数据进行脱敏处理,得到脱敏处理后的用户操作行为数据。In stage 3, the privacy protection module 303 on the terminal device performs desensitization processing on the user operation behavior data preprocessed by the data calibration module 302 to obtain desensitized user operation behavior data.
本申请实施时,可以通过以下方式中的一种或组合实现对用户操作行为数据的脱敏处理,或者还可以通过其他可能的方式对用户操作行为数据进行脱敏处理,本申请不进行限定。示例性的,包括:When this application is implemented, the user operation behavior data may be desensitized by one or a combination of the following methods, or the user operation behavior data may also be desensitized in other possible ways, which are not limited in this application. Exemplary, including:
方式1、终端设备采用差分隐私算法对用户操作行为数据进行相同兴趣主题下的随机替换处理。具体实施为,针对每个session期间对应的用户操作行为数据中包含的每个用户操作行为,有一定概率保持不变,也有一定的概率会被进行随机替换。可选的,终端设备实施为保持主题兴趣不变,通过随机查找主题兴趣表,选取一个该主题兴趣下的一个关键标识,替换掉原有的关键标识。或者,重新生成用户操作行为数据中的附加摘要信息等。沿用表1中的用户操作行为数据的示例,假设终端设备进行的隐私处理为保持表1中第2行的主题兴趣“体育”不变,将第2行中的关键标识由“NBA”替换为“篮球”,以及将表1第5行中的附加摘要信息由“疫苗、肺炎、无症状”变更为“武汉市”,则进行隐私处理之后的 用户操作行为数据可如以下表4所示:Mode 1: The terminal device uses a differential privacy algorithm to perform random replacement processing on the user operation behavior data under the same interest theme. Specifically, for each user operation behavior included in the corresponding user operation behavior data during each session, there is a certain probability that it remains unchanged, and there is a certain probability that it is randomly replaced. Optionally, the terminal device is implemented to keep the topic interest unchanged, by randomly searching the topic interest table, selecting a key identifier under the topic interest, and replacing the original key identifier. Or, regenerate additional summary information and the like in the user operation behavior data. Following the example of user operation behavior data in Table 1, it is assumed that the privacy processing performed by the terminal device is to keep the subject interest "sports" in the second row in Table 1 unchanged, and the key identifier in the second row is replaced by "NBA" with "Basketball", and the additional summary information in row 5 of Table 1 is changed from "Vaccine, Pneumonia, Asymptomatic" to "Wuhan City", the user operation behavior data after privacy processing can be shown in Table 4 below:
表4Table 4
用户IDUser ID 关键标识key identification 用户操作行为类型User action type 兴趣主题topic of interest 附加摘要信息(非必填)Additional summary information (not required)
用户AUser A 篮球basketball 搜索search 体育physical education   
用户AUser A https://china.nba.com/https://china.nba.com/ 网页浏览Web browsing 体育physical education NBANBA
用户AUser A 实体经济real economy 信息流浏览Information flow browsing 财经Finance   
用户AUser A 新冠new crown 网页浏览Web browsing 时政current affairs 武汉市Wuhan
方式2、终端设备剥离所述用户操作行为数据中的用户隐私数据。可选的,若终端设备采集的用户操作行为数据中包含用户ID、用户操作行为类型等用户隐私数据,则可将这些隐私数据进行剥离。沿用表1中的用户操作行为数据的示例,表5为剥离用户相关信息之后的用户操作行为数据的示例,如下:Manner 2: The terminal device strips the user privacy data in the user operation behavior data. Optionally, if the user operation behavior data collected by the terminal device includes user privacy data such as user ID, user operation behavior type, etc., these privacy data may be stripped. Following the example of user operation behavior data in Table 1, Table 5 is an example of user operation behavior data after stripping user-related information, as follows:
表5table 5
关键标识key identification 兴趣主题topic of interest 附加摘要信息(非必填)Additional summary information (not required)
NBANBA 体育physical education   
https://china.nba.com/https://china.nba.com/ 体育physical education NBANBA
实体经济real economy 财经Finance   
新冠new crown 时政current affairs 疫苗、肺炎、无症状Vaccine, pneumonia, asymptomatic
上述示例中,若终端设备通过方式1和方式2对用户操作行为数据进行脱敏处理,则处理之后的用户操作行为数据可以融合两种方式处理之后的结果。示例性的,表6为通过方式1和方式2对用户操作行为数据进行脱敏处理之后的脱敏处理后的用户操作行为数据示例,如下:In the above example, if the terminal device performs desensitization processing on the user operation behavior data through Mode 1 and Mode 2, the processed user operation behavior data can be processed in two ways. Exemplarily, Table 6 is an example of the desensitized user operation behavior data after the user operation behavior data is desensitized by way 1 and way 2, as follows:
表6Table 6
关键标识key identification 兴趣主题topic of interest 附加摘要信息(非必填)Additional summary information (not required)
篮球basketball 体育physical education   
https://china.nba.com/https://china.nba.com/ 体育physical education NBANBA
实体经济real economy 财经Finance   
新冠new crown 时政current affairs 武汉市Wuhan
根据以上表6可以得知,通过阶段3对预处理后的用户操作行为数据进一步进行脱敏处理之后,得到的脱敏处理后的用户操作行为数据可以实现较好地保护用户隐私,主要是为了凸显当前用户操作行为的用户兴趣,以得到当下热门的主题兴趣、及每个主题兴趣下的热点关键标识等,从而可以便于服务器侧生成主题兴趣表或更新主题兴趣表,以提高基于用户兴趣进行内容推荐的时效性及准确性。According to the above Table 6, it can be known that after the preprocessed user operation behavior data is further desensitized in stage 3, the obtained desensitized user operation behavior data can better protect user privacy, mainly for the purpose of Highlight the user interests of the current user operation behaviors, so as to obtain the current hot topic interests and the hot key identifiers under each topic interest, so as to facilitate the server side to generate the topic interest table or update the topic interest table, so as to improve the performance of the topic interest table based on user interests. Timeliness and accuracy of content recommendations.
阶段4、终端设备上的隐私保护模块303将处理后的脱敏处理后的用户操作行为数据上传到服务器侧的数据统计模块304。In stage 4, the privacy protection module 303 on the terminal device uploads the processed and desensitized user operation behavior data to the data statistics module 304 on the server side.
其中,服务器侧可以连接一个或多个终端设备,因此服务器侧可以获取到一个或多个 终端设备中的隐私保护模块303上传的多组脱敏处理后的用户操作行为数据。The server side can be connected to one or more terminal devices, so the server side can obtain multiple sets of desensitized user operation behavior data uploaded by the privacy protection module 303 in one or more terminal devices.
阶段5、服务器中的数据统计模块304对接收的一个或多个终端设备中的隐私保护模块303上传的脱敏处理后的用户操作行为数据进行汇总统计。以及,服务器中的数据分析模块305对数据统计模块304汇总统计的脱敏处理后的用户操作行为数据进行分析,生成主题兴趣表。In stage 5, the data statistics module 304 in the server performs summary statistics on the received desensitized user operation behavior data uploaded by the privacy protection module 303 in one or more terminal devices. And, the data analysis module 305 in the server analyzes the desensitized user operation behavior data collected and counted by the data statistics module 304 to generate a topic interest table.
需要说明的是,本申请实施时,考虑到终端设备侧的计算能力,若由终端设备训练生成主题兴趣表需要终端设备具有较高的性能要求,然而这种实施方式存在成本较高并且并不能很好地提高推荐效率的缺陷。因此,根据采集的海量的用户操作行为数据训练得到主题兴趣表一般可在服务器侧实现。并且,为了保护用户隐私数据,终端设备在将用户操作行为数据上传给到服务器之前,对用户操作行为数据进行敏感数据的脱敏处理,如上述实施例中阶段2介绍到的预处理以及阶段3介绍到的脱敏处理;然后,终端设备将处理之后得到的脱敏处理后的用户操作行为数据发送给服务器以进行生成主题兴趣表的操作。It should be noted that, when the application is implemented, considering the computing capability of the terminal device side, if the terminal device is trained to generate a topic interest table, the terminal device needs to have high performance requirements. However, this implementation has high costs and cannot be used. It is a good defect to improve the recommendation efficiency. Therefore, the topic interest table obtained by training according to the collected massive user operation behavior data can generally be implemented on the server side. In addition, in order to protect user privacy data, before uploading the user operation behavior data to the server, the terminal device performs sensitive data desensitization processing on the user operation behavior data. The desensitization processing introduced; then, the terminal device sends the desensitized user operation behavior data obtained after the processing to the server to perform the operation of generating the topic interest table.
示例性的,基于每个用户操作行为数据中包含多个关键标识与兴趣主题之间的映射关系,数据统计模块304可以对接收的大量脱敏处理后的用户操作行为数据进行统计分析,其中统计分析可以包括脱敏处理后的用户操作行为与关键标识的对应关系、脱敏处理后的用户操作行为数据与兴趣主题的对应关系、兴趣主题与关键标识的对应关系中的一种或组合。Exemplarily, based on the mapping relationship between a plurality of key identifiers and topics of interest contained in each user operation behavior data, the data statistics module 304 may perform statistical analysis on the received large amount of desensitized user operation behavior data, wherein the statistical analysis is performed. The analysis may include one or a combination of the corresponding relationship between the desensitized user operation behavior and the key identifier, the corresponding relationship between the desensitized user operation behavior data and the interest topic, and the corresponding relationship between the interest topic and the key identifier.
1)通过分析脱敏处理后的用户操作行为与关键标识的对应关系可以得到当前用户搜索或浏览的热门关键标识,从而可以实现终端设备根据热门关键标识进行内容推荐,即主要获取与所述热门关键标识相关的信息流、网页等内容进行推荐,进而实现终端设备推荐的内容是用户可能较为感兴趣的内容。举例来说,统计分析结果中指示在大量的脱敏处理后的用户操作行为数据中,包含关键标识“NBA”的次数最多,则终端设备确定“NBA”为当前用户搜索或浏览的热门关键标识,在此场景下,终端设备可以获取与“NBA”相关的一些内容进行推荐。1) By analyzing the correspondence between the desensitized user operation behavior and the key identifiers, the popular key identifiers currently searched or browsed by the current user can be obtained, so that the terminal device can perform content recommendation according to the popular key identifiers, that is, mainly obtain the information related to the popular key identifiers. Contents such as information flow and web pages related to the key identification are recommended, so that the content recommended by the terminal device is the content that the user may be more interested in. For example, if the statistical analysis result indicates that the key identifier "NBA" is included the most times in a large amount of desensitized user operation behavior data, the terminal device determines that "NBA" is a popular key identifier currently searched or browsed by the user. , in this scenario, the terminal device can obtain some content related to "NBA" for recommendation.
2)通过分析脱敏处理后的用户操作行为数据与兴趣主题的对应关系可以得到当前用户搜索或浏览的热门兴趣主题,从而可以实现终端设备根据热门兴趣主题进行内容推荐。举例来说,统计分析结果中指示在大量的脱敏处理后的用户操作行为数据中,包含兴趣主题“体育”的次数最多,则终端设备确定“体育”为当前用户搜索或浏览的热门兴趣主题,在此场景下,终端设备可以获取“体育”兴趣主题下相关的一些内容进行推荐。2) By analyzing the correspondence between the desensitized user operation behavior data and the topic of interest, the popular topics of interest currently searched or browsed by the user can be obtained, so that the terminal device can perform content recommendation according to the topic of interest. For example, the statistical analysis result indicates that among a large amount of desensitized user operation behavior data, the interest topic "sports" is included the most times, and the terminal device determines that "sports" is a popular interest topic currently searched or browsed by the user. , in this scenario, the terminal device can obtain some related content under the "sports" interest topic for recommendation.
3)通过分析兴趣主题与关键标识的对应关系,可以得到在每个兴趣主题下的热门关键标识,从而可以实现终端设备在根据热门兴趣主题进行内容推荐的场景下,进一步根据热门关键标识进行内容推荐。举例来说,统计分析结果中指示在大量的脱敏处理后的用户操作行为数据中,确定兴趣主题“体育”下包含的关键标识有“NBA”、“足球”等,并且包含“NBA”的次数最多,则终端设备确定“体育”兴趣主题下的热门关键标识为“NBA”,在终端设备确定热门兴趣主题为“体育”的基础上,获取“体育”兴趣主题下相关的一些内容时,可以获取较多与“NBA”相关的一些内容进行推荐。3) By analyzing the correspondence between interest topics and key identifiers, the popular key identifiers under each interest topic can be obtained, so that the terminal device can further perform content based on popular key identifiers in the scenario of content recommendation based on popular interest topics. recommend. For example, the statistical analysis result indicates that in a large amount of user operation behavior data after desensitization processing, it is determined that the key identifiers included under the interest topic "sports" include "NBA", "soccer", etc. If the number of times is the most, the terminal device determines that the popular key identifier under the "sports" interest topic is "NBA". You can get more content related to "NBA" for recommendation.
以数据统计模块304对兴趣主题与关键标识的对应关系进行统计作为示例,具体实施方式为,数据统计模块304接收到一个或多个终端设备中的隐私保护模块303上传的大量的脱敏处理后的用户操作行为数据之后,根据每个脱敏处理后的用户操作行为数据中包含的多个序列确定多个关键标识与主题兴趣的映射关系。数据统计模块304可以基于不同的 兴趣主题作为分类,确定每个兴趣主题下包含的关键标识集合,从而得到每个兴趣主题与所述兴趣主题包含的关键标识集合的映射关系。例如,沿用表6中的示例,经过数据统计模块304的汇总统计之后,得到表7的映射关系,如下:Taking the data statistics module 304 statistics on the corresponding relationship between interest topics and key identifiers as an example, the specific implementation is that the data statistics module 304 receives a large number of desensitization processing uploaded by the privacy protection module 303 in one or more terminal devices. After the user operation behavior data is obtained, the mapping relationship between multiple key identifiers and topic interests is determined according to multiple sequences contained in each desensitized user operation behavior data. The data statistics module 304 can determine the set of key identifiers contained under each topic of interest based on different topics of interest as categories, so as to obtain a mapping relationship between each topic of interest and the set of key identifiers contained in the topic of interest. For example, following the example in Table 6, after the summary statistics of the data statistics module 304, the mapping relationship of Table 7 is obtained, as follows:
表7Table 7
体育physical education 财经Finance 时政current affairs
篮球basketball 实体经济real economy 新冠new crown
https://china.nba.com/https://china.nba.com/      
进一步的,数据分析模块305在得到对脱敏处理后的用户操作行为数据的统计汇总结果之后,可以训练主题兴趣模型。示例性的,构建主题兴趣模型可以采用隐含狄利克雷分布(latent dirichlet allocation,LDA)主题模型算法等算法实现,本申请对此不进行限定。Further, the data analysis module 305 can train the topic interest model after obtaining the statistical summary result of the desensitized user operation behavior data. Exemplarily, the construction of the topic interest model may be implemented by an algorithm such as a latent Dirichlet allocation (LDA) topic model algorithm, which is not limited in this application.
主题兴趣模型可以生成主题兴趣表、用户兴趣主题雷达图等来确定用户兴趣。其中,主题兴趣表可以为如表7所示的形式,每一列表示一个兴趣主题,并且每一列包含一个或多个关键标识,其中,每一列兴趣主题包含的关键标识是从海量的脱敏处理后的用户操作行为数据中分析得到的。此外,每一列兴趣主题还可关联对应的权重值,通过权重值表示用户对该兴趣主题的感兴趣程度,其中权重值可以是通过对大量的脱敏处理后的用户操作行为数据的统计及分析得到的,例如,在脱敏处理后的用户操作行为数据中出现次数较多的兴趣主题对应的权重值则较大。可以理解,兴趣主题关联的权重值越大,则表示多数用户对该兴趣主题的感兴趣程度越大。The topic interest model can generate topic interest tables, user interest topic radar charts, etc. to determine user interests. The topic interest table can be in the form shown in Table 7, each column represents an interest topic, and each column contains one or more key identifiers, wherein the key identifiers included in each column of interest topics are processed from mass desensitization It is obtained by analyzing the user operation behavior data. In addition, each column of interest topics can also be associated with a corresponding weight value, and the weight value represents the user's interest in the interest topic, where the weight value can be obtained through statistics and analysis of a large number of desensitized user operation behavior data Obtained, for example, the weight value corresponding to the interest topic that appears more frequently in the desensitized user operation behavior data is larger. It can be understood that the larger the weight value associated with the interest topic, the greater the interest degree of most users in the interest topic.
或者,图5为本申请实施例示出的一种用户兴趣主题雷达图的示例图,假设数据分析模块305通过对脱敏处理后的用户操作行为数据的分析,学习到大量的脱敏处理后的用户操作行为数据中兴趣主题“体育”为用户搜索、浏览最多的兴趣主题,“时政”次之,“财经”较少,进而生成如图5所示出的可以反映脱敏处理后的用户操作行为对于兴趣主题的感兴趣程度的雷达图。Alternatively, FIG. 5 is an example diagram of a radar chart of user interest topics shown in an embodiment of the present application. It is assumed that the data analysis module 305 learns a large number of desensitized data by analyzing the desensitized user operation behavior data. In the user operation behavior data, the interest topic "sports" is the most searched and browsed interest topic by users, followed by "current affairs", and less "financial and economics", and then generate the user operation shown in Figure 5 that can reflect the desensitization process. A radar chart of the behavior's level of interest in a topic of interest.
阶段6、服务器侧中的数据分析模块305将主题兴趣表发送给终端设备侧的数据校准模块302,以实现基于用户兴趣的内容推荐。In stage 6, the data analysis module 305 on the server side sends the topic interest table to the data calibration module 302 on the terminal device side, so as to implement content recommendation based on user interests.
一种可选的实施方式中,服务器侧的数据分析模块305生成的主题兴趣表,可以用于终端设备侧上的数据校准模块302对用户操作行为数据进行预处理,得到预处理后的用户操作行为数据。如在前文阶段2部分中介绍到的内容,终端设备侧的数据校准模块302在对采集的脱敏处理后的用户操作行为数据进行预处理的过程中,可以结合从服务器获取的主题兴趣表来实现,从而可以实现预处理之后的脱敏处理后的用户操作行为数据可以包含关键标识和兴趣主题,进而得到更准确的脱敏处理后的用户操作行为数据。In an optional embodiment, the topic interest table generated by the data analysis module 305 on the server side can be used by the data calibration module 302 on the terminal device side to preprocess the user operation behavior data to obtain the preprocessed user operation data. behavioral data. As described in the previous section 2, the data calibration module 302 on the terminal device side can combine the topic interest table obtained from the server in the process of preprocessing the collected desensitized user operation behavior data. Therefore, it can be realized that the desensitized user operation behavior data after preprocessing can include key identifiers and interest topics, thereby obtaining more accurate desensitized user operation behavior data.
另一种可选的实施方式中,服务器侧的数据分析模块305生成的主题兴趣表,还可以用于终端设备侧上的隐私保护模块303对预处理后的用户操作行为数据通过差分隐私算法进行处理,得到脱敏处理后的用户操作行为数据。如在前文阶段3部分中介绍到的内容,终端设备的隐私保护模块303可以基于主题兴趣表对预处理后的用户操作行为数据中的内容进行随机替换,用以对用户操作行为数据进行隐私保护。In another optional embodiment, the topic interest table generated by the data analysis module 305 on the server side can also be used by the privacy protection module 303 on the terminal device side to perform a differential privacy algorithm on the preprocessed user operation behavior data. processing to obtain desensitized user operation behavior data. As described in Section 3 above, the privacy protection module 303 of the terminal device can randomly replace the content in the preprocessed user operation behavior data based on the topic interest table, so as to protect the privacy of the user operation behavior data .
又一种可选的实施方式,服务器侧的数据分析模块305生成的主题兴趣表,又可以用于终端设备侧上的实时推荐模块306进行实时的内容推荐,如以下阶段7部分中介绍的内容,在此暂不详述。In another optional embodiment, the topic interest table generated by the data analysis module 305 on the server side can be used by the real-time recommendation module 306 on the terminal device side to perform real-time content recommendation, such as the content introduced in the following stage 7 part. , which will not be described in detail here.
阶段7、终端设备上的实时推荐模块306根据数据采集模块301实时采集的用户操作行为数据、从服务器侧获取的主题兴趣表确定用户兴趣,并根据所述用户兴趣进行实时推荐。In stage 7, the real-time recommendation module 306 on the terminal device determines user interests according to the user operation behavior data collected in real time by the data collection module 301 and the topic interest table obtained from the server side, and performs real-time recommendation according to the user interests.
示例性的,终端设备进行实时推荐可以包含以下几种场景:Exemplarily, the real-time recommendation by the terminal device may include the following scenarios:
场景1、用户作为新用户打开应用程序。在此场景下,终端设备中针对该应用程序未存储该用户的历史用户兴趣。 Scenario 1. The user opens the application as a new user. In this scenario, the terminal device does not store the user's historical user interests for the application.
实施时,终端设备检测到用户进入应用程序的用户操作行为之后,且未检测到用户在所述应用程序中其他的用户操作行为之前,则可以根据主题兴趣表中包含的兴趣主题的权重值进行内容推荐。例如,假设终端设备检测到用户打开浏览器,且在未检测到用户在浏览器的用户操作行为之前,主题兴趣表中包含“体育”、“时政”及“财经”的兴趣主题,则终端设备可以从服务器侧上获取这些兴趣主题的相关内容,在浏览器的首页通过信息流进行推荐,并且若兴趣主题的权重值较大,则推荐的相关内容的比例也就越高。示例性的,主题兴趣表如以下表8a的示例:During implementation, after the terminal device detects the user operation behavior of the user entering the application program, and does not detect other user operation behaviors of the user in the application program, it can be carried out according to the weight value of the interest topic included in the topic interest table. Content recommendation. For example, assuming that the terminal device detects that the user has opened the browser, and before detecting the user's user operation behavior in the browser, the subject interest table contains the interest topics of "sports", "current affairs" and "finance and finance", then the terminal device The relevant content of these interest topics can be obtained from the server side, and recommended through the information flow on the home page of the browser, and if the weight value of the interest topic is larger, the proportion of the recommended relevant content will be higher. Exemplarily, the topic interest table is as shown in the following table 8a:
表8aTable 8a
体育physical education 财经Finance 时政current affairs
篮球basketball 实体经济real economy 新冠new crown
https://china.nba.com/https://china.nba.com/      
进一步的,若终端设备检测到用户在所述应用程序中的用户实时操作行为之后,则可以结合用户实时操作行为以及从服务器侧获取的主题兴趣表确定用户兴趣,然后根据所述用户兴趣进行实时推荐。例如,用户实时操作行为表现为用户在浏览器中输入搜索词“火龙果”,则终端设备根据关键标识“火龙果”及其对应的兴趣主题“水果”与主题兴趣表一起生成该用户的用户兴趣(如以下表8b所示)。Further, if the terminal device detects the user's real-time operation behavior in the application program, it can determine the user's interest in combination with the user's real-time operation behavior and the subject interest table obtained from the server side, and then perform real-time operation according to the user's interest. recommend. For example, the user's real-time operation behavior is that the user enters the search term "Pitaya" in the browser, and the terminal device generates the user's user information according to the key identifier "Pitaya" and its corresponding interest topic "Fruit" together with the topic interest table interest (as shown in Table 8b below).
具体实施时,终端设备可以根据用户操作行为的类型为兴趣主题关联不同的权重值。例如,由于搜索操作更能体现用户的个人兴趣,因此可以为兴趣主题“水果”分配较高的权重值,以便于终端设备推荐“水果”相关内容的比例较高。又例如,终端设备检测到用户在浏览首页界面信息流的过程中,对某一个信息流的点击操作,终端设备确定该信息流包含的关键标识为“狮子座”以及对应的兴趣主题为“星座”,则可以将其添加到用户兴趣中。由于信息流浏览操作一般表示为用户的即时兴趣,因此可以为兴趣主题“星座”分配较低的权重,以便于终端设备推荐“星座”相关内容的比例较低。During specific implementation, the terminal device may associate different weight values for the topic of interest according to the type of user operation behavior. For example, since the search operation can better reflect the user's personal interests, a higher weight value may be assigned to the interest topic "fruit", so that the terminal device recommends a higher proportion of "fruit" related content. For another example, the terminal device detects that the user clicks on a certain information stream in the process of browsing the information stream on the home page interface, and the terminal device determines that the key identifier contained in the information stream is "Leo" and the corresponding interest topic is "Constellation". , you can add it to user interests. Since the information flow browsing operation is generally expressed as the user's immediate interest, a lower weight can be assigned to the interest topic "constellation", so that the terminal device recommends a lower proportion of "constellation" related content.
需要说明的是,随着用户在浏览器中的操作行为次数的增加,还可以根据操作行为的次数更新对应的兴趣主题的权重。例如,若用户后续在浏览器中又多次浏览“星座”相关内容,则可以随着用户浏览次数的增多,提高为“星座”分配的权重值。其中,用户兴趣可以通过个性化主题兴趣表形式来体现,如以下表8b所示:It should be noted that, as the number of operations performed by the user in the browser increases, the weight of the corresponding topic of interest may also be updated according to the number of operations performed by the user. For example, if the user browses the content related to "constellation" for many times in the browser subsequently, the weight value allocated to the "constellation" can be increased as the number of times the user browses increases. Among them, user interests can be reflected in the form of personalized topic interest tables, as shown in Table 8b below:
表8bTable 8b
水果fruit 体育physical education 财经Finance 时政current affairs 星座constellation
火龙果pitaya 篮球basketball 实体经济real economy 新冠new crown 狮子座Leo
   https://china.nba.com/https://china.nba.com/         
上述表8b中示出的主题兴趣表仅作为一种可能的示例,不用于对用户兴趣的体现形 式进行限定。本申请实施时,还可以将如表8b中示出的兴趣主题根据权重值大小从左到右进行排序,以及每个兴趣主题包含的关键标识也可以根据权重值大小从上到下进行排序。也即可以理解的是,表8b中兴趣主题“水果”关联的权重值目前是最大的,因此终端设备上显示为对“水果”的相关内容推荐的比例最高。The subject interest table shown in Table 8b above is only a possible example, and is not used to limit the embodiment of the user's interest. During the implementation of this application, the interest topics shown in Table 8b can also be sorted from left to right according to the weight value, and the key identifiers included in each interest topic can also be sorted from top to bottom according to the weight value. That is to say, it can be understood that the weight value associated with the topic of interest "fruit" in Table 8b is currently the largest, so the terminal device displays the highest proportion of the content recommended for "fruit".
此外,在终端设备检测到用户本次离开应用程序之后,还可以将根据本次用户实时操作行为得到的用户兴趣可以存储为该用户的历史用户兴趣,以作为该用户下一次进入该应用程序时确定用户兴趣的参考。In addition, after the terminal device detects that the user leaves the application this time, the user interest obtained according to the real-time operation behavior of the user this time can also be stored as the user's historical user interest, which can be used as the user's next entry into the application. References to determine user interests.
场景2、用户作为老用户打开应用程序。在此场景下,终端设备中针对该应用程序一般存储有该用户的历史用户兴趣。需要说明的是,若终端设备检测到该用户为老用户,但未存储有该用户的历史用户兴趣,则也可以按照前述场景1中介绍的实施方式确定用户兴趣。 Scenario 2. The user opens the application as an old user. In this scenario, the terminal device generally stores the user's historical user interests for the application. It should be noted that, if the terminal device detects that the user is an old user, but does not store the user's historical user interests, the user interests can also be determined according to the implementation manner described in the foregoing scenario 1.
实施时,终端设备检测到用户启动应用程序的指令之后,且未检测到用户在浏览器中其他的用户操作行为之前,则可以结合历史用户兴趣和从服务器侧获取的主题兴趣表进行内容推荐。此时,终端设备确定的用户兴趣可以通过如表8b示出的个性化主题兴趣表确定,例如,表8b中的“水果”、“星座”为历史用户兴趣,而表8b中的“体育”、“财经”、“时政”为从服务器侧获取的主题兴趣表得到的。In implementation, after the terminal device detects the user's instruction to start the application, but before detecting other user operation behaviors in the browser, the terminal device can combine historical user interests and the topic interest table obtained from the server side to perform content recommendation. At this time, the user interest determined by the terminal device can be determined through the personalized topic interest table shown in Table 8b. For example, "fruit" and "constellation" in Table 8b are historical user interests, while "Sports" in Table 8b , "Finance", and "Current Affairs" are obtained from the subject interest table obtained from the server side.
进一步的,若终端设备检测到用户在所述应用程序中的用户实时操作行为之后,则可以结合用户实时操作行为以及上述个性化主题兴趣表更新用户兴趣,然后根据更新后的用户兴趣进行实时推荐。例如,终端设备从用户实时操作行为中检测到用户对于主题兴趣“星座”的浏览次数显著增加,并且浏览的相关关联标识还有“摩羯座”、“星座运势”,则终端设备增加为“星座”分配的权重值,以及更新“星座”下的关键标识。更新后的用户兴趣所对应的个性化主题兴趣表,可以如以下表8c所示:Further, if the terminal device detects the user's real-time operation behavior in the application program, it can update the user's interest in combination with the user's real-time operation behavior and the above-mentioned personalized theme interest table, and then perform real-time recommendation according to the updated user interest. . For example, if the terminal device detects from the user's real-time operation behavior that the number of users' browsing of the topic interest "constellation" has increased significantly, and the relevant associated identifiers browsed are also "capricornus" and "horoscope", the terminal device is added to "constellation" The assigned weight value, and the key ID under "Constellation" is updated. The personalized topic interest table corresponding to the updated user interests can be shown in Table 8c below:
表8cTable 8c
星座constellation 水果fruit 体育physical education 财经Finance 时政current affairs
狮子座Leo 火龙果pitaya 篮球basketball 实体经济real economy 新冠new crown
摩羯座Capricornus    https://china.nba.com/https://china.nba.com/      
星座运势Horoscope            
需要说明的是,在终端设备确定用户兴趣之后,推荐的相关内容以及对相关内容的获取方式不进行限定。例如,终端设备可以从本地缓存内容中查找与所述用户兴趣对应的推荐内容,或者也可以从提供与所述用户兴趣对应的推荐内容的内容提供服务器中获取与所述用户兴趣对应的推荐内容等。It should be noted that, after the terminal device determines the user's interest, the recommended related content and the manner of acquiring the related content are not limited. For example, the terminal device may search for the recommended content corresponding to the user interest from the locally cached content, or may also obtain the recommended content corresponding to the user interest from a content providing server that provides the recommended content corresponding to the user interest Wait.
通过以上本申请提供的内容推荐方式,其一可以避免终端设备上应用程序的信息流冷启动问题。其中,应用程序的信息流冷启动问题一般产生于用户首次打开该应用程序的场景下,由于终端设备中未存储历史用户兴趣,导致终端设备无法进行信息流的推荐。通过本申请提供的方式,终端设备可以根据从服务器侧获取到主题兴趣表以及结合用户实时操作行为等确定用户实时兴趣,从而可根据终端设备的用户实时兴趣进行内容推荐。Through the content recommendation method provided by the present application, one can avoid the cold start problem of the information flow of the application program on the terminal device. The cold start problem of the information flow of the application generally occurs in the scenario where the user opens the application for the first time. Since the terminal device does not store historical user interests, the terminal device cannot recommend the information flow. With the method provided in this application, the terminal device can determine the user's real-time interest according to the topic interest table obtained from the server side and in combination with the user's real-time operation behavior, etc., so that content can be recommended according to the user's real-time interest of the terminal device.
其二本申请实施时,用户实时兴趣是在终端设备侧生成,相比于相关技术中在服务器侧根据大数据生成用户兴趣之后,直接根据服务器侧生成的用户兴趣对不同的用户推荐基本一致的兴趣内容,本申请提供的基于用户兴趣的内容推荐方法能够根据用户的实时操作 及时更新用户兴趣,从而保证推荐的内容可以更好的反映用户当下感兴趣的内容。Second, when the present application is implemented, the real-time interests of users are generated on the terminal device side. Compared with the related art, after the user interests are generated on the server side according to big data, the user interests generated on the server side are directly recommended to different users. Interest content, the content recommendation method based on user interests provided by the present application can update user interests in time according to the user's real-time operation, thereby ensuring that the recommended content can better reflect the content that the user is currently interested in.
为了更好的理解本申请提供的方法的整体流程,以下结合图6对本申请提供的方法进行进一步介绍。本申请提供的方法可以主要分为两个部分:采样部分和推荐部分。其中,终端设备可以采集用户的每次操作,以得到用户操作行为数据。然后,终端设备对用户操作行为数据进行处理之后发送到服务器侧,以使服务器侧根据收集到的大量脱敏处理之后的用户操作行为数据生成主题兴趣表。并且,服务器还可以向终端设备发送主题兴趣表,以实现终端设备根据主题兴趣表进行内容推荐。图6为本申请实施例提供的一种基于用户兴趣的内容推荐的流程示意图,包括以下步骤:In order to better understand the overall flow of the method provided by the present application, the method provided by the present application will be further introduced below with reference to FIG. 6 . The method provided in this application can be mainly divided into two parts: the sampling part and the recommendation part. The terminal device may collect each operation of the user to obtain user operation behavior data. Then, the terminal device processes the user operation behavior data and sends it to the server side, so that the server side generates a topic interest table according to a large amount of collected user operation behavior data after desensitization processing. In addition, the server may also send a topic interest table to the terminal device, so that the terminal device can perform content recommendation according to the topic interest table. FIG. 6 is a schematic flowchart of a content recommendation based on user interests provided by an embodiment of the present application, including the following steps:
一、采样部分1. Sampling part
S601、终端设备检测到用户启动目标应用程序的指令,启动所述目标应用程序。可选的,用户进入目标应用程序可以是用户点击终端设备主界面上的应用程序图标,或者也可以是用户通过语音唤醒所述目标应用程序,或者还可以是用户通过终端设备上任一显示界面包含的所述目标应用程序的快捷进入目标应用程序等,本申请对此不进行限定。例如,假设目标应用程序为浏览器,则终端设备检测到用户点击浏览器图标的用户操作行为,可以参阅图7中1示出的内容;或者终端设备接收到“打开浏览器”类似用户唤起浏览器的唤醒词,或者终端设备接收到用户点击下拉界面上包含的浏览器的快捷入口标识等,可以参阅图7中2示出的内容,均可以执行打开浏览器的操作。S601. The terminal device detects the user's instruction to start the target application, and starts the target application. Optionally, the user entering the target application may be the user clicking on the application icon on the main interface of the terminal device, or the user may wake up the target application through voice, or the user may also use any display interface on the terminal device. The target application can be quickly entered into the target application, etc., which is not limited in this application. For example, assuming that the target application is a browser, the terminal device detects the user operation behavior of the user clicking the browser icon, and can refer to the content shown in 1 in FIG. 7; The wake-up word of the browser, or the terminal device receives the user's click on the shortcut entry identifier of the browser contained in the drop-down interface, etc., you can refer to the content shown in 2 in FIG.
S602、终端设备采集所述用户对所述目标应用程序执行的至少一个操作数据。示例性的,结合图4所示,终端设备在进入目标应用程序之后,对用户操作行为进行实时采集。S602. The terminal device collects at least one operation data performed by the user on the target application. Exemplarily, as shown in FIG. 4 , after the terminal device enters the target application program, the user's operation behavior is collected in real time.
S603、终端设备检测到用户退出所述目标应用程序的指令时,关闭所述目标应用程序。示例性的,用户退出目标应用程序可以是用户关闭目标应用程序的显示界面,退回到终端设备的主显示界面;或者也可以是用户通过后台清理方式关闭掉目标应用程序的运行;或者还可能是目标应用程序由于程序无响应导致的强制退出等,本申请对此不进行限定。S603. When the terminal device detects the user's instruction to quit the target application, the terminal device closes the target application. Exemplarily, when the user exits the target application, the user may close the display interface of the target application and return to the main display interface of the terminal device; or the user may close the running of the target application through background cleaning; or it may be The application does not limit the forced exit of the target application due to the program unresponsiveness, etc.
S604、终端设备将S601~S603的处理过程的至少一个操作数据存储为一组用户操作行为数据。S604. The terminal device stores at least one operation data of the processing procedures of S601 to S603 as a set of user operation behavior data.
S605、终端设备对所述用户操作行为数据进行预处理,得到序列值为预设值的预处理后的用户操作行为数据。示例性的,由于用户操作行为的多种可能性,每组用户操作行为数据可能具有不同的序列长度,终端设备可以基于预设值对不同序列长度的用户操作行为数据进行采样。具体实施为,若用户操作行为数据的序列长度小于所述预设值,则对所述用户操作行为数据进行补充默认用户操作行为,其中默认用户操作行为可以是从主题兴趣表中获取的,或是预先定义的,得到指定序列长度的用户操作行为数据。若用户操作行为数据的序列长度大于所述预设值,则对所述用户操作行为数据进行随机截断采样处理,得到指定序列长度的用户操作行为数据。这样,通过对用户操作行为数据的序列长度的预处理,可以较好地避免由于用户操作行为数据的序列较短导致的无法全面反映用户操作行为、以及由于用户操作行为数据的序列较长导致的采样样本数据过大的问题。需要说明的是,预设值可以是终端设备自定义的,或者基于历史经验总结得到的,或者根据其他规则确定的,本申请对此不进行限定。S605: The terminal device preprocesses the user operation behavior data, and obtains preprocessed user operation behavior data whose sequence value is a preset value. Exemplarily, due to the multiple possibilities of user operation behaviors, each group of user operation behavior data may have different sequence lengths, and the terminal device may sample user operation behavior data of different sequence lengths based on a preset value. Specifically, if the sequence length of the user operation behavior data is less than the preset value, supplement the user operation behavior data with a default user operation behavior, where the default user operation behavior may be obtained from the subject interest table, or is pre-defined, and obtains the user operation behavior data of the specified sequence length. If the sequence length of the user operation behavior data is greater than the preset value, random truncation and sampling processing is performed on the user operation behavior data to obtain user operation behavior data with a specified sequence length. In this way, by preprocessing the sequence length of the user operation behavior data, it is possible to better avoid the failure to fully reflect the user operation behavior due to the short sequence of the user operation behavior data, and the long sequence of the user operation behavior data. The problem of sampling sample data is too large. It should be noted that the preset value may be customized by the terminal device, or obtained based on historical experience, or determined according to other rules, which is not limited in this application.
S606、终端设备对预处理后的用户操作行为数据进行脱敏处理,得到脱敏处理后的用户操作行为数据。一种可能的设计中,终端设备可以对预处理后的用户操作行为数据中包含的内容根据主题兴趣表进行随机替换,从而可以对用户操作行为的兴趣特点进行一定的 模糊处理。另一种可能的设计中,终端设备还可以对模糊处理之后的用户操作行为数据进行用户信息的剥离,进而得到脱敏处理后的用户操作行为数据,从而避免泄露用户的隐私。S606 , the terminal device performs desensitization processing on the preprocessed user operation behavior data to obtain desensitized user operation behavior data. In a possible design, the terminal device can randomly replace the content contained in the preprocessed user operation behavior data according to the topic interest table, so that the interest characteristics of the user operation behavior can be blurred to a certain extent. In another possible design, the terminal device may also perform user information stripping on the obfuscated user operation behavior data to obtain desensitized user operation behavior data, thereby avoiding leakage of user privacy.
上述S605和S606的实施过程中所涉及到的主题兴趣表可以是终端设备从服务器侧获取之后存储的,因此S6050位于S605和S606之前,但不限定S6050与S601~S604之间的执行先后顺序。The subject interest table involved in the implementation of the above S605 and S606 may be stored after the terminal device obtains it from the server side. Therefore, S6050 is located before S605 and S606, but the execution sequence between S6050 and S601 to S604 is not limited.
S6050、终端设备获取服务器侧生成的主题兴趣表。可选的,服务器侧可以实时的、或周期性的、或在主题兴趣表有更新之后,将主题兴趣表自动发送到终端设备侧。或者,终端设备也可以向服务器侧发送请求信息,服务器接收到请求信息之后将最新的主题兴趣表下发给终端设备。S6050. The terminal device acquires the topic interest table generated on the server side. Optionally, the server side may automatically send the topic interest table to the terminal device side in real time, or periodically, or after the topic interest table is updated. Alternatively, the terminal device may also send the request information to the server side, and the server sends the latest topic interest table to the terminal device after receiving the request information.
S607、终端设备上传所述脱敏处理后的用户操作行为数据到服务器侧。考虑到终端设备侧的计算能力有限,并且服务器可以基于更多的用户操作行为数据进行主题兴趣模型的训练,有助于训练得到更为准确和全面的主题兴趣表。本申请实施时,在终端设备对用户操作行为数据进行如S605及S606的处理之后得到的脱敏处理后的用户操作行为数据上传至服务器。通过本申请提供的实施方式,终端设备侧对采集的用户操作行为数据进行处理之后得到脱敏处理后的用户操作行为数据,然后上传至服务器侧,因此服务器侧收集不到用户的隐私数据,从而可以提高用户操作行为数据的安全性。S607. The terminal device uploads the desensitized user operation behavior data to the server side. Considering that the computing power of the terminal device side is limited, and the server can train the topic interest model based on more user operation behavior data, it is helpful for training to obtain a more accurate and comprehensive topic interest table. When the application is implemented, the desensitized user operation behavior data obtained after the terminal device performs the processing on the user operation behavior data such as S605 and S606 is uploaded to the server. With the embodiments provided in this application, the terminal device side processes the collected user operation behavior data to obtain desensitized user operation behavior data, and then uploads it to the server side. Therefore, the server side cannot collect the user's private data, so The security of user operation behavior data can be improved.
S608、服务器对一个或多个终端设备上传的脱敏处理后的用户操作行为数据进行统计汇总。其中,服务器可以连接一个或多个终端设备,图6中仅以一个终端设备作为示例进行说明,其他终端设备与服务器之间的交互类似。示例性的,通过对脱敏处理后的用户操作行为数据的统计汇总,可以得到目前热门的兴趣主题,以及各兴趣主题包含的关键标识。S608. The server performs statistical summary on the desensitized user operation behavior data uploaded by one or more terminal devices. The server may be connected to one or more terminal devices, and FIG. 6 only takes one terminal device as an example for description, and the interactions between other terminal devices and the server are similar. Exemplarily, through the statistical summary of the desensitized user operation behavior data, currently popular interest topics and key identifiers included in each interest topic can be obtained.
S609、服务器基于统计汇总之后的脱敏处理之后的用户操作行为数据,生成或更新主题兴趣表。示例性的,服务器可以将统计汇总之后的用户操作行为数据作为训练样本,对用户兴趣进行无监督学习,从而得到可以反映兴趣主题与关键标识之间映射关系的主题兴趣表。其中,主题兴趣表中可以包含多个兴趣主题、以及每个兴趣主题包含的关键标识。此外,主题兴趣表中还可以关联每个兴趣主题的权重值,以及每个兴趣主题包含的每个关键标识的权重值,通过权重值大小反映多数用户感兴趣的兴趣主题、以及关键标识。例如,主题兴趣表中包含的兴趣主题“体育”的权重值最大,则表示当前多数用户对于体育主题较为感兴趣;进一步的,“体育”主题下包含的关键标识“篮球”的权重值最大,则表示当前多数用户对于篮球关键词更为感兴趣。S609: The server generates or updates a topic interest table based on the user operation behavior data after desensitization processing after statistical aggregation. Exemplarily, the server may use the statistical aggregated user operation behavior data as a training sample to perform unsupervised learning on user interests, thereby obtaining a topic interest table that can reflect the mapping relationship between interest topics and key identifiers. The topic interest table may include multiple interest topics and key identifiers included in each interest topic. In addition, the weight value of each interest topic and the weight value of each key identifier included in each interest topic can also be associated in the topic interest table, and the weight value reflects the interest topics and key identifiers that most users are interested in. For example, the interest topic "sports" contained in the topic interest table has the largest weight value, indicating that most users are currently interested in the sports topic; further, the key identifier "basketball" contained in the "sports" topic has the largest weight value, It means that most current users are more interested in basketball keywords.
S610、服务器发送所述主题兴趣表到所述一个或多个终端设备。S610. The server sends the topic interest table to the one or more terminal devices.
二、推荐部分The recommended part
S611、终端设备检测到用户启动目标应用程序的指令,启动所述目标应用程序。例如,如图7所示,终端设备检测到用户再次打开浏览器的用户操作行为。需要说明的是,为了更清楚的理解本申请,S601~S603与S611~S613分别用来表示终端设备响应于用户操作进行的两种不同场景下的处理,其中S601~S603在进行实时采集用户操作行为的同时,也可以实现根据用户兴趣进行内容推荐;S611~S 6113在进行根据用户兴趣进行内容推荐的同时,也可以进行实时采集用户操作行为。S611. The terminal device detects the user's instruction to start the target application, and starts the target application. For example, as shown in FIG. 7 , the terminal device detects a user operation behavior of the user opening the browser again. It should be noted that, for a clearer understanding of this application, S601-S603 and S611-S613 are respectively used to represent the processing performed by the terminal device in response to user operations in two different scenarios, wherein S601-S603 are collecting user operations in real time. While performing content recommendation based on user interests, S611-S 6113 can also perform real-time collection of user operation behaviors while performing content recommendation based on user interests.
S612、终端设备确定用户兴趣。示例性的,终端设备确定用户兴趣可以包含以下几种可能的场景:S612. The terminal device determines the user's interest. Exemplarily, the determination of the user interest by the terminal device may include the following possible scenarios:
场景A:用户第一次打开浏览器。这样场景下,由于浏览器中不存在历史用户兴趣, 因此终端设备可以根据主题兴趣表为用户进行浏览器首页界面内的信息流推荐,也即显示为第一推荐界面。例如,根据前述表7得到终端设备从服务器侧获取的兴趣主题包括“体育”、“财经”和“时政”,“体育”包含的关键标识有篮球、https://china.nba.com/,“财经”包含的关键标识有实体经济,“时政”包含的关键标识有新冠,则用户第一次打开浏览器时,浏览器首页界面可以如图8a所示的内容。根据图8a所示的内容,可以得到用户第一次打开浏览器时,浏览器首页界面显示的内容包含与关键标识“篮球”相关的篮球协会-首页的推荐信息流、与“https://china.nba.com/”相关的NBA中国官方网站的入口、与“实体经济”相关的刊文、与“新冠”相关的热点解读文;并且,通过用户向下滑动浏览器的首页界面,还可以浏览到与主题兴趣表中包含的兴趣主题更多的相关的推荐内容(图8a中未示出)。Scenario A: The user opens the browser for the first time. In such a scenario, since there is no historical user interest in the browser, the terminal device can recommend the information flow in the browser homepage interface for the user according to the topic interest table, that is, display the first recommendation interface. For example, according to the aforementioned Table 7, the topics of interest acquired by the terminal device from the server side include "sports", "financial and economics" and "current affairs", and the key identifiers contained in "sports" include basketball, https://china.nba.com/, The key identifiers included in "Finance" include the real economy, and the key identifiers included in "Current Affairs" include the new crown. When the user opens the browser for the first time, the browser homepage interface can be as shown in Figure 8a. According to the content shown in Figure 8a, it can be obtained that when the user opens the browser for the first time, the content displayed on the homepage interface of the browser includes the recommended information flow of the basketball association-homepage related to the key identifier "basketball", and "https://" china.nba.com/” related to the NBA China official website, articles related to the “real economy”, and hot interpretation articles related to the “new crown”; and, by sliding down the home page of the browser, users can also It is possible to browse to more relevant recommended content (not shown in FIG. 8a ) related to the interest topics contained in the topic interest table.
场景B:用户非第一次打开浏览器,还没进行用户操作行为之前。这样的场景下,浏览器中可能存储有历史用户兴趣,则终端设备可以根据历史用户兴趣和主题兴趣表为用户进行浏览器首页界面内的信息流推荐。例如,结合前述如表8b得到的主题兴趣表,除了包含如前述场景A中包含的“体育”、“财经”和“时政”之外,还可以包含历史用户兴趣“水果”、“星座”,此时,用户打开浏览器时,浏览器首页界面可以如图8b所示的内容。Scenario B: It is not the first time that the user opens the browser, and before the user's operation behavior is performed. In such a scenario, historical user interests may be stored in the browser, and the terminal device may recommend the information flow in the browser homepage interface for the user according to the historical user interests and the topic interest table. For example, in combination with the aforementioned subject interest table obtained as shown in Table 8b, in addition to including "sports", "financial affairs" and "current affairs" as included in the aforementioned scenario A, it can also include historical user interests "fruit", "constellation", At this time, when the user opens the browser, the browser home page interface may be as shown in FIG. 8b.
场景C:用户非第一次打开浏览器,且进行了一些用户操作行为之后。这样的场景下,终端设备可以根据用户实时操作行为、历史用户兴趣和主题兴趣表为用户进行浏览器首页界面内的信息流推荐,也即显示为第一推荐界面。例如,结合前述表8c所示,根据用户实时操作行为得到的更新后的个性化主题兴趣表,可以得到用户对主题兴趣“星座”更为感兴趣,则浏览器首页界面的内容推荐中,“星座”相关内容的比例增多,则终端设备根据更新后的个性化主题兴趣表进行刷新后的浏览器首页界面可以如图8c所示。由于更新之后的个性化主题兴趣表中指示为用户对“星座”更为感兴趣,则浏览器首页界面中对于“星座”相关的内容的推荐比例增多,且处于浏览器首页界面中较靠前的位置。Scenario C: After the user opens the browser for the first time and performs some user actions. In such a scenario, the terminal device can recommend the information flow in the browser homepage interface for the user according to the user's real-time operation behavior, historical user interest and topic interest table, that is, the first recommendation interface is displayed. For example, in combination with the aforementioned table 8c, according to the updated personalized topic interest table obtained by the user's real-time operation behavior, it can be obtained that the user is more interested in the topic interest "constellation", then in the content recommendation of the browser home page interface, "" If the proportion of content related to "Constellation" increases, the browser home page interface after the terminal device refreshes according to the updated personalized topic interest table may be as shown in Figure 8c. Since the updated personalized topic interest table indicates that users are more interested in "constellations", the proportion of recommended content related to "constellations" in the browser homepage interface increases, and it is higher in the browser homepage interface. s position.
上述示例实施时,第二推荐界面可以是终端设备检测到用户刷新第一推荐界面的指令之后触发显示的。例如,在应用程序中,用户从终端设备的顶端向下滑动操作时,表示用户想要对当前界面进行刷新,此时,终端设备可以将根据更新后的用户兴趣所对应的推荐内容显示在终端设备上,也可以理解为显示第二推荐界面。When the above example is implemented, the second recommendation interface may be triggered to display after the terminal device detects an instruction of the user to refresh the first recommendation interface. For example, in an application, when the user swipes down from the top of the terminal device, it means that the user wants to refresh the current interface. At this time, the terminal device can display the recommended content corresponding to the updated user interests on the terminal On the device, it can also be understood as displaying the second recommendation interface.
需要说明的是,用户兴趣并非特指某一兴趣,可以表示多个兴趣主题的集合,并且每个兴趣主题关联有权重值,权重值用来反映用户感兴趣的程度,兴趣主题权重值越大,可以表示用户对该兴趣主题越感兴趣。It should be noted that a user's interest does not specifically refer to a certain interest. It can represent a collection of multiple interest topics, and each interest topic is associated with a weight value. The weight value is used to reflect the degree of user interest. The greater the interest topic weight value , which can indicate that the user is more interested in the topic of interest.
S613、终端设备根据用户兴趣进行内容推荐。可选的,终端设备可以从本地缓存中获取用户兴趣对应的相关内容,或者还可以向服务器发送获取请求,以从服务器上获取用户兴趣对应的相关内容等。S613. The terminal device recommends content according to the user's interests. Optionally, the terminal device may obtain the relevant content corresponding to the user's interests from the local cache, or may also send an obtaining request to the server to obtain the relevant content corresponding to the user's interests from the server.
基于相同的技术构思,图9所示为本申请实施例提供的一种终端设备900。该终端设备900包括一个或多个处理器901;一个或多个存储器902;通信接口903,以及一个或多个计算机程序904,上述各器件可以通过一个或多个通信总线905连接。通信接口903用于实现与其他设备(比如终端设备)的通信,比如通信接口可以是收发器。其中该一个或多个计算机程序904被存储在上述存储器902中并被配置为被该一个或多个处理器901执行,该一个或多个计算机程序904包括指令,上述指令可以用于执行如下步骤,包括:Based on the same technical concept, FIG. 9 shows a terminal device 900 provided by an embodiment of the present application. The terminal device 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 . The communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver. wherein the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据;将采集的所述多个用户操作行为数据进行脱敏处理,所述脱敏处理为将所述用户 操作行为数据中涉及所述用户的隐私数据滤除;将脱敏处理后的多个用户操作行为数据发送给服务器,以使服务器对脱敏处理后的多个用户操作行为数据进行分析,得到所述用户使用所述目标应用程序的主题兴趣表。Collect multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration; perform desensitization processing on the multiple collected user operation behavior data, and the desensitization processing is to desensitize all the collected user operation behavior data. Filter out the privacy data related to the user in the user operation behavior data; send the desensitized multiple user operation behavior data to the server, so that the server analyzes the desensitized multiple user operation behavior data, Obtain the subject interest table of the user using the target application.
示例性的,所述采集用户一次使用目标应用程序时用户输入的多个用户操作行为数据,具体实施为检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序;在所述目标应用程序启动后,采集所述用户对所述目标应用程序执行的至少一个操作数据;检测到所述用户退出所述目标应用程序的指令时,关闭所述目标应用程序;将所述目标应用程序从启动到关闭过程中采集到的至少一个操作数据存储为一组用户操作行为数据。Exemplarily, the collecting a plurality of user operation behavior data input by the user when the user uses the target application program at one time is specifically implemented as starting the target application program when an instruction of the user to start the target application program is detected; After the target application is started, collect at least one operation data performed by the user on the target application; when detecting the user's instruction to exit the target application, close the target application; At least one operation data collected from the start-up to the shutdown process of the target application is stored as a set of user operation behavior data.
可选的,所述将采集的所述多个用户操作行为数据进行脱敏处理,具体实施为,针对所述多个用户操作行为数据中的一个或多个用户操作行为数据,基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换,所述兴趣主题根据所述主题兴趣表确定;剥离所述多个用户操作行为数据中包含的用户信息。Optionally, performing desensitization processing on the multiple user operation behavior data collected is specifically implemented as, for one or more user operation behavior data in the multiple user operation behavior data, based on a differential privacy algorithm. Perform random replacement of user operation behavior data under the same interest topic, and the interest topic is determined according to the topic interest table; strip the user information contained in the plurality of user operation behavior data.
一种可能的实施例中,所述基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换之前,确定每个所述用户操作行为数据的序列长度;按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据。In a possible embodiment, before randomly replacing the user operation behavior data under the same interest topic based on the differential privacy algorithm, the sequence length of each user operation behavior data is determined; The operation behavior data is truncated and compensated to obtain the user operation behavior data of the specified sequence length.
所述按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据,具体实施为,若所述用户操作行为数据的序列长度小于所述预设值,为所述用户操作行为数据补充目标长度的、预先定义的用户操作行为数据,得到指定序列长度的用户操作行为数据;若所述用户操作行为数据的序列长度大于所述预设值,对所述用户操作行为数据截断目标长度,得到所述指定序列长度的用户操作行为数据;其中,所述目标长度为所述用户操作行为数据的序列长度与预设值的差值的绝对值。The truncation and compensation processing is performed on the user operation behavior data according to the preset value to obtain the user operation behavior data of the specified sequence length, which is specifically implemented as follows: if the sequence length of the user operation behavior data is less than the preset value, Supplementing the user operation behavior data with pre-defined user operation behavior data of a target length to obtain user operation behavior data with a specified sequence length; if the sequence length of the user operation behavior data is greater than the preset value, the The user operation behavior data is truncated to the target length to obtain the user operation behavior data of the specified sequence length; wherein the target length is the absolute value of the difference between the sequence length of the user operation behavior data and a preset value.
基于相同的技术构思,图9所示还可以为本申请实施例提供的一种服务器900。该服务器900包括一个或多个处理器901;一个或多个存储器902;通信接口903,以及一个或多个计算机程序904,上述各器件可以通过一个或多个通信总线905连接。通信接口903用于实现与其他设备(比如终端设备)的通信,比如通信接口可以是收发器。其中该一个或多个计算机程序904被存储在上述存储器902中并被配置为被该一个或多个处理器901执行,该一个或多个计算机程序904包括指令,上述指令可以用于执行如下步骤,包括:Based on the same technical concept, FIG. 9 may also provide a server 900 provided in this embodiment of the present application. The server 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 . The communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver. wherein the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
接收一个或多个终端设备发送的脱敏处理后的多个用户操作行为数据;所述脱敏处理后的用户操作行为数据为所述一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表;发送所述主题兴趣表给所述一个或多个终端设备。Receive multiple desensitized user operation behavior data sent by one or more terminal devices; the desensitized user operation behavior data is collected by the one or more terminal devices once the user or the user has a set duration. Multiple user operation behavior data input by the user when the target application is used multiple times, and obtained by desensitizing the collected user operation behavior data; the desensitization process is to desensitize the user operation behavior data Filter out the privacy data involved in the user; analyze multiple user operation behavior data after the desensitization process to obtain the subject interest table of the user using the target application; send the subject interest table to the one or more terminal devices.
示例性的,所述对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表,具体实施为,将所述脱敏处理后的多个用户操作行为数据输入预先构建的主题兴趣模型,以对所述脱敏处理后的多个用户操作行为数据进行无监督学习;得到所述预先构建的主题兴趣模型输出的主题兴趣表。Exemplarily, the desensitization-processed plurality of user operation behavior data is analyzed to obtain the subject interest table of the user using the target application, which is specifically implemented as: The user operation behavior data is input into a pre-built topic interest model to perform unsupervised learning on the desensitized multiple user operation behavior data; a topic interest table output by the pre-built topic interest model is obtained.
基于相同的技术构思,图9所示还可以为本申请实施例提供的一种终端设备900。该终端设备900包括一个或多个处理器901;一个或多个存储器902;通信接口903,以及一个或多个计算机程序904,上述各器件可以通过一个或多个通信总线905连接。通信接口 903用于实现与其他设备(比如终端设备)的通信,比如通信接口可以是收发器。其中该一个或多个计算机程序904被存储在上述存储器902中并被配置为被该一个或多个处理器901执行,该一个或多个计算机程序904包括指令,上述指令可以用于执行如下步骤,包括:Based on the same technical concept, FIG. 9 may also provide a terminal device 900 provided in an embodiment of the present application. The terminal device 900 includes one or more processors 901 ; one or more memories 902 ; a communication interface 903 , and one or more computer programs 904 . The communication interface 903 is used to implement communication with other devices (such as terminal devices), for example, the communication interface may be a transceiver. wherein the one or more computer programs 904 are stored in the aforementioned memory 902 and configured to be executed by the one or more processors 901, the one or more computer programs 904 include instructions that can be used to perform the following steps ,include:
接收服务器发送的主题兴趣表,所述主题兴趣表为所述服务器对脱敏处理后的多个用户操作行为数据进行分析得到的;所述脱敏处理后的用户操作行为数据为一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时所述用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序并显示第一推荐界面,所述第一推荐界面中包含至少一项推荐内容;所述至少一项推荐内容是根据所述主题兴趣表确定的。Receive the subject interest table sent by the server, where the subject interest table is obtained by the server analyzing multiple user operation behavior data after desensitization processing; the user operation behavior data after desensitization processing is one or more The terminal device collects multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration, and desensitizes the collected multiple user operation behavior data; The desensitization process is to filter out the private data related to the user in the user operation behavior data; when detecting the user's instruction to start the target application, start the target application and display the first recommendation The first recommendation interface includes at least one item of recommended content; the at least one item of recommended content is determined according to the topic interest table.
示例性的,所述启动所述目标应用程序并显示第一推荐界面,具体实施为,启动所述目标应用程序;在启动所述目标应用程序后显示第一推荐界面;将所述主题兴趣表中包含的一个或多个兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第一推荐界面中;其中,各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含所述兴趣主题的相关内容的比例越高。Exemplarily, the starting the target application and displaying the first recommendation interface is specifically implemented as: starting the target application; displaying the first recommendation interface after starting the target application; One or more interest topics included in the user interest are taken as user interests, at least one recommended content is obtained according to the user interest, and the obtained at least one recommended content is displayed in the first recommendation interface; wherein, each of the interests The topic has an associated weight value, and the greater the weight value associated with the interest topic, the higher the proportion of the recommended content including the related content of the interest topic.
一种可能的实施例中,所述启动所述目标应用程序并显示第一推荐界面之后,接收并采集用户使用所述目标应用程序时用户输入的一个或多个用户操作行为数据;检测到所述用户刷新所述第一推荐界面的指令时,显示第二推荐界面;所述第二推荐界面中包含的推荐内容是根据所述一个或多个用户操作行为数据以及所述主题兴趣表确定的。In a possible embodiment, after the target application is started and the first recommendation interface is displayed, one or more user operation behavior data input by the user when the user uses the target application is received and collected; When the user refreshes the instruction of the first recommendation interface, a second recommendation interface is displayed; the recommended content included in the second recommendation interface is determined according to the one or more user operation behavior data and the topic interest table .
在一种可能的设计中,所述显示第二推荐界面,具体实施为,根据所述一个或多个用户操作行为数据确定对应的一个或多个兴趣主题,并为各所述兴趣主题分配关联的权重值;将所述用户操作行为数据对应的一个或多个兴趣主题、所述主题兴趣表中包含的一个或多个所述兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第二推荐界面中;其中,所述主题兴趣表中包括的各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含的所述兴趣主题的相关内容的比例越高。In a possible design, the displaying of the second recommendation interface is specifically implemented as determining one or more corresponding interest topics according to the one or more user operation behavior data, and assigning an association to each of the interest topics take one or more interest topics corresponding to the user operation behavior data and one or more of the interest topics included in the topic interest table as the user interest, and obtain at least one recommendation according to the user interest content, and display at least one item of recommended content obtained in the second recommendation interface; wherein, each of the interest topics included in the topic interest table has an associated weight value, and the greater the weight value associated with the interest topic , the higher the proportion of content related to the topic of interest included in the recommended content.
在一种可能的设计中,所述根据所述用户兴趣获取至少一项所述推荐内容,具体实施为,从本地缓存内容中查找与所述用户兴趣对应的推荐内容;和/或,从提供与所述用户兴趣对应的推荐内容的内容提供服务器中获取与所述用户兴趣对应的推荐内容。In a possible design, the acquiring at least one piece of the recommended content according to the user's interest is specifically implemented as searching for the recommended content corresponding to the user's interest from locally cached content; The content providing server of the recommended content corresponding to the user's interest acquires the recommended content corresponding to the user's interest.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。From the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above functional modules is used as an example for illustration. In practical applications, the above functions can be allocated as required. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the system, apparatus and unit described above, reference may be made to the corresponding process in the foregoing method embodiments, and details are not described herein again.
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。Each functional unit in each of the embodiments of the embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可 以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage The medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
以上所述,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何在本申请实施例揭露的技术范围内的变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of the embodiments of the present application, but the protection scope of the embodiments of the present application is not limited thereto, and any changes or substitutions within the technical scope disclosed in the embodiments of the present application shall be covered by this within the protection scope of the application examples. Therefore, the protection scope of the embodiments of the present application should be subject to the protection scope of the claims.

Claims (17)

  1. 一种基于用户兴趣的内容推荐方法,其特征在于,应用于终端设备中,包括:A method for recommending content based on user interests, characterized in that, when applied to a terminal device, the method includes:
    采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据;Collect multiple user operation behavior data entered by the user when the user uses the target application one or more times during the set time period;
    将采集的所述多个用户操作行为数据进行脱敏处理,所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;performing desensitization processing on the plurality of collected user operation behavior data, where the desensitization processing is to filter out private data involving the user in the user operation behavior data;
    将脱敏处理后的多个用户操作行为数据发送给服务器,以使服务器对脱敏处理后的多个用户操作行为数据进行分析,得到所述用户使用所述目标应用程序的主题兴趣表。Sending the multiple user operation behavior data after desensitization processing to the server, so that the server analyzes the multiple user operation behavior data after the desensitization processing, and obtains the subject interest table of the user using the target application program.
  2. 根据权利要求1所述的方法,其特征在于,采集用户一次使用目标应用程序时用户输入的多个用户操作行为数据,包括:The method according to claim 1, wherein collecting a plurality of user operation behavior data input by the user when the user uses the target application program once, comprising:
    检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序;When detecting the user's instruction to start the target application, start the target application;
    在所述目标应用程序启动后,采集所述用户对所述目标应用程序执行的至少一个操作数据;After the target application is started, collect at least one operation data performed by the user on the target application;
    检测到所述用户退出所述目标应用程序的指令时,关闭所述目标应用程序;When detecting the user's instruction to quit the target application, close the target application;
    将所述目标应用程序从启动到关闭过程中采集到的至少一个操作数据存储为一组用户操作行为数据。At least one operation data collected from the start-up to the shutdown process of the target application is stored as a set of user operation behavior data.
  3. 根据权利要求1或2所述的方法,其特征在于,将采集的所述多个用户操作行为数据进行脱敏处理,包括:The method according to claim 1 or 2, wherein performing desensitization processing on the plurality of collected user operation behavior data, comprising:
    针对所述多个用户操作行为数据中的一个或多个用户操作行为数据,基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换,所述兴趣主题根据所述主题兴趣表确定;For one or more user operation behavior data in the plurality of user operation behavior data, random replacement of user operation behavior data under the same interest topic is performed based on a differential privacy algorithm, and the interest topic is determined according to the topic interest table;
    剥离所述多个用户操作行为数据中包含的用户信息。Strip the user information contained in the plurality of user operation behavior data.
  4. 根据权利要求3所述的方法,其特征在于,所述基于差分隐私算法进行相同兴趣主题下的用户操作行为数据的随机替换之前,所述方法还包括:The method according to claim 3, wherein, before the random replacement of user operation behavior data under the same topic of interest based on a differential privacy algorithm, the method further comprises:
    确定每个所述用户操作行为数据的序列长度;determining the sequence length of each of the user operation behavior data;
    按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据。The user operation behavior data is truncated and compensated according to a preset value to obtain user operation behavior data with a specified sequence length.
  5. 根据权利要求4所述的方法,其特征在于,按照预设值对所述用户操作行为数据进行截断和补偿处理,得到指定序列长度的用户操作行为数据,包括:The method according to claim 4, wherein the user operation behavior data is truncated and compensated according to a preset value to obtain the user operation behavior data of a specified sequence length, comprising:
    若所述用户操作行为数据的序列长度小于所述预设值,为所述用户操作行为数据补充目标长度的、预先定义的用户操作行为数据,得到指定序列长度的用户操作行为数据;If the sequence length of the user operation behavior data is less than the preset value, supplementing the user operation behavior data with pre-defined user operation behavior data of a target length to obtain user operation behavior data with a specified sequence length;
    若所述用户操作行为数据的序列长度大于所述预设值,对所述用户操作行为数据截断目标长度,得到所述指定序列长度的用户操作行为数据;If the sequence length of the user operation behavior data is greater than the preset value, truncate the target length of the user operation behavior data to obtain the user operation behavior data of the specified sequence length;
    其中,所述目标长度为所述用户操作行为数据的序列长度与预设值的差值的绝对值。The target length is the absolute value of the difference between the sequence length of the user operation behavior data and a preset value.
  6. 一种基于用户兴趣的内容推荐方法,其特征在于,应用于服务器中,包括:A method for recommending content based on user interests, characterized in that, when applied to a server, the method includes:
    接收一个或多个终端设备发送的脱敏处理后的多个用户操作行为数据;所述脱敏处理后的用户操作行为数据为所述一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的 隐私数据滤除;Receive multiple desensitized user operation behavior data sent by one or more terminal devices; the desensitized user operation behavior data is collected by the one or more terminal devices once the user or the user has a set duration. Multiple user operation behavior data input by the user when the target application is used multiple times, and obtained by desensitizing the collected user operation behavior data; the desensitization process is to desensitize the user operation behavior data Filtering out the privacy data of the user involved in;
    对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表;Analyzing a plurality of user operation behavior data after the desensitization process, to obtain the subject interest table of the user using the target application;
    发送所述主题兴趣表给所述一个或多个终端设备。Sending the topic interest list to the one or more terminal devices.
  7. 根据权利要求6所述的方法,其特征在于,所述对所述脱敏处理后的多个用户操作行为数据进行分析,得到用户使用所述目标应用程序的主题兴趣表,包括:The method according to claim 6, characterized in that, the process of analyzing a plurality of user operation behavior data after the desensitization process to obtain a subject interest table of the user using the target application program, comprising:
    将所述脱敏处理后的多个用户操作行为数据输入预先构建的主题兴趣模型,以对所述脱敏处理后的多个用户操作行为数据进行无监督学习;Inputting the desensitized multiple user operation behavior data into a pre-built topic interest model to perform unsupervised learning on the desensitized multiple user operation behavior data;
    得到所述预先构建的主题兴趣模型输出的主题兴趣表。A topic interest table output by the pre-built topic interest model is obtained.
  8. 一种基于用户兴趣的内容推荐方法,其特征在于,应用于终端设备中,包括:A method for recommending content based on user interests, characterized in that, when applied to a terminal device, the method includes:
    接收服务器发送的主题兴趣表,所述主题兴趣表为所述服务器对脱敏处理后的多个用户操作行为数据进行分析得到的;所述脱敏处理后的用户操作行为数据为一个或多个终端设备采集在设定时长中用户一次或多次使用目标应用程序时所述用户输入的多个用户操作行为数据,并将采集到的所述多个用户操作行为数据进行脱敏处理得到的;所述脱敏处理为将所述用户操作行为数据中涉及所述用户的隐私数据滤除;Receive the subject interest table sent by the server, where the subject interest table is obtained by the server analyzing multiple user operation behavior data after desensitization processing; the user operation behavior data after desensitization processing is one or more The terminal device collects multiple user operation behavior data input by the user when the user uses the target application program one or more times in the set duration, and desensitizes the collected multiple user operation behavior data; The desensitization processing is to filter out the private data involving the user in the user operation behavior data;
    检测到所述用户启动所述目标应用程序的指令时,启动所述目标应用程序并显示第一推荐界面,所述第一推荐界面中包含至少一项推荐内容;所述至少一项推荐内容是根据所述主题兴趣表确定的。When detecting the user's instruction to start the target application, start the target application and display a first recommendation interface, where the first recommendation interface includes at least one recommended content; the at least one recommended content is Determined according to the subject interest table.
  9. 根据权利要求8所述的方法,其特征在于,所述启动所述目标应用程序并显示第一推荐界面,包括:The method according to claim 8, wherein the starting the target application and displaying the first recommendation interface comprises:
    启动所述目标应用程序;start the target application;
    在启动所述目标应用程序后显示第一推荐界面;Displaying a first recommendation interface after starting the target application;
    将所述主题兴趣表中包含的一个或多个兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第一推荐界面中;其中,各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含所述兴趣主题的相关内容的比例越高。Taking one or more interest topics included in the topic interest table as a user interest, acquiring at least one recommended content according to the user interest, and displaying the acquired at least one recommended content in the first recommendation interface; Wherein, each of the interest topics has an associated weight value, and the greater the associated weight value of the interest topic, the higher the proportion of the recommended content including the related content of the interest topic.
  10. 根据权利要求8所述的方法,其特征在于,所述启动所述目标应用程序并显示第一推荐界面之后,所述方法还包括:The method according to claim 8, wherein after the target application is started and the first recommendation interface is displayed, the method further comprises:
    接收并采集用户使用所述目标应用程序时用户输入的一个或多个用户操作行为数据;Receive and collect one or more user operation behavior data input by the user when the user uses the target application;
    检测到所述用户刷新所述第一推荐界面的指令时,显示第二推荐界面;所述第二推荐界面中包含的推荐内容是根据所述一个或多个用户操作行为数据以及所述主题兴趣表确定的。When detecting the user's instruction to refresh the first recommendation interface, a second recommendation interface is displayed; the recommended content included in the second recommendation interface is based on the one or more user operation behavior data and the topic interest table is determined.
  11. 根据权利要求10所述的方法,其特征在于,所述显示第二推荐界面,包括:The method according to claim 10, wherein the displaying the second recommendation interface comprises:
    根据所述一个或多个用户操作行为数据确定对应的一个或多个兴趣主题,并为各所述兴趣主题分配关联的权重值;Determine one or more corresponding interest topics according to the one or more user operation behavior data, and assign an associated weight value to each of the interest topics;
    将所述用户操作行为数据对应的一个或多个兴趣主题、所述主题兴趣表中包含的一个或多个所述兴趣主题作为用户兴趣,根据所述用户兴趣获取至少一项推荐内容,并将获取的至少一项推荐内容显示在所述第二推荐界面中;其中,所述主题兴趣表中包括的各所述兴趣主题具有关联的权重值,兴趣主题关联的权重值越大,所述推荐内容中包含的所述兴趣主题的相关内容的比例越高。Take one or more interest topics corresponding to the user operation behavior data and one or more of the interest topics included in the topic interest table as the user interest, obtain at least one recommended content according to the user interest, and use the The obtained at least one recommended content is displayed in the second recommendation interface; wherein, each of the interest topics included in the topic interest table has an associated weight value, and the greater the weight value associated with the interest topic, the higher the recommended weight value. The higher the proportion of content related to the topic of interest contained in the content.
  12. 根据权利要求9或11所述的方法,其特征在于,所述根据所述用户兴趣获取至少一项所述推荐内容,包括:The method according to claim 9 or 11, wherein the acquiring at least one piece of the recommended content according to the user's interests comprises:
    从本地缓存内容中查找与所述用户兴趣对应的推荐内容;和/或,Find recommended content corresponding to the user's interests from locally cached content; and/or,
    从提供与所述用户兴趣对应的推荐内容的内容提供服务器中获取与所述用户兴趣对应的推荐内容。The recommended content corresponding to the user's interest is acquired from a content providing server that provides the recommended content corresponding to the user's interest.
  13. 一种终端设备,其特征在于,包括:一个或多个处理器;一个或多个存储器;A terminal device, comprising: one or more processors; one or more memories;
    所述一个或多个存储器,用于存储一个或多个计算机程序以及数据信息;其中所述一个或多个计算机程序包括指令;the one or more memories for storing one or more computer programs and data information; wherein the one or more computer programs comprise instructions;
    当所述指令被所述一个或多个处理器执行时,使得所述终端设备执行如权利要求1~5中任一项所述的方法,或执行如权利要求8~12中任一项所述的方法。When executed by the one or more processors, the instructions cause the terminal device to perform the method as claimed in any one of claims 1 to 5, or to perform the method as claimed in any one of claims 8 to 12. method described.
  14. 一种服务器,其特征在于,包括:一个或多个处理器;一个或多个存储器;A server, characterized by comprising: one or more processors; one or more memories;
    所述一个或多个存储器,用于存储一个或多个计算机程序以及数据信息;其中所述一个或多个计算机程序包括指令;the one or more memories for storing one or more computer programs and data information; wherein the one or more computer programs comprise instructions;
    当所述指令被所述一个或多个处理器执行时,使得所述服务器执行如权利要求6或7所述的方法。The instructions, when executed by the one or more processors, cause the server to perform the method of claim 6 or 7.
  15. 一种通信系统,其特征在于,包括如权利要求13所述的终端设备,以及如权利要求14所述的服务器。A communication system, characterized by comprising the terminal device as claimed in claim 13 and the server as claimed in claim 14 .
  16. 一种计算机可读存储介质,其特征在于,包括计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得如权利要求1~5中任一项所述的方法被执行,或如权利要求6或7所述的方法被执行,或如权利要求8~12任一项所述的方法被执行。A computer-readable storage medium, characterized in that it includes a computer program or instruction, which, when the computer program or instruction is run on a computer, causes the method according to any one of claims 1 to 5 to be performed, or The method of claim 6 or 7 is performed, or the method of any one of claims 8 to 12 is performed.
  17. 一种终端设备上的图形用户界面,其特征在于,所述终端设备具有显示屏、一个或多个存储器、以及一个或多个处理器,所述一个或多个处理器用于执行存储在所述一个或多个存储器中的一个或多个计算机程序,所述图形用户界面包括所述终端设备执行如权利要求1~5中任一项所述的方法或执行如权利要求8~12中任一项所述的方法时显示的图形用户界面。A graphical user interface on a terminal device, characterized in that the terminal device has a display screen, one or more memories, and one or more processors, and the one or more processors are used to execute the One or more computer programs in one or more memories, said graphical user interface comprising said terminal device performing the method of any one of claims 1 to 5 or performing any one of claims 8 to 12 The graphical user interface displayed when the method described in the item is used.
PCT/CN2022/081770 2021-03-23 2022-03-18 User interest-based content recommendation method, and terminal device WO2022199494A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110307500.XA CN115114515A (en) 2021-03-23 2021-03-23 Content recommendation method based on user interest and terminal equipment
CN202110307500.X 2021-03-23

Publications (1)

Publication Number Publication Date
WO2022199494A1 true WO2022199494A1 (en) 2022-09-29

Family

ID=83324238

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/081770 WO2022199494A1 (en) 2021-03-23 2022-03-18 User interest-based content recommendation method, and terminal device

Country Status (2)

Country Link
CN (1) CN115114515A (en)
WO (1) WO2022199494A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784092A (en) * 2017-10-11 2018-03-09 深圳市金立通信设备有限公司 A kind of method, server and computer-readable medium for recommending hot word
CN108763502A (en) * 2018-05-30 2018-11-06 腾讯科技(深圳)有限公司 Information recommendation method and system
CN109784092A (en) * 2019-01-23 2019-05-21 北京工业大学 A kind of recommended method based on label and difference secret protection
US20190158443A1 (en) * 2017-11-17 2019-05-23 International Business Machines Corporation Real-time recommendation of message recipients based on recipient interest level in message
CN111797210A (en) * 2020-03-03 2020-10-20 中国平安人寿保险股份有限公司 Information recommendation method, device and equipment based on user portrait and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784092A (en) * 2017-10-11 2018-03-09 深圳市金立通信设备有限公司 A kind of method, server and computer-readable medium for recommending hot word
US20190158443A1 (en) * 2017-11-17 2019-05-23 International Business Machines Corporation Real-time recommendation of message recipients based on recipient interest level in message
CN108763502A (en) * 2018-05-30 2018-11-06 腾讯科技(深圳)有限公司 Information recommendation method and system
CN109784092A (en) * 2019-01-23 2019-05-21 北京工业大学 A kind of recommended method based on label and difference secret protection
CN111797210A (en) * 2020-03-03 2020-10-20 中国平安人寿保险股份有限公司 Information recommendation method, device and equipment based on user portrait and storage medium

Also Published As

Publication number Publication date
CN115114515A (en) 2022-09-27

Similar Documents

Publication Publication Date Title
US11036744B2 (en) Personalization of news articles based on news sources
US11012753B2 (en) Computerized system and method for determining media based on selected motion video inputs
US11341153B2 (en) Computerized system and method for determining applications on a device for serving media
US10885076B2 (en) Computerized system and method for search query auto-completion
US9361385B2 (en) Generating content for topics based on user demand
US9589149B2 (en) Combining personalization and privacy locally on devices
US20190347287A1 (en) Method for screening and injection of media content based on user preferences
US8725849B1 (en) Browser cache pre-population
US20140122697A1 (en) Providing content to linked devices associated with a user
EP2586008A1 (en) Infinite browse
KR101960873B1 (en) Detecting digital content visibility
US10164936B2 (en) Providing content to devices in a cluster
US9946794B2 (en) Accessing special purpose search systems
US20150287069A1 (en) Personal digital engine for user empowerment and method to operate the same
US20170374001A1 (en) Providing communication ranking scheme based on relationship graph
US10909146B2 (en) Providing automated hashtag suggestions to categorize communication
US20130204857A1 (en) Asynchronous caching to improve user experience
WO2022199494A1 (en) User interest-based content recommendation method, and terminal device
EP3455805A1 (en) Enhancing contact card based on knowledge graph
WO2023035893A1 (en) Search processing method and apparatus, and device, medium and program product
WO2015056112A1 (en) A system and method for determining a search response to a research query
EP4315110A1 (en) Server and method for generating digital content for users of a recommendation system
WO2016086797A1 (en) Method for searching for information and processing and identifying picture in browser, and browser client

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22774155

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22774155

Country of ref document: EP

Kind code of ref document: A1