CN111026967A - Method, device, equipment and medium for obtaining user interest tag - Google Patents

Method, device, equipment and medium for obtaining user interest tag Download PDF

Info

Publication number
CN111026967A
CN111026967A CN201911247424.7A CN201911247424A CN111026967A CN 111026967 A CN111026967 A CN 111026967A CN 201911247424 A CN201911247424 A CN 201911247424A CN 111026967 A CN111026967 A CN 111026967A
Authority
CN
China
Prior art keywords
user
application program
interest
content
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911247424.7A
Other languages
Chinese (zh)
Other versions
CN111026967B (en
Inventor
邵和明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201911247424.7A priority Critical patent/CN111026967B/en
Publication of CN111026967A publication Critical patent/CN111026967A/en
Application granted granted Critical
Publication of CN111026967B publication Critical patent/CN111026967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management

Abstract

The method for obtaining the interest tag of the user comprises the steps of obtaining display content in a target content application program used by the user, extracting text information from the display content, and estimating the interest tag of the user according to the extracted text information. Therefore, the interest tag of the user is determined through the display content in the target content application program used by the user, a more detailed and objective user interest tag can be obtained, and the accuracy and precision of the user interest tag are improved.

Description

Method, device, equipment and medium for obtaining user interest tag
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for obtaining a user interest tag.
Background
With the development of internet technology and the popularization and application of intelligent terminals, the number of applications is increasing. In order to better provide targeted services for users, the interest tags of the users are usually obtained according to the selection of the users, and then the content meeting the requirements of the users can be recommended to the users according to the interest tags of the users, but the accuracy of the obtained interest tags of the users is low.
Therefore, a technical solution that can improve the accuracy of the interest tag of the user is urgently needed.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a medium for obtaining a user interest tag, which are used for improving the accuracy of the user interest tag when the user interest tag is obtained.
In one aspect, a method for obtaining a user interest tag is provided, which includes:
acquiring display content in a target content application program used by a user;
extracting text information from the display content;
and estimating the interest label of the user according to the extracted text information.
In one aspect, an apparatus for obtaining a tag of interest of a user is provided, including:
the device comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring display content in a target content application program used by a user;
the extraction unit is used for extracting text information from the display content;
and the estimation unit is used for estimating the interest tag of the user according to the extracted text information.
Preferably, the obtaining unit is further configured to:
acquiring a recommended application program set containing a plurality of application programs of a specified type through a server;
screening out local application programs contained in the recommended application program set;
and determining the screened local application program as a target content application program.
Preferably, the obtaining unit is further configured to:
acquiring the use duration of each local application program in a specified time range;
and screening the local application programs with the use time longer than a preset time length threshold value.
Preferably, the obtaining unit is further configured to:
and according to the sequence of the use time length from high to low, screening out a specified number of local application programs with the maximum use time length again from the screened out local application programs.
Preferably, the obtaining unit is configured to:
when the specified application access right of the target content application program is determined, respectively reading the display content in the specified area range in each target content application program;
when the appointed application access right of the target content application program is determined, pictures in the appointed area range in each target content application program are respectively intercepted, and image-text recognition is carried out on each picture to obtain display content;
and when the specified application access right of the target content application program is determined not to exist, acquiring the user screenshot of each screened target content application program, and performing image-text recognition on the acquired user screenshot to acquire the display content.
Preferably, the extraction unit is configured to:
according to a preset text analysis algorithm, performing semantic analysis on the acquired display content to obtain semantic analysis content;
and extracting the keywords of the semantic analysis content according to a preset keyword extraction algorithm.
Preferably, the estimation unit is configured to:
determining the extracted keywords as interest tags of the user; or obtaining the interest tag correspondingly set by the extracted keyword.
Preferably, the estimation unit is further configured to:
respectively counting the frequency of each keyword in the extracted keywords;
respectively determining the obtained frequency of each interest tag according to the frequency of each keyword;
and determining a corresponding label weight value according to the frequency of each interest label.
In one aspect, a control device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to perform the steps of any of the above-mentioned methods of obtaining a tag of interest of a user.
In one aspect, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of any of the above-mentioned methods of obtaining a user interest tag.
In the method, the device, the equipment and the medium for obtaining the user interest tag, the display content in the target content application program used by the user is obtained, the text information is extracted from the display content, and the user interest tag is estimated according to the extracted text information. Therefore, the interest tag of the user is determined through the display content in the target content application program used by the user without the need of the user to subjectively select the interest tag, so that a more detailed and objective interest tag can be obtained, and the accuracy and precision of the interest tag of the user are improved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic diagram of a system architecture for obtaining a user interest tag according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart illustrating a process of obtaining a user interest tag according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of an apparatus for obtaining a tag of interest of a user according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a control device in an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solution and beneficial effects of the present application more clear and more obvious, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
First, some terms referred to in the embodiments of the present application are explained to facilitate understanding by those skilled in the art.
The terminal equipment: the electronic device can be mobile or fixed, and can be used for installing various applications and displaying objects provided in the installed applications. For example, a mobile phone, a tablet computer, various wearable devices, a vehicle-mounted device, a Personal Digital Assistant (PDA), a point of sale (POS), or other electronic devices capable of implementing the above functions may be used.
The application comprises the following steps: i.e. application programs, computer programs that can perform one or more services, typically have a visual display interface that can interact with a user, for example electronic maps and wechat, are referred to as applications.
Artificial Intelligence (AI): the method is a theory, method, technology and application system for simulating, extending and expanding human intelligence by using a digital computer or a machine controlled by the digital computer, sensing the environment, acquiring knowledge and obtaining the best result by using the knowledge. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Computer Vision technology (Computer Vision, CV): computer vision is a science for researching how to make a machine "see", and further, it means that a camera and a computer are used to replace human eyes to perform machine vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. Theories and techniques related to computer vision research attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
Natural Language Processing (NLP): the human language (such as Chinese and English) is processed, understood and used by computer, and belongs to a branch of artificial intelligence, which is a cross discipline of computer science and linguistics, also commonly called computational linguistics.
Optical Character Recognition (OCR): refers to the process in which an electronic device (e.g., a scanner or digital camera) examines characters in an image, determines their shape by detecting dark and light patterns, and then translates the shape into computer text using character recognition methods.
Interest recommendation Application (APP): recommending APP of products, information and commodities for users according to the use habits and preferences of users, such as today's headlines, trembles, panning and the like.
Cold start: refers to the process of converting the target user into the seed user at the beginning of the product.
The design concept of the embodiment of the present application is described below.
With the development of internet technology and the popularization and application of intelligent terminals, the number of applications is also rapidly increasing. People can play entertainment, study, work and the like through the network. The method and the device can better provide targeted service for users, improve user experience and satisfaction, improve product retention rate, and determine user interest tags according to user preferences.
In the prior art, when determining a user interest tag, the following method is generally adopted:
and manually selecting the interest tags from the provided interest tag list by the user to obtain the interest tags of the user.
However, in this way, a user needs to manually select the tag, the operation steps of the user are complicated, and in order to avoid that too many tags cause user annoyance, the number of interest tags in the interest tag list is usually small, so that the granularity of the interest tags is large, and the user may have a deviation in the cognition of the user, so that the accuracy of the interest tags is low.
Obviously, a technical solution for obtaining a user interest tag with high accuracy is not provided in the conventional technology, and therefore, a technical solution for obtaining a user interest tag is urgently needed, and when obtaining a user interest tag, the accuracy of the user interest tag is improved.
In view of the above analysis and consideration, the present embodiment provides a data processing scheme, in which the display content in the target content application used by the user is acquired, the text information is extracted from the display content, and the interest tag of the user is estimated according to the extracted text information.
To further illustrate the technical solutions provided by the embodiments of the present application, the following detailed description is made with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide method steps as shown in the following embodiments or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the embodiments of the present application. The method can be executed in sequence or in parallel according to the method shown in the embodiment or the figure when the method is executed in an actual processing procedure or a device.
In the embodiment of the application, the interest recommendation application in the terminal device of the user is taken as an execution subject, the application scenario is that the interest recommendation application estimates the interest tag of the user according to the display content in the target content application program of the user, and then recommends the corresponding content to the user according to the obtained interest tag.
The following describes an embodiment of the present application with a system for obtaining a user interest tag. Fig. 1 is a schematic diagram of a system for obtaining a user interest tag. The system for obtaining a user interest tag includes a server 100 and a terminal device 101. Among them, the terminal apparatus 101 has installed therein a plurality of application programs including an interest recommendation application.
The server 100: the recommendation system is used for receiving a recommendation application request sent by an interest recommendation application in the terminal device 101, returning a recommendation application program set containing a plurality of application programs of specified types to the interest recommendation application, and receiving and storing an interest tag of a user sent by the interest recommendation application.
Further, the server 100 may be further configured to recommend corresponding application content, such as products, articles, videos, and the like, to the user through the interest recommendation application according to the interest tag of the user.
The terminal apparatus 101: the method is used for screening out target content application programs used by the user through the interest recommendation application according to the recommendation application program set obtained through the server 100, extracting text information from display content in the target content application programs, estimating interest tags of the user according to the extracted text information, and uploading the interest tags of the user to the server 100.
In the embodiment of the application, the interest labels of the users are deduced reversely through the display content of the target content application program used by the users without subjective selection operation of the users, so that the interest labels with more objective and smaller granularity can be obtained, and the accuracy of the interest labels of the users is improved.
The following describes a specific process for obtaining the user interest tag by using an interest recommendation application in the terminal device of the user as an execution subject. Referring to fig. 2, a flowchart of an implementation of a method for obtaining a user interest tag according to the present application is shown. The method comprises the following specific processes:
step 200: and screening the local application program to obtain the target content application program used by the user.
Specifically, when step 200 is executed, the following steps may be adopted:
s2001: a set of recommended applications for a plurality of applications of a specified type is obtained by a server.
Specifically, a recommended application request is sent to the server, and a recommended application set including a plurality of application programs of a specified type returned by the server is received.
It should be noted that the recommended application request may be sent to the server when the interest recommended application is started each time, or the recommended application request may be sent to the server periodically according to a specified duration, so as to continuously update the obtained recommended application program set, or the recommended application request may be triggered to be sent to the server according to other manners, which is not limited herein.
In one embodiment, the interest recommendation application is a new online product application, and when a user opens the new product application, a recommendation application request is sent to the server, and a recommendation application program set returned by the server is obtained. For example, the interest recommendation application may be a video application.
Optionally, the application program of the specified type may be an APP of interest recommendation type, that is, an APP of a product, information, or a commodity, etc., such as a today's headline, tremble, and Taobao, etc., is recommended to the user according to the use habit, preference, etc., of the user.
In practical applications, the corresponding designated type may also be set according to the application purpose of the interest tag of the user, which is not limited herein.
Optionally, the request for the recommended application may further include a specified type of the request, that is, the interest recommendation application determines the type of each application program in the set of recommended application programs.
S2002: and screening out the local applications contained in the recommended application set.
Specifically, each local application program is matched with each application program in the recommended application program set, and the successfully matched local application programs are screened out.
In this way, a specified type of native application can be screened out.
Optionally, before executing S2002, the following steps may also be executed:
and acquiring the use time of each local application program in a specified time range, and screening the local application programs of which the use time is higher than a preset time threshold value.
In practical application, both the specified time range and the preset time threshold may be set according to practical application scenarios, for example, the specified time range may be within a week, a month, or a year, and the preset time threshold may be 10 hours, which is not limited herein.
Therefore, local application programs of specified types commonly used by the user can be screened out according to the use duration of the user, so that the accuracy of the subsequently obtained interest tags is ensured.
Optionally, after executing S2002, the following steps may also be executed:
and according to the sequence of the use time length from high to low, screening out a specified number of local application programs with the maximum use time length again from the screened out local application programs.
The designated number is a positive integer, and may be set according to an actual application scenario, and is not described herein again if the designated number is 5.
In one embodiment, the screened local applications are sorted according to the corresponding use duration from high to low, and the first n local applications are obtained, wherein n is a designated number and is an integer.
In one embodiment, the screened local applications are sorted according to the sequence of the corresponding use duration from low to high, and the last n local applications are obtained, wherein n is a designated number and is an integer.
Therefore, a specified number of local application programs can be screened out, so that the data processing amount is reduced, and the processing efficiency and the accuracy are improved.
S2003: and determining the screened local application program as a target content application program.
In this way, a specified number of local applications that are commonly used by the user and of a specified type can be screened out.
Step 201: and acquiring the display content in the target content application program used by the user.
Specifically, when step 201 is executed, the following methods may be adopted:
the first mode is as follows: when the specified application access right of the target content application program is determined, the display content in the specified area range in each target content application program is read respectively.
Wherein, the specified application access right can be obtained by the following modes:
a) and inquiring whether the user agrees to the specified application access authority of the authorized target content application program or not through a popup window, and obtaining the specified application access authority of the target content application program when the user agrees.
b) The system defaults to having the specified application access rights for the target content application.
The designated application access right includes a right for accessing the display content of the target content application, and may also include a right such as a screenshot, and may be set according to an actual application scenario, which is not limited herein.
Optionally, the display content in the designated area range may be recommended content of the target content application program, may also be access content or search content of the target content application program for the user, and may also be display content of other set areas, which is not limited herein.
Optionally, when the display content is read, the corresponding content in the target content application program may be read through the background of the terminal device, and the target content application program may also be controlled to open and read the display content.
Thus, the display content can be read in the background.
The second way is: when the appointed application access right of the target content application program is determined, pictures in the appointed area range in each target content application program are respectively intercepted, and picture identification is carried out on each picture to obtain display content.
In one implementation, a designated area range in each target content application program is opened, each picture is obtained according to the opened screenshot of the designated area range, and image-text recognition is performed on each picture by adopting image-character recognition technologies such as OCR (optical character recognition) and the like to obtain display content.
Therefore, the display content can be obtained through the screenshot of the software.
The third mode is as follows: and when the specified application access right of the target content application program is determined not to exist, acquiring user screenshots of all the target content application programs, and performing image-text recognition on the acquired user screenshots to acquire display content.
In one embodiment, the user is asked through a pop-up window whether to grant the specified application access rights of the targeted content application, and when the user is determined to be not granted, it is determined that the specified application access rights of the targeted content application are not present. And acquiring a user screenshot of the target content application program uploaded by the user, and performing image-text recognition on the user screenshot by adopting image-character recognition technologies such as OCR (optical character recognition) and the like to obtain the display content.
Therefore, when the appointed application access right of the target content application program does not exist, the display content can be obtained according to the user screenshot uploaded by the user.
Further, the core picture elements in the user screenshot or the picture may be analyzed to obtain the display content, and in practical application, the display content may also be obtained in other manners according to a practical application scene, which is not limited herein.
Step 202: text information is extracted from the presentation.
Specifically, when step 202 is executed, the following steps may be adopted:
s2021: and performing semantic analysis on the acquired display content according to a preset text analysis algorithm to obtain semantic analysis content.
The text analysis algorithm is an algorithm for semantic analysis, for example, the text analysis algorithm may adopt an NLP algorithm.
In one embodiment, the NLP algorithm is used to perform semantic analysis on the display content of each target content application program to obtain semantic analysis content.
S2022: and extracting the keywords of the semantic analysis content according to a preset keyword extraction algorithm.
The keyword extraction algorithm is an algorithm for extracting keywords in the text.
Furthermore, the repeated times of each keyword in the extracted keywords can be respectively counted to determine the corresponding frequency.
In this way, keywords and corresponding frequencies in the presentation content can be extracted.
Step 203: and estimating the interest label of the user according to the extracted text information.
Specifically, when step 203 is executed, the following method may be adopted:
the first mode is as follows: and determining the extracted keywords as interest tags of the user.
The second way is: and acquiring the interest tag correspondingly set by the extracted keyword.
Specifically, the corresponding relationship between each keyword and each interest tag is preset, and the interest tag set corresponding to the extracted keyword is obtained according to the associated relationship.
For example, the keywords are: peony, rose, ginkgo and rabbit. Assuming that the interest tags corresponding to peony, rose and ginkgo are all plants, and the interest tag corresponding to rabbit is an animal, the obtained interest tags are plants and animals.
Therefore, each keyword can be divided to obtain corresponding interest tags.
Further, a tag weight value of the interest tag may also be determined.
When determining the tag weight value of the interest tag, the following steps may be adopted:
s2031: and respectively determining the obtained frequency of each interest tag according to the frequency of each keyword.
Specifically, for each interest tag, the sum of the frequencies of the keywords corresponding to the interest tag is counted.
For example, the keywords are: peony, rose, ginkgo and rabbit, the corresponding frequency is 1. Because the interest tags corresponding to the peony, the rose and the ginkgo are plants, and the interest tag corresponding to the rabbit is an animal, the frequency of the plant is determined to be 3, and the frequency of the animal is determined to be 1.
S2032: and determining a corresponding label weight value according to the frequency of each interest label.
In one embodiment, a correspondence between the frequency and the tag weight value is preset, and the tag weight value of each interest tag is determined according to the correspondence.
In one embodiment, the tag weight value is a frequency of the interest tag, and then the frequency of each interest tag is determined as a corresponding tag weight value.
For example, if the frequency of interest tag a is 3, then the tag weight value is 3.
In practical application, the corresponding relationship between the frequency and the tag weight value may be set according to a practical application scenario, which is not limited herein.
In this way, each interest tag, and corresponding tag weight value, may be determined.
Step 204: and sending the obtained interest tag of the user to a server.
Further, the label weight value corresponding to the interest label of the user can be sent to the server.
Further, the server stores the received interest tags of the users, or the interest tags and the corresponding tag weight values.
An application scenario of an interest tag of a user is as follows: and recommending corresponding content to the user by the interest recommendation application according to the interest tag of the user.
An application scenario of an interest tag of a user is as follows: and recommending corresponding contents to the user by the interest recommendation application according to the interest labels and the label weight values of the user.
Optionally, the interest tag of the user and the corresponding tag weight value may also be used in an application program of a friend-making class to match a more appropriate friend-making object.
In one implementation mode, the target user can be converted into the seed user in a mode of recommending corresponding content according to the interest tag of the target user at the initial stage of new product application, so that the content which meets the requirements of the user can be recommended to the user when the new product application is started in a cold mode, the satisfaction degree of the user is improved, and the retention rate of the product application is improved.
According to the embodiment of the application, according to the display content of the target content application program which has been used by the user for a period of time, the interest point dimensionality of the user and the attention degree of each interest point are analyzed through modes such as an OCR (optical character recognition), an NLP (non line-of-sight) algorithm and a keyword abstraction algorithm, the interest labels and the corresponding label weight values of the user are obtained, further, the content which meets the requirements of the user better can be recommended to the user according to the interest labels and the corresponding label weights, the accuracy of the interest labels of the user and the satisfaction degree of the user are improved, and the product retention rate of the.
Based on the same inventive concept, the embodiment of the present application further provides a device for obtaining a tag of interest of a user, and because the principle of solving the problem of the device and the equipment is similar to that of a method for obtaining a tag of interest of a user, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
Fig. 3 is a schematic structural diagram of an apparatus for obtaining a tag of interest of a user according to an embodiment of the present application. An apparatus for obtaining a tag of interest of a user, comprising:
an obtaining unit 301, configured to obtain display content in a target content application used by a user;
an extracting unit 302, configured to extract text information from the display content;
an estimating unit 303, configured to estimate an interest tag of the user according to the extracted text information.
Preferably, the obtaining unit 301 is further configured to:
acquiring a recommended application program set containing a plurality of application programs of a specified type through a server;
screening out local application programs contained in the recommended application program set;
and determining the screened local application program as a target content application program.
Preferably, the obtaining unit 301 is further configured to:
acquiring the use duration of each local application program in a specified time range;
and screening the local application programs with the use time longer than a preset time length threshold value.
Preferably, the obtaining unit 301 is further configured to:
and according to the sequence of the use time length from high to low, screening out a specified number of local application programs with the maximum use time length again from the screened out local application programs.
Preferably, the obtaining unit 301 is configured to:
when the specified application access right of the target content application program is determined, the display content in the specified area range in each target content application program is read respectively, or when the specified application access right of the target content application program is determined, the pictures in the specified area range in each target content application program are intercepted respectively, and the pictures are subjected to picture-text recognition to obtain the display content;
and when the specified application access right of the target content application program is determined not to exist, acquiring the user screenshot of each screened target content application program, and performing image-text recognition on the acquired user screenshot to acquire the display content.
Preferably, the extracting unit 302 is configured to:
according to a preset text analysis algorithm, performing semantic analysis on the acquired display content to obtain semantic analysis content;
and extracting the keywords of the semantic analysis content according to a preset keyword extraction algorithm.
Preferably, the estimating unit 303 is configured to:
determining the extracted keywords as interest tags of the user; or obtaining the interest tag correspondingly set by the extracted keyword.
Preferably, the estimating unit 303 is further configured to:
respectively counting the frequency of each keyword in the extracted keywords;
respectively determining the obtained frequency of each interest tag according to the frequency of each keyword;
and determining a corresponding label weight value according to the frequency of each interest label.
Fig. 4 shows a schematic configuration of a control device 4000. Referring to fig. 4, the control apparatus 4000 includes: processor 4010, memory 4020, power supply 4030, display unit 4040, and input unit 4050.
The processor 4010 is a control center of the control apparatus 4000, connects each component by various interfaces and lines, and executes various functions of the control apparatus 4000 by running or executing software programs and/or data stored in the memory 4020, thereby performing overall monitoring of the control apparatus 4000.
In this embodiment of the application, the processor 4010 executes the method for obtaining a tag of interest of a user as provided in the embodiment shown in fig. 2 when calling the computer program stored in the memory 4020.
Optionally, processor 4010 may comprise one or more processing units; preferably, the processor 4010 may integrate an application processor, which mainly handles operating systems, user interfaces, applications, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 4010. In some embodiments, the processor, memory, and/or memory may be implemented on a single chip, or in some embodiments, they may be implemented separately on separate chips.
The memory 4020 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, various applications, and the like; the storage data area may store data created according to the use of the control apparatus 4000, and the like. Further, the memory 4020 may include a high speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The control device 4000 may further include a power supply 4030 (e.g., a battery) to provide power to the various components, which may be logically coupled to the processor 4010 via a power management system to enable management of charging, discharging, and power consumption via the power management system.
The display unit 4040 may be configured to display information input by a user or information provided to the user, and various menus of the control apparatus 4000. The display unit 4040 may include a display panel 4041. The Display panel 4041 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The input unit 4050 may be used to receive information such as numbers or characters input by a user. The input unit 4050 may include a touch panel 4051 and other input devices 4052. Touch panel 4051, also referred to as a touch screen, may collect touch operations by a user on or near the touch panel 4051 (e.g., operations by a user on or near touch panel 4051 using a finger, a stylus, or any other suitable object or attachment).
Specifically, the touch panel 4051 may detect a touch operation of the user, detect signals generated by the touch operation, convert the signals into touch point coordinates, transmit the touch point coordinates to the processor 4010, receive a command transmitted from the processor 4010, and execute the command. In addition, the touch panel 4051 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. Other input devices 4052 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, power on and off keys, etc.), a trackball, a mouse, a joystick, and the like.
Of course, the touch panel 4051 may cover the display panel 4041, and when the touch panel 4051 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 4010 to determine the type of the touch event, and then the processor 4010 provides a corresponding visual output on the display panel 4041 according to the type of the touch event. Although in fig. 4, the touch panel 4051 and the display panel 4041 are two separate components to implement the input and output functions of the control device 4000, in some embodiments, the touch panel 4051 and the display panel 4041 may be integrated to implement the input and output functions of the control device 4000.
The control device 4000 may also include one or more sensors, such as pressure sensors, gravitational acceleration sensors, proximity light sensors, and the like. Of course, the control device 4000 may further include other components such as a camera, which are not shown in fig. 4 and will not be described in detail since they are not components that are used in the embodiment of the present application.
Those skilled in the art will appreciate that fig. 4 is merely an example of a control device and is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or different components.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for obtaining a user interest tag in any of the above method embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. Based on such understanding, the above technical solutions substantially or partially contributing to the related art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a control device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (15)

1. A method for obtaining a user interest tag, comprising:
acquiring display content in a target content application program used by a user;
extracting text information from the display content;
and estimating the interest label of the user according to the extracted text information.
2. The method of claim 1, prior to obtaining the presentation content in the targeted content application for use by the user, further comprising:
acquiring a recommended application program set containing a plurality of application programs of a specified type through a server;
screening out the local application programs contained in the recommended application program set;
and determining the screened local application program as a target content application program.
3. The method of claim 2, further comprising, prior to screening out local applications contained by the set of recommended applications:
acquiring the use duration of each local application program in a specified time range;
and screening the local application programs with the use time longer than a preset time length threshold value.
4. The method of claim 3, after screening out the local applications contained by the set of recommended applications, further comprising:
and according to the sequence of the use time length from high to low, screening out a specified number of local application programs with the maximum use time length again from the screened out local application programs.
5. The method of any one of claims 1-4, wherein obtaining the presentation content in the targeted content application used by the user comprises:
when the specified application access right of the target content application program is determined, the display content in the specified area range in each target content application program is read respectively, or when the specified application access right of the target content application program is determined, the pictures in the specified area range in each target content application program are intercepted respectively, and the pictures are subjected to picture-text recognition to obtain the display content;
and when the specified application access right of the target content application program is determined not to exist, acquiring the user screenshot of each screened target content application program, and performing image-text recognition on the acquired user screenshot to acquire the display content.
6. The method of any one of claims 1-4, wherein extracting textual information from the presentation content comprises:
according to a preset text analysis algorithm, performing semantic analysis on the acquired display content to obtain semantic analysis content;
and extracting the keywords of the semantic analysis content according to a preset keyword extraction algorithm.
7. The method of claim 6, wherein estimating the interest tag of the user based on the extracted textual information comprises:
determining the extracted keywords as interest tags of the user; alternatively, the first and second electrodes may be,
and acquiring the interest tag correspondingly set by the extracted keyword.
8. The method of claim 7, further comprising:
respectively counting the frequency of each keyword in the extracted keywords;
respectively determining the obtained frequency of each interest tag according to the frequency of each keyword;
and determining a corresponding label weight value according to the frequency of each interest label.
9. An apparatus for obtaining a tag of interest of a user, comprising:
the device comprises an acquisition unit, a display unit and a display unit, wherein the acquisition unit is used for acquiring display content in a target content application program used by a user;
the extraction unit is used for extracting text information from the display content;
and the estimation unit is used for estimating the interest tag of the user according to the extracted text information.
10. The apparatus of claim 9, wherein the obtaining unit is further configured to:
acquiring a recommended application program set containing a plurality of application programs of a specified type through a server;
screening out the local application programs contained in the recommended application program set;
and determining the screened local application program as a target content application program.
11. The apparatus of claim 10, wherein the obtaining unit is further configured to:
acquiring the use duration of each local application program in a specified time range;
and screening the local application programs with the use time longer than a preset time length threshold value.
12. The apparatus of claim 11, wherein the obtaining unit is further configured to:
and according to the sequence of the use time length from high to low, screening out a specified number of local application programs with the maximum use time length again from the screened out local application programs.
13. The apparatus of any one of claims 9-12, wherein the obtaining unit is to:
when the specified application access right of the target content application program is determined, the display content in the specified area range in each target content application program is read respectively, or when the specified application access right of the target content application program is determined, the pictures in the specified area range in each target content application program are intercepted respectively, and the pictures are subjected to picture-text recognition to obtain the display content;
and when the specified application access right of the target content application program is determined not to exist, acquiring the user screenshot of each screened target content application program, and performing image-text recognition on the acquired user screenshot to acquire the display content.
14. A control device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1-8 are implemented when the program is executed by the processor.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
CN201911247424.7A 2019-12-09 2019-12-09 Method, device, equipment and medium for obtaining user interest labels Active CN111026967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911247424.7A CN111026967B (en) 2019-12-09 2019-12-09 Method, device, equipment and medium for obtaining user interest labels

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911247424.7A CN111026967B (en) 2019-12-09 2019-12-09 Method, device, equipment and medium for obtaining user interest labels

Publications (2)

Publication Number Publication Date
CN111026967A true CN111026967A (en) 2020-04-17
CN111026967B CN111026967B (en) 2023-08-04

Family

ID=70204784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911247424.7A Active CN111026967B (en) 2019-12-09 2019-12-09 Method, device, equipment and medium for obtaining user interest labels

Country Status (1)

Country Link
CN (1) CN111026967B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680219A (en) * 2020-06-09 2020-09-18 腾讯科技(深圳)有限公司 Content recommendation method, device, equipment and readable storage medium
CN111949776A (en) * 2020-07-17 2020-11-17 上海淇馥信息技术有限公司 Method and device for evaluating user tag and electronic equipment
CN112883275A (en) * 2021-03-17 2021-06-01 北京乐我无限科技有限责任公司 Live broadcast room recommendation method, device, server and medium
CN113688626A (en) * 2021-09-02 2021-11-23 北京方正阿帕比技术有限公司 Method for extracting reader interest tag

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11338869A (en) * 1998-05-25 1999-12-10 Nippon Telegr & Teleph Corp <Ntt> Information recommendation method and system, storage medium storing information recommendation program, information storage method and device, and storage medium storing information storage program
CN102411596A (en) * 2010-09-21 2012-04-11 阿里巴巴集团控股有限公司 Information recommendation method and system
WO2015166630A1 (en) * 2014-05-02 2015-11-05 株式会社ランディード Information presentation system, device, method, and computer program
CN106383857A (en) * 2016-08-31 2017-02-08 锐捷网络股份有限公司 Information processing method and electronic equipment
US20180067939A1 (en) * 2016-09-07 2018-03-08 Tivo Solutions Inc. Automatically labeling clusters of media content consumers
CN109446412A (en) * 2018-09-25 2019-03-08 中国平安人寿保险股份有限公司 Product data method for pushing, device, equipment and medium based on web page tag

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11338869A (en) * 1998-05-25 1999-12-10 Nippon Telegr & Teleph Corp <Ntt> Information recommendation method and system, storage medium storing information recommendation program, information storage method and device, and storage medium storing information storage program
CN102411596A (en) * 2010-09-21 2012-04-11 阿里巴巴集团控股有限公司 Information recommendation method and system
WO2015166630A1 (en) * 2014-05-02 2015-11-05 株式会社ランディード Information presentation system, device, method, and computer program
CN106383857A (en) * 2016-08-31 2017-02-08 锐捷网络股份有限公司 Information processing method and electronic equipment
US20180067939A1 (en) * 2016-09-07 2018-03-08 Tivo Solutions Inc. Automatically labeling clusters of media content consumers
CN109446412A (en) * 2018-09-25 2019-03-08 中国平安人寿保险股份有限公司 Product data method for pushing, device, equipment and medium based on web page tag

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680219A (en) * 2020-06-09 2020-09-18 腾讯科技(深圳)有限公司 Content recommendation method, device, equipment and readable storage medium
CN111680219B (en) * 2020-06-09 2023-10-20 深圳市雅阅科技有限公司 Content recommendation method, device, equipment and readable storage medium
CN111949776A (en) * 2020-07-17 2020-11-17 上海淇馥信息技术有限公司 Method and device for evaluating user tag and electronic equipment
CN111949776B (en) * 2020-07-17 2023-09-22 上海淇馥信息技术有限公司 User tag evaluation method and device and electronic equipment
CN112883275A (en) * 2021-03-17 2021-06-01 北京乐我无限科技有限责任公司 Live broadcast room recommendation method, device, server and medium
CN112883275B (en) * 2021-03-17 2024-01-19 北京乐我无限科技有限责任公司 Live broadcast room recommendation method, device, server and medium
CN113688626A (en) * 2021-09-02 2021-11-23 北京方正阿帕比技术有限公司 Method for extracting reader interest tag

Also Published As

Publication number Publication date
CN111026967B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
CN108009521B (en) Face image matching method, device, terminal and storage medium
CN111026967B (en) Method, device, equipment and medium for obtaining user interest labels
US10133951B1 (en) Fusion of bounding regions
CN106326420B (en) Recommendation method and device for mobile terminal
CN106164959A (en) Behavior affair system and correlation technique
CN107209905A (en) For personalized and task completion service, correspondence spends theme and sorted out
CN101305368A (en) Semantic visual search engine
CN106537387B (en) Retrieval/storage image associated with event
CN108256537A (en) A kind of user gender prediction method and system
CN111240482A (en) Special effect display method and device
KR20190053481A (en) Apparatus and method for user interest information generation
CN113516113A (en) Image content identification method, device, equipment and storage medium
CN111428522B (en) Translation corpus generation method, device, computer equipment and storage medium
CN111667275A (en) User identity identification method, device, equipment and medium thereof
CN114357278A (en) Topic recommendation method, device and equipment
CN115131604A (en) Multi-label image classification method and device, electronic equipment and storage medium
CN112995757B (en) Video clipping method and device
CN112948526A (en) User portrait generation method and device, electronic equipment and storage medium
CN110516153B (en) Intelligent video pushing method and device, storage medium and electronic device
CN113486260B (en) Method and device for generating interactive information, computer equipment and storage medium
CN111639705B (en) Batch picture marking method, system, machine readable medium and equipment
CN114547242A (en) Questionnaire investigation method and device, electronic equipment and readable storage medium
CN114332990A (en) Emotion recognition method, device, equipment and medium
CN110352418A (en) Inquiry disambiguation is carried out by disambiguating dialogue problem
CN110020120A (en) Feature word treatment method, device and storage medium in content delivery system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40022567

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant