CN110674410B - User portrait construction and content recommendation method, device and equipment - Google Patents

User portrait construction and content recommendation method, device and equipment Download PDF

Info

Publication number
CN110674410B
CN110674410B CN201910951783.4A CN201910951783A CN110674410B CN 110674410 B CN110674410 B CN 110674410B CN 201910951783 A CN201910951783 A CN 201910951783A CN 110674410 B CN110674410 B CN 110674410B
Authority
CN
China
Prior art keywords
user
content
reading
determining
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910951783.4A
Other languages
Chinese (zh)
Other versions
CN110674410A (en
Inventor
黄涛
姜伟
杨令铎
李来林
伍绪青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Luka Beijing Intelligent Technology Co ltd
Original Assignee
Beijing Wuling Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wuling Technology Co ltd filed Critical Beijing Wuling Technology Co ltd
Priority to CN201910951783.4A priority Critical patent/CN110674410B/en
Publication of CN110674410A publication Critical patent/CN110674410A/en
Application granted granted Critical
Publication of CN110674410B publication Critical patent/CN110674410B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The application discloses a user portrait construction method, a user portrait construction device, a user portrait content recommendation device and user portrait content recommendation equipment. The user portrait construction method comprises the following steps: acquiring historical reading data of a user; obtaining a label of each piece of historical reading content, and determining the weight of each label; filtering the tags according to the weights; and constructing a reading interest portrait of the user according to the filtered tags. The content recommendation method comprises the following steps: constructing a reading interest portrait according to the method; determining a multivariate intelligent theoretical value of a user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories; and determining the content recommended to the user according to the reading interest portrait and the multivariate intelligent theoretical value. By adopting the scheme, more accurate reading preference of the user can be obtained, the reading content which is interested in the user can be conveniently recommended for the user, and the content which the user needs to read can be recommended for the user when the user is recommended, rather than only developing the intelligent category which is interested in the user.

Description

User portrait construction and content recommendation method, device and equipment
Technical Field
The application relates to the technical field of computers, in particular to a user portrait construction and content recommendation method, device and equipment.
Background
The picture book is considered to be a book suitable for children to read, and through beautiful pictures and concise characters, language development of children can be effectively promoted, and the reading interest of the children is developed. With the development of the Artificial Intelligence (AI) technology in recent years, the AI technology can be utilized to help parents and children read and draw books, so that the problems of reading without time, reading difficulty and the like are solved, the concentration of children is improved, and a good reading habit is developed.
For example, a conventional drawing robot can read out a drawing viewed by the drawing robot in a voice manner by using image and voice technologies. Specifically, the method comprises the following steps: the user shows a picture book in front of the picture book robot, the picture book robot shoots by using the camera, then the computer vision technology is used for judging which picture book is shown by the user, and the content of the inner page is read out in a voice mode. The user turns over a page, the drawing robot reads a page, and the drawing robot can also support page skipping and reading.
Because each child is an independent individual and the interested objects and books are different, the reading preference of each child can be obtained when the book is read, some children are interested in habits and family conditions, and some children are interested in family conditions, so that the reading interest of each child needs to be known and targeted reading recommendation is provided for the children.
Disclosure of Invention
The embodiment of the application provides a user portrait construction method, a user portrait construction device, a user portrait recommendation device and a user portrait recommendation device, and is used for recommending personalized reading content for a user.
The user portrait construction method provided by the embodiment of the application comprises the following steps:
acquiring historical reading data of a user;
obtaining the label of each piece of historical reading content in the historical reading data, and determining the weight of each label;
filtering the tags according to the weights;
and constructing a reading interest portrait of the user according to the filtered tags.
In the method, the tags which can be used for describing the reading interests of the user are determined according to the historical data of the user, the tags are filtered according to the weight, the reading interest preference of the user can be better reflected by the filtered tags, and the reading content recommended for the user according to the more accurate reading preference can better meet the reading requirement of the user.
The content recommendation method provided by the embodiment of the application comprises the following steps:
constructing a reading interest portrait of a user according to the method of the embodiment;
determining a multivariate intelligent theoretical value of a user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories;
and determining the content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
In the method, the content is recommended for the user according to the reading interest of the user, and the reading content is recommended for the user according to the multivariate intelligent theoretical value of the user, so that the content recommended for the user is not only interesting for the user, but also can comprise the content which the user needs to read, and is not only developed into the intelligent category which the user is interested in.
The user portrait construction device that this application embodiment provided includes:
the acquisition module is used for acquiring historical reading data of a user;
the determining module is used for acquiring the label of each piece of historical reading content in the historical reading data and determining the weight of each label;
the filtering module is used for filtering the labels according to the weight;
and the construction module is used for constructing the reading interest portrait of the user according to the filtered tags.
An embodiment of the present application provides a content recommendation device, including:
a construction module for constructing a reading interest representation according to the method of any one of claims 1 to 5;
the determination module is used for determining a multivariate intelligent theoretical value of a user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories;
and the recommending module is used for determining the content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
An embodiment of the present application provides a content recommendation device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor,
the instructions are executable by the at least one processor to enable the at least one processor to perform the content recommendation method described above.
An embodiment of the present application provides a computer-readable storage medium, where instructions are stored on the computer-readable storage medium, and when the instructions are executed by a processor, the user portrait construction method or the content recommendation method is implemented.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic flow chart of a user representation construction method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram illustrating a read content tag generation method according to an embodiment of the present application;
fig. 3 is a flowchart illustrating a content recommendation method according to an embodiment of the present application;
fig. 4 is a second flowchart illustrating a content recommendation method according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a user representation constructing apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a content recommendation device according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram of a user representation creation apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a content recommendation device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to understand the reading interest of each user and further recommend the reading content to the user according to the reading interest, the embodiment of the present application provides a user representation construction method to determine the reading interest of the user.
The method can be applied to reading equipment, such as a robot, or can also be applied to other intelligent terminals, for example, an application program for implementing the method is installed in the intelligent terminal.
Referring to fig. 1, a flow chart of a user portrait construction method provided in an embodiment of the present application is schematically illustrated, and as shown in the drawing, the method may include the following steps:
step 101, obtaining historical reading data of a user.
For example, the drawing robot may obtain the user-read drawing data, which may include the user-prepared drawing data that the drawing robot reads by using a computer vision technology, the drawing data that the drawing robot reads by itself or is downloaded from the internet, and further, the drawing data that the user has ordered or has collected may be included.
Taking reading application software installed in the intelligent terminal as an example, the intelligent terminal obtains the novels and the cartoons read by the user, the data of the listened to voiced novels and the like through executing a program of the software, and the novels and the cartoons ordered or collected by the user can be obtained.
And 102, obtaining the label of each piece of historical reading content in the historical data, and determining the weight of each label.
Each piece of reading content can correspond to one or more tags, and the tags of the historical reading content of the user are extracted, so that the reading interest of the user can be conveniently known and analyzed. Optionally, a label may be set according to the type of the content, for example, a label related to the fairy tale in denmark may be "fairy tale"; a picture of an animal, the corresponding tag may be "animal". Of course, there may be a plurality of type tags, for example, a book related to teenager psychology education, and the corresponding tags may include "teenager", "psychology", and "education", etc. In addition to setting tags according to the content type of the book, corresponding tags may be set according to the author, the publisher, the age group suitable for reading, and the like.
In one possible implementation, the tags of each piece of historical reading content may be obtained by web crawler, automatic generation, or manual input. The web crawler technology is a program or script for automatically capturing web information according to a certain rule, that is, automatically capturing a tag for the content existing in another web page from the internet. Manual input, i.e., a label input by the user about the reading content. And automatic generation, namely, drawing the corresponding label automatically by the robot or the intelligent terminal and the like according to the acquired reading content by using an AI technology.
Alternatively, when the tag of the reading content is automatically generated, keywords may be extracted from the text content of the obtained reading content, for example, as shown in fig. 2, a Natural Language Processing (NLP) technique may be used to perform word segmentation and word type tagging, only some key words, such as nouns and action nouns, are reserved, and some noise information, such as stop words and punctuation marks, is removed. And then, respectively determining the weight of each keyword from three dimensions according to the word frequency, the maximum entropy and a keyword extraction algorithm.
The Term Frequency algorithm can be Term Frequency-Inverse text Frequency (TFIDF) TFIDF, and the principle is as follows: if a word or phrase appears frequently in one article, TF is high, and rarely appears in other articles, the word or phrase is considered to have a good classification capability and is suitable for classification. For example, if the word "plant" in a sketch occurs with a high frequency, but the word does not occur with a high frequency in the corpus, it may be determined that the weight of the keyword "plant" is high. The maximum entropy principle is a criterion for selecting the random variable statistical characteristics to best meet objective conditions, and when only partial knowledge about unknown distribution is mastered, the probability distribution which meets the knowledge and has the maximum entropy value is selected.
And then selecting k keywords from the extracted keywords as the tags of the reading content by utilizing a multidimensional Top k Rank technology. The specific process may be as shown in fig. 2, and the keywords are first clustered and merged, for example, the keywords "home" and "family" have similar meanings, and may be clustered and merged by a clustering technique; and then, the top k keywords are reserved as the labels of the reading content by adopting a top k algorithm for the keywords subjected to clustering and merging, namely, the automatic generation of the labels of the reading content is realized.
After the tags of each piece of reading content are acquired, the weight of each tag needs to be further determined. In one possible implementation manner, a first weight of a corresponding user behavior type, the number of times of the user behavior type, a time attenuation factor, and a second weight determined according to a word frequency may be determined for each true tag. The specific label weight may be determined by the following formula:
label weight fun (action type weight, action times, time decay factor, TFIDF label weight)
Wherein, the fun function can adopt an algorithm weighted by a weighting factor; the user behavior types can comprise reading behavior types, purchasing behavior types, collecting behavior types and the like, and different behavior types can correspond to different weights; the TFIDF tag weight is the weight determined by applying the TFIDF algorithm.
And 103, filtering the labels according to the weight.
Specifically, the labels may be filtered using an interest label library and the weights of the labels; alternatively, a threshold may be set for the tag weight, so as to filter the tags; alternatively, k labels with higher weights may be selected.
And 104, determining the reading interest portrait of the user according to the filtered tags.
For example, the set of filtered tags may be used as a reading interest portrait of the user; alternatively, the reading interest portrait of the user may be generated according to a preset rule for the tag obtained after filtering.
Furthermore, besides the tags, the reading interest representation of the user can also include the corresponding weights of the tags so as to reflect the interest of the user in different aspects.
The user portrait construction method provided by the embodiment can be used for realizing the construction of the reading interest portrait of the user, fully knowing the reading interest of the user and providing conditions for recommending reading contents to the user subsequently. Because the user portrait construction system for picture book reading in the prior art is still incomplete, label setting for the picture book is incomplete, and interest portrait of children for the picture book is not constructed, the method is particularly suitable for constructing the reading interest portrait of the children for the picture book.
Based on the same technical concept, the embodiment of the application also provides a content recommendation method, which is used for recommending reading content which may be interested by a user to the user. The method can be applied to reading equipment, such as a robot, or can also be applied to other intelligent terminals, for example, an application program for implementing the method is installed in the intelligent terminal.
Referring to fig. 3, a schematic flow chart of a content recommendation method provided in the embodiment of the present application is shown, and as shown in the drawing, the method may include the following steps:
step 301, obtaining historical reading data of a user.
Similar to the foregoing embodiment, the obtained historical reading data may include content read by the reading device by using a computer vision technology, reading content stored by the reading device itself or downloaded from the internet, and reading content ordered or collected by the user.
Step 302, obtaining the label of each piece of historical reading content in the historical reading data, and determining the weight of each label; and filtering the labels according to the weight, and determining the reading interest portrait of the user according to the filtered labels.
As described above, for each piece of reading content, the tag related to the reading content may be obtained through a web crawler technology, or a manually input tag is received, or a corresponding tag may be automatically generated, and the method for automatically generating a tag is similar to the foregoing embodiment, and is not described here again.
Step 303, determining a multivariate intelligent theoretical value of the user according to the historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories.
The multivariate intelligent theory was proposed by gardner, harvard university, usa in 1983. Traditionally, schools have focused on the development of both logical mathematics and linguistic intelligence (reading, writing), but this is not all human intelligence, which is a combination of multiple dimensional intelligence. The following takes a general eight-dimensional intelligent theory as an example, and introduces human multivariate intelligence and the performance corresponding to children one by one:
self-cognitive intelligence: the ability to drill into and understand the heart and world and to direct their behavior. It appears that the child has a profound understanding of himself.
Music intelligence: feeling, appreciation, playing, singing, ability to create music. It appears that children are more sensitive to rhythm, tone, timbre and melody.
Interpersonal intelligence: learn about others, the ability to collaborate with people. It appears that the child perceives emotional changes of others and reacts appropriately accordingly.
Language intelligence: mastering and applying the abilities of language and characters. It is expressed that children can describe events in language and express ideas to communicate with people.
Physical kinesthetic intelligence: refers to the ability to apply the entire body or a part of the body (including the mouth and hands) to solve a problem or create a product.
Logical mathematical intelligence: logical reasoning, mathematical operations, scientific analysis. It shows that children are interested in causal, logical, etc. relationships between things.
Space intelligence: the ability to transform what is being observed into a model or image of the brain. It appears that children are more sensitive to lines, shapes, colors, spaces, etc.
The zoologist intelligence: the ability to study, summarize and classify all things in nature. It is expressed that children like exploring nature, planting and raising.
Different people can have different intelligent combinations, for example, architect and sculptor's space intelligence is stronger, sportsman and ballet actor's physical kinesthetic intelligence is stronger, customs personnel's interpersonal intelligence is stronger, the self-cognition intelligence of writer is stronger etc.
In order to comprehensively understand the development of the user in the multivariate intelligence, the development condition of the user in each intelligent category can be judged according to the historical reading content of the user or the reading interest picture of the user.
The following description will take the example of determining the multivariate intelligent theoretical value based on the user reading interest portrait as an example. As described in the previous embodiments, the user reading interest picture may include a set of tags. Then after the reading interest portrait of the user is obtained, whether each tag belongs to a certain intelligent category can be determined respectively. For example, the user's reading interest representation includes the following tags: "natural spelling", "astronomical knowledge", "natural knowledge", "everyday words", "Chinese character learning", "human body manufacturer", etc. The tags 'natural spelling', 'everyday language', 'Chinese character learning' are analyzed by AI technology, and the reading interest of the user in the language intelligent category is reflected; the labels of astronomical knowledge, natural knowledge and human manufacturers reflect the reading interest of users in the intelligent category of the musicians.
Alternatively, a corresponding tag may be set for each intelligent category in advance, and then the tag in the user reading interest portrait may be matched with the tag corresponding to each intelligent category, as shown in the following table.
Intelligent category Label 1 Label 2 Label 3 ...
Language intelligence Natural spelling and reading Common words Chinese character learning device ...
Boctilogist intelligence Astronomical knowledge Knowledge of nature Human body manufacturer ...
... ... ... ... ...
If the number of the tags matched by the user in a certain intelligent category is large, the user can be considered to be better developed in the intelligent category. In another possible case, the reading interest portrait of the user includes a group of tags and a weight corresponding to each tag, and at this time, when the development condition of the user in a certain intelligent category is judged, the judgment can be further performed by combining the weights of the matched tags. For example, if the reading interest portrait of the user includes the label "natural knowledge" and the weight of the label is high, the user may be considered to develop well in the intelligent category of the musicians corresponding to the label "natural knowledge".
And step 304, determining the content recommended for the user according to the reading interest portrait of the user and the multivariate intelligent theoretical value.
When the reading content is recommended for the user according to the reading interest portrait of the user, the recommended content is more likely to be the content in which the user is interested, that is, the possibility that the user reads the recommended content is higher.
When recommending content for the user according to the multivariate intelligent theoretical value, on one hand, the content which the user is interested in can be recommended for the user, for example, if the user develops well in the language intelligent category, the user is more likely to be interested in the content related to the language intelligent category; on the other hand, the content related to the intelligent category with relatively weak development of the user can be recommended for the user to help the user to realize the comprehensive development, for example, if the user develops relatively weak in the aspect of the logic mathematical intelligent category, the reading content related to the logic mathematical intelligence can be recommended for the user; in addition, recommendation can be performed according to user requirements, for example, if the user desires to become an architect, and the architect needs strong space intelligence, reading content related to the space intelligence can be recommended for the user.
Therefore, the content recommended to the user is determined according to the reading interest portrait of the user and the multivariate intelligent theoretical value, so that the reading interest of the user can be met, and the development requirement of the user can be met.
Further, after the step 301, the information of the age, sex, location, and the like of the user may be counted and analyzed according to the obtained historical reading data, and the reading content may be recommended to the user according to the analyzed user information. For example, if the plurality of sketches read by the user are sketches suitable for being read by children aged 3 to 5, in step 304, contents suitable for being read by children aged 3 to 5 may be recommended to the user; if the historical reading content of the user is the content which is interested by the adolescent girls, the content which is suitable for the adolescent girls to read can be recommended to the user.
In addition, reading content can be recommended for the user according to the popularity. For example, the higher the number of reading clicks of the sketches A and B suitable for the children of 6-10 years old, the sketches A and B can be recommended to the users of 6-10 years old.
In one embodiment, when determining recommended content for a user, a specific flow may be as shown in fig. 4. Firstly, reading historical data of a user, data shot and recorded through a computer vision technology, ordering historical data, collecting historical data and other behavior data of the user are obtained; then analyzing information such as age (or age group), sex and the like of the user according to the data, constructing a reading interest portrait of the user and determining an intelligent theoretical value of the user; and inputting the analyzed information, the read interest portrait and the intelligent theoretical value of the user into a recommendation system through a user characteristic implantation layer. Then, filtering a large amount of reading contents in the reading library, for example, filtering a large amount of reading contents according to the age (paragraph), gender, heat, and the like of the user, inputting the candidate reading contents obtained after filtering into the recommendation system, and respectively passing through a dense layer (dense layer) and a discarding regularization layer (dropout layer) in the recommendation system, wherein the dense layer is used for classification, and the discarding regularization layer is used for temporarily discarding a part of neural network units from the network according to a certain probability in the training process of the deep learning network; and then, inputting the reading contents passing through the two layers into a two-classification layer (sigmod), namely judging whether each reading content is recommended to the user or not, and outputting the reading contents which are determined to be recommended to the user.
Based on the same technical concept, the embodiment of the present application further provides a user representation constructing apparatus, as shown in fig. 5, the apparatus may include:
an obtaining module 501, configured to obtain historical reading data of a user;
a determining module 502, configured to obtain a tag of each piece of historical reading content in the historical reading data, and determine a weight of each tag;
a filtering module 503, configured to filter the tags according to the weights;
and a construction module 504, configured to construct a reading interest representation of the user according to the filtered tags.
Optionally, the determining module 502 is specifically configured to:
determining a first weight of a user behavior type corresponding to each label, the times of the user behavior, a time attenuation factor and a second weight determined according to the word frequency;
and determining the weight of the label according to the first weight, the user behavior times, the time attenuation factor and the second weight.
Optionally, the time decay factor is calculated by the following formula:
N(t)=N0e-α(t+l)
wherein t represents the decay time, N0Denotes an initial value of the attenuation, α denotes an attenuation constant, and l denotes an amount of leftward shift.
Optionally, the determining module 502 is specifically configured to:
obtaining the label of each historical reading content through one or more of the following modes: and the web crawler automatically generates and acquires the manually input label.
Optionally, the determining module 502 is specifically configured to:
acquiring text content of each piece of historical reading content, and acquiring keywords from the text content;
determining the weight of each keyword by using a word frequency, maximum entropy and keyword extraction algorithm;
and taking the N keywords with the largest weight as the labels of the historical reading content, wherein N is an integer greater than or equal to 1.
Based on the same technical concept, an embodiment of the present application further provides a content recommendation apparatus, as shown in fig. 6, the apparatus may include:
a construction module 601, configured to construct a reading interest portrait of a user according to any embodiment of the user reading interest construction method;
a determining module 602, configured to determine a multivariate intelligent theoretical value of a user according to historical reading data of the user, where the multivariate intelligent theoretical value is used to reflect reading conditions of the user in multiple intelligent categories;
and the recommending module 603 is configured to determine content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
Optionally, the determining module 602 is specifically configured to:
obtaining a label of each piece of historical reading content in the historical reading data, and determining the intelligent category to which the content belongs according to the label;
counting the historical reading number of each intelligent category;
and generating a multivariate intelligent theoretical value according to the historical reading number of each intelligent category.
Optionally, the apparatus may further include a deep learning module 604, configured to perform deep learning according to the historical reading data;
the recommending module 603 is further configured to determine content recommended for the user according to the deep learning result.
Based on the same technical concept, the embodiment of the present application further provides a user representation construction device, as shown in fig. 7, the device 700 includes: at least one processor 710, a memory 720 communicatively coupled to the at least one processor 710;
the at least one processor 710 is configured to read a program in the memory, and to perform the following steps:
acquiring historical reading data of a user;
obtaining the label of each piece of historical reading content in the historical reading data, and determining the weight of each label;
filtering the tags according to the weights;
and constructing a reading interest portrait of the user according to the filtered tags.
Optionally, when determining the weight of each tag, the processor 710 is specifically configured to:
determining a first weight of a user behavior type corresponding to each label, the times of the user behavior, a time attenuation factor and a second weight determined according to the word frequency;
and determining the weight of the label according to the first weight, the user behavior times, the time attenuation factor and the second weight.
Optionally, the time attenuation factor is calculated by the following formula:
N(t)=N0e-α(t+l)
wherein t represents the decay time, N0Denotes an initial value of the attenuation, α denotes an attenuation constant, and l denotes an amount of leftward shift.
Optionally, when the processor 710 obtains the tag of each piece of historical reading content, the processor is specifically configured to:
obtaining the label of each historical reading content through one or more of the following modes: and the web crawler automatically generates and acquires the manually input label.
Optionally, when the processor 710 obtains the tag of each piece of historical reading content in an automatic generation manner, the processor is specifically configured to:
acquiring text content of each piece of historical reading content, and acquiring keywords from the text content;
determining the weight of each keyword by using a word frequency, maximum entropy and keyword extraction algorithm;
and taking the N keywords with the largest weight as the labels of the historical reading content, wherein N is an integer greater than or equal to 1.
Based on the same technical concept, an embodiment of the present application further provides a content recommendation device, as shown in fig. 8, where the device 800 includes: at least one processor 810, a memory 820 communicatively coupled to the at least one processor 810;
the at least one processor 810 is configured to read a program in the memory, and is configured to perform the following steps:
constructing a reading interest portrait according to the method of the embodiment;
determining a multivariate intelligent theoretical value of a user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories;
and determining the content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
Optionally, when determining the multivariate intelligent theoretical value of the user according to the historical reading data, the processor 810 is specifically configured to:
obtaining a label of each piece of historical reading content in the historical reading data, and determining the intelligent category to which the content belongs according to the label;
counting the historical reading number of each intelligent category;
and generating a multivariate intelligent theoretical value according to the historical reading number of each intelligent category.
Optionally, the processor 810 is further configured to:
and performing deep learning according to the historical reading data, and determining the content recommended for the user according to the deep learning result.
Based on the same technical concept, embodiments of the present application further provide a computer-readable storage medium, where instructions are stored on the computer-readable storage medium, and when executed by a processor, the instructions may implement the user portrait construction method or the content recommendation method.
In addition, other identical elements exist. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (16)

1. A user portrait construction method, comprising:
obtaining historical reading data of a user;
obtaining the label of each piece of historical reading content in the historical reading data, and determining the weight of each label;
filtering the tags according to the weights;
constructing a reading interest portrait of the user according to the filtered tags;
determining a multivariate intelligent theoretical value of the user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories;
and determining the content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
2. The method of claim 1, wherein the determining the weight for each tag comprises:
determining a first weight of a user behavior type corresponding to each label, the times of the user behavior, a time attenuation factor and a second weight determined according to the word frequency;
and determining the weight of the label according to the first weight, the user behavior times, the time attenuation factor and the second weight.
3. The method of claim 2, wherein the time decay factor is calculated by the following equation:
N(t)=N0e-α(t+l)
wherein t represents a decay time, N0Denotes an initial value of the attenuation, α denotes an attenuation constant, and l denotes an amount of leftward shift.
4. The method of claim 1, wherein obtaining a label for each piece of historical reading comprises:
obtaining the label of each historical reading content through one or more of the following modes: and the web crawler automatically generates and acquires the manually input label.
5. The method of claim 4, wherein the obtaining the label of each piece of historical reading content by automatic generation comprises:
acquiring text content of each piece of historical reading content, and acquiring keywords from the text content;
determining the weight of each keyword by using a word frequency, maximum entropy and keyword extraction algorithm;
and taking the N keywords with the largest weight as the labels of the historical reading content, wherein N is an integer greater than or equal to 1.
6. The method of claim 1, wherein said determining a multivariate intelligent theoretic value for the user based on the historical reading data comprises:
obtaining a label of each piece of historical reading content in the historical reading data, and determining the intelligent category to which the content belongs according to the label;
counting the historical reading number of each intelligent category;
and generating a multivariate intelligent theoretical value according to the historical reading number of each intelligent category.
7. The method of claim 1, further comprising:
and performing deep learning according to the historical reading data, and determining the content recommended for the user according to the deep learning result.
8. A user representation construction apparatus, comprising:
the acquisition module is used for acquiring historical reading data of a user;
the determining module is used for acquiring the label of each piece of historical reading content in the historical reading data and determining the weight of each label;
the filtering module is used for filtering the labels according to the weight;
the construction module is used for constructing a reading interest portrait of the user according to the filtered tags;
the determining module is used for determining a multivariate intelligent theoretical value of the user according to historical reading data of the user, wherein the multivariate intelligent theoretical value is used for reflecting the reading condition of the user in multiple intelligent categories;
and the recommending module is used for determining the content recommended for the user according to the reading interest portrait and the multivariate intelligent theoretical value.
9. The apparatus of claim 8, wherein the determination module is specifically configured to:
determining a first weight of a user behavior type corresponding to each label, the times of the user behavior, a time attenuation factor and a second weight determined according to the word frequency;
and determining the weight of the label according to the first weight, the user behavior times, the time attenuation factor and the second weight.
10. The apparatus of claim 9, wherein the time decay factor is calculated by the following equation:
N(t)=N0e-α(t+l)
wherein t represents the decay time, N0Denotes an initial value of the attenuation, α denotes an attenuation constant, and l denotes an amount of leftward shift.
11. The apparatus of claim 8, wherein the determination module is specifically configured to:
obtaining the label of each historical reading content through one or more of the following modes: and the web crawler automatically generates and acquires the manually input label.
12. The apparatus of claim 11, wherein the determination module is specifically configured to:
acquiring text content of each piece of historical reading content, and acquiring keywords from the text content;
determining the weight of each keyword by using a word frequency, maximum entropy and keyword extraction algorithm;
and taking N keywords with the largest weight as the labels of the historical reading contents, wherein N is an integer greater than or equal to 1.
13. The apparatus of claim 8, wherein the determination module is specifically configured to:
obtaining a label of each piece of historical reading content in the historical reading data, and determining the intelligent category to which the content belongs according to the label;
counting the historical reading number of each intelligent category;
and generating a multivariate intelligent theoretical value according to the historical reading number of each intelligent category.
14. The apparatus of claim 8, further comprising a deep learning module to perform deep learning based on the historical reading data;
and the recommending module is also used for determining the recommended content for the user according to the deep learning result.
15. A content recommendation device, characterized by comprising: at least one processor, a memory communicatively coupled to the at least one processor;
the at least one processor is configured to read a program in the memory for performing the method of any of claims 1-7.
16. A computer-readable storage medium having stored thereon instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1-7.
CN201910951783.4A 2019-10-08 2019-10-08 User portrait construction and content recommendation method, device and equipment Active CN110674410B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910951783.4A CN110674410B (en) 2019-10-08 2019-10-08 User portrait construction and content recommendation method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910951783.4A CN110674410B (en) 2019-10-08 2019-10-08 User portrait construction and content recommendation method, device and equipment

Publications (2)

Publication Number Publication Date
CN110674410A CN110674410A (en) 2020-01-10
CN110674410B true CN110674410B (en) 2022-05-24

Family

ID=69081051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910951783.4A Active CN110674410B (en) 2019-10-08 2019-10-08 User portrait construction and content recommendation method, device and equipment

Country Status (1)

Country Link
CN (1) CN110674410B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581452B (en) * 2020-03-26 2023-10-17 浙江口碑网络技术有限公司 Recommendation object data obtaining method and device and electronic equipment
CN111931022A (en) * 2020-06-10 2020-11-13 北京雅邦网络技术发展有限公司 AI hot spot content intelligent editing system
CN111753199A (en) * 2020-06-22 2020-10-09 北京百度网讯科技有限公司 User portrait construction method and device, electronic device and medium
CN111935265A (en) * 2020-08-03 2020-11-13 腾讯科技(深圳)有限公司 Media information processing method and device
CN114077713A (en) * 2020-08-11 2022-02-22 华为技术有限公司 Content recommendation method, electronic device and server
CN112182153B (en) * 2020-09-24 2024-03-08 武汉大学 Reading content theme recombination frame generation method and device
CN112487285A (en) * 2020-11-18 2021-03-12 中国人寿保险股份有限公司 Message pushing method and device
CN112632389B (en) * 2020-12-30 2024-03-15 广州博冠信息科技有限公司 Information processing method, information processing apparatus, storage medium, and electronic device
CN113076487B (en) * 2021-04-30 2024-03-08 北京爱奇艺科技有限公司 User interest characterization and content recommendation method, device and equipment
CN113610680A (en) * 2021-08-17 2021-11-05 山西传世科技有限公司 AI-based interactive reading material personalized recommendation method and system
CN113688626A (en) * 2021-09-02 2021-11-23 北京方正阿帕比技术有限公司 Method for extracting reader interest tag
CN115796607A (en) * 2023-01-30 2023-03-14 国网山西省电力公司营销服务中心 Acquisition terminal security portrait assessment method based on power consumption information analysis

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN105930507A (en) * 2016-05-10 2016-09-07 腾讯科技(深圳)有限公司 Method and apparatus for obtaining Web browsing interest of user
CN106503015A (en) * 2015-09-07 2017-03-15 国家计算机网络与信息安全管理中心 A kind of method for building user's portrait
CN107045533A (en) * 2017-01-20 2017-08-15 广东技术师范学院天河学院 Educational resource based on label recommends method and system
CN107437215A (en) * 2017-08-02 2017-12-05 杭州东信北邮信息技术有限公司 A kind of book recommendation method based on label
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN108280114A (en) * 2017-07-28 2018-07-13 淮阴工学院 A kind of user's literature reading interest analysis method based on deep learning
CN110209875A (en) * 2018-07-03 2019-09-06 腾讯科技(深圳)有限公司 User content portrait determines method, access object recommendation method and relevant apparatus

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760163B (en) * 2012-06-12 2015-04-29 北京奇虎科技有限公司 Personalized recommendation method and device of characteristic information
US20140280241A1 (en) * 2013-03-15 2014-09-18 MediaGraph, LLC Methods and Systems to Organize Media Items According to Similarity

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929959A (en) * 2012-10-10 2013-02-13 杭州东信北邮信息技术有限公司 Book recommendation method based on user actions
CN106503015A (en) * 2015-09-07 2017-03-15 国家计算机网络与信息安全管理中心 A kind of method for building user's portrait
CN105930507A (en) * 2016-05-10 2016-09-07 腾讯科技(深圳)有限公司 Method and apparatus for obtaining Web browsing interest of user
CN107045533A (en) * 2017-01-20 2017-08-15 广东技术师范学院天河学院 Educational resource based on label recommends method and system
CN108280114A (en) * 2017-07-28 2018-07-13 淮阴工学院 A kind of user's literature reading interest analysis method based on deep learning
CN107437215A (en) * 2017-08-02 2017-12-05 杭州东信北邮信息技术有限公司 A kind of book recommendation method based on label
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN110209875A (en) * 2018-07-03 2019-09-06 腾讯科技(深圳)有限公司 User content portrait determines method, access object recommendation method and relevant apparatus

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Exploring the variety of parental talk during shared book reading and its contributions to preschool language and literacy: evidence from the Early Childhood Longitudinal Study-Birth Cohort;Annemarie H. Hindman et al.;《Reading and Writing》;20130417;第27卷;287-313 *
Quest: An Adaptive Framework for User Profile Acquisition from Social Communities of Interest;Nima Dokoohaki et al.;《2010 International Conference on Advances in Social Networks Analysis and Mining》;20100907;360-364 *
基于主题模型的用户兴趣建模及在新闻推荐中的应用;陈铭权;《中国优秀硕士学位论文全文数据库 信息科技辑》;20151215(第12期);I138-946 *
基于用户兴趣变化的数字图书馆知识推荐服务研究;曾子明 等;《图书馆论坛》;20160110;第36卷(第1期);94-99 *

Also Published As

Publication number Publication date
CN110674410A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN110674410B (en) User portrait construction and content recommendation method, device and equipment
Underwood Distant horizons: digital evidence and literary change
Dobson Critical digital humanities: the search for a methodology
Li et al. Imbalanced text sentiment classification using universal and domain-specific knowledge
CN110717017B (en) Method for processing corpus
CN109492157A (en) Based on RNN, the news recommended method of attention mechanism and theme characterizing method
Santini Exploratory image databases: content-based retrieval
CN110325986B (en) Article processing method, article processing device, server and storage medium
Mallik et al. Nrityakosha: Preserving the intangible heritage of indian classical dance
US20140229486A1 (en) Method and apparatus for unsupervised learning of multi-resolution user profile from text analysis
Jacobs et al. What’s in the brain that ink may character…. A quantitative narrative analysis of Shakespeare’s 154 sonnets for use in (Neuro-) cognitive poetics
CN111488931A (en) Article quality evaluation method, article recommendation method and corresponding devices
Lin et al. Usability of affective interfaces for a digital arts tutoring system
Korsgaard et al. Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering
Miranda et al. Topic modeling and sentiment analysis of martial arts learning textual feedback on YouTube
Ruta et al. StyleBabel: artistic style tagging and captioning
Roshchina TWIN: Personality-based Recommender System
JP2010277462A (en) Action recommendation device, method and program
Ammari et al. Deriving group profiles from social media to facilitate the design of simulated environments for learning
KR20120068519A (en) Method for extracting experience and classifying verb in blog
Singh Twitter Sentiment Analysis Using Machine Learning
CN111415740B (en) Method and device for processing inquiry information, storage medium and computer equipment
US20240086452A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
Vinson Quantifying context and its effects in large natural datasets
CN114722184A (en) Method and device for recommending picture books, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100000 Room D529, No. 501, Floor 5, Building 2, Fourth District, Wangjing Dongyuan, Chaoyang District, Beijing

Patentee after: Beijing Wuling Technology Co.,Ltd.

Address before: 100000 room 06, 2163, 13 / F, building 523, Wangjing Dongyuan, Chaoyang District, Beijing

Patentee before: Beijing Wuling Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230104

Address after: 100000 Room 815, Floor 8, Building 6, Yard 33, Guangshun North Street, Chaoyang District, Beijing

Patentee after: Luka (Beijing) Intelligent Technology Co.,Ltd.

Address before: 100000 Room D529, No. 501, Floor 5, Building 2, Fourth District, Wangjing Dongyuan, Chaoyang District, Beijing

Patentee before: Beijing Wuling Technology Co.,Ltd.