CN114880572B - Intelligent news client recommendation system - Google Patents

Intelligent news client recommendation system Download PDF

Info

Publication number
CN114880572B
CN114880572B CN202210564514.4A CN202210564514A CN114880572B CN 114880572 B CN114880572 B CN 114880572B CN 202210564514 A CN202210564514 A CN 202210564514A CN 114880572 B CN114880572 B CN 114880572B
Authority
CN
China
Prior art keywords
interest
retrieval
features
point
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210564514.4A
Other languages
Chinese (zh)
Other versions
CN114880572A (en
Inventor
郑创伟
符捷雯
陈义飞
金勇�
谢志成
王泳
陈少彬
刑谷涛
罗佩珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Creative Intelligence Port Technology Co ltd
Original Assignee
Shenzhen Creative Intelligence Port Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Creative Intelligence Port Technology Co ltd filed Critical Shenzhen Creative Intelligence Port Technology Co ltd
Priority to CN202210564514.4A priority Critical patent/CN114880572B/en
Publication of CN114880572A publication Critical patent/CN114880572A/en
Application granted granted Critical
Publication of CN114880572B publication Critical patent/CN114880572B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of content recommendation, in particular to an intelligent news client recommendation system. The system comprises: a local end and a server end; the local end is characterized by comprising: the user interest bubble building unit is used for building interest bubbles of a user based on preset configuration information, each interest bubble corresponds to a primary classification, the primary classification is a defined user interest category, each primary category comprises a plurality of different secondary categories, and each interest bubble comprises an interest center and a plurality of interest category sets. According to the method and the device, the interest bubbles are established, the motion of the interest bubbles is driven through the behaviors of the user to change the interest analysis of the user, and the recommended content is generated through retrieval characteristics, so that the intellectualization of content recommendation is realized, the behavior of the user is not depended on, the primary and secondary contents can be distinguished more clearly in the characteristic retrieval process based on the interest bubbles, and the retrieval efficiency is improved.

Description

Intelligent news client recommendation system
Technical Field
The invention belongs to the technical field of content recommendation, and particularly relates to an intelligent verification system for internet news content data.
Background
The personalized recommendation system is a tool for helping users to quickly find useful information, and can provide personalized services for different users so as to meet specific interests and requirements of the users. Unlike search engines, recommendation systems do not require users to provide explicit needs, but rather model the interests of users by analyzing their historical behavior and proactively recommend to users information that can satisfy their interests and needs based on this.
The application of the personalized recommendation system can be seen in various websites of the internet, including e-commerce, movies and videos, music, social networks and the like. And applying recommendation systems such as Taobao and Amazon to predict the commodities which are possibly interested by the user to be recommended by personalized recommendation models such as collaborative filtering. Collaborative Filtering (CF) is a recommendation of items or information of interest to a user using the preferences of a community of shared interest and common experience.
The personalized news recommendation system is a recommendation system for recommending interested news information to a user according to the interest characteristics and behaviors of the user. The personalized news recommendation technology is an extended application of personalized recommendation in the news processing field, news is automatically recommended to interested users through a recommendation system, and the benefit of news websites and website users is double . The personalized news recommendation system applies personalized recommendation to the recommendation of news, can help a user to easily acquire interesting news from massive information on the Internet, and excavates content which the user may be interested in.
At present, the most widely applied collaborative filtering personalized recommendation technology has two modes: user-based collaborative filtering and item-based collaborative filtering. The former mainly comprises three steps: a user behavior data representation; searching a plurality of users most similar to the target user by using a user similarity calculation method; and predicting the behaviors of the target user to the items according to the behaviors of the similar users to the items, and recommending. The latter also comprises three steps: a project behavior data representation; calculating the similarity between the projects by using a project similarity calculation method; recommending the item which is most similar to the item of the user generated action to the user.
The method is always based on the user similarity and the item similarity, and in the judgment of the similarity, the final recommendation result is easily influenced due to algorithm errors. And through the mode of target user portrayal, a large amount of user data need to be collected and called, on one hand, the efficiency is low, and on the other hand, more user rights need to be acquired.
Disclosure of Invention
In view of this, the main object of the present invention is to provide an intelligent news client recommendation system, where the interest bubbles are created, the motions of the interest bubbles are driven by the behaviors of the user to change the interest analysis of the user, and the recommended content is generated by retrieving features, so that the intelligence of content recommendation is realized, the user behavior is not relied on, the primary and secondary are better distinguished based on the interest bubbles during the feature retrieval process, and the retrieval efficiency is improved.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
news client intelligence recommendation system, the system includes: a local end and a server end; the local end comprises: the user interest bubble constructing unit is configured to construct user interest bubbles based on preset configuration information, each interest bubble corresponds to a primary classification, the primary classification is a defined user interest category, each primary category comprises a plurality of different secondary categories, each interest bubble comprises an interest center and a plurality of interest category sets, the interest category sets surround the interest centers in a floating interest set mode, and Euclidean distances between the interest centers and the interest centers are equal set values; the user interest path establishing unit is configured for acquiring a complete behavior path of a user within a set time range; the complete behavior path of the user is defined as: in a set time range, a user browses a starting point, a middle point and an end point of content; the first-level user interest map building unit is configured to perform first-level classification on a starting point, a middle point and an end point in a complete behavior path of a user, find the starting point, the middle point and the end point with a first classification level, find interest bubbles corresponding to the starting point, the middle point and the end point, count the number of the starting point, the middle point and the end point belonging to the same category and the positions of the starting point, the middle point and the end point in the path, calculate first weight values of the starting point, the middle point and the end point by using a preset first interest weight calculation model, and push the interest bubbles to move towards an interest center based on the calculated weight values; the secondary user interest map building unit is configured for carrying out secondary classification on a starting point, an intermediate point and an end point in a complete behavior path of a user, finding out the starting point, the intermediate point and the end point with the classification level of secondary, dividing the starting point, the intermediate point and the end point of the secondary into interest bubbles corresponding to the starting point, the intermediate point and the end point of the subordinate primary, and generating a model by using interest retrieval features based on secondary categories of the starting point, the intermediate point and the end point of the secondary to generate retrieval features corresponding to each interest bubble; the server side comprises: a content database configured to store content; the retrieval unit is configured for sequentially calling retrieval features from near to far according to the distance between the interest bubble and the interest center to perform feature retrieval in the content database and find out the content matched with the retrieval; and the content presentation unit is configured to send the retrieved and matched content to the client for presentation.
Further, the first interest weight calculation model is represented by the following formula:
Figure 349484DEST_PATH_IMAGE001
(ii) a Wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE002
is a weighted value;
Figure 695014DEST_PATH_IMAGE003
the number of starting points, intermediate points or end points belonging to the same class of a first class classification;
Figure 100002_DEST_PATH_IMAGE004
the total number of starting points, intermediate points and end points;
Figure 517477DEST_PATH_IMAGE005
the distance between the starting point, the middle point or the end point belonging to one category and the starting point, the middle point or the end point of other categories respectively; the separation distance is defined as the number of points between the starting point, the middle point and the end point and other points of different categories;
Figure 100002_DEST_PATH_IMAGE006
the weight initial value is a set value, and the value range is as follows: 100 to 300.
Further, the method for generating the search feature by the interest search feature generation model comprises the following steps: extracting category keywords corresponding to each secondary category; the category keywords are label keywords added during generation of the secondary category; preprocessing each tag keyword in the category keywords and converting the preprocessed tag keywords into word sequences; determining a word vector of each word, and calculating a tag keyword vector of each tag keyword; clustering the label keyword vectors, and dividing the category keywords into a plurality of label keyword subsets; and extracting retrieval features according to the divided tag keyword subsets.
Further, preprocessing each tag keyword in the category keywords, and converting the preprocessed tag keywords into a word sequence, including: for the English label key words, judging whether a space exists between every two words, if so, segmenting the words into words, and adding a sequence; for the Chinese label key words, the Chinese label key words are converted into word sequences through word segmentation and/or word pause.
Further, determining a word vector of each word, calculating a tag keyword vector of each tag keyword, and determining the word vector of each word; and calculating the label keyword vector of each label keyword according to the word vector of each word.
Further, the method for the retrieval unit to sequentially call the retrieval features from near to far according to the distance between the interest bubble and the interest center to perform feature retrieval in the content database and find the content matched with the retrieval comprises the following steps: acquiring retrieval characteristics; extracting core features of the retrieval features from the retrieval features by using a convolutional neural network model, wherein the convolutional neural network model is obtained by training based on historical retrieval features and a training set of historical retrieval data; and retrieving target content of which the core features are matched with the core features of the retrieval features based on the extracted core features of the retrieval features.
Further, the retrieving, based on the extracted core features of the retrieval features, target content whose core features match with the core features of the retrieval features includes: determining a hash bucket mapped by the core feature of the retrieval feature through a hash function; determining the content corresponding to each element existing in the hash bucket as the target content; the existing elements in the hash bucket are obtained by mapping the core features of each content through the hash function in advance, and the core features of each content are extracted from each content through the convolutional neural network model.
Further, the extracting core features of the search features from the search features by using a convolutional neural network model includes: and performing dimensionality reduction on the core features extracted from the retrieval features by using the convolutional neural network model, and taking the core features obtained after dimensionality reduction as the core features of the retrieval features.
Further, the retrieving the target content whose core features are matched with the core features of the retrieval features based on the extracted core features of the retrieval features specifically includes: retrieving target content of which the core features are matched with the core features of the retrieval features from a content retrieval database based on the extracted core features of the retrieval features; the content retrieval database establishes indexes for core features in a mode of combining a locality sensitive hashing algorithm and a distributed system.
Further, when extracting the category keywords corresponding to each secondary category, extracting the category keywords according to the order from near to far of the distance from the primary category to which each secondary category belongs to the interest center.
The intelligent news client recommendation system has the following beneficial effects:
when the method and the device are used for pushing the content, the user portrait is not constructed for each user, but the interest bubble is constructed according to the one-time behavior of the user, the construction according to the behavior is different from the prior art, and the transverse classification and the longitudinal classification are carried out according to the one-time complete behavior chain of the user, so that the more accurate content pushing is carried out, and the efficiency and the accuracy of the content pushing are improved.
Drawings
Fig. 1 is a schematic system structure diagram of an intelligent news client recommendation system according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating an interest bubble and an interest center of an intelligent news client recommendation system according to an embodiment of the present invention;
fig. 3 is a diagram of a news client intelligent recommendation system according to an embodiment of the present invention.
Detailed Description
The method of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments of the invention.
Example 1
As shown in fig. 1, the intelligent news client recommendation system includes: a local end and a server end; the local end comprises: the user interest bubble constructing unit is configured to construct user interest bubbles based on preset configuration information, each interest bubble corresponds to a primary classification, the primary classification is a defined user interest category, each primary category comprises a plurality of different secondary categories, each interest bubble comprises an interest center and a plurality of interest category sets, the interest category sets surround the interest centers in a floating interest set mode, and Euclidean distances between the interest centers and the interest centers are equal set values; the user interest path establishing unit is configured for acquiring a complete behavior path of a user within a set time range; the complete behavior path of the user is defined as: in a set time range, a user browses a starting point, a middle point and an end point of content; the first-level user interest map building unit is configured to perform first-level classification on a starting point, a middle point and an end point in a complete behavior path of a user, find the starting point, the middle point and the end point with a first classification level, find interest bubbles corresponding to the starting point, the middle point and the end point, count the number of the starting point, the middle point and the end point belonging to the same category and the positions of the starting point, the middle point and the end point in the path, calculate first weight values of the starting point, the middle point and the end point by using a preset first interest weight calculation model, and push the interest bubbles to move towards an interest center based on the calculated weight values; the secondary user interest map building unit is configured for carrying out secondary classification on a starting point, an intermediate point and an end point in a complete behavior path of a user, finding out the starting point, the intermediate point and the end point with the classification level of secondary, dividing the starting point, the intermediate point and the end point of the secondary into interest bubbles corresponding to the starting point, the intermediate point and the end point of the subordinate primary, and generating a model by using interest retrieval features based on secondary categories of the starting point, the intermediate point and the end point of the secondary to generate retrieval features corresponding to each interest bubble; the server side comprises: a content database configured to store content; the retrieval unit is configured for sequentially calling retrieval features from near to far according to the distance between the interest bubble and the interest center to perform feature retrieval in the content database and find out the content matched with the retrieval; and the content presentation unit is configured to send the retrieved and matched content to the client for presentation.
Specifically, taking a recommendation method based on click rate estimation as an example, a deep network model is set in the server. For each pair of 'user-content' combinations in the candidate content set, predicting the clicking probability of the user on the content by the deep network model according to the historical clicking behaviors of the user, the semantic features and the context features of the content; then, for the content to be recommended of a certain user, recommending the content ranked at the top n as an information stream to the user according to the sequence from high click probability to low click probability.
In the related art, a recommendation algorithm usually selects push information according to the interest of a target user and judges the interest degree of the user in the information by analyzing the information content, but the recommendation method neglects the requirements of the user for acquiring the current hot event, reading the high-quality content of the small people and the like, and usually has the problem of low accuracy.
Example 2
On the basis of the above embodiment, the first interest weight calculation model is represented by the following formula:
Figure DEST_PATH_IMAGE007
(ii) a Wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE008
is a weighted value;
Figure 315669DEST_PATH_IMAGE009
the number of starting points, intermediate points or end points belonging to the same class of a first class classification;
Figure DEST_PATH_IMAGE010
the total number of starting points, intermediate points and end points;
Figure 311306DEST_PATH_IMAGE011
is the distance between the starting point, the middle point or the end point belonging to one category and the starting point, the middle point or the end point of other categories respectively; the separation distance is defined as the number of points between the starting point, the middle point and the end point and other points of different categories;
Figure DEST_PATH_IMAGE012
the weight initial value is a set value, and the value range is as follows: 100 to 300.
Referring to fig. 2 and 3, the letter symbols in fig. 2 show a plurality of interest bubbles, and the distance between the interest bubble and the interest center may be a positive value or a negative value, and when the distance is a negative value, an absolute value is required.
FIG. 3 shows the jump chains at each point in the secondary classification.
Example 3
On the basis of the previous embodiment, the method for generating the search feature by the interest search feature generation model comprises the following steps: extracting category keywords corresponding to each secondary category; the category keywords are label keywords added during generation of the secondary category; preprocessing each label keyword in the category keywords and converting the preprocessed label keywords into a word sequence; determining a word vector of each word, and calculating a tag keyword vector of each tag keyword; clustering the label keyword vectors, and dividing the category keywords into a plurality of label keyword subsets; and extracting retrieval features according to the divided tag keyword subsets.
In particular, conventional methods typically include steps of text localization, pre-processing (typically including normalization, enhancement, binarization), and OCR character recognition. Each of which involves many other complex methods, each of which will affect the accuracy of the final recognition result. Chen's paper Automatic detection and recognition of signals from natural scenes suggests a method for detecting and recognizing signals from images of natural scenes. The method comprises the steps of detecting a text by utilizing LoG (Laplacian of Gaussian) edge detection, color modeling, layout analysis and affine correction, then carrying out normalization processing on the text, and finally carrying out text recognition by utilizing OCR (optical character recognition) based on gray level. Koga's paper Camera-based Kanji OCR for mobile-phones, practical issues (used for Camera-based chinese character OCR for practical use of mobile phones) proposes a Camera-based chinese character recognition method for mobile phones. The first part of the method comprises four steps: pre-binarization, rough layout analysis, line direction detection and line segmentation. The latter part also comprises four steps: fine binarization, pre-segmentation, chinese character recognition and post-processing. Due to such OCR-based methods, the recognition accuracy is closely related to the text localization and the enhanced image quality.
Example 4
On the basis of the previous embodiment, preprocessing each tag keyword in the category keywords to convert the tag keywords into a word sequence, including: for the English label key words, judging whether a space exists between every two words, if so, segmenting the words into words, and adding a sequence; for the Chinese label key words, the Chinese label key words are converted into word sequences through word segmentation and/or word pause.
Example 5
On the basis of the previous embodiment, determining a word vector of each word, calculating a tag keyword vector of each tag keyword, and determining the word vector of each word; and calculating the label keyword vector of each label keyword according to the word vector of each word.
Example 6
On the basis of the previous embodiment, the method for the retrieval unit to sequentially call the retrieval features from near to far according to the distance between the interest bubble and the interest center to perform feature retrieval in the content database, and finding the content matched with the retrieval comprises the following steps: acquiring retrieval characteristics; extracting core features of the retrieval features from the retrieval features by using a convolutional neural network model, wherein the convolutional neural network model is obtained by training based on historical retrieval features and a training set of historical retrieval data; and retrieving target content of which the core features are matched with the core features of the retrieval features based on the extracted core features of the retrieval features.
Example 7
On the basis of the above embodiment, the retrieving, based on the extracted core features of the retrieval features, target content whose core features match with the core features of the retrieval features includes: determining a hash bucket mapped by the core feature of the retrieval feature through a hash function; determining the content corresponding to each element existing in the hash bucket as the target content; the existing elements in the hash bucket are obtained by mapping the core features of each content through the hash function in advance, and the core features of each content are extracted from each content through the convolutional neural network model.
Example 8
On the basis of the above embodiment, the extracting core features of the search features from the search features by using a convolutional neural network model includes: and performing dimension reduction on the core features extracted from the retrieval features by using the convolutional neural network model, and taking the core features obtained after the dimension reduction as the core features of the retrieval features.
Example 9
On the basis of the previous embodiment, the retrieving, based on the extracted core feature of the retrieval feature, target content of which the core feature is matched with the core feature of the retrieval feature specifically includes: based on the extracted core features of the retrieval features, retrieving target content of which the core features are matched with the core features of the retrieval features from a content retrieval database; the content retrieval database establishes indexes for core features in a mode of combining a locality sensitive hashing algorithm and a distributed system.
Example 10
On the basis of the previous embodiment, when extracting the category keywords corresponding to each secondary category, the extraction is performed according to the order from near to far of the distance from the primary category to which each secondary category belongs to the interest center.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
It should be noted that, the system provided in the foregoing embodiment is only illustrated by dividing the functional units, and in practical applications, the functions may be distributed by different functional units according to needs, that is, the units or steps in the embodiments of the present invention are further decomposed or combined, for example, the units in the foregoing embodiment may be combined into one unit, or may be further separated into multiple sub-units, so as to complete the functions of the whole unit or the unit described above. The names of the units and steps involved in the embodiments of the present invention are only for distinguishing the units or steps, and are not to be construed as unduly limiting the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage module and the processing module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Those of skill in the art would appreciate that the various illustrative elements, method steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the elements, method steps may be located in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or unit/module that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or unit/module.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical marks can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (10)

1. News client intelligence recommendation system, the system includes: a local end and a server end; the local end is characterized by comprising: the user interest bubble building unit is configured for building interest bubbles of a user based on preset configuration information, each interest bubble corresponds to one primary classification, the primary classification is a defined user interest category, each primary category comprises a plurality of different secondary categories, each interest bubble comprises an interest center and a plurality of interest category sets, the interest category sets surround the interest centers in a floating interest set mode, and Euclidean distances between the interest centers are equal set values; the user interest path establishing unit is configured for acquiring a complete behavior path of a user within a set time range; the complete behavior path of the user is defined as: in a set time range, a user browses a starting point, a middle point and an end point of content; the first-level user interest map building unit is configured to perform first-level classification on a starting point, a middle point and an end point in a complete behavior path of a user, find the starting point, the middle point and the end point with a first classification level, find interest bubbles corresponding to the starting point, the middle point and the end point, count the number of the starting point, the middle point and the end point belonging to the same category and the positions of the starting point, the middle point and the end point in the path, calculate first weight values of the starting point, the middle point and the end point by using a preset first interest weight calculation model, and push the interest bubbles to move towards an interest center based on the calculated weight values; the secondary user interest map building unit is configured for carrying out secondary classification on a starting point, an intermediate point and an end point in a complete behavior path of a user, finding out the starting point, the intermediate point and the end point with the classification level of secondary, dividing the starting point, the intermediate point and the end point of the secondary into interest bubbles corresponding to the starting point, the intermediate point and the end point of the subordinate primary, and generating a model by using interest retrieval features based on secondary categories of the starting point, the intermediate point and the end point of the secondary to generate retrieval features corresponding to each interest bubble; the server side comprises: a content database configured to store content; the retrieval unit is configured for sequentially calling retrieval features from near to far according to the distance between the interest bubble and the interest center to perform feature retrieval in the content database and find out the content matched with the retrieval; and the content presentation unit is configured to send the retrieved and matched content to the client for presentation.
2. The system of claim 1, wherein the first interest weight calculation model is represented using the formula:
Figure 115185DEST_PATH_IMAGE001
(ii) a Wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE002
is a weighted value;
Figure 829063DEST_PATH_IMAGE003
the number of starting points, intermediate points or end points which belong to the same class of a primary classification;
Figure DEST_PATH_IMAGE004
the total number of starting points, intermediate points and end points;
Figure 608800DEST_PATH_IMAGE005
the distance between the starting point, the middle point or the end point belonging to one category and the starting point, the middle point or the end point of other categories respectively; the separation distance is defined as the number of points between the starting point, the middle point and the end point and other points of different categories;
Figure DEST_PATH_IMAGE006
the weight initial value is a set value, and the value range is as follows: 100 to 300.
3. The system of claim 1, wherein the method for generating search features by the interest search feature generation model comprises: extracting category keywords corresponding to each secondary category; the category keywords are label keywords added during generation of the secondary category; preprocessing each label keyword in the category keywords and converting the preprocessed label keywords into a word sequence; determining a word vector of each word, and calculating a tag keyword vector of each tag keyword; clustering the label keyword vectors, and dividing the category keywords into a plurality of label keyword subsets; and extracting retrieval features according to the divided tag keyword subsets.
4. The system of claim 3, wherein preprocessing each tag keyword in the category keywords into a sequence of words comprises: for the English label key words, judging whether a space exists between every two words, if so, segmenting the words into words, and adding a sequence; for the Chinese label key words, the Chinese label key words are converted into word sequences through word segmentation and/or word pause.
5. The system of claim 4, wherein a word vector for each word is determined and a tag keyword vector for each tag keyword is calculated to determine a word vector for each word; and calculating the label keyword vector of each label keyword according to the word vector of each word.
6. The system of claim 1, wherein the retrieval unit sequentially retrieves the retrieval features from near to far according to the distance between the interest bubble and the interest center for feature retrieval in the content database, and the method for finding the content matching the retrieval comprises: acquiring retrieval characteristics; extracting core features of the retrieval features from the retrieval features by using a convolutional neural network model, wherein the convolutional neural network model is obtained by training based on historical retrieval features and a training set of historical retrieval data; and retrieving target content of which the core features are matched with the core features of the retrieval features based on the extracted core features of the retrieval features.
7. The system of claim 6, wherein retrieving target content whose core features match the core features of the retrieved features based on the extracted core features of the retrieved features comprises: determining a hash bucket mapped by the core feature of the retrieval feature through a hash function; determining the content corresponding to each element existing in the hash bucket as the target content; the existing elements in the hash bucket are obtained by mapping the core features of each content through the hash function in advance, and the core features of each content are extracted from each content through the convolutional neural network model.
8. The system of claim 7, wherein said extracting core features of said search features from said search features using a convolutional neural network model comprises: and performing dimensionality reduction on the core features extracted from the retrieval features by using the convolutional neural network model, and taking the core features obtained after dimensionality reduction as the core features of the retrieval features.
9. The system according to claim 8, wherein the retrieving the target content whose core feature matches the core feature of the retrieval feature based on the extracted core feature of the retrieval feature specifically comprises: based on the extracted core features of the retrieval features, retrieving target content of which the core features are matched with the core features of the retrieval features from a content retrieval database; the content retrieval database establishes indexes for core features in a mode of combining a locality sensitive hashing algorithm and a distributed system.
10. The system of claim 3, wherein in extracting the category keyword corresponding to each secondary category, the extraction is performed in order from near to far of the distance from the interest center to the primary category to which each secondary category belongs.
CN202210564514.4A 2022-05-23 2022-05-23 Intelligent news client recommendation system Active CN114880572B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210564514.4A CN114880572B (en) 2022-05-23 2022-05-23 Intelligent news client recommendation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210564514.4A CN114880572B (en) 2022-05-23 2022-05-23 Intelligent news client recommendation system

Publications (2)

Publication Number Publication Date
CN114880572A CN114880572A (en) 2022-08-09
CN114880572B true CN114880572B (en) 2023-03-03

Family

ID=82677665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210564514.4A Active CN114880572B (en) 2022-05-23 2022-05-23 Intelligent news client recommendation system

Country Status (1)

Country Link
CN (1) CN114880572B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102611785A (en) * 2011-01-20 2012-07-25 北京邮电大学 Personalized active news recommending service system and method for mobile phone user
CN104216898A (en) * 2013-05-31 2014-12-17 腾讯科技(深圳)有限公司 Browser navigation method and device and terminal equipment
CN106776993A (en) * 2016-12-06 2017-05-31 苏州大学 Recommend method and system in a kind of path based on temporal constraint activity purpose
CN113688164A (en) * 2021-07-28 2021-11-23 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Interest point query method and system based on knowledge graph correlation analysis
CN114187036A (en) * 2021-11-30 2022-03-15 深圳市喂车科技有限公司 Internet advertisement intelligent recommendation management system based on behavior characteristic recognition

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8341267B2 (en) * 2008-09-19 2012-12-25 Core Wireless Licensing S.A.R.L. Memory allocation to store broadcast information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102611785A (en) * 2011-01-20 2012-07-25 北京邮电大学 Personalized active news recommending service system and method for mobile phone user
CN104216898A (en) * 2013-05-31 2014-12-17 腾讯科技(深圳)有限公司 Browser navigation method and device and terminal equipment
CN106776993A (en) * 2016-12-06 2017-05-31 苏州大学 Recommend method and system in a kind of path based on temporal constraint activity purpose
CN113688164A (en) * 2021-07-28 2021-11-23 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Interest point query method and system based on knowledge graph correlation analysis
CN114187036A (en) * 2021-11-30 2022-03-15 深圳市喂车科技有限公司 Internet advertisement intelligent recommendation management system based on behavior characteristic recognition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A User Interest Preferences Based On-path Caching Strategy in Named Data Networking;Siyang Shan et al.;《2017 IEEE/CIC International Conference on Communications in China (ICCC》;第1-6页 *
MIKU:融合知识图谱的用户多层兴趣模型;段文菁等;《小型微型计算机系统》;第1006-1012页 *

Also Published As

Publication number Publication date
CN114880572A (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN111783419B (en) Address similarity calculation method, device, equipment and storage medium
CN108228915B (en) Video retrieval method based on deep learning
WO2020108608A1 (en) Search result processing method, device, terminal, electronic device, and storage medium
CN102549603B (en) Relevance-based image selection
CN112347244B (en) Yellow-based and gambling-based website detection method based on mixed feature analysis
CN108154425B (en) Offline merchant recommendation method combining social network and location
Unar et al. Detected text‐based image retrieval approach for textual images
CN111125495A (en) Information recommendation method, equipment and storage medium
CN112364204B (en) Video searching method, device, computer equipment and storage medium
US10460174B2 (en) System and methods for analysis of user-associated images to generate non-user generated labels and utilization of the generated labels
CN111507350B (en) Text recognition method and device
CN108734159B (en) Method and system for detecting sensitive information in image
CN109492168B (en) Visual tourism interest recommendation information generation method based on tourism photos
CN111309936A (en) Method for constructing portrait of movie user
CN111538846A (en) Third-party library recommendation method based on mixed collaborative filtering
CN113282754A (en) Public opinion detection method, device, equipment and storage medium for news events
CN115712780A (en) Information pushing method and device based on cloud computing and big data
Sinnott et al. Linking user accounts across social media platforms
CN111259223B (en) News recommendation and text classification method based on emotion analysis model
KR101910424B1 (en) Method for movie ratings prediction using sentiment analysis of movie tags, recording medium and device for performing the method
CN117390299A (en) Interpretable false news detection method based on graph evidence
CN114880572B (en) Intelligent news client recommendation system
CN114282119B (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network
CN113688281B (en) Video recommendation method and system based on deep learning behavior sequence
CN114022233A (en) Novel commodity recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zheng Chuangwei

Inventor after: Fu Jiewen

Inventor after: Chen Yifei

Inventor after: Jin Yong

Inventor after: Xie Zhicheng

Inventor after: Wang Yong

Inventor after: Chen Shaobin

Inventor after: Xing Gutao

Inventor after: Luo Peishan

Inventor before: Zheng Chuangwei

Inventor before: Fu Jiewen

Inventor before: Chen Yifei

Inventor before: Jin Yong

Inventor before: Xie Zhicheng

Inventor before: Wang Yong

Inventor before: Chen Shaobin

Inventor before: Xing Gutao

Inventor before: Luo Peishan

CB03 Change of inventor or designer information