CN110110130B - Personalized tag recommendation method and system based on convolution features and weighted random walk - Google Patents

Personalized tag recommendation method and system based on convolution features and weighted random walk Download PDF

Info

Publication number
CN110110130B
CN110110130B CN201910424549.6A CN201910424549A CN110110130B CN 110110130 B CN110110130 B CN 110110130B CN 201910424549 A CN201910424549 A CN 201910424549A CN 110110130 B CN110110130 B CN 110110130B
Authority
CN
China
Prior art keywords
image
test image
convolution
random walk
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910424549.6A
Other languages
Chinese (zh)
Other versions
CN110110130A (en
Inventor
刘峥
赵天龙
袁韶璟
高珊珊
韩慧健
张彩明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Finance and Economics
Original Assignee
Shandong University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Finance and Economics filed Critical Shandong University of Finance and Economics
Priority to CN201910424549.6A priority Critical patent/CN110110130B/en
Publication of CN110110130A publication Critical patent/CN110110130A/en
Application granted granted Critical
Publication of CN110110130B publication Critical patent/CN110110130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information

Abstract

The invention discloses a personalized tag recommendation method and system based on convolution characteristics and weighted random walk, which comprises the following steps: inputting a set test image into a pre-trained convolutional neural network, and taking the output of a convolutional layer in the convolutional neural network as the visual characteristic of the image; coding the visual features, and converting the image into a visual feature vector; k adjacent images of the test image are searched to serve as a data set for recommending labels to the test image; establishing a weighted image-label bipartite model, and calculating the correlation of each label relative to a test image through an improved weighted random walk algorithm; and selecting the top N labels with the highest correlation degree and recommending the labels to the test image. The invention has the beneficial effects that: an improved weighted random walk algorithm is provided to calculate the relevance of all labels relative to the designated image, and label recommendation is sequenced according to the relevance, so that the accuracy of label recommendation can be effectively improved.

Description

Personalized tag recommendation method and system based on convolution features and weighted random walk
Technical Field
The invention belongs to the technical field of label recommendation of multimedia data, and particularly relates to a personalized label recommendation method and system based on convolution characteristics and weighted random walk.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In recent years, with the explosive growth of multimedia data, the act of adding keywords (tags) to multimedia data has become a popular way of managing various internet resources, such as tagging resources such as internet pages, academic publications, and multimedia objects (audio, images, video). The tags provide meaningful descriptors of the data and allow the user to organize and index the corresponding content. Assigning labels or keywords to images, music, or video clips all change the way a user retrieves various internet resources.
The inventor finds that with the rapid increase of the number of social images uploaded to the internet, a photo sharing website has richer metadata information, so that users can organize and access shared media content more conveniently while the amount of picture information is increased, and how to retrieve images meeting the requirements of specific users from a large number of uploaded photos becomes a difficult point at present. In particular, image retrieval has been extensively studied both in content-based and label-based ways, the former relying on visual descriptors extracted from images to return images that best match a user-specified sample image, and the latter returning identically labeled images primarily according to labels assigned to the images.
Obtaining high-quality image annotations, whether manual or automatic, has been a major obstacle in label-based image retrieval, and nowadays, in image sharing communities, annotations attached to users uploading images have become a valuable source of image labels. Therefore, tag-based image retrieval technology (TagIR) is becoming increasingly important. But a prerequisite for label-based image retrieval is that the image has been labeled with an associated label. Existing research has shown that many tags attached to uploaded images in many picture media sharing communities are not accurate, and in fact it is irrelevant to assign nearly 50% of the tags to pictures. In addition, the importance of the tags is not linked to the current tag list order, which is simply based on the order of the input, and is almost never ranked with importance or relevance. The current mainstream research work mainly focuses on establishing the connection between the text labels from the visual features of the images, and does not consider the potential connection between the metadata information attached to the images and the image labels in the internet.
Disclosure of Invention
In order to solve the problems, the invention provides a personalized label recommendation method and system based on convolution characteristics and weighted random walk, which adopts the convolution layer output of an image in a convolution neural network as the visual characteristics of the image, converts the image into a visual characteristic vector by coding the visual characteristics, and searches visual neighbors through the characteristic vector and group metadata information; and (3) executing a weighted random walk algorithm on an image-label bipartite graph formed by the adjacent images and the corresponding labels, and performing personalized label recommendation for the images through the adjacent labels.
In some embodiments, the following technical scheme is adopted:
a personalized tag recommendation method based on convolution features and weighted random walks comprises the following steps:
inputting a set test image into a pre-trained convolutional neural network, and taking the output of a convolutional layer in the convolutional neural network as the visual characteristic of the image;
coding the visual features, and converting the image into a visual feature vector;
searching k adjacent images of the test image as a data set for recommending labels to the test image through the visual feature vector and the group metadata information;
establishing a weighted image-label bipartite model, and calculating the correlation of each label relative to a test image through an improved weighted random walk algorithm;
and selecting the top N labels with the highest correlation degree and recommending the labels to the test image.
Further, inputting a set test image into a pre-trained convolutional neural network, and taking the output of a convolutional layer in the convolutional neural network as the visual characteristic of the image; the method specifically comprises the following steps:
adjusting the size of the test image to be n multiplied by n suitable for the convolutional neural network, and inputting the test image into a pre-trained L-layer convolutional neural network;
forward propagation through the network, at the ith convolutional layer LiAfter the features of the previous layer are passed through a convolution kernel, the result of the convolution is given a size nl×nl×dlCharacteristic diagram M ofl(ii) a Wherein d islIs LiThe number of convolution kernels;
in the feature map MlAt each (i, j) position of (a), a d is obtainedlVector of dimensions
Figure GDA0002893189560000021
Finally, the test image is obtained on the convolution layer LiN in (1)l×nlA local feature vector.
Further, coding the visual features, and converting the image into a visual feature vector; the method specifically comprises the following steps: encoding the local feature vector into a single visual feature vector by using VLAD encoding; the VLAD code is calculated by computing the convolution features extracted from the test image at a certain layer as k dlDimension vectors, thus converting the processing of the test image into the processing of k vectors.
Further, k neighbor images of the test image are searched as a data set for recommending labels to the test image, specifically:
by computing all images in the dataset at LiObtaining the feature vector table X of all the images;
calculating the test image at LiFeature vector x ofpThen calculate xpEuclidean distances from all the feature vectors in the feature vector table X are used as visual similarity data;
calculating the group co-occurrence coefficient normalization score of the test image and the image in the data set as group similarity data;
carrying out linear weighting on the visual similarity data and the group similarity data, and calculating the correlation of the pictures in the data set relative to the test image;
and sorting the correlation results from small to large, and selecting the first k images as adjacent images.
Further, performing minimum and maximum normalization on the correlation result of each neighboring image, and taking the correlation result as the voting weight of the neighboring image to the test image; and establishing a weighted image-label bipartite graph model according to the weight.
Further, calculating the correlation degree of each label relative to the test image through an improved weighted random walk algorithm, specifically:
Figure GDA0002893189560000031
PR (i) is the correlation degree of the node i relative to the test image, PR (j) is the correlation degree of the node j relative to the test image, d is the probability of jump access of the user, out (j) refers to a webpage set hyperlinked by the webpage j, and in (i) refers to all webpage sets hyperlinked to the webpage i;
Figure GDA0002893189560000032
u represents an input image node, meaning that each time a walk is made from the input image node; when j belongs to a tag node, ωjIs 1, when j belongs to a neighboring image node, is ωjThe voting weights of the neighboring images relative to the test image are attached.
Further, the improved weighted random walk algorithm obtains the correlation degree of all labels and adjacent images relative to the input image, and selects a plurality of labels with the highest correlation degree to recommend to the input image.
In other embodiments, the following technical solutions are adopted:
a computer-readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute the above-mentioned personalized tag recommendation method based on convolution feature and weighted random walk.
In other embodiments, the following technical solutions are adopted:
a terminal device comprising a processor and a computer-readable storage medium, the processor being configured to implement instructions; the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the personalized tag recommendation method based on the convolution characteristics and the weighted random walk.
Compared with the prior art, the invention has the beneficial effects that:
1. a bipartite graph model composed of labels and images is provided and used for data modeling, a neighbor image and weight thereof are determined through group information and image visual characteristics by combining a neighbor selection method of the group information and the image visual characteristics, a weighted neighbor image-label bipartite graph model is provided, and potential relation between image metadata information and image labels is established.
2. An improved weighted random walk algorithm is provided to calculate the relevance of all labels relative to the designated image, and label recommendation is sequenced according to the relevance, so that the accuracy of label recommendation can be effectively improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a schematic diagram illustrating obtaining visual feature vectors according to a first embodiment;
FIGS. 2(a) - (c) are schematic diagrams illustrating the influence of group information on image neighbor selection according to an embodiment;
FIG. 3 is a diagram illustrating the establishment of a bipartite graph of weighted neighbor labels according to an embodiment;
FIG. 4 is a graph showing a comparison of the properties of different layers in the first embodiment;
FIG. 5 is a diagram illustrating the correlation comparison between neighboring images at different λ values according to the first embodiment.
FIG. 6 is a diagram illustrating the influence of the recommended number of different tags on the NDCG value according to the first embodiment;
FIG. 7 is a diagram illustrating the influence of group information and weighted random walks on tag recommendation according to an embodiment;
FIG. 8 is a diagram illustrating comparison of tag results recommended by different methods according to the first embodiment.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
In one or more embodiments, disclosed is a personalized tag recommendation method based on convolution features and weighted random walks, comprising the following steps:
(1) inputting a set test image into a pre-trained convolutional neural network, and taking the output of a convolutional layer in the convolutional neural network as the visual characteristic of the image;
by training the multilayer convolution filter, the CNN can automatically learn complex features to perform object recognition, and the CNN trained for the image classification task can be used for extracting general features of other visual recognition tasks.
Given a test image, the size of the test image is first adjusted to n × n suitable for the network, then the test image is input into a pre-trained L-layer convolutional neural network CNN, and then the test image is propagated forward through the network, as can be seen from FIG. 1, at the ith convolutional layer LiAfter the features of the previous layer are passed through a convolution kernel, the result of the convolution is given a size nl×nl×dlCharacteristic diagram M oflWhere d islMeans LiNumber of convolution kernels in the feature map MlAt each (i, j) position of (a), where 0. ltoreq. i.ltoreq.nl-1,0≤j≤nl-1, to obtain a dlVector of dimensions
Figure GDA0002893189560000051
In this way, the input image is obtained at the convolution layer LiN in (1)l×nlThe local visual characteristics of the human body are measured,
Figure GDA00028931895600000512
in our work, it is necessary to study the convolution results of the layers to study the image feature extraction performance, so that the features obtained by the convolution kernel of each layer are preserved: { F1,F2,···,FL}。
(2) Coding the visual features, and converting the image into a visual feature vector;
since CNNs are trained for classification tasks, features from the highest or next highest level are typically used for decision making because they capture more of the semantic features at the classification level. More target local features can be captured from lower layers, while local features of objects are not well preserved on higher-layer networks, so they perform better in example-level image retrieval than features extracted from higher layers, which indicates that applying the last or higher layer designed for classification tasks directly to example-level image retrieval is not the best choice. Because different objects from similar categories need to be distinguished. Therefore, it is a problem for the example-level image retrieval task from which layer features are extracted.
Since it is complicated to use extracted features in CNN networks directly for instance-level image retrieval, the features are encoded to achieve efficient retrieval, and since each image contains a set of low-dimensional feature vectors whose structure is similar to dense SIFT, these feature vectors are encoded as a single feature vector using VLAD coding. VLAD coding can efficiently code local features into a single descriptor while achieving a good balance between retrieval accuracy and memory footprint.
VLAD coding is similar to constructing a BoW histogram. Mixing L withiN in (1)l×nlThe convolution characteristics are subjected to L2 standardization, and K-Means clustering is performed on the standardized characteristics to obtain a vocabulary with K visual words
Figure GDA0002893189560000052
LiExtracted features
Figure GDA00028931895600000513
By calculating the distanceThe most recent vocabulary
Figure GDA0002893189560000053
And (4) obtaining. For each visual vocabulary
Figure GDA0002893189560000054
To be distributed to
Figure GDA0002893189560000055
All feature calculations of
Figure GDA0002893189560000056
And
Figure GDA0002893189560000057
vector residuals between:
Figure GDA0002893189560000058
the VLAD algorithm operates by combining the image LiThe convolution features extracted at a certain layer are calculated into k dlDimension vector vlThereby converting the processing of the image into k dlDimension vector vlAnd (4) processing. At the L th of the imageiThe VLAD code on a layer is formulated as:
Figure GDA0002893189560000059
in the formula
Figure GDA00028931895600000510
Representing visual words
Figure GDA00028931895600000514
With all convolution features belonging to this vocabulary
Figure GDA00028931895600000511
Cumulative residual of v, will vlExpansion to obtain k × dlThe long vector x of (2) is taken as a feature vector of the input image, i.e., the image is represented by x.
(3) K adjacent images of the test image are searched to serve as a data set for recommending labels to the test image;
for the personalized recommendation of the image label, the nearest neighbor of the image does not select the most similar image in vision but expresses the related image of the same theme. Therefore, not only the visual similarity of the images but also the information of the metadata of the group to which the images belong should be considered in the selection process of the neighbors. The effect of groups on inter-image correlation is illustrated by fig. 2(a) - (c), where the tag list of fig. 2(a) includes: baikal, ice, lack, winter, frozen; the tag list of fig. 2(b) includes: baikal, shore, ice, lack, winter, rock, winter; the tag list of fig. 2(c) includes: cliff, sea, shore, wave, rock, cloud; it can be seen that although fig. 2(b) is more visually similar to fig. 2(c), fig. 2(b) belongs to the same group as fig. 2 (a). It can be seen from the tag list that the tags between fig. 2(a) and fig. 2(b) are more similar, and it is illustrated from the side that the same group contains more relevant subjects, and the pictures therein are also more relevant.
1) Visual similarity
By computing all images in the dataset (obtained by crawling data of the Flickr website) at LiFor the test image p, first calculate p at LiFeature vector x ofpThen calculate xpEuclidean distances to all feature vectors in X:
Figure GDA0002893189560000061
in the formula, rho is the visual distance of two images, xi、piAre d-dimensional feature vectors of the two images. The p values are mapped between 0-1.
2) Group similarity
And calculating the group co-occurrence coefficient normalized score of the test image and the visual neighbor image.
Figure GDA0002893189560000062
The group is known in advance, and the user information comprises group information; the co-occurrence coefficient value is the number of group intersections of two pictures divided by the number of group union of the two pictures. The Jaccard coefficient is used to measure the similarity between two pictures over the group metadata.
3) Neighbor formalized representation
Calculating the correlation of the pictures in the data set relative to the test picture by calculating the visual similarity and the group similarity of the test image and the images in the data set and linearly weighting the visual similarity and the group similarity, and the formula is as follows:
y=λ*(1-ρ)+(1-λ)*J (4)
where y represents the correlation score of the test image with the image in the dataset, and λ coefficient weight.
And sorting the relevance scores from small to large, selecting the first k images as adjacent images, and performing minimum and maximum normalization on the relevance scores of all the adjacent images as the voting weight of the adjacent images to the test image.
(4) Establishing a weighted image-label bipartite model, and calculating the correlation of each label relative to a test image through an improved weighted random walk algorithm;
(5) and selecting the top N labels with the highest correlation degree and recommending the labels to the test image.
The first k neighbor images obtained according to the magnitude of Y value, for which we build a weighted neighbor label bipartite graph, the magnitude of weight is Y value, assuming we have the following set of neighbor images, the process is shown in fig. 3.
The image photo is [ a, B, C, D ] is a neighbor image set, the previous value is the relevance score y of each image relative to the image a, and the tag is [ a, B, C, D, e ] is a tag set.
The Pagerank random walk algorithm is an algorithm used for calculating the access heat of each webpage in the Internet, the algorithm scores the webpages and then ranks the webpages, the basic idea is that each webpage in the Internet is mutually connected through a hyperlink, a user can jump to access another webpage through the hyperlink of one webpage, the webpages in the Internet form each node of a graph, and when one user accesses one webpage, two options are provided, one option is to stay at the current page, and the other option is to jump to other webpages for access through the hyperlinks contained in the current page. And if the probability of the user jumping to access is d, the probability of staying at the current webpage is 1-d. Assuming that a user accesses other web pages uniformly through hyperlinks of the current web page, a random walk process is formed, and after a large number of users access each web page in the internet for many times, the probability of each web page being accessed converges to a certain value, and the web page ranking can be performed through the value, wherein the random walk process is expressed by a formula:
Figure GDA0002893189560000071
PR (i) in the formula refers to the probability that the webpage i is accessed, d refers to the probability that the user jumps to access, N refers to the number of webpages in the Internet, in (i) refers to all webpage sets hyperlinked to the webpage i, and out (j) refers to the webpage set hyperlinked by the webpage j. The access probability of a web page i contains two parts: the first part is the probability that the user initially visits i and stays down:
Figure GDA0002893189560000072
the second part is the probability that the user accesses i through hyperlinks to other web pages:
Figure GDA0002893189560000073
these two parts constitute the access probability of web page i.
In the pagerank algorithm, the correlation degree of each vertex in the graph relative to other vertices is calculated, however in our work, the correlation of all labels relative to the input picture is needed, and meanwhile, the correlation (neighbor weight) of neighbor images relative to the input image needs to be considered, so that on the basis of the pagerank algorithm, the following weighted random walk formula is improved:
Figure GDA0002893189560000074
Figure GDA0002893189560000075
compared with the pagerank algorithm, the method has two differences, the first is riThe value of (b) represents an input image node, which means that the input image node is walked from each time; second is ωjTaking a value, when j belongs to a label node, we ignore the parameter, and when j belongs to a neighboring image node, it is ωjAnd attaching the weight y of the adjacent image relative to the input image, and recommending a plurality of labels with the highest correlation degree to the input image according to the result obtained by the algorithm, wherein the correlation degree of all vertexes (including labels and images) relative to the input image.
Experimental verification
The experimental data set contains the required group metadata information, namely an image is shared by a plurality of groups, and the experiment is carried out on the basis of the expanded data set. Firstly, images in a data set need to be converted into visual feature vectors, a classic AlexNet convolutional neural network structure is selected, the network comprises five convolutional layers and two full-connection layers, an ImageNet data set is used for training network parameters, and experiments are conducted on the trained network.
Inputting all pictures in the data set into a network to extract the characteristics of each convolution layer, carrying out VLAD coding, carrying out K-Means clustering on all the characteristics of each convolution layer, setting a clustering value K to be 100, thus obtaining 100 visual vocabularies in each convolution layer, and calculating residual errors for each visual vocabulary
Figure GDA0002893189560000081
VLAD codes of each image in each convolution layer are obtained, the codes are unfolded into long vectors, and feature vectors of each image in each convolution layer are obtained.
Inputting a test image, extracting convolution characteristics of each layer, calculating residual errors through visual vocabularies of the corresponding layer to carry out VLAD coding, obtaining a characteristic vector of the test image after expansion, and selecting the first 15 images with the minimum distance to compare the correlation between the input image and the selected image through MAP by calculating Euclidean distance.
As can be seen from fig. 4, the image features extracted by the fifth convolutional layer (conv5) perform better in the selection of visual neighbors than other layers, so the neighbors are selected by extracting the image features through the AlexNet network fifth convolutional layer.
The parameter of the formula (4) can be obtained through training data, for an input image I, the value of lambda is increased progressively from 0.1 to 0.9 by the step length of 0.2, and the correlation between the obtained neighbor image and the training image is calculated when different values are obtained. Similarly, the rest of the images are input, the correlation of the images of the respective groups under different parameters are averaged and compared, and the comparison result is shown in fig. 5.
As can be seen from fig. 5, when the λ value is set to 0.4, the resulting neighbor image is more correlated with the input image, while in comparison with fig. 4, it is found that the neighbor correlation obtained by combining the visual information and the group information is significantly higher than that obtained based on the visual information alone.
To verify the effect of the method of the present embodiment, the following verification was performed:
(1) the influence of the recommended number of the tags on the recommendation result is compared by setting different recommended numbers of the tags, as shown in fig. 6; as can be seen from fig. 6, when the recommended number of tags is 10, a better tag recommendation effect can be obtained.
(2) And comparing in four ways according to whether the group information is considered in the selection of the neighbor and whether the weight of the neighbor is considered in the random walk. We denote A, a for considering and not considering the group information and B, b for considering and not considering the weight of the neighbor, then there are four combinations of AB, AB. The effect of the four combinations on the recommended label is shown in FIG. 7; it can be seen from fig. 7 that the tag recommendation combining the group information and the weight achieves the best effect. Meanwhile, the positive effect of the group information in the label recommendation can be obtained by comparing Ab and Ab; comparing aB to aB can yield a positive role for neighbor weights in label recommendation.
(3) The comparison between the weighted random walk algorithm based on the convolution characteristic and other methods in the embodiment is shown in fig. 8, and it can be seen that the performance of the method in the embodiment is superior. Image search is performed after labels are recommended for images through a personalized label recommendation algorithm, and P (precision), R (recall), F1 (harmonic mean based on precision and recall) and MAP (mean of average precision) are calculated to evaluate the recommendation effect of each method, as shown in table 1.
TABLE 1 comparison of image tag recommendation results using P, R, F1 and MAP
Figure GDA0002893189560000091
As can be seen from table 1, after the image is labeled with the recommended label by the method of the present embodiment, the image retrieval performance is superior to that of other methods in the four evaluation indexes of the P, R, F1 values and MAP, which proves the superiority of the method of the present embodiment compared with other methods, more relevant neighbor images can be obtained by combining the group information and the visual features, and the accuracy of the label can be effectively improved by the weighted random walk algorithm.
Example two
In one or more embodiments, a terminal device is disclosed that includes a processor and a computer-readable storage medium, the processor to implement instructions; the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the personalized tag recommendation method based on convolution characteristics and weighted random walk in the first embodiment. For brevity, no further description is provided herein.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The computer readable storage medium may include a read-only memory and a random access memory and provide instructions and data to the processor, and a portion of the memory may also include a non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The steps of a method in connection with one embodiment may be embodied directly in a hardware processor, or in a combination of the hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements, i.e., algorithm steps, described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (8)

1. A personalized tag recommendation method based on convolution features and weighted random walks is characterized by comprising the following steps:
inputting a set test image into a pre-trained convolutional neural network, and taking the output of a convolutional layer in the convolutional neural network as the visual characteristic of the image;
coding the visual features, and converting the image into a visual feature vector;
k adjacent images of the test image are searched through the visual feature vector and the group metadata information to serve as a data set for recommending labels to the test image;
establishing a weighted image-label bipartite model, and calculating the correlation of each label relative to a test image through an improved weighted random walk algorithm; the weighted random walk formula of the improved weighted random walk algorithm is as follows:
Figure FDA0002893189550000011
Figure FDA0002893189550000012
PR (i) is the correlation degree of the node i relative to the test image, PR (j) is the correlation degree of the node j relative to the test image, d is the probability of jump access of a user, out (j) refers to a webpage set hyperlinked by a webpage j, in (i) refers to all webpage sets hyperlinked to the webpage i, and u represents an input image node, which means that the input image node is walked from each time; when j belongs to a tag node, ωjIs 1, when j belongs to a neighboring image node, is ωjAssigning a voting weight y of the neighboring image relative to the test image, wherein y represents a relevance score of the test image and the image in the data set;
and selecting the top N labels with the highest correlation degree and recommending the labels to the test image.
2. The personalized tag recommendation method based on the convolution feature and the weighted random walk according to claim 1, characterized in that a set test image is input into a pre-trained convolution neural network, and the output of convolution layers in the convolution neural network is used as the visual feature of the image; the method specifically comprises the following steps:
adjusting the size of the test image to be n multiplied by n suitable for the convolutional neural network, and inputting the test image into a pre-trained L-layer convolutional neural network;
forward propagation through the network, at the ith convolutional layer LiAfter the features of the previous layer are passed through a convolution kernel, the result of the convolution is given a size nl×nl×dlCharacteristic diagram M ofl(ii) a Wherein d islIs LiNumber of convolution kernels for a layer;
in the feature map MlAt each (i, j) position of (a), a d is obtainedlVector of dimensions
Figure FDA0002893189550000013
Finally, the test image is obtained on the convolution layer LiN in (1)l×nlA local feature vector.
3. The personalized tag recommendation method based on convolution features and weighted random walk according to claim 1, characterized in that, the visual features are encoded, and the image is converted into a visual feature vector; the method specifically comprises the following steps: encoding the local feature vector into a single visual feature vector by using VLAD encoding; the VLAD code is calculated by computing the convolution features extracted from the test image at a certain layer as k dlDimension vectors, thus converting the processing of the test image into the processing of k vectors.
4. The personalized tag recommendation method based on convolution feature and weighted random walk as claimed in claim 1, wherein k neighbor images of the test image are searched as a data set for recommending a tag to the test image, specifically:
by computing all images in the dataset at LiObtaining the feature vector table X of all the images; the above-mentionedLiIs the ith convolution layer;
calculating the test image at LiFeature vector x ofpThen calculate xpEuclidean distances from all the feature vectors in the feature vector table X are used as visual similarity data;
calculating the group co-occurrence coefficient normalization score of the test image and the image in the data set as group similarity data;
carrying out linear weighting on the visual similarity data and the group similarity data, and calculating the correlation of the pictures in the data set relative to the test image;
and sorting the correlation results from small to large, and selecting the first k images as adjacent images.
5. The personalized tag recommendation method based on the convolution feature and the weighted random walk as claimed in claim 4, wherein the correlation result of each neighboring image is subjected to minimum and maximum normalization to be used as the voting weight given to the test image by the neighboring image; and establishing a weighted image-label bipartite graph model according to the weight.
6. The personalized tag recommendation method based on the convolution characteristic and the weighted random walk as claimed in claim 1, wherein the improved weighted random walk algorithm obtains the correlation degrees of all tags and neighboring images relative to the input image, and selects a plurality of tags with the highest correlation degrees to recommend to the input image.
7. A computer-readable storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor of a terminal device and to perform the method for personalized tag recommendation based on convolution feature and weighted random walk of any of claims 1-6.
8. A terminal device comprising a processor and a computer-readable storage medium, the processor being configured to implement instructions; a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method for personalized tag recommendation based on convolution features and weighted random walks according to any one of claims 1 to 6.
CN201910424549.6A 2019-05-21 2019-05-21 Personalized tag recommendation method and system based on convolution features and weighted random walk Active CN110110130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910424549.6A CN110110130B (en) 2019-05-21 2019-05-21 Personalized tag recommendation method and system based on convolution features and weighted random walk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910424549.6A CN110110130B (en) 2019-05-21 2019-05-21 Personalized tag recommendation method and system based on convolution features and weighted random walk

Publications (2)

Publication Number Publication Date
CN110110130A CN110110130A (en) 2019-08-09
CN110110130B true CN110110130B (en) 2021-03-02

Family

ID=67491408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910424549.6A Active CN110110130B (en) 2019-05-21 2019-05-21 Personalized tag recommendation method and system based on convolution features and weighted random walk

Country Status (1)

Country Link
CN (1) CN110110130B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775849B (en) * 2023-08-23 2023-10-24 成都运荔枝科技有限公司 On-line problem processing system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804689A (en) * 2018-06-14 2018-11-13 合肥工业大学 The label recommendation method of the fusion hidden connection relation of user towards answer platform
CN108920641A (en) * 2018-07-02 2018-11-30 北京理工大学 A kind of information fusion personalized recommendation method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965863B2 (en) * 2016-08-26 2018-05-08 Elekta, Inc. System and methods for image segmentation using convolutional neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804689A (en) * 2018-06-14 2018-11-13 合肥工业大学 The label recommendation method of the fusion hidden connection relation of user towards answer platform
CN108920641A (en) * 2018-07-02 2018-11-30 北京理工大学 A kind of information fusion personalized recommendation method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于二分图的个性化图像标签推荐算法》;赵天龙等;《南京大学学报(自然科学 )》;20181130;第54卷(第6期);第1193-1205页 *

Also Published As

Publication number Publication date
CN110110130A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
Zhang et al. Query specific rank fusion for image retrieval
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN108733766B (en) Data query method and device and readable medium
CN111581510A (en) Shared content processing method and device, computer equipment and storage medium
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
Xiao et al. Convolutional hierarchical attention network for query-focused video summarization
US8891908B2 (en) Semantic-aware co-indexing for near-duplicate image retrieval
CN111159485B (en) Tail entity linking method, device, server and storage medium
JP2013200885A (en) Annotating images
CN112417097B (en) Multi-modal data feature extraction and association method for public opinion analysis
CN110795527B (en) Candidate entity ordering method, training method and related device
CN112307182B (en) Question-answering system-based pseudo-correlation feedback extended query method
Liu et al. Learning socially embedded visual representation from scratch
Wang et al. Aspect-ratio-preserving multi-patch image aesthetics score prediction
Shah et al. Prompt: Personalized user tag recommendation for social media photos leveraging personal and social contexts
CN114756733A (en) Similar document searching method and device, electronic equipment and storage medium
CN110110130B (en) Personalized tag recommendation method and system based on convolution features and weighted random walk
CN114741599A (en) News recommendation method and system based on knowledge enhancement and attention mechanism
Abdel-Nabi et al. Content based image retrieval approach using deep learning
US20220058448A1 (en) Image selection from a database
CN111813888A (en) Training target model
CN113962228A (en) Long document retrieval method based on semantic fusion of memory network
Zheng et al. Personalized tag recommendation based on convolution feature and weighted random walk
Urban et al. Adaptive image retrieval using a graph model for semantic feature integration
CN113516094A (en) System and method for matching document with review experts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant