CN107590741A

CN107590741A - A kind of method and system of predicted pictures popularity

Info

Publication number: CN107590741A
Application number: CN201710848290.9A
Authority: CN
Inventors: 刘文印; 黎宇坤; 黄费涛; 林泽航; 杨振国
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2017-09-19
Filing date: 2017-09-19
Publication date: 2018-01-16

Abstract

The present application discloses a method for predicting the popularity of pictures. The method includes: preprocessing the received source data to obtain feature samples; performing random inactivation processing on feature samples according to a preset ratio to obtain the first input feature; A regression model performs a prediction operation on the input features to obtain an intermediate prediction result, and combines the intermediate prediction result with the first input feature to obtain a second input feature; judge whether the depth stacking regression model converges according to the second input feature; if so, then Generate the corresponding picture popularity according to the intermediate prediction result; if not, combine the intermediate prediction result with the second input feature to obtain the third input feature, and use the third input result as a new feature sample; this method can accurately predict the picture popularity; the application also discloses a system for predicting the popularity of pictures, a computer-readable storage medium and a server, which have the above beneficial effects.

Description

A method and system for predicting the popularity of pictures

技术领域technical field

本发明涉及数据分析领域，特别涉及一种预测图片流行度的方法、系统和一种计算机可读存储介质及服务器。The invention relates to the field of data analysis, in particular to a method and system for predicting picture popularity, a computer-readable storage medium and a server.

背景技术Background technique

信息技术的飞速发展推动了社交媒体的流行，社交媒体改变了人们交互的方式。用户主要通过发送图片的方式，在社交媒体平台分享自己的生活和经历。因此，社交媒体积累了海量的图片数据。然而，这些图片的流行度不尽相同。不同知名度的用户所发的图片的流行度相差甚远，同一用户发的图片的流行度也不同。许多领域的应用，例如新闻个性化推荐系统的设计，网上广告的投放等等，都得益于社交媒体图片流行度预测这一课题研究。The rapid development of information technology has promoted the popularity of social media, which has changed the way people interact. Users mainly share their lives and experiences on social media platforms by sending pictures. Therefore, social media has accumulated massive image data. However, the popularity of these images varies. The popularity of pictures sent by users with different popularity is very different, and the popularity of pictures sent by the same user is also different. Applications in many fields, such as the design of news personalized recommendation systems, the placement of online advertisements, etc., all benefit from the research on the popularity prediction of social media pictures.

现有技术中，基于循环神经网络的社交网络消息爆发检测是对社交网络中用户发布与转发的历史消息的分类预测，判断消息是否爆发。该现有技术只涉及到对社交媒体上面的文本信息的预测，无法实现对社交图片流行度的精准预测。In the prior art, social network message outbreak detection based on recurrent neural network is to classify and predict historical messages published and forwarded by users in social networks, and to judge whether a message has broken out. This prior art only involves the prediction of text information on social media, and cannot accurately predict the popularity of social pictures.

因此，如何实现对社交图片流行度进行精准预测，是本领域技术人员目前需要解决的技术问题。Therefore, how to accurately predict the popularity of social pictures is a technical problem to be solved by those skilled in the art.

发明内容Contents of the invention

本申请的目的是提供一种预测图片流行度的方法、系统和一种计算机可读存储介质及服务器，能够实现对社交图片流行度进行精准预测。The purpose of the present application is to provide a method and system for predicting the popularity of pictures, a computer-readable storage medium and a server, which can accurately predict the popularity of social pictures.

为解决上述技术问题，本申请提供一种预测图片流行度的方法，该方法包括：In order to solve the above technical problems, the present application provides a method for predicting the popularity of pictures, the method comprising:

步骤1：对接收的源数据进行预处理得到特征样本；其中，所述特征样本包括视觉特征和社交特征；Step 1: Preprocessing the received source data to obtain feature samples; wherein, the feature samples include visual features and social features;

步骤2：对所述特征样本按照预设比率进行随机失活处理，得到第一输入特征；Step 2: Perform random inactivation processing on the feature samples according to a preset ratio to obtain the first input feature;

步骤3：利用预设数个回归模型对所述输入特征进行预测操作得到中间预测结果，并将所述中间预测结果与所述第一输入特征进行组合得到第二输入特征；Step 3: Using several preset regression models to perform prediction operations on the input features to obtain intermediate prediction results, and combine the intermediate prediction results with the first input features to obtain second input features;

步骤4：根据所述第二输入特征判断深度堆叠回归模型是否收敛；若否，则将所述中间预测结果与所述第二输入特征组合得到第三输入特征，并将所述第三输入结果作为新的所述特征样本进入步骤2；若是，则进入步骤5；Step 4: According to the second input feature, it is judged whether the depth stacking regression model is converged; if not, the intermediate prediction result is combined with the second input feature to obtain a third input feature, and the third input result is Enter step 2 as a new feature sample; if so, enter step 5;

步骤5：根据所述中间预测结果生成相对应的所述图片流行度。Step 5: Generate the corresponding picture popularity according to the intermediate prediction result.

可选的，所述对接收的源数据进行预处理得到特征样本包括：Optionally, said preprocessing the received source data to obtain feature samples includes:

将所述源数据划分为图片数据和社交数据；dividing the source data into image data and social data;

利用人工神经网络对所述图片数据进行特征提取得到所述视觉特征；performing feature extraction on the image data using an artificial neural network to obtain the visual features;

利用多时态标度和Z-score(标准分数)标准化对所述社交数据进行转化得到所述社交特征；Using multitemporal scale and Z-score (standard score) standardization to convert the social data to obtain the social features;

将所述视觉特征和所述社交特征按预设规则拼接得到所述特征样本。The feature samples are obtained by splicing the visual features and the social features according to preset rules.

可选的，所述利用人工神经网络对所述图片数据进行特征提取得到所述视觉特征包括：Optionally, performing feature extraction on the image data using an artificial neural network to obtain the visual features includes:

利用人工神经网络对所述图片数据进行两级特征提取得到低级特征和高级特征；performing two-level feature extraction on the image data using an artificial neural network to obtain low-level features and high-level features;

将所述低级特征和所述高级特征进行组合得到所述视觉特征。Combining the low-level features and the high-level features to obtain the visual features.

可选的，所述利用人工神经网络对所述图片数据进行两级特征提取得到低级特征和高级特征包括：Optionally, performing two-level feature extraction on the image data using an artificial neural network to obtain low-level features and high-level features includes:

在ImageNet数据集和Place365数据集上训练得到ResNeXt模型，Xception模型和DenseNet模型；其中，所述ResNeXt模型、Xception模型和DenseNet模型均为人工神经网络；ResNeXt model, Xception model and DenseNet model are obtained by training on ImageNet data set and Place365 data set; Wherein, described ResNeXt model, Xception model and DenseNet model are artificial neural networks;

将所述图片数据输入到所述ResNeXt模型、Xception模型和DenseNet模型中，并将所述ResNeXt模型、Xception模型和DenseNet模型最后一层之前的特征图的值提取并经过特征压缩和特征选择得到所述低级特征；The picture data is input into the ResNeXt model, Xception model and DenseNet model, and the value of the feature map before the last layer of the ResNeXt model, Xception model and DenseNet model is extracted and obtained through feature compression and feature selection. low-level features;

将所述ResNeXt模型、Xception模型和DenseNet模型对图片预测的场景信息和类别信息连接得到所述高级特征。The high-level features are obtained by connecting the scene information and category information predicted by the ResNeXt model, Xception model and DenseNet model to the picture.

本申请还提供了一种预测图片流行度的系统，该系统包括：The present application also provides a system for predicting the popularity of pictures, the system comprising:

预处理模块，用于对接收的源数据进行预处理得到特征样本；其中，所述特征样本包括视觉特征和社交特征；A preprocessing module, configured to preprocess the received source data to obtain feature samples; wherein the feature samples include visual features and social features;

Dropout模块，用于对所述特征样本按照预设比率进行随机失活处理，得到第一输入特征；The Dropout module is used to perform random inactivation processing on the feature samples according to a preset ratio to obtain the first input feature;

Block模块，用于利用预设数个回归模型对所述第一输入特征进行预测操作得到中间预测结果，并将所述中间预测结果与所述输入特征进行组合得到第二输入特征；A Block module, configured to perform a prediction operation on the first input feature using several preset regression models to obtain an intermediate prediction result, and combine the intermediate prediction result with the input feature to obtain a second input feature;

Detector模块，用于根据所述第二输入特征判断深度堆叠回归模型是否收敛；若否，则将所述中间预测结果与所述第二输入特征组合得到第三输入特征，并将所述第三输入结果作为新的所述特征样本进入下一层堆叠回归模型；若是，则根据所述中间预测结果生成相对应的所述图片流行度。The Detector module is used to judge whether the depth stacking regression model converges according to the second input feature; if not, combine the intermediate prediction result with the second input feature to obtain a third input feature, and use the third The input result is entered into the next layer stacked regression model as the new feature sample; if yes, the corresponding picture popularity is generated according to the intermediate prediction result.

可选的，所述预处理模块包括：Optionally, the preprocessing module includes:

分类子模块，用于将所述源数据划分为图片数据和社交数据；Classification sub-module for dividing the source data into picture data and social data;

视觉特征提取子模块，用于利用人工神经网络对所述图片数据进行特征提取得到所述视觉特征；The visual feature extraction sub-module is used to perform feature extraction on the image data using an artificial neural network to obtain the visual features;

社交特征提取子模块，用于利用多时态标度和Z-score标准化对所述社交数据进行转化得到所述社交特征；The social feature extraction submodule is used to convert the social data to obtain the social feature by using multi-temporal scale and Z-score standardization;

拼接子模块，用于将所述视觉特征和所述社交特征按预设规则拼接得到所述特征样本。The splicing sub-module is used to splice the visual features and the social features according to preset rules to obtain the feature samples.

可选的，所述视觉特征提取子模块包括：Optionally, the visual feature extraction submodule includes:

两级特征提取单元，用于利用人工神经网络对所述图片数据进行两级特征提取得到低级特征和高级特征；A two-level feature extraction unit is used to extract low-level features and high-level features from the image data using an artificial neural network for two-level feature extraction;

组合单元，用于将所述低级特征和所述高级特征进行组合得到所述视觉特征。A combining unit, configured to combine the low-level features and the high-level features to obtain the visual features.

可选的，所述两级特征提取包括：Optionally, the two-stage feature extraction includes:

模型训练子单元，用于在ImageNet数据集和Place365数据集上训练得到ResNeXt模型，Xception模型和DenseNet模型；其中，所述ResNeXt模型、Xception模型和DenseNet模型均为人工神经网络；The model training subunit is used to train the ResNeXt model, the Xception model and the DenseNet model on the ImageNet data set and the Place365 data set; wherein, the ResNeXt model, the Xception model and the DenseNet model are all artificial neural networks;

低级特征提取子单元，用于将所述图片数据输入到所述ResNeXt模型、Xception模型和DenseNet模型中，并将所述ResNeXt模型、Xception模型和DenseNet模型最后一层之前的特征图的值提取并经过特征压缩和特征选择得到所述低级特征；The low-level feature extraction subunit is used to input the picture data into the ResNeXt model, Xception model and DenseNet model, and extract the value of the feature map before the last layer of the ResNeXt model, Xception model and DenseNet model and Obtaining the low-level features through feature compression and feature selection;

高级特征提取子单元，用于将所述ResNeXt模型、Xception模型和DenseNet模型对图片预测的场景信息和类别信息连接得到所述高级特征。The advanced feature extraction subunit is used to connect the scene information and category information predicted by the ResNeXt model, Xception model and DenseNet model to obtain the advanced features.

本申请还提供了一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序执行时实现以下步骤：The present application also provides a computer-readable storage medium, on which a computer program is stored, and the following steps are implemented when the computer program is executed:

本申请还提供了一种服务器，包括存储器和处理器，所述存储器中存储有计算机程序，所述处理器调用所述存储器中的计算机程序时实现以下步骤：The present application also provides a server, including a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the following steps when invoking the computer program in the memory:

本发明提供了一种预测图片流行度的方法，对接收的源数据进行预处理得到特征样本；其中，所述特征样本包括视觉特征和社交特征；对所述特征样本按照预设比率进行随机失活处理，得到第一输入特征；利用预设数个回归模型对所述输入特征进行预测操作得到中间预测结果，并将所述中间预测结果与所述第一输入特征进行组合得到第二输入特征；根据所述第二输入特征判断深度堆叠回归模型是否收敛；若否，则将所述中间预测结果与所述第二输入特征组合得到第三输入特征，并将所述第三输入结果作为新的所述特征样本；若是，则根据所述中间预测结果生成相对应的所述图片流行度。The present invention provides a method for predicting the popularity of a picture. The received source data is preprocessed to obtain feature samples; wherein, the feature samples include visual features and social features; the feature samples are randomized according to a preset ratio. Live processing to obtain the first input feature; using several preset regression models to perform prediction operations on the input feature to obtain an intermediate prediction result, and combine the intermediate prediction result with the first input feature to obtain the second input feature ; Judging whether the depth stacking regression model converges according to the second input feature; if not, combining the intermediate prediction result with the second input feature to obtain a third input feature, and using the third input result as a new the feature samples; if yes, generate the corresponding picture popularity according to the intermediate prediction result.

本方法中对源数据进行预处理得到可以识别的特征样本，对特征样本按照预设比例进行随机失活处理可以增加特征样本的多样性，使图片流行度预测更加准确。利用回归模型对输入的第一输入特征进行预测得到预测值，并判断深度堆叠回归模型是否收敛，若不收敛则重复执行随机失活处理、回归模型预测、收敛判断的操作直至深度回归模型收敛，得到图片流行度。由于存在多个回归模型，所以会得到多个预测值，将上述预测值取平均值即预测到的图片流行度。该方法通过多层回归模型的堆叠能够准确预测图片的流行度，有利于新媒体的发展；本申请同时还提供了一种预测图片流行度的系统、一种计算机可读存储介质及服务器，具有上述有益效果，在此不再赘述。In this method, the source data is preprocessed to obtain identifiable feature samples, and the random inactivation processing of the feature samples according to a preset ratio can increase the diversity of feature samples and make the picture popularity prediction more accurate. Use the regression model to predict the first input feature of the input to obtain the predicted value, and judge whether the deep stacked regression model converges. If it does not converge, repeat the operations of random deactivation processing, regression model prediction, and convergence judgment until the deep regression model converges. Get image popularity. Since there are multiple regression models, multiple predicted values will be obtained, and the average of the above predicted values is the predicted popularity of the picture. The method can accurately predict the popularity of pictures through the stacking of multi-layer regression models, which is conducive to the development of new media; the application also provides a system for predicting the popularity of pictures, a computer-readable storage medium and a server, which have The above beneficial effects will not be repeated here.

附图说明Description of drawings

为了更清楚地说明本申请实施例，下面将对实施例中所需要使用的附图做简单的介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the embodiments of the present application more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. As far as people are concerned, other drawings can also be obtained based on these drawings on the premise of not paying creative work.

图1为本申请实施例所提供的一种预测图片流行度的方法的流程图；FIG. 1 is a flow chart of a method for predicting picture popularity provided by an embodiment of the present application;

图2为本申请实施例所提供的另一种预测图片流行度的方法的流程图；FIG. 2 is a flow chart of another method for predicting picture popularity provided by an embodiment of the present application;

图3为本申请实施例所提供的又一种预测图片流行度的方法的流程图；FIG. 3 is a flow chart of another method for predicting picture popularity provided by an embodiment of the present application;

图4为本申请实施例所提供的一种预测图片流行度的系统的结构示意图。FIG. 4 is a schematic structural diagram of a system for predicting picture popularity provided by an embodiment of the present application.

具体实施方式detailed description

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

下面请参见图1，图1为本申请实施例所提供的一种预测图片流行度的方法的流程图；Please refer to FIG. 1 below. FIG. 1 is a flowchart of a method for predicting picture popularity provided by an embodiment of the present application;

具体步骤可以包括：Specific steps can include:

步骤S101：对接收的源数据进行预处理得到特征样本；其中，所述特征样本包括视觉特征和社交特征；Step S101: Preprocessing the received source data to obtain feature samples; wherein, the feature samples include visual features and social features;

其中，本方案的目的是预测图片的流行度，因此在本步骤中接收的源数据为图片的相关信息，源数据的来源可以是Flickr、微博、QQ空间等社交平台或图片分享网站，此处并不对源数据的获取来源进行限定，只要是公开的、与图片相关的网站即可。可以理解的是，由于本发明的目的是预测图片流行度，因此获取的源数据可以是受关注度很高的图片，也可以是基本不受关注的图片。但是出于法律道德的要求，对于某些携带不良信息(如色情、暴力、政治反动)的源数据就可以在进行预处理时剔除不进行流行度预测。Among them, the purpose of this program is to predict the popularity of the picture, so the source data received in this step is the relevant information of the picture, and the source of the source data can be social platforms such as Flickr, Weibo, QQ space or picture sharing websites. There is no restriction on the source of the source data, as long as it is a public, image-related website. It can be understood that, since the purpose of the present invention is to predict the popularity of pictures, the acquired source data may be pictures with a high degree of attention, or pictures with little attention. However, due to the requirements of legal ethics, some source data with bad information (such as pornography, violence, and political reaction) can be eliminated during preprocessing without popularity prediction.

可以理解的是，源数据中不仅有图片的视觉信息(图片内容携带的信息)，还有社交信息(如图片的发布时间、图片所获得的评论数、图片是否有标签、标签的个数、图片标题长度、发布图片的用户的平均浏览量，用户加入群组的个数等)。因此，相应的在对源数据进行预处理时，需要对图片的视觉信息进行预处理，也需要对图片的社交媒体层面的信息进行预处理。由于与处理的对象不同，因此对其进行的预处理的方式也不同。对于图片的视觉信息，可以使用深度学习技术(如ResNeXt、Xception或DenseNet)来提取源数据中的视觉信息得到视觉特征。更进一步的，可以使用自编码器和随机森林进行降维处理和特征选择得到维数更少的视觉特征。优选的，可以利用社交信息中的时间信息可以将图片的发布日期进行多维度的划分，比如可以划分为季度、月份、小时等。It is understandable that the source data contains not only the visual information of the picture (the information carried by the content of the picture), but also social information (such as the release time of the picture, the number of comments received by the picture, whether the picture has tags, the number of tags, The length of the picture title, the average number of views of the user who posted the picture, the number of users who joined the group, etc.). Therefore, correspondingly, when preprocessing the source data, it is necessary to preprocess the visual information of the picture, and it is also necessary to preprocess the information of the social media level of the picture. Since it is different from the processed object, the way of preprocessing it is also different. For the visual information of pictures, deep learning techniques (such as ResNeXt, Xception or DenseNet) can be used to extract the visual information in the source data to obtain visual features. Furthermore, autoencoders and random forests can be used for dimensionality reduction and feature selection to obtain visual features with fewer dimensions. Preferably, the time information in the social information can be used to divide the release date of the picture in multiple dimensions, for example, it can be divided into quarters, months, hours and so on.

当然，对源数据进行预处理得到特征样本可以有很多操作，除了上述提到的相关操作外，可以包括对重复源数据(即多张相同照片)的筛选；还可以包括按图片类型对源数据进行分类，如人物图、风景图、动物图等等，即先对图片类型进行区分再进行后续的流行度预测操作。上述对图片进行分类只是作为一种优选实施例而存在，本领域的技术人员可以根据方案应用的具体场景作出综合性的选择。Of course, there are many operations that can be performed on the source data to obtain feature samples. In addition to the above-mentioned related operations, it can include screening of duplicate source data (that is, multiple identical photos); it can also include source data by image type Carry out classification, such as figure pictures, landscape pictures, animal pictures, etc., that is, first distinguish the picture types and then perform subsequent popularity prediction operations. The above-mentioned classification of pictures exists only as a preferred embodiment, and those skilled in the art can make a comprehensive choice according to the specific scene where the solution is applied.

步骤S102：对所述特征样本按照预设比率进行随机失活处理，得到第一输入特征；Step S102: Randomly inactivate the feature samples according to a preset ratio to obtain the first input feature;

其中，本步骤的目的是提升特征样本的多样性。由于本方案是通过堆叠多层回归模型实现对图片流行度的预测，因此随着回归模型的复杂性的增加，对于流行度的预测往往受到过度拟合的限制。为了降低过度拟合带来的限制，本方案在进行图片流行度预测之前对所有的特征样本进行随机失活处理。例如，假设输入特征向量为4-D，并且失活的比率为0.5，则为每个基本模型获得2-D特征向量；假设块模块中有4个基本模型，在此失活比率的基础上，则下一块模块的输入的特征尺寸为8-D。Among them, the purpose of this step is to increase the diversity of feature samples. Since this solution realizes the prediction of image popularity by stacking multi-layer regression models, as the complexity of the regression model increases, the prediction of popularity is often limited by overfitting. In order to reduce the limitation caused by overfitting, this scheme performs random deactivation on all feature samples before predicting the popularity of images. For example, assuming that the input feature vector is 4-D and the deactivation ratio is 0.5, a 2-D feature vector is obtained for each base model; assuming there are 4 base models in the block module, based on this deactivation ratio , then the feature size of the input of the next block is 8-D.

可以理解的是，本步骤中的预设比率是本领域的技术人员在进行大量实验和论证后得到的，可以根据源数据的具体情况进行相对准确的选择，以提升流行度预测的准确度和效率。通常来讲，在特征样本(即源数据)多样性不足的情况下可以适当提高随机失活处理的预设比率；而在特征样本多样性相对较高的情况下可以适当降低随机失活处理的预设比率。总而言之，此处并不对预设比率的具体数值进行限定，针对于不同的应用环境存在不同的预设比率。It can be understood that the preset ratio in this step is obtained by those skilled in the art after conducting a large number of experiments and demonstrations, and can be selected relatively accurately according to the specific conditions of the source data to improve the accuracy and accuracy of the popularity prediction. efficiency. Generally speaking, when the diversity of feature samples (ie, source data) is insufficient, the preset ratio of random inactivation processing can be appropriately increased; while in the case of relatively high diversity of feature samples, the ratio of random inactivation processing can be appropriately reduced. preset ratio. In a word, the specific value of the preset ratio is not limited here, and there are different preset ratios for different application environments.

步骤S103：利用预设数个回归模型对所述输入特征进行预测操作得到中间预测结果，并将所述中间预测结果与所述第一输入特征进行组合得到第二输入特征；Step S103: Using several preset regression models to perform prediction operations on the input features to obtain intermediate prediction results, and combine the intermediate prediction results with the first input features to obtain second input features;

其中，本步骤的目的是对步骤S102中得到的输入特征进行预测得到初步的预测值。本步骤中使用的回归模型可以有很多种，其中包括：随机森林(RF)、极度随机树(EXRT)、XGBoost或Lasso等等，当然还可以存在其他的回归模型，本领域的技术人员可以自行选择，故此处并不对本步骤中用到的回归模型的具体类型进行限定。Wherein, the purpose of this step is to predict the input features obtained in step S102 to obtain a preliminary predicted value. There can be many kinds of regression models used in this step, including: Random Forest (RF), Extreme Random Tree (EXRT), XGBoost or Lasso, etc. Of course, there can also be other regression models, and those skilled in the art can make their own Therefore, the specific type of the regression model used in this step is not limited here.

可以理解的是，本步骤中提到的预设数量可以是1可以是2或2以上的数字，但是出于增加输出特征的多样性的考虑，只使用一个回归模型会带来多样性偏低的不良影响，因此尽量多使用不同类型的回归模型来增加输出特征的多样性。例如，可以使用随机森林(RF)、极度随机树(EXRT)、XGBoost和Lasso这四种回归模型来同时对步骤S102中得到的输入特征进行预测得到初步的预测值。其中，由于每一种类型的回归模型的结构、原理不同，因此每一种回归模型计算得到的预测值也是不同的，可以对所有的预测值取平均得到中间预测结果。It is understandable that the number of presets mentioned in this step can be 1 or 2 or more, but for the sake of increasing the diversity of output features, using only one regression model will result in low diversity adverse effects, so use as many different types of regression models as possible to increase the diversity of output features. For example, four regression models, Random Forest (RF), Extremely Random Tree (EXRT), XGBoost and Lasso, can be used to simultaneously predict the input features obtained in step S102 to obtain preliminary predicted values. Wherein, since the structures and principles of each type of regression model are different, the prediction values calculated by each regression model are also different, and all the prediction values can be averaged to obtain an intermediate prediction result.

通过前面的论述可知，本方案是通过堆叠多层回归模型来实现图片预测的，故本步骤中将中间预测结果与所述第一输入特征进行组合得到第二输入特征这一步骤，是为了判断是否需要进入下一层回归模型再次预测。As can be seen from the previous discussion, this solution realizes picture prediction by stacking multi-layer regression models, so in this step, the step of combining the intermediate prediction result with the first input feature to obtain the second input feature is for judging Whether it is necessary to enter the next layer of regression model to predict again.

步骤S104：根据所述第二输入特征判断深度堆叠回归模型是否收敛；若否，则进入步骤S105；若是，则进入步骤S106；Step S104: judging whether the deep stack regression model converges according to the second input feature; if not, proceed to step S105; if yes, proceed to step S106;

其中，本步骤的目的是通过步骤S103得到的第二输入特征判断深度堆叠回归模型是否收敛。深度(即回归模型堆叠的次数)对于堆叠模型至关重要，在一定范围内深度越大对于图片流行度的预测就越准确，但是在达到某个临界值时一味地增加堆叠的深度不会增加预测的准确度，反而会造成资源的浪费。因此，本步骤通过判断深度堆叠回归模型是否收敛来分析是否需要增加堆叠的深度。Wherein, the purpose of this step is to judge whether the deep stack regression model is converged or not based on the second input feature obtained in step S103. Depth (that is, the number of times the regression model is stacked) is very important for the stacked model. The greater the depth within a certain range, the more accurate the prediction of the popularity of the picture is, but blindly increasing the stacking depth when reaching a certain critical value will not increase The accuracy of the forecast will cause a waste of resources. Therefore, this step analyzes whether the stacking depth needs to be increased by judging whether the deep stacking regression model converges.

如果深度堆叠模型不收敛，则需要继续堆叠；如果深度模型收敛，则说明对于图片流行度的预测已经结束。其中继续堆叠就是利用步骤S105中得到的第三输入特征重复步骤S102、步骤S103、步骤S104的流程直至判断深度堆叠模型收敛。If the deep stacking model does not converge, you need to continue stacking; if the deep stacking model converges, it means that the prediction of the image popularity has ended. To continue stacking is to use the third input feature obtained in step S105 to repeat the process of step S102, step S103, and step S104 until the convergence of the deep stacking model is judged.

步骤S105：将所述中间预测结果与所述第二输入特征组合得到第三输入特征，并将所述第三输入结果作为新的所述特征样本进入步骤S102。Step S105: Combine the intermediate prediction result with the second input feature to obtain a third input feature, and use the third input result as a new feature sample to enter step S102.

步骤S106：根据所述中间预测结果生成相对应的所述图片流行度。Step S106: Generate the corresponding picture popularity according to the intermediate prediction result.

下面请参见图2，图2为本申请实施例所提供的另一种预测图片流行度的方法的流程图。Please refer to FIG. 2 below. FIG. 2 is a flow chart of another method for predicting the popularity of a picture provided by an embodiment of the present application.

本实施例是针对上一实施例中步骤S101中如何对源数据中进行预处理得到特征样本所做出的一个具体的限定，其他步骤与上一实施例大体相同，相同部分可参见上一实施例相关部分，在此不再赘述。This embodiment is a specific limitation on how to preprocess the source data to obtain feature samples in step S101 in the previous embodiment. Other steps are generally the same as the previous embodiment. For the same parts, please refer to the previous implementation. The relevant part of the example will not be repeated here.

具体步骤可以包括：Specific steps can include:

步骤S201：将所述源数据划分为图片数据和社交数据；Step S201: dividing the source data into image data and social data;

其中，本步骤中提到的图片数据就是图片的视觉信息(图片内容携带的信息)，社交数据就是社交媒体层面的信息(如图片的发布时间、图片所获得的评论数、图片是否有标签、标签的个数、图片标题长度、发布图片的用户的平均浏览量，用户加入群组的个数等)。Among them, the picture data mentioned in this step is the visual information of the picture (the information carried by the picture content), and the social data is the information at the social media level (such as the release time of the picture, the number of comments obtained by the picture, whether the picture has tags, The number of tags, the length of the picture title, the average page views of users who post pictures, the number of users joining groups, etc.).

步骤S202：利用人工神经网络对所述图片数据进行特征提取得到所述视觉特征；Step S202: Using an artificial neural network to perform feature extraction on the image data to obtain the visual features;

步骤S203：利用多时态标度和Z-score标准化对所述社交数据进行转化得到所述社交特征；Step S203: transforming the social data using multitemporal scale and Z-score standardization to obtain the social features;

步骤S204：将所述视觉特征和所述社交特征按预设规则拼接得到所述特征样本。Step S204: concatenate the visual features and the social features according to preset rules to obtain the feature samples.

下面请参见图3，图3为本申请实施例所提供的又一种预测图片流行度的方法的流程图。Please refer to FIG. 3 below. FIG. 3 is a flow chart of another method for predicting picture popularity provided by an embodiment of the present application.

本实施例是针对上一实施例中步骤S202中如何利用人工神经网络对所述图片数据进行特征提取得到所述视觉特征所做出的一个具体的限定，其他步骤与上一实施例大体相同，相同部分可参见上一实施例相关部分，在此不再赘述。This embodiment is a specific limitation on how to use the artificial neural network to perform feature extraction on the image data in step S202 in the previous embodiment to obtain the visual features. Other steps are substantially the same as in the previous embodiment. For the same parts, reference may be made to relevant parts of the previous embodiment, and details are not repeated here.

步骤S301：在ImageNet数据集和Place365数据集上训练得到ResNeXt模型，Xception模型和DenseNet模型；其中，所述ResNeXt模型、Xception模型和DenseNet模型均为人工神经网络；Step S301: ResNeXt model, Xception model and DenseNet model are obtained by training on ImageNet dataset and Place365 dataset; wherein, the ResNeXt model, Xception model and DenseNet model are all artificial neural networks;

步骤S302：将所述图片数据输入到所述ResNeXt模型、Xception模型和DenseNet模型中，并将所述ResNeXt模型、Xception模型和DenseNet模型最后一层之前的特征图的值提取并经过特征压缩和特征选择得到所述低级特征；Step S302: Input the picture data into the ResNeXt model, Xception model and DenseNet model, and extract the value of the feature map before the last layer of the ResNeXt model, Xception model and DenseNet model and perform feature compression and feature extraction. select to obtain said low-level features;

步骤S303：将所述ResNeXt模型、Xception模型和DenseNet模型对图片预测的场景信息和类别信息连接得到所述高级特征。Step S303: connect the scene information and category information predicted by the ResNeXt model, Xception model and DenseNet model to obtain the high-level features.

步骤S304：将所述低级特征和所述高级特征进行组合得到所述视觉特征。Step S304: Combine the low-level features and the high-level features to obtain the visual features.

请参见图4，图4为本申请实施例所提供的一种预测图片流行度的系统的结构示意图；Please refer to FIG. 4, which is a schematic structural diagram of a system for predicting picture popularity provided by an embodiment of the present application;

该系统可以包括：The system can include:

预处理模块100，用于对接收的源数据进行预处理得到特征样本；其中，所述特征样本包括视觉特征和社交特征；A preprocessing module 100, configured to preprocess the received source data to obtain feature samples; wherein, the feature samples include visual features and social features;

Dropout模块200，用于对所述特征样本按照预设比率进行随机失活处理，得到第一输入特征；The Dropout module 200 is configured to perform random inactivation processing on the feature samples according to a preset ratio to obtain the first input feature;

Block模块300，用于利用预设数个回归模型对所述第一输入特征进行预测操作得到中间预测结果，并将所述中间预测结果与所述输入特征进行组合得到第二输入特征；Block module 300, configured to use several preset regression models to perform prediction operations on the first input features to obtain intermediate prediction results, and combine the intermediate prediction results with the input features to obtain second input features;

Detector模块400，用于根据所述第二输入特征判断深度堆叠回归模型是否收敛；若否，则将所述中间预测结果与所述第二输入特征组合得到第三输入特征，并将所述第三输入结果作为新的所述特征样本进入下一层堆叠回归模型若是，则根据所述中间预测结果生成相对应的所述图片流行度。The Detector module 400 is used to judge whether the depth stacking regression model converges according to the second input feature; if not, combine the intermediate prediction result with the second input feature to obtain a third input feature, and use the second input feature to obtain a third input feature. The three input results are entered into the next layer stacked regression model as the new feature samples, and if so, the corresponding picture popularity is generated according to the intermediate prediction results.

其中，图4中只将第一层堆叠回归模型的各模块进行了标注，每一层堆叠回归模型均有Dropout模块、Block模块和Detector模块。Among them, in Figure 4, only the modules of the first-layer stacked regression model are marked, and each layer of the stacked regression model has a Dropout module, a Block module, and a Detector module.

在本申请提供的另一种预测图片流行度的系统的实施例中，所述预处理模块100包括：In another embodiment of the system for predicting picture popularity provided by the present application, the preprocessing module 100 includes:

进一步的，所述视觉特征提取子模块包括：Further, the visual feature extraction submodule includes:

进一步的，所述两级特征提取包括：Further, the two-level feature extraction includes:

由于系统部分的实施例与方法部分的实施例相互对应，因此系统部分的实施例请参见方法部分的实施例的描述，这里暂不赘述。Since the embodiments of the system part correspond to the embodiments of the method part, please refer to the description of the embodiments of the method part for the embodiments of the system part, and details will not be repeated here.

本申请还提供了一种计算机可读存储介质，其上存有计算机程序，该计算机程序被执行时可以实现上述实施例所提供的步骤。该存储介质可以包括：U盘、移动硬盘、只读存储器(Read-Only Memory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The present application also provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed, the steps provided in the above-mentioned embodiments can be realized. The storage medium may include various media capable of storing program codes such as a U disk, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk.

本申请还提供了一种服务器，可以包括存储器和处理器，所述存储器中存有计算机程序，所述处理器调用所述存储器中的计算机层序时，可以实现上述实施例所提供的步骤。当然所述服务器还可以包括各种网络接口，电源等组件。The present application also provides a server, which may include a memory and a processor, where a computer program is stored in the memory, and when the processor invokes a computer layer sequence in the memory, the steps provided in the above embodiments can be implemented. Of course, the server may also include various network interfaces, power supplies and other components.

说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。对于实施例公开的系统而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。应当指出，对于本技术领域的普通技术人员来说，在不脱离本申请原理的前提下，还可以对本申请进行若干改进和修饰，这些改进和修饰也落入本申请权利要求的保护范围内。Each embodiment in the description is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other. As for the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part. It should be pointed out that those skilled in the art can make some improvements and modifications to the application without departing from the principles of the application, and these improvements and modifications also fall within the protection scope of the claims of the application.

还需要说明的是，在本说明书中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的状况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that in this specification, relative terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations There is no such actual relationship or order between the operations. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

Claims

1. A method for predicting picture popularity, comprising:

Step 1: Preprocessing the received source data to obtain feature samples; wherein, the feature samples include visual features and social features;

Step 2: Perform random inactivation processing on the feature samples according to a preset ratio to obtain the first input feature;

Step 3: Using several preset regression models to perform prediction operations on the input features to obtain intermediate prediction results, and combine the intermediate prediction results with the first input features to obtain second input features;

Step 4: According to the second input feature, it is judged whether the depth stacking regression model is converged; if not, the intermediate prediction result is combined with the second input feature to obtain a third input feature, and the third input result is Enter step 2 as a new feature sample; if so, enter step 5;

Step 5: Generate the corresponding picture popularity according to the intermediate prediction result.

2. The method according to claim 1, wherein said obtaining the feature samples by preprocessing the received source data comprises:

dividing the source data into image data and social data;

performing feature extraction on the image data using an artificial neural network to obtain the visual features;

converting the social data to obtain the social features by using multitemporal scale and Z-score standardization;

The feature samples are obtained by splicing the visual features and the social features according to preset rules.

3. The method according to claim 2, wherein said utilizing artificial neural network to perform feature extraction on said picture data to obtain said visual features comprises:

performing two-level feature extraction on the image data using an artificial neural network to obtain low-level features and high-level features;

Combining the low-level features and the high-level features to obtain the visual features.

4. method according to claim 3, is characterized in that, described utilizing artificial neural network to carry out two-level feature extraction to described picture data and obtain low-level feature and high-level feature comprising:

ResNeXt model, Xception model and DenseNet model are obtained by training on ImageNet data set and Place365 data set; Wherein, described ResNeXt model, Xception model and DenseNet model are artificial neural networks;

The picture data is input into the ResNeXt model, Xception model and DenseNet model, and the value of the feature map before the last layer of the ResNeXt model, Xception model and DenseNet model is extracted and obtained through feature compression and feature selection. low-level features;

The high-level features are obtained by connecting the scene information and category information predicted by the ResNeXt model, Xception model and DenseNet model to the picture.

5. A system for predicting picture popularity, comprising:

A preprocessing module, configured to preprocess the received source data to obtain feature samples; wherein the feature samples include visual features and social features;

The Dropout module is used to perform random inactivation processing on the feature samples according to a preset ratio to obtain the first input feature;

A Block module, configured to perform a prediction operation on the first input feature using several preset regression models to obtain an intermediate prediction result, and combine the intermediate prediction result with the input feature to obtain a second input feature;

The Detector module is used to judge whether the depth stacking regression model converges according to the second input feature; if not, combine the intermediate prediction result with the second input feature to obtain a third input feature, and use the third The input result is entered into the next layer stacked regression model as the new feature sample; if yes, the corresponding picture popularity is generated according to the intermediate prediction result.

6. The system according to claim 5, wherein the preprocessing module comprises:

Classification sub-module for dividing the source data into picture data and social data;

The visual feature extraction sub-module is used to perform feature extraction on the image data using an artificial neural network to obtain the visual features;

The social feature extraction submodule is used to convert the social data to obtain the social feature by using multi-temporal scale and Z-score standardization;

The splicing sub-module is used to splice the visual features and the social features according to preset rules to obtain the feature samples.

7. system according to claim 6, is characterized in that, described visual feature extraction submodule comprises:

A two-level feature extraction unit is used to extract low-level features and high-level features from the image data using an artificial neural network for two-level feature extraction;

A combining unit, configured to combine the low-level features and the high-level features to obtain the visual features.

8. The system according to claim 7, wherein the two-stage feature extraction comprises:

The model training subunit is used to train the ResNeXt model, the Xception model and the DenseNet model on the ImageNet data set and the Place365 data set; wherein, the ResNeXt model, the Xception model and the DenseNet model are all artificial neural networks;

The low-level feature extraction subunit is used to input the picture data into the ResNeXt model, Xception model and DenseNet model, and extract the value of the feature map before the last layer of the ResNeXt model, Xception model and DenseNet model and Obtaining the low-level features through feature compression and feature selection;

The advanced feature extraction subunit is used to connect the scene information and category information predicted by the ResNeXt model, Xception model and DenseNet model to obtain the advanced features.

9. A computer-readable storage medium, on which a computer program is stored, wherein the method according to any one of claims 1 to 4 is implemented when the computer program is executed.

10. A server, characterized in that it comprises a memory and a processor, wherein a computer program is stored in the memory, and when the processor invokes the computer program in the memory, it realizes any one of claims 1 to 4 Methods.