US12112463B2 - Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon - Google Patents

Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon Download PDF

Info

Publication number
US12112463B2
US12112463B2 US17/447,447 US202117447447A US12112463B2 US 12112463 B2 US12112463 B2 US 12112463B2 US 202117447447 A US202117447447 A US 202117447447A US 12112463 B2 US12112463 B2 US 12112463B2
Authority
US
United States
Prior art keywords
popularity
intrinsic
image
score
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/447,447
Other versions
US20220084187A1 (en
Inventor
Shiqi Wang
Kede Ma
Keyan Ding
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Roko Labs LLC
City University of Hong Kong CityU
Original Assignee
City University of Hong Kong CityU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by City University of Hong Kong CityU filed Critical City University of Hong Kong CityU
Priority to US17/447,447 priority Critical patent/US12112463B2/en
Assigned to CITY UNIVERSITY OF HONG KONG reassignment CITY UNIVERSITY OF HONG KONG ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DING, KEYAN, MA, KEDE, WANG, SHIQI
Assigned to ROKO LABS LLC reassignment ROKO LABS LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RAKOVITSKY, Dmitry, SELIN, SERGEI
Publication of US20220084187A1 publication Critical patent/US20220084187A1/en
Application granted granted Critical
Publication of US12112463B2 publication Critical patent/US12112463B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Definitions

  • the present specification relates broadly, but not exclusively, to methods, devices, and computer readable media for intrinsic popularity evaluation and content compression based thereon.
  • IPA absolute image popularity assessment
  • computational and/or storage resources of the various social network platforms cannot be allocated effectively, as it is difficult to provide a more popular content (e.g. images/photos and videos) with more computational and storage resources without an accurate popularity evaluation/prediction.
  • a method of intrinsic popularity evaluation comprising: receiving an image from a social network; and determining an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • a method of content compression comprising: compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
  • a device for intrinsic popularity evaluation comprising: at least one processor; and a memory including computer program code for execution by the at least one processor, the computer program code instructs the at least one processor to: receive an image from a social network; and determine an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • a device for content compression comprising: at least one processor configurable to allocate resources; and a memory including computer program code for execution by the at least one processor, the computer program code instructs the at least one processor to allocate resources by: compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
  • a device for content compression comprising: a core autoencoder including an analysis transformation and a synthesis transformation to learn a quantized latent representation of a content of information; and a hyper autoencoder arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder; wherein the content is processed by both the core autoencoder and the hyper autoencoder to generate a compressed content of the information, and wherein the core autoencoder and the hyper autoencoder are composed of multiple layers of conditional convolution, generalized divisive normalization (GDN), and inverse GDN, wherein the multiple layers of conditional convolution are conditioned on an intrinsic popularity score of the content of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation as described herein.
  • FIG. 1 is a schematic diagram of a device 100 .
  • the device can be implemented as a device for intrinsic popularity evaluation.
  • the device can be implemented as a device for content compression.
  • FIG. 2 is a flow chart illustrating a method 200 for intrinsic popularity evaluation, according to an embodiment.
  • FIG. 3 shows a diagram 300 depicting an embodiment of content on a social network.
  • the content 300 is an image posted by a user.
  • the image 300 comprises a visual content 302 and a plurality of attributes.
  • the attributes include user statistics 304 , upload/download time 306 , and caption content 308 .
  • FIG. 4 shows a diagram 400 depicting an embodiment of popularity-discriminable image pairs (PDIPs).
  • FIG. 5 shows a diagram 500 depicting an embodiment of a deep neural network (DNN) based intrinsic popularity assessment model trained with a plurality of popularity-discriminable image pairs (PDIPs).
  • DNN deep neural network
  • PDIPs popularity-discriminable image pairs
  • FIG. 6 is a flow chart illustrating a method 600 for content compression, according to an embodiment.
  • FIG. 7 is a schematic diagram of a device 700 for content compression, according to an embodiment.
  • FIG. 8 shows a block diagram of a computer system 800 suitable for use as a device 100 as exemplified in FIG. 1 , which in some embodiments can be implemented as a device for intrinsic popularity evaluation and in some other embodiments can be implemented as a device for content compression as described herein.
  • FIGS. 9 to 12 depict experimental data that proves effectiveness of the methods and devices for intrinsic popularity evaluation and for content compression as described herein.
  • the experimental data shows that the embodiments described herein produce a more accurate popularity evaluation which facilitates a more efficient computational and/or storage resource management for various social networks based on the intrinsic popularity evaluation. Details of the experimental data are as follows.
  • FIG. 9 shows a diagram 900 depicting accuracy of popularity evaluation conducted on a testing set of popularity-discriminable image pairs (PDIPs) based on various popularity evaluation methods. It is shown that the intrinsic popularity evaluation 914 as described in the present application has achieved a highest accuracy at 76.65% among seven (7) popularity evaluation methods.
  • PDIPs popularity-discriminable image pairs
  • FIG. 10 shows a diagram 1000 depicting a normalised histogram of intrinsic popularity scores for a testing set of 5000 popularity-discriminable image pairs (PDIPs) based on the method of intrinsic popularity evaluation according to an embodiment. It is shown that the normalised histogram of the intrinsic popularity scores fits into a Gaussian curve 1002 .
  • PDIPs popularity-discriminable image pairs
  • FIG. 11 shows examples of images with different intrinsic popularity levels.
  • the respective intrinsic popularity scores of these images determined by the methods of intrinsic popularity evaluation as described herein are classified into five (5) intrinsic popularity levels: (a) excellent, (b) good, (c) fair, (d) bad, and (e) poor.
  • the excellent level may cover an intrinsic popularity score range of 6 and above.
  • the good level may cover an intrinsic popularity score range of 4 to 6.
  • the fair level may cover an intrinsic popularity score range of 2-4.
  • the bad level may cover an intrinsic popularity score range of 0 to 2.
  • the poor level may cover an intrinsic popularity score range of 0 and below.
  • FIG. 12 shows a diagram 1200 depicting heatmaps of sample images generated by Grad-CAM.
  • a front row 1202 shows images of high intrinsic popularity, e.g. with intrinsic popularity scores in a range of 4 and above.
  • a second row 1204 shows images of low intrinsic popularity, e.g. with intrinsic popularity scores in a range of 2 and below. It is shown from the diagram 1200 that a warmer region in a visual content of an image contributes more to the image's intrinsic popularity.
  • the present specification also discloses apparatus for performing the operations of the methods.
  • Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer.
  • the algorithms and displays presented herein are not inherently related to any particular computer or other apparatus.
  • Various machines may be used with programs in accordance with the teachings herein.
  • the construction of more specialized apparatus to perform the required method steps may be appropriate.
  • the structure of a computer suitable for executing the various methods/processes described herein will appear from the description below.
  • the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code.
  • the computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the specification contained herein.
  • the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
  • Such a computer program may be stored on any computer readable medium.
  • the computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a computer.
  • the computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system.
  • the computer program when loaded and executed on such a computer effectively results in an apparatus that implements the steps of the preferred method.
  • Embodiments of the present application provide approaches that emphasize on visual content of images to evaluate intrinsic popularity and thereby provide a more accurate popularity evaluation for the images and a more efficient computational and/or storage resource management for various social network platforms based on the intrinsic popularity evaluation.
  • intrinsic popularity evaluation aims to provide accurate popularity predictions for one or more images on a social network by determining intrinsic popularity scores for the one or more images.
  • intrinsic popularity evaluation can be interchangeably referred to as intrinsic popularity prediction in the present application.
  • FIG. 1 illustrates a schematic diagram of a device 100 .
  • the device can be implemented as a device for intrinsic popularity evaluation.
  • the device can be implemented as a device for content compression.
  • the device 100 at least includes one or more processor 102 and a memory 104 .
  • the at least one processor 102 and the memory 104 are interconnected.
  • the memory 104 includes computer program code (not shown in FIG. 1 ) for execution by the at least one processor 102 .
  • the computer program code instructs the at least one processor 102 to perform the steps for intrinsic popularity evaluation as shown in FIG. 2 and described in the present application.
  • the computer program code instructs the at least one processor 102 to receive an image from a social network.
  • An example of the image is shown in FIG. 3 .
  • a diagram 300 depicts an embodiment of content on a social network.
  • the content 300 is an image posted by a user.
  • the image 300 comprises a visual content 302 and a plurality of attributes.
  • the attributes of the image 300 indicate social and textual information of the image 300 .
  • the attributes include user statistics 304 , upload/download time 306 , and caption content 308 .
  • the user statistics 304 may include user ID, number of posts of the user, number of follower accounts of the user, number of following accounts of the user, etc.
  • the attributes of the image 300 may include other information such as a Uniform Resource Locator (URL) web address of the image 300 (interchangeably referred to as “post URL” in the present application for the sake of simplicity), a type of the visual content 302 (for example, a building, a street view, etc), number of comments, number of hashtags, number of @ signs, and etc. It is appreciable to those skilled in the art that the attributes of the image 300 may include further information.
  • URL Uniform Resource Locator
  • the computer program code instructs the at least one processor 102 to determine an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • the intrinsic popularity score for the image is associated with a visual content of the image.
  • the visual content 302 of the image 300 can be relied solely for intrinsic popularity evaluation 310 , which arrives at an intrinsic popularity score of 32 (with a re-scaled maximum score of 100).
  • a conventional absolute image popularity assessment takes both the visual factor (e.g. visual content 302 of the image 300 ) and non-visual factors (e.g. the attributes 304 , 306 , 308 of the image 300 ) into consideration to predict the number of received likes for the image 300 to be 341 . It therefore shows the intrinsic popularity evaluation/assessment in the present application which treats visual factor (e.g. visual content 302 of the image 300 ) with more importance can advantageously achieve a more direct intrinsic popularity score and a more accurate popularity prediction.
  • the intrinsic popularity score for the image can be further associated with one or more of the attributes as described above based on practical needs and/or requirements.
  • the DNN based intrinsic popularity assessment model is trained.
  • the computer program code instructs the at least one processor 102 to retrieve a plurality of historical images.
  • the plurality of historical images can be retrieved from the same social network as the image 300 . It is appreciable to those skilled in the art that the plurality of historical images used in the training may be retrieved from one or more different social networks. For example, the plurality of historical images can be retrieved from over 200 million distinctive posts crawled from Instagram. In some embodiments, one historical image can be only involved in one PDIP to ensure diversity of the training.
  • a plurality of popularity-discriminable image pairs can be constructed.
  • the construction of the plurality of PDIPs is in a manner that each PDIP in the plurality of PDIPs comprises a first image and a second image and that the first image has a probability of higher intrinsic popularity than the second image.
  • the construction of the plurality of PDIPs can be performed in the following manner. Considering a log-scaled number of likes S received for an image as a ground truth for absolute image popularity of the image, the following two assumptions are made.
  • is a random variable, which can be viewed as the average number of likes received by an image in the log scale.
  • p( ⁇ ) is assumed to be flat with a finite positive support. To simplify the derivation, ⁇ is treated as a positive constant to be determined.
  • S A , S B ) ⁇ ⁇ ( S A - S B 2 ⁇ ⁇ ) , ( 5 )
  • ⁇ ( ⁇ ) is the standard normal cumulative distribution function.
  • S A ,S B ) indicates a probability that Image A is intrinsically more popular than Image B.
  • a large threshold T is chosen to ensure the popularity discriminability of the plurality of PDIPs, i.e., P(Q A ⁇ Q B
  • FIG. 4 shows a diagram 400 depicting an embodiment of popularity-discriminable image pairs (PDIPs) as constructed above.
  • this embodiment 400 shows six (6) popularity-discriminable image pairs (PDIPs) 402 , 404 , 406 , 408 , 410 , 412 .
  • PDIPs popularity-discriminable image pairs
  • the left image in each pair is expected to be intrinsically more popular than the right image based on the above equation (5).
  • Such an expectation is confirmed by a psychophysical experiment conducted to prove effectiveness of the methods for intrinsic popularity evaluation. It is understandable that the embodiment in FIG. 4 only shows a portion of the popularity-discriminable image pairs (PDIPs) for the sake of simplicity.
  • the construction of the plurality of PDIPs may take non-visual factors into consideration in addition to the requirement that the first image in each PDIP has a probability of higher intrinsic popularity than the second image, e.g., P(Q A ⁇ Q B
  • S A ,S B ) ⁇ T it may be desirable for the first image and the second image in a PDIP to have similar textual and social contexts.
  • Such non-visual factors can be fulfilled by selecting two images that are associated with the same one or more attributes in addition to requiring one of the two images intrinsically more popular than the other image of the two images when constructing a PDIP.
  • the non-visual factors are described above as one or more attributes of an image, which include user statistics, upload/download time, caption content, post URL, a type of visual content of the image (for example, a beach, a dog, a building, a street view, etc), number of comments, number of hashtags, and/or number of @ signs.
  • the user statistics may include user ID, number of posts of the user, number of follower accounts of the user, number of following accounts of the user, etc.
  • the construction of the plurality of PDIPs may further require images used for PDIP construction to be from a same user.
  • the construction of the plurality of PDIPs may further require the post time difference of two images in a PDIP to a maximum of ten days.
  • images just uploaded to a social network is preferable to be disregarded, as the number of likes has not reached a saturation value.
  • the construction of the plurality of PDIPs may further require excluding images posted within one month.
  • Captions of images have a noticeable influence on image popularity, especially those containing hashtags and @ signs.
  • a hot hashtag contributes significantly to image popularity because of the extensive exposure to viewers beyond followers.
  • the construction of the plurality of PDIPs may further require the hashtag and @ sign of the images in a PDIP to be the same in terms of both content and number.
  • the construction of the plurality of PDIPs may further require the length of the caption (excluding the hashtag and @ sign) to be restricted to a maximum of six words.
  • the construction of the plurality of PDIPs requires the first image in each PDIP to have a probability of higher intrinsic popularity than the second image, e.g., P (Q A ⁇ Q B
  • the one or more additional requirements may be optional in some embodiments or essential in some other embodiments.
  • the computer program code instructs the at least one processor 102 to train the DNN based intrinsic popularity assessment model 500 with the plurality of PDIPs.
  • An embodiment of the DNN based intrinsic popularity assessment model 500 trained with the plurality of PDIPs is shown in FIG. 5 .
  • the computer program code instructs the at least one processor 102 to determine a first intrinsic popularity score 510 for the first image 502 and a second intrinsic popularity score 512 for the second image 504 using the DNN based intrinsic popularity assessment model 500 , and optimise the DNN based intrinsic popularity assessment model 500 by minimizing a binary cross entropy loss between a score difference P AB between the first intrinsic popularity score 510 and the second intrinsic popularity score 512 and a ground truth binary label P AB denoting whether the first image 502 is intrinsically more popular than the second image 504 .
  • the DNN based intrinsic popularity assessment model 500 is configured in a Siamese architecture, which includes two same DNNs 506 , 508 each denoted as ⁇ .
  • each of the two DNNs 506 , 508 is implemented by a 50-layer residual network, for example, ResNet-50. It is appreciable to those skilled in the art that the two DNNs can be implemented by other deep neural networks.
  • the two DNNs 206 , 508 may share same weights during training and testing.
  • the ground truth binary label P AB of the PDIP denotes whether the first image 502 is intrinsically more popular than the second image 504 .
  • the DNN based intrinsic popularity assessment model 500 is optimised.
  • the DNN 506 , 508 ⁇ is optimised as ⁇ * in the intrinsic popularity assessment model 500 .
  • the above described embodiments of intrinsic popularity evaluation advantageously provide a more accurate popularity prediction for images on various social networks as compared to conventional absolute popularity evaluation methods, as described with reference to experimental data shown in FIGS. 9 to 12 .
  • Such a more accurate popularity prediction in turn can facilitate a more efficient computational and/or storage resource management for various social networks.
  • the intrinsic popularity scores determined by the intrinsic popularity evaluation further serve as a guidance to optimize content compression from the perspective of ultimate utility.
  • the present application allocates more resources to information having contents with high popularity scores, such that contents (e.g., images) that are potentially viewed more frequently are allocated with more coding bits. That is, contents of higher popularity are compressed with lower compression rates.
  • the computer program code instructs the at least one processor 102 to allocate resources by performing the steps in the exemplified method 600 for content compression as shown in FIG. 6 and described in the present application.
  • the computer program code instructs the at least one processor 102 to compress information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
  • the content of the information comprises an image.
  • the intrinsic popularity score of the image is determined using a deep neural network (DNN) based intrinsic popularity assessment model as described above with reference to the methods of intrinsic popularity evaluation.
  • the step of compressing of the information comprises: classifying the intrinsic popularity score into a popularity level and compressing the image with the variable compression rate, which is corresponding to the popularity level.
  • the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
  • the excellent level may cover an intrinsic popularity score range of 6 and above.
  • the good level may cover an intrinsic popularity score range of 4 to 6.
  • the fair level may cover an intrinsic popularity score range of 2-4.
  • the bad level may cover an intrinsic popularity score range of 0 to 2.
  • the poor level may cover an intrinsic popularity score range of 0 and below. For example, if the image has an intrinsic popularity score of 5, the intrinsic popularity score is classified into the good level and the image is compressed with a variable compression rate corresponding to the intrinsic popularity score and in turn the good level. Such a variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
  • FIG. 7 Another embodiment of a device 700 for content compression is depicted in FIG. 7 .
  • the device 700 is a DNN-based autoencoder for variable rate compression.
  • the device 700 comprises a core autoencoder 702 including an analysis transformation g a and a synthesis transformation g s to learn a quantized latent representation of a content x of information and a hyper autoencoder 704 arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder 702 .
  • the hyper autoencoder 704 can also include an analysis transformation h a and a synthesis transformation h s .
  • the content x is processed by both the core autoencoder 702 and the hyper autoencoder 704 to generate a compressed content ⁇ tilde over (x) ⁇ of the information.
  • the core autoencoder 702 and the hyper autoencoder 704 are trained by one or more deep neural networks (DNNs) includes one or more layers of conditional convolution, wherein the one or more layers of conditional convolution are conditioned on an intrinsic popularity score of the content x of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNNs deep neural networks
  • Q box represents quantization.
  • AE box and AD box represent an arithmetic encoder and an arithmetic decoder, respectively.
  • CConv denotes conditional convolution as described above that are conditioned on an intrinsic popularity score of the content x of the information.
  • Convolution parameters in the one or more layers of conditional convolution of the one or more DNNs are denoted as: kernel support height ⁇ kernel support width ⁇ number N or M of filters/down- or up-sampling stride, where ⁇ indicates upsampling and ⁇ indicates downsampling.
  • GDN indicates a generalized divisive normalization
  • IGDN is an inverse GDN.
  • the content of the information comprises an image x.
  • the variable compression rate of the image is achieved as described above based on a popularity based conditional convolution CConv according to equation (8).
  • Rate loss representing the bit consumption of the autoencoder 700 is defined by
  • FIG. 8 shows a block diagram of a computer system 800 suitable for use as a device 100 as exemplified in FIG. 1 , which in some embodiments can be implemented as a device for intrinsic popularity evaluation and in some other embodiments can be implemented as a device for content compression as described herein.
  • the example computing device 800 includes a processor 804 for executing software routines. Although a single processor is shown for the sake of clarity, the computing device 800 may also include a multi-processor system.
  • the processor 804 is connected to a communication infrastructure 806 for communication with other components of the computing device 800 .
  • the communication infrastructure 806 may include, for example, a communications bus, cross-bar, or network.
  • the computing device 800 further includes a main memory 808 , such as a random access memory (RAM), and a secondary memory 810 .
  • the secondary memory 810 may include, for example, a hard disk drive 812 and/or a removable storage drive 814 , which may include a magnetic tape drive, an optical disk drive, or the like.
  • the removable storage drive 814 reads from and/or writes to a removable storage unit 818 in a well-known manner.
  • the removable storage unit 818 may include a magnetic tape, optical disk, or the like, which is read by and written to by removable storage drive 814 .
  • the removable storage unit 818 includes a computer readable storage medium having stored therein computer executable program code instructions and/or data.
  • the secondary memory 810 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into the computing device 800 .
  • Such means can include, for example, a removable storage unit 822 and an interface 820 .
  • a removable storage unit 822 and interface 820 include a removable memory chip (such as an EPROM or PROM) and associated socket, and other removable storage units 822 and interfaces 820 which allow software and data to be transferred from the removable storage unit 822 to the computer system 800 .
  • the computing device 800 also includes at least one communication interface 824 .
  • the communication interface 824 allows software and data to be transferred between computing device 800 and external devices via a communication path 826 .
  • the communication interface 824 permits data to be transferred between the computing device 800 and a data communication network, such as a public data or private data communication network.
  • the communication interface 824 may be used to exchange data between different computing devices 800 which such computing devices 800 form part an interconnected computer network. Examples of a communication interface 824 can include a modem, a network interface (such as an Ethernet card), a communication port, an antenna with associated circuitry and the like.
  • the communication interface 824 may be wired or may be wireless.
  • Software and data transferred via the communication interface 824 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communication interface 824 . These signals are provided to the communication interface via the communication path 826 .
  • the computing device 800 further includes a display interface 802 which performs operations for rendering images to an associated display 830 and an audio interface 832 for performing operations for playing audio content via associated speaker(s) 834 .
  • Computer program product may refer, in part, to removable storage unit 818 , removable storage unit 822 , a hard disk installed in hard disk drive 812 , or a carrier wave carrying software over communication path 826 (wireless link or cable) to communication interface 824 .
  • Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computing device 800 for execution and/or processing.
  • Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-rayTM Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computing device 800 .
  • Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computing device 800 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
  • the computer programs are stored in main memory 808 and/or secondary memory 810 . Computer programs can also be received via the communication interface 824 . Such computer programs, when executed, enable the computing device 800 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 804 to perform features of the above-described embodiments. Accordingly, such computer programs represent controllers of the computer system 800 .
  • Software may be stored in a computer program product and loaded into the computing device 800 using the removable storage drive 814 , the hard disk drive 812 , or the interface 820 .
  • the computer program product may be downloaded to the computer system 800 over the communications path 826 .
  • the software when executed by the processor 804 , causes the computing device 800 to perform functions of embodiments described herein.
  • FIG. 8 is presented merely by way of example. Therefore, in some embodiments one or more features of the computing device 800 may be omitted. Also, in some embodiments, one or more features of the computing device 800 may be combined together. Additionally, in some embodiments, one or more features of the computing device 800 may be split into one or more component parts.
  • embodiments of the present application provide approaches that emphasize on visual content of images to evaluate intrinsic popularity and thereby advantageously provide a more accurate popularity prediction for images.
  • Such an accurate intrinsic popularity evaluation in turn advantageously serves as a guidance to optimize content compression, thereby facilitates a more efficient computational and/or storage resource management for various social networks.
  • FIGS. 9 to 12 depict experimental data that proves effectiveness of the methods and devices for intrinsic popularity evaluation and for content compression as described herein.
  • the experimental data shows that the embodiments described herein produce a more accurate popularity evaluation which facilitates a more efficient computational and/or storage resource management for various social networks based on the intrinsic popularity evaluation. Details of the experimental data are as follows.
  • FIG. 9 shows a diagram 900 depicting accuracy of popularity evaluation conducted on a testing set of popularity-discriminable image pairs (PDIPs) based on various popularity evaluation methods.
  • the various popularity evaluation methods include random evaluation 902 , virality detection 904 , Khosla14 906 , Hessel17 908 , LikelyAl 910 , human evaluation 912 , and the intrinsic popularity evaluation 914 of the present application.
  • the intrinsic popularity evaluation 914 as described in the present application has achieved a highest accuracy at 76.65% while the random evaluation 902 achieves an accuracy at 50%, the virality detection 904 at 53.21%, the Khosla14 906 at 54.39%, the Hessel17 908 at 65.54%, the LikelyAl 910 at 68.87%, and the human evaluation 912 at 72.40%.
  • FIG. 10 shows a diagram 1000 depicting a normalised histogram of intrinsic popularity scores for a testing set of 5000 popularity-discriminable image pairs (PDIPs) based on the method of intrinsic popularity evaluation according to an embodiment. It is shown that the normalised histogram of the intrinsic popularity scores fits into a Gaussian curve 1002 .
  • PDIPs popularity-discriminable image pairs
  • FIG. 11 shows examples of images with different intrinsic popularity levels.
  • the respective intrinsic popularity scores of these images determined by the methods of intrinsic popularity evaluation as described herein are classified into five (5) intrinsic popularity levels: (a) excellent, (b) good, (c) fair, (d) bad, and (e) poor.
  • the excellent level may cover an intrinsic popularity score range of 6 and above.
  • the good level may cover an intrinsic popularity score range of 4 to 6.
  • the fair level may cover an intrinsic popularity score range of 2-4.
  • the bad level may cover an intrinsic popularity score range of 0 to 2.
  • the poor level may cover an intrinsic popularity score range of 0 and below. It is appreciable to those skilled in the art that the intrinsic popularity score ranges described above are for exemplary purposes.
  • the intrinsic popularity score ranges may vary based on practical needs and requirements.
  • FIG. 12 shows a diagram 1200 depicting heatmaps of sample images generated by Grad-CAM.
  • a front row 1202 shows images of high intrinsic popularity, e.g. with intrinsic popularity scores in a range of 4 and above.
  • a second row 1204 shows images of low intrinsic popularity, e.g. with intrinsic popularity scores in a range of 2 and below. It is shown from the diagram 1200 that a warmer region in a visual content of an image contributes more to the image's intrinsic popularity.
  • the present disclosure provides the following.
  • a first aspect of the present disclosure provides a method of intrinsic popularity evaluation.
  • the method comprises: receiving an image from a social network; and determining an intrinsic popularity score for the image using a DNN based intrinsic popularity assessment model.
  • the method further comprises training the DNN based intrinsic popularity assessment model.
  • the training comprises: retrieving a plurality of historical images; constructing a plurality of PDIPs based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and training the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
  • the training of the DNN based intrinsic popularity assessment model with the plurality of PDIPs comprises: for each PDIP of the plurality of PDIPs, determining a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and optimising the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
  • the intrinsic popularity score for the image is associated with a visual content of the image.
  • the intrinsic popularity score for the image is further associated with one or more attributes of the image, the one or more attributes including: upload/download time; post URL; user ID; type of the visual content; caption content, and/or number of comments.
  • a second aspect of the present disclosure provides a method of content compression.
  • the method comprises compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
  • the content of the information comprises an image.
  • the method further comprises determining the intrinsic popularity score of the image using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • the compressing of the information comprises: classifying the intrinsic popularity score into a popularity level; and compressing the image with the variable compression rate, the variable compression rate corresponding to the popularity level, wherein the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
  • CConv W(s) ⁇ Conv+b(s)
  • s is the popularity level of the image encoded by a one-hot vector
  • W(s) is a channel-wise scaling factor performed on a general convolution (Conv)
  • b(s) is a bias term
  • a third aspect of the present disclosure provides a device for intrinsic popularity evaluation.
  • the device comprises: at least one processor; and a memory including computer program code for execution by the at least one processor.
  • the computer program code instructs the at least one processor to: receive an image from a social network; and determine an intrinsic popularity score for the image using a DNN based intrinsic popularity assessment model.
  • the computer program code further instructs the at least one processor to: train the DNN based intrinsic popularity assessment model.
  • the computer program code further instructs the at least one processor to: retrieve a plurality of historical images; construct a plurality of popularity-discriminable image pairs (PDIPs) based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and train the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
  • PDIPs popularity-discriminable image pairs
  • the computer program code instructs the at least one processor to: for each PDIP of the plurality of PDIPs, determine a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and optimise the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
  • the intrinsic popularity score for the image is associated with a visual content of the image.
  • the intrinsic popularity score for the image is further associated with one or more attributes of the image.
  • the one or more attributes include: upload/download time; post URL; user ID; type of the visual content; caption content; and/or number of comments.
  • a fourth aspect of the present disclosure provides a device for content compression.
  • the device comprises: at least one processor configurable to allocate resources; and a memory including computer program code for execution by the at least one processor.
  • the computer program code instructs the at least one processor to allocate resources by compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
  • the allocated resources include storage space.
  • the content of the information comprises an image.
  • the computer program code further instructs the at least one processor to determine the intrinsic popularity score of the image using a DNN based intrinsic popularity assessment model.
  • the computer program code further instructs the at least one processor to: classify the intrinsic popularity score into a popularity level; and compress the image with the variable compression rate, the variable compression rate corresponding to the popularity level, wherein the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
  • CConv W(s) ⁇ Conv+b(s)
  • s is the popularity level of the image encoded by a one-hot vector
  • W(s) is a channel-wise scaling factor performed on a general convolution (Conv
  • a fifth aspect of the present disclosure provides a device for content compression.
  • the device comprises: a core autoencoder including an analysis transformation and a synthesis transformation to learn a quantized latent representation of a content of information; and a hyper autoencoder arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder; wherein the content is processed by both the core autoencoder and the hyper autoencoder to generate a compressed content of the information, and wherein the core autoencoder and the hyper autoencoder are composed of multiple layers of conditional convolution, generalized divisive normalization (GDN), and inverse GDN, wherein the multiple layers of conditional convolution are conditioned on an intrinsic popularity score of the content of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
  • DNN deep neural network
  • the content of the information comprises an image.
  • a sixth aspect of the present disclosure provides a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation according to any one of the embodiments disclosed in the first aspect of the present disclosure.
  • a seventh aspect of the present disclosure provides a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation according to any one of the embodiments disclosed in the second aspect of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present application provides methods, devices and computer readable media for intrinsic popularity evaluation and content compression based thereon. In an embodiment, there is provided a method of intrinsic popularity evaluation. The method comprises: receiving an image from a social network; and determining an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.

Description

CROSS REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Patent Application No. 63/077,854 filed on Sep. 14, 2020, the disclosure of which is incorporated by reference herein in its entirety.
TECHNICAL FIELD
The present specification relates broadly, but not exclusively, to methods, devices, and computer readable media for intrinsic popularity evaluation and content compression based thereon.
BACKGROUND
Recent years have witnessed an accelerated proliferation of images/photos and videos being uploaded to various social network platforms (interchangeably referred to as social networks in the present application) such as Instagram, Flickr, and Reddit. Some images turn to be extremely popular, which gain millions of likes and comments, while some are completely ignored.
Conventional methods evaluate popularity of images by a combination of visual factors (e.g. visual content) and non-visual factors (e.g. user's statistics, upload time, caption, etc.). For example, absolute image popularity assessment (IPA) attempts to predict the number of received likes/comments of an image by combining all visual and non-visual factors.
However, visual factors and non-visual factors have different contributions in an image's popularity. It is noted that even for images uploaded by a same user at the same time, popularities of the images may be substantially different. Conventional methods that treat both visual and non-visual factors with same or similar importance may result in an inaccurate popularity evaluation/prediction.
In addition, without an accurate popularity evaluation/prediction, computational and/or storage resources of the various social network platforms cannot be allocated effectively, as it is difficult to provide a more popular content (e.g. images/photos and videos) with more computational and storage resources without an accurate popularity evaluation/prediction.
A need therefore exists to provide methods and devices that seek to overcome or at least minimize the above-mentioned problems so as to provide a more accurate popularity evaluation and a more efficient computational and/or storage resource management.
SUMMARY
According to an embodiment, there is provided a method of intrinsic popularity evaluation, the method comprising: receiving an image from a social network; and determining an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.
According to another embodiment, there is provided a method of content compression, the method comprising: compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
According to another embodiment, there is provided a device for intrinsic popularity evaluation, the device comprising: at least one processor; and a memory including computer program code for execution by the at least one processor, the computer program code instructs the at least one processor to: receive an image from a social network; and determine an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model.
According to another embodiment, there is provided a device for content compression, the device comprising: at least one processor configurable to allocate resources; and a memory including computer program code for execution by the at least one processor, the computer program code instructs the at least one processor to allocate resources by: compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
According to another embodiment, there is provided a device for content compression, comprising: a core autoencoder including an analysis transformation and a synthesis transformation to learn a quantized latent representation of a content of information; and a hyper autoencoder arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder; wherein the content is processed by both the core autoencoder and the hyper autoencoder to generate a compressed content of the information, and wherein the core autoencoder and the hyper autoencoder are composed of multiple layers of conditional convolution, generalized divisive normalization (GDN), and inverse GDN, wherein the multiple layers of conditional convolution are conditioned on an intrinsic popularity score of the content of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
According to yet another embodiment, there is provided a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation as described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments and implementations are provided by way of example only, and will be better understood and readily apparent to one of ordinary skill in the art from the following written description, read in conjunction with the drawings, in which:
FIG. 1 is a schematic diagram of a device 100. In an embodiment, the device can be implemented as a device for intrinsic popularity evaluation. In another embodiment, the device can be implemented as a device for content compression.
FIG. 2 is a flow chart illustrating a method 200 for intrinsic popularity evaluation, according to an embodiment.
FIG. 3 shows a diagram 300 depicting an embodiment of content on a social network. In this embodiment, the content 300 is an image posted by a user. The image 300 comprises a visual content 302 and a plurality of attributes. In the present embodiment, the attributes include user statistics 304, upload/download time 306, and caption content 308.
FIG. 4 shows a diagram 400 depicting an embodiment of popularity-discriminable image pairs (PDIPs).
FIG. 5 shows a diagram 500 depicting an embodiment of a deep neural network (DNN) based intrinsic popularity assessment model trained with a plurality of popularity-discriminable image pairs (PDIPs).
FIG. 6 is a flow chart illustrating a method 600 for content compression, according to an embodiment.
FIG. 7 is a schematic diagram of a device 700 for content compression, according to an embodiment.
FIG. 8 shows a block diagram of a computer system 800 suitable for use as a device 100 as exemplified in FIG. 1 , which in some embodiments can be implemented as a device for intrinsic popularity evaluation and in some other embodiments can be implemented as a device for content compression as described herein.
FIGS. 9 to 12 depict experimental data that proves effectiveness of the methods and devices for intrinsic popularity evaluation and for content compression as described herein. The experimental data shows that the embodiments described herein produce a more accurate popularity evaluation which facilitates a more efficient computational and/or storage resource management for various social networks based on the intrinsic popularity evaluation. Details of the experimental data are as follows.
FIG. 9 shows a diagram 900 depicting accuracy of popularity evaluation conducted on a testing set of popularity-discriminable image pairs (PDIPs) based on various popularity evaluation methods. It is shown that the intrinsic popularity evaluation 914 as described in the present application has achieved a highest accuracy at 76.65% among seven (7) popularity evaluation methods.
FIG. 10 shows a diagram 1000 depicting a normalised histogram of intrinsic popularity scores for a testing set of 5000 popularity-discriminable image pairs (PDIPs) based on the method of intrinsic popularity evaluation according to an embodiment. It is shown that the normalised histogram of the intrinsic popularity scores fits into a Gaussian curve 1002.
FIG. 11 shows examples of images with different intrinsic popularity levels. For the sake of simplicity, some images are resized without keeping aspect ratios for neat display. In an embodiment, the respective intrinsic popularity scores of these images determined by the methods of intrinsic popularity evaluation as described herein are classified into five (5) intrinsic popularity levels: (a) excellent, (b) good, (c) fair, (d) bad, and (e) poor. For example, the excellent level may cover an intrinsic popularity score range of 6 and above. The good level may cover an intrinsic popularity score range of 4 to 6. The fair level may cover an intrinsic popularity score range of 2-4. The bad level may cover an intrinsic popularity score range of 0 to 2. The poor level may cover an intrinsic popularity score range of 0 and below.
FIG. 12 shows a diagram 1200 depicting heatmaps of sample images generated by Grad-CAM. In the diagram 1200, a front row 1202 shows images of high intrinsic popularity, e.g. with intrinsic popularity scores in a range of 4 and above. A second row 1204 shows images of low intrinsic popularity, e.g. with intrinsic popularity scores in a range of 2 and below. It is shown from the diagram 1200 that a warmer region in a visual content of an image contributes more to the image's intrinsic popularity.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been depicted to scale. For example, the dimensions of some of the elements in the illustrations, block diagrams or flowcharts may be exaggerated in respect to other elements to help to improve understanding of the present embodiments.
DETAILED DESCRIPTION
Embodiments will be described, by way of example only, with reference to the drawings. Like reference numerals and characters in the drawings refer to like elements or equivalents.
Some portions of the description which follows are explicitly or implicitly presented in terms of algorithms and functional or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to convey most effectively the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities, such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.
Unless specifically stated otherwise, and as apparent from the following, it will be appreciated that throughout the present specification, discussions utilizing terms such as “receiving”, “determining”, “training”, “retrieving”, “constructing”, “determining”, “optimising”, “compressing”, “classifying”, or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.
The present specification also discloses apparatus for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of more specialized apparatus to perform the required method steps may be appropriate. The structure of a computer suitable for executing the various methods/processes described herein will appear from the description below.
In addition, the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the specification contained herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
Furthermore, one or more of the steps of the computer program may be performed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a computer. The computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system. The computer program when loaded and executed on such a computer effectively results in an apparatus that implements the steps of the preferred method.
This specification uses the term “configured to” in connection with systems, devices, and computer program components. For a system of one or more computers to be configured to perform particular operations or actions means that the system has installed on it software, firmware, hardware, or a combination of them that in operation cause the system to perform the operations or actions. For one or more computer programs to be configured to perform particular operations or actions means that the one or more programs include instructions that, when executed by data processing apparatus, cause the apparatus to perform the operations or actions. For special-purpose logic circuitry to be configured to perform particular operations or actions means that the circuitry has electronic logic that performs the operations or actions.
Embodiments of the present application provide approaches that emphasize on visual content of images to evaluate intrinsic popularity and thereby provide a more accurate popularity evaluation for the images and a more efficient computational and/or storage resource management for various social network platforms based on the intrinsic popularity evaluation.
It is understandable to those skilled in the art that the intrinsic popularity evaluation in the present application aims to provide accurate popularity predictions for one or more images on a social network by determining intrinsic popularity scores for the one or more images. In this regard, intrinsic popularity evaluation can be interchangeably referred to as intrinsic popularity prediction in the present application.
FIG. 1 illustrates a schematic diagram of a device 100. In an embodiment, the device can be implemented as a device for intrinsic popularity evaluation. In another embodiment, the device can be implemented as a device for content compression.
The device 100 at least includes one or more processor 102 and a memory 104. The at least one processor 102 and the memory 104 are interconnected. The memory 104 includes computer program code (not shown in FIG. 1 ) for execution by the at least one processor 102.
In an embodiment where the device 100 is implemented as a device for intrinsic popularity evaluation, the computer program code instructs the at least one processor 102 to perform the steps for intrinsic popularity evaluation as shown in FIG. 2 and described in the present application.
At step 202, the computer program code instructs the at least one processor 102 to receive an image from a social network. An example of the image is shown in FIG. 3 .
In FIG. 3 , a diagram 300 depicts an embodiment of content on a social network. In this embodiment, the content 300 is an image posted by a user. The image 300 comprises a visual content 302 and a plurality of attributes. The attributes of the image 300 indicate social and textual information of the image 300. In this embodiment, the attributes include user statistics 304, upload/download time 306, and caption content 308. The user statistics 304 may include user ID, number of posts of the user, number of follower accounts of the user, number of following accounts of the user, etc. Additionally or alternatively, the attributes of the image 300 may include other information such as a Uniform Resource Locator (URL) web address of the image 300 (interchangeably referred to as “post URL” in the present application for the sake of simplicity), a type of the visual content 302 (for example, a building, a street view, etc), number of comments, number of hashtags, number of @ signs, and etc. It is appreciable to those skilled in the art that the attributes of the image 300 may include further information.
At step 204, the computer program code instructs the at least one processor 102 to determine an intrinsic popularity score for the image using a deep neural network (DNN) based intrinsic popularity assessment model. An embodiment of the DNN based intrinsic popularity assessment model is depicted in FIG. 5 and described in the corresponding paragraphs.
In some embodiments, the intrinsic popularity score for the image is associated with a visual content of the image. As shown in FIG. 3 , in this manner, the visual content 302 of the image 300 can be relied solely for intrinsic popularity evaluation 310, which arrives at an intrinsic popularity score of 32 (with a re-scaled maximum score of 100). On the other hand, a conventional absolute image popularity assessment takes both the visual factor (e.g. visual content 302 of the image 300) and non-visual factors (e.g. the attributes 304, 306, 308 of the image 300) into consideration to predict the number of received likes for the image 300 to be 341. It therefore shows the intrinsic popularity evaluation/assessment in the present application which treats visual factor (e.g. visual content 302 of the image 300) with more importance can advantageously achieve a more direct intrinsic popularity score and a more accurate popularity prediction.
In some other embodiments, in addition to the visual content of the image, the intrinsic popularity score for the image can be further associated with one or more of the attributes as described above based on practical needs and/or requirements.
Prior to step 204, the DNN based intrinsic popularity assessment model is trained. During the training, the computer program code instructs the at least one processor 102 to retrieve a plurality of historical images. In some embodiments, the plurality of historical images can be retrieved from the same social network as the image 300. It is appreciable to those skilled in the art that the plurality of historical images used in the training may be retrieved from one or more different social networks. For example, the plurality of historical images can be retrieved from over 200 million distinctive posts crawled from Instagram. In some embodiments, one historical image can be only involved in one PDIP to ensure diversity of the training.
Based on the plurality of historical images, a plurality of popularity-discriminable image pairs (PDIPs) can be constructed. The construction of the plurality of PDIPs is in a manner that each PDIP in the plurality of PDIPs comprises a first image and a second image and that the first image has a probability of higher intrinsic popularity than the second image.
In an embodiment, the construction of the plurality of PDIPs can be performed in the following manner. Considering a log-scaled number of likes S received for an image as a ground truth for absolute image popularity of the image, the following two assumptions are made.
First assumption: S obeys a normal distribution
p ( S | μ ) exp ( - ( S - μ ) 2 2 σ 2 ) ( 1 )
with mean μ and standard deviation a. In equation (1), μ is a random variable, which can be viewed as the average number of likes received by an image in the log scale. p(μ) is assumed to be flat with a finite positive support. To simplify the derivation, σ is treated as a positive constant to be determined.
Second assumption: Intrinsic popularity Q is a monotonically increasing function of μ. Using Bayes' theorem,
p(μ|S)∝p(S|μ)p(μ)∝p(S|μ),  (2)
where the second proportion follows from the first assumption that p(μ) is flat. That is, conditioning on S, μ is Gaussian with mean S and standard σ.
To ensure that the first image A is intrinsically more popular than the second image B in a PDIP, the following probability is computed:
P(Q A ≥Q B |S A ,S B)=PA≥μB |S A ,S B)=PA−μB≥0|S A ,S B),  (3)
where equation (3) follows the second assumption that intrinsic popularity Q is a monotonically increasing function of μ. Assuming variability of intrinsic popularities across images is uncorrelated, and conditioning on SA and SB, the difference μABA−μB is also Gaussian:
p ( μ A B | S A , S B ) exp ( - ( μ A B - ( S A - S B ) ) 2 4 σ 2 ) . ( 4 )
Combining equation (3) with equation (4),
P ( Q A Q B | S A , S B ) = Φ ( S A - S B 2 σ ) , ( 5 )
where Φ(·) is the standard normal cumulative distribution function. P(QA≥QB|SA,SB) indicates a probability that Image A is intrinsically more popular than Image B. In some embodiments, a large threshold T is chosen to ensure the popularity discriminability of the plurality of PDIPs, i.e., P(QA≥QB|SA,SB)≥T.
FIG. 4 shows a diagram 400 depicting an embodiment of popularity-discriminable image pairs (PDIPs) as constructed above. In this embodiment 400 shows six (6) popularity-discriminable image pairs (PDIPs) 402, 404, 406, 408, 410, 412. In each PDIP 402, 404, 406, 408, 410, 412, the left image in each pair is expected to be intrinsically more popular than the right image based on the above equation (5). Such an expectation is confirmed by a psychophysical experiment conducted to prove effectiveness of the methods for intrinsic popularity evaluation. It is understandable that the embodiment in FIG. 4 only shows a portion of the popularity-discriminable image pairs (PDIPs) for the sake of simplicity.
In some embodiments, the construction of the plurality of PDIPs may take non-visual factors into consideration in addition to the requirement that the first image in each PDIP has a probability of higher intrinsic popularity than the second image, e.g., P(QA≥QB|SA,SB)≥T. For example, it may be desirable for the first image and the second image in a PDIP to have similar textual and social contexts. Such non-visual factors can be fulfilled by selecting two images that are associated with the same one or more attributes in addition to requiring one of the two images intrinsically more popular than the other image of the two images when constructing a PDIP.
The non-visual factors are described above as one or more attributes of an image, which include user statistics, upload/download time, caption content, post URL, a type of visual content of the image (for example, a beach, a dog, a building, a street view, etc), number of comments, number of hashtags, and/or number of @ signs. The user statistics may include user ID, number of posts of the user, number of follower accounts of the user, number of following accounts of the user, etc.
Different users have different numbers of followers. As such, images posted by the users with more followers have higher chances of receiving more likes. Considering such influence of user statistics, the construction of the plurality of PDIPs may further require images used for PDIP construction to be from a same user.
A user often has a different number of followers at different times. To reduce influence of uploading time, the construction of the plurality of PDIPs may further require the post time difference of two images in a PDIP to a maximum of ten days. In addition, images just uploaded to a social network is preferable to be disregarded, as the number of likes has not reached a saturation value. As such, the construction of the plurality of PDIPs may further require excluding images posted within one month.
Captions of images have a noticeable influence on image popularity, especially those containing hashtags and @ signs. A hot hashtag contributes significantly to image popularity because of the extensive exposure to viewers beyond followers. To remove the textual bias, the construction of the plurality of PDIPs may further require the hashtag and @ sign of the images in a PDIP to be the same in terms of both content and number. Moreover, the construction of the plurality of PDIPs may further require the length of the caption (excluding the hashtag and @ sign) to be restricted to a maximum of six words.
In view of the above, the construction of the plurality of PDIPs requires the first image in each PDIP to have a probability of higher intrinsic popularity than the second image, e.g., P (QA≥QB|SA,SB)≥T, with one or more additional requirements that the two images in each PDIP have to be from the same user, posted more than one month and within ten days, and/or having a caption with a maximum of six words and the same hashtag/@ sign(s). Based on the practical requirements, the one or more additional requirements may be optional in some embodiments or essential in some other embodiments.
Thereafter, the computer program code instructs the at least one processor 102 to train the DNN based intrinsic popularity assessment model 500 with the plurality of PDIPs. An embodiment of the DNN based intrinsic popularity assessment model 500 trained with the plurality of PDIPs is shown in FIG. 5 .
As shown in FIG. 5 , for each PDIP 502, 504 of the plurality of PDIPs, the computer program code instructs the at least one processor 102 to determine a first intrinsic popularity score 510 for the first image 502 and a second intrinsic popularity score 512 for the second image 504 using the DNN based intrinsic popularity assessment model 500, and optimise the DNN based intrinsic popularity assessment model 500 by minimizing a binary cross entropy loss between a score difference PAB between the first intrinsic popularity score 510 and the second intrinsic popularity score 512 and a ground truth binary label P AB denoting whether the first image 502 is intrinsically more popular than the second image 504.
As shown in FIG. 5 , the DNN based intrinsic popularity assessment model 500 is configured in a Siamese architecture, which includes two same DNNs 506, 508 each denoted as ƒ. In an embodiment, a RGB image is provided to a first DNN 506 as the first image 502 (input image A) of a PDIP, which produces a first intrinsic popularity score QA=ƒ(A) as an output of the first DNN 506. Analogously, another RGB image is provided to a second DNN 508 as the second image 504 (input image 8) of the PDIP, which produces a second intrinsic popularity score QB=ƒ(B) as an output of the second DNN 508.
In some embodiments, each of the two DNNs 506, 508 is implemented by a 50-layer residual network, for example, ResNet-50. It is appreciable to those skilled in the art that the two DNNs can be implemented by other deep neural networks. The two DNNs 206, 508 may share same weights during training and testing. A score difference between the first intrinsic popularity score QA and the second intrinsic popularity score QB is calculated as OAB=ƒ(A)−ƒ(B), and converted to a probability PAB using a logistic function
P A B = exp ( O A B ) 1 + exp ( O A B ) . ( 6 )
The ground truth binary label P AB of the PDIP denotes whether the first image 502 is intrinsically more popular than the second image 504. P AB=1 indicates the first image 502 is intrinsically more popular than the second image 504 and otherwise P AB=0.
The binary cross entropy loss between the score difference PAB and the ground truth binary label P AB is defined as the following loss function:
l=−P AB log P AB−(1− P AB)log(1−P AB)=− P AB O AB+ln(1+exp(O AB)).  (7)
By minimizing the binary cross entropy loss l, the DNN based intrinsic popularity assessment model 500 is optimised.
After training, the DNN 506, 508 ƒ is optimised as ƒ* in the intrinsic popularity assessment model 500. Given an image X, an intrinsic popularity score of the image X can be determined at either DNN 506, 508 of the DNN based intrinsic popularity assessment model 500 as QX=ƒ*(X).
The above described embodiments of intrinsic popularity evaluation advantageously provide a more accurate popularity prediction for images on various social networks as compared to conventional absolute popularity evaluation methods, as described with reference to experimental data shown in FIGS. 9 to 12 . Such a more accurate popularity prediction in turn can facilitate a more efficient computational and/or storage resource management for various social networks.
For example, efficient compression of information is crucial for either local or cloud data storage. In the present application, the intrinsic popularity scores determined by the intrinsic popularity evaluation further serve as a guidance to optimize content compression from the perspective of ultimate utility. In other words, the present application allocates more resources to information having contents with high popularity scores, such that contents (e.g., images) that are potentially viewed more frequently are allocated with more coding bits. That is, contents of higher popularity are compressed with lower compression rates.
In an embodiment where the device 100 is implemented as a device for content compression, the computer program code instructs the at least one processor 102 to allocate resources by performing the steps in the exemplified method 600 for content compression as shown in FIG. 6 and described in the present application.
At step 602, the computer program code instructs the at least one processor 102 to compress information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
In some embodiments, the content of the information comprises an image. In some embodiments, the intrinsic popularity score of the image is determined using a deep neural network (DNN) based intrinsic popularity assessment model as described above with reference to the methods of intrinsic popularity evaluation. In these embodiments, the step of compressing of the information comprises: classifying the intrinsic popularity score into a popularity level and compressing the image with the variable compression rate, which is corresponding to the popularity level. In these embodiments, the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
In some embodiments, there may be five (5) intrinsic popularity levels: (a) excellent, (b) good, (c) fair, (d) bad, and (e) poor. The excellent level may cover an intrinsic popularity score range of 6 and above. The good level may cover an intrinsic popularity score range of 4 to 6. The fair level may cover an intrinsic popularity score range of 2-4. The bad level may cover an intrinsic popularity score range of 0 to 2. The poor level may cover an intrinsic popularity score range of 0 and below. For example, if the image has an intrinsic popularity score of 5, the intrinsic popularity score is classified into the good level and the image is compressed with a variable compression rate corresponding to the intrinsic popularity score and in turn the good level. Such a variable compression rate is in a negative correlation to the corresponding intrinsic popularity score. In this manner, different images of different popularities can have different compression rates. In other words, images with higher intrinsic popularity scores are compressed with lower compression rates so that images that are more popular are compressed with less loss. It is appreciable to those skilled in the art that the intrinsic popularity score ranges described above are for exemplary purposes. The intrinsic popularity score ranges may vary based on practical needs and requirements.
In some embodiments, the variable compression rate of the image is achieved based on a popularity based conditional convolution CConv:
CConv=W(s)×Conv+b(s),  (8)
wherein W(s)=softplus(u×s) and b(s)=v×s, wherein s is the popularity level of the image encoded by a one-hot vector, W(s) is a channel-wise scaling factor performed on a general convolution (Conv), b(s) is a bias term performed on the general convolution (Conv), and u and v are learnable weights of fully connected layers in a DNN based autoencoder.
Another embodiment of a device 700 for content compression is depicted in FIG. 7 . As shown in FIG. 7 , the device 700 is a DNN-based autoencoder for variable rate compression. The device 700 comprises a core autoencoder 702 including an analysis transformation ga and a synthesis transformation gs to learn a quantized latent representation of a content x of information and a hyper autoencoder 704 arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder 702. Similar to the core autoencoder 702, the hyper autoencoder 704 can also include an analysis transformation ha and a synthesis transformation hs. The content x is processed by both the core autoencoder 702 and the hyper autoencoder 704 to generate a compressed content {tilde over (x)} of the information. The core autoencoder 702 and the hyper autoencoder 704 are trained by one or more deep neural networks (DNNs) includes one or more layers of conditional convolution, wherein the one or more layers of conditional convolution are conditioned on an intrinsic popularity score of the content x of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
In the embodiment of device 700 shown in FIG. 7 , Q box represents quantization. AE box and AD box represent an arithmetic encoder and an arithmetic decoder, respectively. CConv denotes conditional convolution as described above that are conditioned on an intrinsic popularity score of the content x of the information. Convolution parameters in the one or more layers of conditional convolution of the one or more DNNs are denoted as: kernel support height×kernel support width×number N or M of filters/down- or up-sampling stride, where ↑ indicates upsampling and ↓ indicates downsampling. In the embodiment, each of the kernel support height and kernel support width is set as 5, N=128 and M=192. GDN indicates a generalized divisive normalization, and IGDN is an inverse GDN.
As described above, in some embodiments, the content of the information comprises an image x. The variable compression rate of the image is achieved as described above based on a popularity based conditional convolution CConv according to equation (8).
The objective function is a weighted sum of two terms that quantify a coding cost and a reconstruction error, respectively:
Figure US12112463-20241008-P00001
=
Figure US12112463-20241008-P00001
r
Figure US12112463-20241008-P00001
d,  (9)
wherein the Lagrange multiplier λ controls rate-distortion trade-off, which is known as rate-distortion optimization (RDO) functions in conventional codecs. Rate loss representing the bit consumption of the autoencoder 700 is defined by
r = 𝔼 x p x [ - log 2 p y ( y ) ] + 𝔼 x - p x [ - log 2 p z ( z ) ] . ( 10 )
The distortion loss is the mean square error between the input image x and the reconstructed image {tilde over (x)}:
Figure US12112463-20241008-P00001
d=
Figure US12112463-20241008-P00002
x˜p x ∥x−{tilde over (x)}∥ 2 2.  (11)
In general, the Lagrange multiplier λ is a constant that represents the slope of the rate-distortion (RD) curve, but in equation (9), λ is a monotonic increasing function of the popularity level s:
λ=α·sigmoid(s)  (12)
where α is a constant. In this manner, images of higher popularity levels will have larger λ, resulting in more bits being allocated to popular images. In this manner, a more efficient computational and/or storage resource management is advantageously achieved.
FIG. 8 shows a block diagram of a computer system 800 suitable for use as a device 100 as exemplified in FIG. 1 , which in some embodiments can be implemented as a device for intrinsic popularity evaluation and in some other embodiments can be implemented as a device for content compression as described herein.
The following description of the computer system/computing device 800 is provided by way of example only and is not intended to be limiting.
As shown in FIG. 8 , the example computing device 800 includes a processor 804 for executing software routines. Although a single processor is shown for the sake of clarity, the computing device 800 may also include a multi-processor system. The processor 804 is connected to a communication infrastructure 806 for communication with other components of the computing device 800. The communication infrastructure 806 may include, for example, a communications bus, cross-bar, or network.
The computing device 800 further includes a main memory 808, such as a random access memory (RAM), and a secondary memory 810. The secondary memory 810 may include, for example, a hard disk drive 812 and/or a removable storage drive 814, which may include a magnetic tape drive, an optical disk drive, or the like. The removable storage drive 814 reads from and/or writes to a removable storage unit 818 in a well-known manner. The removable storage unit 818 may include a magnetic tape, optical disk, or the like, which is read by and written to by removable storage drive 814. As will be appreciated by persons skilled in the relevant art(s), the removable storage unit 818 includes a computer readable storage medium having stored therein computer executable program code instructions and/or data.
In an alternative implementation, the secondary memory 810 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into the computing device 800. Such means can include, for example, a removable storage unit 822 and an interface 820. Examples of a removable storage unit 822 and interface 820 include a removable memory chip (such as an EPROM or PROM) and associated socket, and other removable storage units 822 and interfaces 820 which allow software and data to be transferred from the removable storage unit 822 to the computer system 800.
The computing device 800 also includes at least one communication interface 824. The communication interface 824 allows software and data to be transferred between computing device 800 and external devices via a communication path 826. In various embodiments, the communication interface 824 permits data to be transferred between the computing device 800 and a data communication network, such as a public data or private data communication network. The communication interface 824 may be used to exchange data between different computing devices 800 which such computing devices 800 form part an interconnected computer network. Examples of a communication interface 824 can include a modem, a network interface (such as an Ethernet card), a communication port, an antenna with associated circuitry and the like. The communication interface 824 may be wired or may be wireless. Software and data transferred via the communication interface 824 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communication interface 824. These signals are provided to the communication interface via the communication path 826.
Optionally, the computing device 800 further includes a display interface 802 which performs operations for rendering images to an associated display 830 and an audio interface 832 for performing operations for playing audio content via associated speaker(s) 834.
As used herein, the term “computer program product” may refer, in part, to removable storage unit 818, removable storage unit 822, a hard disk installed in hard disk drive 812, or a carrier wave carrying software over communication path 826 (wireless link or cable) to communication interface 824. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computing device 800 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-ray™ Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computing device 800. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computing device 800 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
The computer programs (also called computer program code) are stored in main memory 808 and/or secondary memory 810. Computer programs can also be received via the communication interface 824. Such computer programs, when executed, enable the computing device 800 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 804 to perform features of the above-described embodiments. Accordingly, such computer programs represent controllers of the computer system 800.
Software may be stored in a computer program product and loaded into the computing device 800 using the removable storage drive 814, the hard disk drive 812, or the interface 820. Alternatively, the computer program product may be downloaded to the computer system 800 over the communications path 826. The software, when executed by the processor 804, causes the computing device 800 to perform functions of embodiments described herein.
It is to be understood that the embodiment of FIG. 8 is presented merely by way of example. Therefore, in some embodiments one or more features of the computing device 800 may be omitted. Also, in some embodiments, one or more features of the computing device 800 may be combined together. Additionally, in some embodiments, one or more features of the computing device 800 may be split into one or more component parts.
The techniques described in this specification produce one or more technical effects. As mentioned above, embodiments of the present application provide approaches that emphasize on visual content of images to evaluate intrinsic popularity and thereby advantageously provide a more accurate popularity prediction for images.
Furthermore, such an accurate intrinsic popularity evaluation in turn advantageously serves as a guidance to optimize content compression, thereby facilitates a more efficient computational and/or storage resource management for various social networks.
FIGS. 9 to 12 depict experimental data that proves effectiveness of the methods and devices for intrinsic popularity evaluation and for content compression as described herein. The experimental data shows that the embodiments described herein produce a more accurate popularity evaluation which facilitates a more efficient computational and/or storage resource management for various social networks based on the intrinsic popularity evaluation. Details of the experimental data are as follows.
FIG. 9 shows a diagram 900 depicting accuracy of popularity evaluation conducted on a testing set of popularity-discriminable image pairs (PDIPs) based on various popularity evaluation methods. The various popularity evaluation methods include random evaluation 902, virality detection 904, Khosla14 906, Hessel17 908, LikelyAl 910, human evaluation 912, and the intrinsic popularity evaluation 914 of the present application. It is shown that the intrinsic popularity evaluation 914 as described in the present application has achieved a highest accuracy at 76.65% while the random evaluation 902 achieves an accuracy at 50%, the virality detection 904 at 53.21%, the Khosla14 906 at 54.39%, the Hessel17 908 at 65.54%, the LikelyAl 910 at 68.87%, and the human evaluation 912 at 72.40%.
FIG. 10 shows a diagram 1000 depicting a normalised histogram of intrinsic popularity scores for a testing set of 5000 popularity-discriminable image pairs (PDIPs) based on the method of intrinsic popularity evaluation according to an embodiment. It is shown that the normalised histogram of the intrinsic popularity scores fits into a Gaussian curve 1002.
FIG. 11 shows examples of images with different intrinsic popularity levels. For the sake of simplicity, some images are resized without keeping aspect ratios for neat display. In an embodiment, the respective intrinsic popularity scores of these images determined by the methods of intrinsic popularity evaluation as described herein are classified into five (5) intrinsic popularity levels: (a) excellent, (b) good, (c) fair, (d) bad, and (e) poor. For example, the excellent level may cover an intrinsic popularity score range of 6 and above. The good level may cover an intrinsic popularity score range of 4 to 6. The fair level may cover an intrinsic popularity score range of 2-4. The bad level may cover an intrinsic popularity score range of 0 to 2. The poor level may cover an intrinsic popularity score range of 0 and below. It is appreciable to those skilled in the art that the intrinsic popularity score ranges described above are for exemplary purposes. The intrinsic popularity score ranges may vary based on practical needs and requirements.
FIG. 12 shows a diagram 1200 depicting heatmaps of sample images generated by Grad-CAM. In the diagram 1200, a front row 1202 shows images of high intrinsic popularity, e.g. with intrinsic popularity scores in a range of 4 and above. A second row 1204 shows images of low intrinsic popularity, e.g. with intrinsic popularity scores in a range of 2 and below. It is shown from the diagram 1200 that a warmer region in a visual content of an image contributes more to the image's intrinsic popularity.
It will be appreciated by a person skilled in the art that numerous variations and/or modifications may be made to the present invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.
Based on the embodiments disclosed above, the present disclosure provides the following.
A first aspect of the present disclosure provides a method of intrinsic popularity evaluation. The method comprises: receiving an image from a social network; and determining an intrinsic popularity score for the image using a DNN based intrinsic popularity assessment model.
In certain embodiments, the method further comprises training the DNN based intrinsic popularity assessment model. In particular, the training comprises: retrieving a plurality of historical images; constructing a plurality of PDIPs based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and training the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
In certain embodiments, the training of the DNN based intrinsic popularity assessment model with the plurality of PDIPs comprises: for each PDIP of the plurality of PDIPs, determining a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and optimising the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
In certain embodiments, the intrinsic popularity score for the image is associated with a visual content of the image.
In certain embodiments, the intrinsic popularity score for the image is further associated with one or more attributes of the image, the one or more attributes including: upload/download time; post URL; user ID; type of the visual content; caption content, and/or number of comments.
A second aspect of the present disclosure provides a method of content compression. The method comprises compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
In certain embodiments, the content of the information comprises an image.
In certain embodiments, the method further comprises determining the intrinsic popularity score of the image using a deep neural network (DNN) based intrinsic popularity assessment model.
In certain embodiments, the compressing of the information comprises: classifying the intrinsic popularity score into a popularity level; and compressing the image with the variable compression rate, the variable compression rate corresponding to the popularity level, wherein the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
In certain embodiments, the variable compression rate of the image is achieved based on a popularity based conditional convolution CConv: CConv=W(s)×Conv+b(s), wherein W(s)=softplus(u×s) and b(s)=v×s, and wherein s is the popularity level of the image encoded by a one-hot vector, W(s) is a channel-wise scaling factor performed on a general convolution (Conv), b(s) is a bias term performed on the general convolution (Conv), and u and v are learnable weights of fully connected layers in a DNN based autoencoder.
A third aspect of the present disclosure provides a device for intrinsic popularity evaluation. The device comprises: at least one processor; and a memory including computer program code for execution by the at least one processor. The computer program code instructs the at least one processor to: receive an image from a social network; and determine an intrinsic popularity score for the image using a DNN based intrinsic popularity assessment model.
In certain embodiments, the computer program code further instructs the at least one processor to: train the DNN based intrinsic popularity assessment model. During the training the computer program code further instructs the at least one processor to: retrieve a plurality of historical images; construct a plurality of popularity-discriminable image pairs (PDIPs) based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and train the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
In certain embodiments, during the training of the DNN based intrinsic popularity assessment model with the plurality of PDIPs, the computer program code instructs the at least one processor to: for each PDIP of the plurality of PDIPs, determine a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and optimise the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
In certain embodiments, the intrinsic popularity score for the image is associated with a visual content of the image.
In certain embodiments, the intrinsic popularity score for the image is further associated with one or more attributes of the image. The one or more attributes include: upload/download time; post URL; user ID; type of the visual content; caption content; and/or number of comments.
A fourth aspect of the present disclosure provides a device for content compression. The device comprises: at least one processor configurable to allocate resources; and a memory including computer program code for execution by the at least one processor. The computer program code instructs the at least one processor to allocate resources by compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information.
In certain embodiments, the allocated resources include storage space.
In certain embodiments, the content of the information comprises an image.
In certain embodiments, the computer program code further instructs the at least one processor to determine the intrinsic popularity score of the image using a DNN based intrinsic popularity assessment model.
In certain embodiments, during the compressing of the information, the computer program code further instructs the at least one processor to: classify the intrinsic popularity score into a popularity level; and compress the image with the variable compression rate, the variable compression rate corresponding to the popularity level, wherein the variable compression rate is in a negative correlation to the corresponding intrinsic popularity score.
In certain embodiments, the computer program code further instructs the at least one processor to achieve the variable compression rate of the image based on a popularity based conditional convolution CConv: CConv=W(s)×Conv+b(s), wherein W(s)=softplus(u×s) and b(s)=v×s, and wherein s is the popularity level of the image encoded by a one-hot vector, W(s) is a channel-wise scaling factor performed on a general convolution (Conv), b(s) is a bias term performed on the general convolution (Conv), and u and v are learnable weights of fully connected layers in a DNN based autoencoder.
A fifth aspect of the present disclosure provides a device for content compression. The device comprises: a core autoencoder including an analysis transformation and a synthesis transformation to learn a quantized latent representation of a content of information; and a hyper autoencoder arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder; wherein the content is processed by both the core autoencoder and the hyper autoencoder to generate a compressed content of the information, and wherein the core autoencoder and the hyper autoencoder are composed of multiple layers of conditional convolution, generalized divisive normalization (GDN), and inverse GDN, wherein the multiple layers of conditional convolution are conditioned on an intrinsic popularity score of the content of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
In certain embodiments, the content of the information comprises an image.
A sixth aspect of the present disclosure provides a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation according to any one of the embodiments disclosed in the first aspect of the present disclosure.
A seventh aspect of the present disclosure provides a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform one or more steps in a method of intrinsic popularity evaluation according to any one of the embodiments disclosed in the second aspect of the present disclosure.

Claims (9)

What is claimed is:
1. A method of content compression, the method comprising:
compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information, wherein the content of the information comprises an image; and
before the information is compressed, determining the intrinsic popularity score of the image using a deep neural network (DNN) based intrinsic popularity assessment model;
wherein the compressing of the information comprises:
classifying the intrinsic popularity score into a popularity level; and
compressing the image with the variable compression rate, the variable compression rate corresponding to the popularity level, the variable compression rate being in a negative correlation to the corresponding intrinsic popularity score, wherein the image is compressed by using a DNN based autoencoder, the DNN based autoencoder comprising learnable weights, the learnable weights being learnt according to an objective function
Figure US12112463-20241008-P00002
given by
Figure US12112463-20241008-P00002
=
Figure US12112463-20241008-P00002
r
Figure US12112463-20241008-P00002
d, where
Figure US12112463-20241008-P00002
r is a coding cost,
Figure US12112463-20241008-P00002
d is a reconstruction error, and λ is a Lagrange multiplier for controlling rate-distortion trade-off, and wherein A is a monotonic increasing function of the popularity level such that more bits are allocated for more popular images to thereby enable the image to be compressed with the variable compression rate corresponding to the popularity level and in the negative correlation to the corresponding intrinsic popularity score.
2. The method according to claim 1, further comprising:
training the DNN based intrinsic popularity assessment model, wherein the training comprises:
retrieving a plurality of historical images;
constructing a plurality of popularity-discriminable image pairs (PDIPs) based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and
training the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
3. The method according to claim 2, wherein the training of the DNN based intrinsic popularity assessment model with the plurality of PDIPs comprises:
for each PDIP of the plurality of PDIPs,
determining a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and
optimising the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
4. The method according to claim 1, wherein the intrinsic popularity score for the image is associated with a visual content of the image.
5. The method according to claim 4, wherein the intrinsic popularity score for the image is further associated with one or more attributes of the image, the one or more attributes including:
upload/download time,
post URL,
user ID,
type of the visual content,
caption content, and/or
number of comments.
6. A device for content compression, the device comprising:
at least one processor configurable to allocate resources; and
a memory including computer program code for execution by the at least one processor, wherein the computer program code instructs the at least one processor to allocate resources by:
compressing information with a variable compression rate corresponding to an intrinsic popularity score of a content of the information, wherein the content of the information comprises an image; and
before the information is compressed, determining the intrinsic popularity score of the image using a deep neural network (DNN) based intrinsic popularity assessment model;
wherein the compressing of the information comprises:
classifying the intrinsic popularity score into a popularity level; and
compressing the image with the variable compression rate, the variable compression rate corresponding to the popularity level, the variable compression rate being in a negative correlation to the corresponding intrinsic popularity score, wherein the image is compressed by using a DNN based autoencoder, the DNN based autoencoder comprising learnable weights, the learnable weights being learnt according to an objective function
Figure US12112463-20241008-P00002
given by
Figure US12112463-20241008-P00002
=
Figure US12112463-20241008-P00002
r
Figure US12112463-20241008-P00002
d, where
Figure US12112463-20241008-P00002
r is a coding cost,
Figure US12112463-20241008-P00002
d is a reconstruction error, and λ is a Lagrange multiplier for controlling rate-distortion trade-off, and wherein λ is a monotonic increasing function of the popularity level such that more bits are allocated for more popular images to thereby enable the image to be compressed with the variable compression rate corresponding to the popularity level and in the negative correlation to the corresponding intrinsic popularity score.
7. The device according to claim 6, wherein the computer program code further instructs the at least one processor to:
train the DNN based intrinsic popularity assessment model, wherein during the training the computer program code further instructs the at least one processor to:
retrieve a plurality of historical images;
construct a plurality of popularity-discriminable image pairs (PDIPs) based on the plurality of historical images, wherein each PDIP comprises a first image and a second image, the first image having a probability of higher intrinsic popularity than the second image; and
train the DNN based intrinsic popularity assessment model with the plurality of PDIPs.
8. The device according to claim 7, wherein during the training of the DNN based intrinsic popularity assessment model with the plurality of PDIPs, the computer program code instructs the at least one processor to:
for each PDIP of the plurality of PDIPs,
determine a first intrinsic popularity score for the first image and a second intrinsic popularity score for the second image using the DNN based intrinsic popularity assessment model, and
optimise the DNN based intrinsic popularity assessment model by minimizing a binary cross entropy loss between a score difference between the first intrinsic popularity score and the second intrinsic popularity score and a ground truth binary label denoting whether the first image is intrinsically more popular than the second image.
9. The device according to claim 6, wherein the DNN based autoencoder comprises:
a core autoencoder including an analysis transformation and a synthesis transformation to learn a quantized latent representation of a content of information; and
a hyper autoencoder arranged to learn a probabilistic model over the quantized latent representation of the content learned in the core autoencoder;
wherein the content is processed by both the core autoencoder and the hyper autoencoder to generate a compressed content of the information, and
wherein the core autoencoder and the hyper autoencoder are composed of multiple layers of conditional convolution, generalized divisive normalization (GDN), and inverse GDN, wherein the multiple layers of conditional convolution are conditioned on an intrinsic popularity score of the content of the information determined using a deep neural network (DNN) based intrinsic popularity assessment model.
US17/447,447 2020-09-14 2021-09-13 Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon Active 2042-08-31 US12112463B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/447,447 US12112463B2 (en) 2020-09-14 2021-09-13 Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063077854P 2020-09-14 2020-09-14
US17/447,447 US12112463B2 (en) 2020-09-14 2021-09-13 Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon

Publications (2)

Publication Number Publication Date
US20220084187A1 US20220084187A1 (en) 2022-03-17
US12112463B2 true US12112463B2 (en) 2024-10-08

Family

ID=80626898

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/447,447 Active 2042-08-31 US12112463B2 (en) 2020-09-14 2021-09-13 Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon

Country Status (1)

Country Link
US (1) US12112463B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230162228A1 (en) * 2021-11-22 2023-05-25 Northwestern University Method and system to assess image advertisements

Citations (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999052522A1 (en) 1998-04-15 1999-10-21 King's College, London University Protection of the nervous system using sodium channel blockers
US20040105585A1 (en) 1997-06-04 2004-06-03 Nikon Corporation Image compression apparatus, method and recording medium storing an image compression program
US20080267501A1 (en) * 2001-03-29 2008-10-30 Keeney Richard A Apparatus and methods for digital image compression
WO2009002935A1 (en) 2007-06-22 2008-12-31 Sciele Pharma, Inc. Transdermal delivery system comprising glycopyrrolate to treat sialorrhea
US20110302103A1 (en) 2010-06-08 2011-12-08 International Business Machines Corporation Popularity prediction of user-generated content
US8097633B2 (en) 2006-11-15 2012-01-17 Rich Steven A Uses for quaternary ammonium anticholinergic muscarinic receptor antagonists in patients being treated for cognitive impairment or acute delirium
US20120102410A1 (en) * 2010-10-25 2012-04-26 Thomas Gewecke Media management for multi-user group
US8892591B1 (en) * 2011-09-30 2014-11-18 Google Inc. Presenting search results
US8922400B2 (en) 2010-11-02 2014-12-30 I-CES (Innovative Compression Engineering Solutions) Method for compressing digital values of image, audio and/or video files
US20150024839A1 (en) * 2013-07-16 2015-01-22 Microsoft Corporation Game Clip Popularity Based Control
US20150067331A1 (en) * 2013-08-30 2015-03-05 International Business Machines Corporation Remote data storage
US20150161517A1 (en) 2013-12-10 2015-06-11 Electronics And Telecommunications Research Institute Device and method for predicting popularity of social data
US20150215625A1 (en) 2012-10-11 2015-07-30 Tencent Technology (Shenzhen) Company Limited Image compression method and system
US9491479B2 (en) 2012-03-31 2016-11-08 Baidu Online Network Technology (Beijing) Co., Ltd. Image compression method, apparatus and device
US20160371293A1 (en) * 2015-06-19 2016-12-22 Lenovo (Singapore) Pte, Ltd. Managing storage of digital content
US20170017652A1 (en) * 2015-07-16 2017-01-19 Vizio lnscape Technologies, LLC Prediction of Future Views of Video Segments to Optimize System Resource Utilization
US9561218B2 (en) 2012-09-05 2017-02-07 Chase Pharmaceuticals Corporation Anticholinergic neuroprotective composition and methods
US20170323210A1 (en) 2016-05-06 2017-11-09 Wp Company Llc Techniques for prediction of popularity of media
CN107563394A (en) * 2017-09-19 2018-01-09 广东工业大学 A kind of method and system of predicted pictures popularity
US20180191800A1 (en) * 2016-12-30 2018-07-05 Facebook, Inc. Decision engine for dynamically selecting media streams
US20190095807A1 (en) * 2011-05-24 2019-03-28 Ebay Inc. Image-based popularity prediction
CN110222231A (en) * 2019-06-11 2019-09-10 成都澳海川科技有限公司 A kind of temperature prediction technique of video clip
US20190362367A1 (en) 2018-05-22 2019-11-28 Wp Company Llc Techniques for prediction of long-term popularity of digital media
CN111339404A (en) * 2020-02-14 2020-06-26 腾讯科技(深圳)有限公司 Content popularity prediction method and device based on artificial intelligence and computer equipment
US20200351437A1 (en) * 2019-05-02 2020-11-05 International Business Machines Corporation Generating image capture configurations and compositions
US20220254279A1 (en) * 2019-07-30 2022-08-11 Ntt Docomo, Inc. Popularity evaluation system and geographical feature generation model

Patent Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040105585A1 (en) 1997-06-04 2004-06-03 Nikon Corporation Image compression apparatus, method and recording medium storing an image compression program
WO1999052522A1 (en) 1998-04-15 1999-10-21 King's College, London University Protection of the nervous system using sodium channel blockers
US20080267501A1 (en) * 2001-03-29 2008-10-30 Keeney Richard A Apparatus and methods for digital image compression
US9084753B2 (en) 2006-11-15 2015-07-21 Steven A. Rich Uses for quaternary ammonium anticholinergic muscarinic receptor antagonists in patients being treated for cognitive impairment or acute delirium
US8097633B2 (en) 2006-11-15 2012-01-17 Rich Steven A Uses for quaternary ammonium anticholinergic muscarinic receptor antagonists in patients being treated for cognitive impairment or acute delirium
WO2009002935A1 (en) 2007-06-22 2008-12-31 Sciele Pharma, Inc. Transdermal delivery system comprising glycopyrrolate to treat sialorrhea
US20110302103A1 (en) 2010-06-08 2011-12-08 International Business Machines Corporation Popularity prediction of user-generated content
US20120102410A1 (en) * 2010-10-25 2012-04-26 Thomas Gewecke Media management for multi-user group
US8922400B2 (en) 2010-11-02 2014-12-30 I-CES (Innovative Compression Engineering Solutions) Method for compressing digital values of image, audio and/or video files
US20190095807A1 (en) * 2011-05-24 2019-03-28 Ebay Inc. Image-based popularity prediction
US8892591B1 (en) * 2011-09-30 2014-11-18 Google Inc. Presenting search results
US9491479B2 (en) 2012-03-31 2016-11-08 Baidu Online Network Technology (Beijing) Co., Ltd. Image compression method, apparatus and device
US9561218B2 (en) 2012-09-05 2017-02-07 Chase Pharmaceuticals Corporation Anticholinergic neuroprotective composition and methods
US20150215625A1 (en) 2012-10-11 2015-07-30 Tencent Technology (Shenzhen) Company Limited Image compression method and system
US20150024839A1 (en) * 2013-07-16 2015-01-22 Microsoft Corporation Game Clip Popularity Based Control
US20150067331A1 (en) * 2013-08-30 2015-03-05 International Business Machines Corporation Remote data storage
US20150161517A1 (en) 2013-12-10 2015-06-11 Electronics And Telecommunications Research Institute Device and method for predicting popularity of social data
US20160371293A1 (en) * 2015-06-19 2016-12-22 Lenovo (Singapore) Pte, Ltd. Managing storage of digital content
GB2540470A (en) * 2015-06-19 2017-01-18 Lenovo Singapore Pte Ltd Managing storage of digital content
US20170017652A1 (en) * 2015-07-16 2017-01-19 Vizio lnscape Technologies, LLC Prediction of Future Views of Video Segments to Optimize System Resource Utilization
US20170323210A1 (en) 2016-05-06 2017-11-09 Wp Company Llc Techniques for prediction of popularity of media
US20180191800A1 (en) * 2016-12-30 2018-07-05 Facebook, Inc. Decision engine for dynamically selecting media streams
CN107563394A (en) * 2017-09-19 2018-01-09 广东工业大学 A kind of method and system of predicted pictures popularity
US20190362367A1 (en) 2018-05-22 2019-11-28 Wp Company Llc Techniques for prediction of long-term popularity of digital media
US20200351437A1 (en) * 2019-05-02 2020-11-05 International Business Machines Corporation Generating image capture configurations and compositions
CN110222231A (en) * 2019-06-11 2019-09-10 成都澳海川科技有限公司 A kind of temperature prediction technique of video clip
US20220254279A1 (en) * 2019-07-30 2022-08-11 Ntt Docomo, Inc. Popularity evaluation system and geographical feature generation model
CN111339404A (en) * 2020-02-14 2020-06-26 腾讯科技(深圳)有限公司 Content popularity prediction method and device based on artificial intelligence and computer equipment

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Almgren et al.; Predicting the future popularity of images on social networks. Multidisciplinary International Social Networks Conference on Social Informatics, Data Science. pp. 1-6, 2016.
Ballé et al.; Variational image compression with a scale hyperprior. International Conference on Learning Representations, pp. 1-14, 2018.
Choi et al.; Variable rate deep image compression with a conditional autoencoder. International Conference on Computer Vision, pp. 3146-3154, 2019.
Khosla et al.; What makes an image popular? International Conference on World Wide Web, pp. 867-876, 2014.
Liu, Haojie, et al. "Gated context model with embedded priors for deep image compression." arXiv preprint arXiv:1902.10480 (2019). (Year: 2019). *
Mazloom et al.; Multimodal popularity prediction of brand-related social media posts. ACM International Conference on Multimedia. pp. 197-201, 2016.
Minnen et al.; Joint autoregressive and hierarchical priors for learned image compression. Advances in Neural Information Processing Systems, pp. 10771-10780, 2018.

Also Published As

Publication number Publication date
US20220084187A1 (en) 2022-03-17

Similar Documents

Publication Publication Date Title
US11636283B2 (en) Committed information rate variational autoencoders
CN110929780B (en) Video classification model construction method, video classification device, video classification equipment and medium
US20220279183A1 (en) Image compression and decoding, video compression and decoding: methods and systems
CN114982227B (en) A method, device and medium for format selection
US10783395B2 (en) Method and apparatus for detecting abnormal traffic based on convolutional autoencoder
CN113658122B (en) Image quality evaluation method, device, storage medium and electronic device
CN112699937A (en) Apparatus, method, device, and medium for image classification and segmentation based on feature-guided network
CN114359592B (en) Model training and image processing method, device, equipment, and storage medium
CN106845471A (en) A kind of vision significance Forecasting Methodology based on generation confrontation network
WO2021155832A1 (en) Image processing method and related device
CN111738357A (en) Method, device and equipment for identifying garbage pictures
Xu et al. A novel image compression technology based on vector quantisation and linear regression prediction
CN111310041A (en) Image-text publishing method, model training method and device and storage medium
CN114155388A (en) Image recognition method and device, computer equipment and storage medium
CN110717058A (en) Information recommendation method and device, and storage medium
WO2023207836A1 (en) Image encoding method and apparatus, and image decompression method and apparatus
US12112463B2 (en) Method, device and computer readable medium for intrinsic popularity evaluation and content compression based thereon
CN120541542A (en) Multimodal data alignment method and device, electronic device, and storage medium
US20260011118A1 (en) Ai-based image processing method and apparatus, device, and storage medium
CN115082840B (en) Action video classification method and device based on data combination and channel correlation
CN107844541A (en) Image duplicate checking method and device
El-Khamy et al. Toward better semantic segmentation by retaining spectral information using matched wavelet pooling
Huang et al. Visual fidelity index for generative semantic communications with critical information embedding
CN114881196B (en) Student network processing methods, devices and electronic equipment
Alaql et al. No‐reference image quality metric based on multiple deep belief networks

Legal Events

Date Code Title Description
AS Assignment

Owner name: CITY UNIVERSITY OF HONG KONG, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WANG, SHIQI;MA, KEDE;DING, KEYAN;REEL/FRAME:057457/0010

Effective date: 20210905

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

AS Assignment

Owner name: ROKO LABS LLC, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAKOVITSKY, DMITRY;SELIN, SERGEI;REEL/FRAME:057516/0448

Effective date: 20210127

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE