CN112598438A - Outdoor advertisement recommendation system and method based on large-scale user portrait - Google Patents

Outdoor advertisement recommendation system and method based on large-scale user portrait Download PDF

Info

Publication number
CN112598438A
CN112598438A CN202011513373.0A CN202011513373A CN112598438A CN 112598438 A CN112598438 A CN 112598438A CN 202011513373 A CN202011513373 A CN 202011513373A CN 112598438 A CN112598438 A CN 112598438A
Authority
CN
China
Prior art keywords
advertisement
user
module
real
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011513373.0A
Other languages
Chinese (zh)
Inventor
袁晓晔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou 15 Billion Intelligent Technology Co ltd
Original Assignee
Suzhou 15 Billion Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou 15 Billion Intelligent Technology Co ltd filed Critical Suzhou 15 Billion Intelligent Technology Co ltd
Priority to CN202011513373.0A priority Critical patent/CN112598438A/en
Publication of CN112598438A publication Critical patent/CN112598438A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

Abstract

The invention discloses an outdoor advertisement recommendation method based on a large-scale user portrait, and belongs to the technical field of computational advertising. The advertisement scheduling module in the outdoor advertisement recommendation method is respectively connected with the user side, the advertisement management module and the real-time accurate crowd analysis module. The real-time accurate crowd analysis module is respectively in parameter interaction with the advertisement retrieval module, the user portrait module and the user behavior module. The user data mining module is respectively connected with the advertisement management module, the user behavior module and the user portrait module, and the advertisement management module is connected with the advertisement retrieval module. The outdoor advertisement recommendation method of the invention is used for generating a personalized advertisement list by inquiring the user interest and the historical behavior record according to the user figure when the user logs in the platform, and finally pushing the advertisement to the outdoor advertisement terminal. The method has good autonomous learning ability, can effectively improve the reach rate of the outdoor intelligent advertisement, and is suitable for outdoor advertisement recommendation under the background of the Internet of things and big data.

Description

Outdoor advertisement recommendation system and method based on large-scale user portrait
Technical Field
The invention relates to an advertisement recommendation method, in particular to an outdoor advertisement recommendation method based on a large-scale user portrait, and belongs to the technical field of computational advertising.
Background
With the rapid development of the 5G technology and the Internet of things technology, a technical support and development opportunity is provided for the online, digitization and intellectualization of outdoor media advertisements, and the outdoor advertisements can also realize self-innovation in the Internet of things era and comprehensively enter the data-driven era. Meanwhile, with the construction of modern smart cities, the outdoor LED programmed advertisements are endowed with brand-new revolutionary vitality by the Internet of things technology, and the offline commercial display screen not only can show the commercial vitality of the cities, but also realizes the function of flow rate change for urban operators.
In the background of the era of big data, the combination of big data thinking with advertising is a future trend. There are three important dimensions of big data thinking: quantitative, relational, and experimental thinking, in short, that is all measurable (providing a description of what is believed), all associable (inherent to different things), all testable (all solutions being verifiable), that in combination with advertising will make the advertising industry subject to a wide variety. The effect of combining the online advertisement and the offline advertisement is better. Although online advertisements develop rapidly, offline advertisements can be made to have higher temperature, and people often have deeper impression, so that the combination of online advertisements and offline advertisements is a better scheme, and the development trend of online and offline combined advertisements is more obvious along with the continuous deepening and development of the internet plus in the future.
Currently, more and more advertisers are not satisfied with inexpensive advertisements, and advertisement agencies are beginning to be required to explore people flow, liveness, and dissemination effects, and so on. Under the new requirements of advertisers, the industry still stays in the traditional methods of estimating the display Capacity (CPM) by using the alleged people flow and traffic flow data and estimating the outdoor media visibility Opportunities (OTS) by subjective judgment, and the like, which lack data bases, so that the traditional outdoor advertisement mode is not suitable for outdoor advertisement recommendation under the background of big data.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides the outdoor advertisement recommendation method based on the large-scale user portrait, which has the characteristics of good autonomous learning capability and capability of effectively improving the intelligent level of outdoor advertisement recommendation, and is suitable for outdoor advertisement recommendation under the current Internet of things and big data background.
The invention provides an outdoor advertisement recommendation system based on a large-scale user portrait, which comprises an advertisement scheduling module, a real-time accurate crowd analysis module, an advertisement retrieval module, a user behavior module, a user portrait module, an advertisement management module and a user data mining module, wherein the advertisement scheduling module is respectively connected with the real-time accurate crowd analysis module and the advertisement management module of a user side; the real-time accurate crowd analysis module is respectively in parameter interaction with the advertisement retrieval module, the user behavior module and the user portrait module; the user data mining module is respectively connected with the user behavior module, the user portrait module and the advertisement management module; the advertisement retrieval module is connected with the advertisement management module;
the advertisement scheduling module performs real-time flow analysis according to the real-time accurate crowd analysis module and is used for completing the environmental guidance of the whole advertisement scheduling execution;
the real-time accurate crowd analysis module is used for finishing the advertisement sequencing, numbering the sequenced advertisements and returning the numbered advertisements to the advertisement scheduling module after scoring;
the advertisement retrieval module indexes advertisement data by acquiring the label and characteristic parameter information transmitted by the real-time accurate crowd analysis module 2 and the data of the advertisement management module, and returns the hit advertisement list parameters to the real-time accurate crowd analysis module;
the user behavior module acquires the category label provided by the user data mining module and the user portrait parameter transmitted by the real-time accurate crowd analysis module, completes the query of user behavior information, and returns the user information and the strategy to the real-time accurate crowd analysis;
the user portrait module completes user portrait matching through the real-time information transmitted by the real-time accurate crowd analysis module and returns the user information to the accurate crowd analysis module;
the advertisement management module is used for storing the latest advertisement putting strategy set and providing the data set to the advertisement retrieval module for the user data mining module and the advertisement scheduling module to use;
the user data mining module can acquire the data of the advertisement management module in real time, match the user portrait and analyze and predict the user behavior.
In a further limited technical scheme of the present invention, the user data mining module of the outdoor advertisement recommendation system based on a large-scale user portrait includes a portrait updating module, a policy updating module and a behavior flow detection module. The portrait updating module and the strategy updating module realize real-time updating and concurrent use of the portrait and the strategy by constructing an Hbase dynamic data storage area online.
In the outdoor advertisement recommendation system based on the large-scale user portrait, the behavior stream detection module receives real-time data through a Flume distributed log collection system, and performs advertisement recommendation by establishing a sequencing model based on CTR, watching duration, forward feedback and reverse feedback.
An outdoor advertisement recommendation method based on a large-scale user portrait comprises the following steps:
s1, acquiring user image information and user behavior information through log data and service data of the APP terminal;
s2, preprocessing the acquired data, wherein the input data format is libSVM, namely the input data format is expressed as follows:
y index_1:value_1index_2:value_2…index_n:value_n,
therefore, the data format of the feature vector is converted into a libSVM format so as to meet the input requirement of an FM model;
s3, considering that the information characteristics of the user portrait information, the user behavior information and the advertisement portrait information are not continuous category values at all, in order to conform to the training data format of the model, the collected data is subjected to One-hot coding, and the One-hot coding can be expressed as:
Figure BDA0002845879100000031
wherein u represents a user, a represents an advertisement, and h represents historical behavior information;
s4, adopting an FM model to model all pairwise combinations between each pair of features so as to estimate a personalized advertisement recommendation list of the user, wherein the second-order polynomial FM model can be expressed as:
Figure BDA0002845879100000035
s5, decomposing the matrix W in S4 into W ═ VTV, wherein V ═ V1,V2,…,Vn)T,Vi=(vi1,vi2,…,vik) The matrix VTEach row of matrix V represents the relevance of a certain user to different features, each row of matrix V represents the relevance of a certain feature to different advertisements, and the model can be expressed as:
Figure BDA0002845879100000032
wherein
Figure BDA0002845879100000033
Therefore, there are:
Figure BDA0002845879100000034
s6, in order to carry out off-line verification on the recommended model, the following definitions are carried out:
the accuracy rate is that the number of correctly classified positive samples accounts for the proportion of the number of samples judged as positive samples by the classifier, wherein R (u) is equivalent to the positive samples of the model and is expressed as:
Figure BDA0002845879100000041
wherein, R (u) is a recommendation list obtained through a recommendation model, and T (u) is a behavior list of a user in an actual scene;
recall ratio of correctly classified positive samples to true positive samples, where t (u) corresponds to the true positive sample set and is expressed as:
Figure BDA0002845879100000042
coverage rate, namely the proportion of the total number of the recommended commodity sets in the recommendation system, and the quantity distribution or popularity of different advertisements can be different for the same coverage rate; in order to better describe the capability of the recommendation system for mining long tails, the times of different advertisements are counted, and the coverage rate is defined by information entropy:
Figure BDA0002845879100000043
where p (i) is the popularity of ad i divided by the sum of the popularity of all ads;
measuring dissimilarity among all advertisements in the recommendation list, and measuring the similarity of the advertisements in the recommendation list through different similarity functions, such as similarity based on content and similarity based on collaborative filtering, so that the diversity of different angles can be obtained; the concrete expression is as follows:
Figure BDA0002845879100000044
wherein s (i, j) is the similarity of advertisement i and advertisement j;
the overall diversity of the recommendation system can be defined as the average of the diversity of all user recommendation lists:
Figure BDA0002845879100000045
the simplest method is to recommend the advertisements which are not seen before to the users, but the quantity of the advertisements which are not seen by each user is huge, so that the average popularity of the recommended advertisements is calculated, and the advertisements with lower popularity are more likely to be novel for the users; therefore, if the average popular degree of the commodities in the recommendation result is lower, the recommendation result is novel;
and S7, screening out the model which shows the best performance in the off-line verification, carrying out on-line A/B test, and evaluating the on-line advertisement recommendation effect.
An outdoor advertisement recommendation method based on a large-scale user representation comprises user basic information, user social data, user consumption characteristics and user historical purchase data; the user behavior information comprises mobile end user behavior information and equipment end user behavior information; the mobile terminal user behavior information comprises information such as online watching completion, skipping, sharing, advertisement commenting, online advertisement watching duration, online advertisement marking state and the like; the device end user behavior information comprises information such as the offline device finishes watching, skips advertisements, and the offline device marks advertisement states.
Further, in the aforementioned method for recommending outdoor advertisements based on large-scale user portraits, the FM model can adaptively adjust the learning rate by using an adapelta algorithm, perform a large update on unusual parameters, perform a small update on common parameters, and accumulate fixed-size items without directly storing the items, or setting a default value.
The invention has the beneficial effects that: compared with the prior art, the invention has the following advantages:
1. the method maps the outdoor advertisement putting problem based on the large-scale user portrait into the behavior prediction problem of the large-scale online data through historical behavior information. By utilizing the integrated model with higher accuracy, the category prediction is carried out on the expected behavior of the user, and the reach capability and the intelligent level of the outdoor advertisement recommendation system are improved. And moreover, a special data storage structure and a corresponding prediction algorithm of the system integration model are designed, and the system can be applied to outdoor advertisement recommendation under the background of the Internet of things and big data.
2. Based on the established sequencing model, the method adopts a sub-linear online prediction method. Compared with the traditional linear prediction method, the trained Embedding layer of the sequencing model is extracted, the user characteristics are sequentially subjected to vectorization operation, and then all the characteristics are subjected to average weighting to form the final Embeddin-g of the user; the same operation is performed for the advertisement. And then, a vector retrieval tool Faiss is used for establishing indexes to construct an advertisement recommendation list, so that the prediction speed is greatly increased, the prediction time is only 3% of that of the traditional method, and the challenge of large-scale data processing is better met.
3. The method separates the services with low real-time requirements in the real-time service system, solves the problem of conflict between deep mining of user data and real-time online service requirements by means of offline data analysis, relieves the pressure of the real-time service system, and ensures the accuracy of data analysis in the process of pushing services to users in real time.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, the embodiments in the drawings do not constitute any limitation to the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of an outdoor advertisement recommendation method according to the present invention.
Fig. 2 is a schematic diagram of a data structure of the FM model according to this embodiment.
Fig. 3 is a schematic diagram of matrix conversion according to the present embodiment.
Fig. 4 is a flowchart of the outdoor advertising method according to this embodiment.
In the figure: 1. the system comprises an advertisement scheduling module, a real-time accurate crowd analysis module, a 3 advertisement retrieval module, a 4 user behavior module, a 5 user portrait module, a 6 advertisement management module and a 7 user data mining module.
Detailed Description
The technical solution of the present invention will be further described in detail with reference to the accompanying drawings and embodiments, which are preferred embodiments of the present invention. It is to be understood that the described embodiments are merely a subset of the embodiments of the invention, and not all embodiments; it should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to better understand the technical solutions, the technical solutions will be described in detail below with reference to the drawings and specific embodiments.
The embodiment of the invention provides an outdoor advertisement recommendation method facing a large-scale user portrait, which comprises an advertisement scheduling module 1, a real-time accurate crowd analysis module 2, an advertisement retrieval module 3, a user behavior module 4, a user portrait module 5, an advertisement management module 6 and a user data mining module 7, as shown in figure 1. The advertisement scheduling module 1 is connected with the real-time accurate crowd analysis module 2, the user side and the advertisement management module 6 respectively. The real-time accurate crowd analysis module 2 respectively carries out parameter interaction with the advertisement retrieval module 3, the user behavior module 4 and the user portrait module 5. The user data mining module 7 is respectively connected with the user behavior module 4, the user portrait module 5 and the advertisement management module 6. The advertisement retrieval module 3 is connected with the advertisement management module 6.
The advertisement scheduling module 1 performs real-time traffic analysis according to the real-time accurate crowd analysis module, and is used for completing environmental guidance of the whole advertisement scheduling execution. And the real-time accurate crowd analysis module 2 is used for finishing the advertisement sequencing and numbering the sequenced advertisements and returning the numbered advertisements to the advertisement scheduling module 1 after scoring. The advertisement retrieval module 3 indexes advertisement data by acquiring the label and characteristic parameter information transmitted by the real-time accurate crowd analysis module 2 and the data of the advertisement management module 6, and returns the hit advertisement list parameters to the real-time accurate crowd analysis module 2. And the user behavior module 4 acquires the category label provided by the user data mining module 7 and the user portrait parameter transmitted by the real-time accurate crowd analysis module 2, completes the query of the user behavior information, and returns the user information and the strategy to the real-time accurate crowd analysis module 2. And the user portrait module 5 completes user portrait matching through the real-time information transmitted by the real-time accurate crowd analysis module 2 and returns the user information to the accurate crowd analysis module 2. And the advertisement management module 6 is used for storing the latest advertisement putting strategy set and providing the data set to the advertisement retrieval module 3, the user data mining module 7 and the advertisement scheduling module 1 for use.
The user data mining module 7 can acquire data of the advertisement management module 6 in real time, match the user portrait and analyze and predict the user behavior, and specifically comprises a portrait updating module, a strategy updating module and a behavior flow detection module. The portrait updating module and the strategy updating module realize real-time updating and concurrent use of portrait and strategy by constructing an Hbase dynamic data storage area online. And the behavior flow detection module receives real-time data by using a Flume distributed log collection system, and carries out advertisement recommendation by establishing a sequencing model based on CTR, watching duration, forward feedback and reverse feedback. Therefore, each user can obtain a candidate advertisement set, the set is stored by using Redis to form a high-performance database in a key-word format, wherein a key represents a user, and a value represents an advertisement list to be recommended of the user. As shown in fig. 4, the present embodiment discloses an outdoor advertisement recommendation method based on a large-scale user profile, as shown in fig. 4, comprising the following steps:
and S1, acquiring user image information and user behavior information through the log data and the service data of the APP terminal.
S2, the user portrait information includes user basic information, user social data, user consumption characteristics and user historical purchase data; the user behavior information comprises mobile end user behavior information and equipment end user behavior information; the mobile terminal user behavior information comprises information such as online watching, skipping, sharing, commenting on advertisements, online advertisement watching duration, online advertisement state marking and the like; the device end user behavior information comprises information such as the offline device finishes watching, skips advertisements, and the offline device marks advertisement states.
S3, for data preprocessing, the input data format must be libSVM, namely expressed as follows:
y index_1:value_1index_2:value_2…index_n:value_n
therefore, the data format of the feature vector is converted into libSVM format to meet the input requirement of the FM model.
S4, considering that the information features such as user image information, user behavior information and advertisement image information are not always continuous, but are category values in many cases, in order to conform more to the training data format of the model, it is necessary to perform One-hot encoding process, as shown in FIG. 2. The One-hot code can be expressed as:
Figure BDA0002845879100000071
where u represents a user, a represents an advertisement, and h represents historical behavior information.
And S5, the FM model is used for estimating the personalized advertisement recommendation list of the user by modeling all pairwise combinations between each pair of features. The second order polynomial FM model can be expressed as:
Figure BDA0002845879100000072
s6, decomposing the matrix W in S5 into W ═ VTV, wherein V ═ V1,V2,…,Vn)T,Vi=(vi1,vi2,…,vik). Matrix VTEach row of matrix V represents the relevance of a certain user to a different feature, and each row of matrix V represents the relevance of a certain feature to a different advertisement, as shown in fig. 3; the model can be represented as:
Figure BDA0002845879100000073
wherein
Figure BDA0002845879100000074
Therefore, there are:
Figure BDA0002845879100000081
s7, the FM model usually adopts SGD algorithm, but compared with the SGD algorithm, Adadelta algorithm can adaptively adjust the learning rate, the method can carry out larger updating on unusual parameters and smaller updating on common parameters, and fixed-size items can be accumulated without directly storing the items, and default values are not required to be set.
S8, in order to carry out off-line verification on the recommended model, the following definitions are carried out:
the accuracy rate, which is the ratio of the number of correctly classified positive samples to the number of samples determined as positive samples by the classifier, is expressed as (here, r (u) is equivalent to the positive samples of the model):
Figure BDA0002845879100000082
wherein, r (u) is a recommendation list obtained by the recommendation model, and t (u) is a behavior list of the user in the actual scene.
Recall ratio of correctly classified positive samples to true positive samples, expressed as (where t (u) corresponds to true positive sample set):
Figure BDA0002845879100000083
coverage rate is the proportion of the total number of the recommended commodity sets in the recommendation system. The number distribution (popularity) of different advertisements may be different for the same coverage. In order to better describe the capability of the recommendation system for mining long tails, the times of different advertisements are required to be counted, and the coverage rate is defined by the information entropy:
Figure BDA0002845879100000084
where p (i) is the popularity of ad i divided by the sum of the popularity of all ads.
Diversity, namely measuring dissimilarity among all advertisements in the recommendation list, and measuring the similarity of the advertisements in the recommendation list through different similarity functions, such as similarity based on content and similarity based on collaborative filtering, so that diversity from different angles can be obtained. The concrete expression is as follows:
Figure BDA0002845879100000085
wherein s (i, j) is the similarity of advertisement i and advertisement j;
the overall diversity of the recommendation system can be defined as the average of the diversity of all user recommendation lists:
Figure BDA0002845879100000091
novelty-the simplest method is to recommend to users advertisements that they have not seen before, but the number of advertisements that each user has not seen is enormous, so the average popularity of recommended advertisements is generally calculated, and advertisements with lower popularity are more likely to be novel to users. Therefore, if the average popularity of the commodities in the recommendation result is lower, the recommendation result is novel.
And S9, screening out the model which shows the best performance in the off-line verification, carrying out on-line A/B test, and evaluating the on-line advertisement recommendation effect.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. The utility model provides an outdoor advertisement recommendation system based on large-scale user portrait, includes advertisement scheduling module (1), real-time accurate crowd analysis module (2), advertisement retrieval module (3), user action module (4), user portrait module (5), advertisement management module (6) and user data mining module (7), its characterized in that: the advertisement scheduling module (1) is respectively connected with the real-time accurate crowd analysis module (2) and the advertisement management module (6) of the user side; the real-time accurate crowd analysis module (2) is in parameter interaction with the advertisement retrieval module (3), the user behavior module (4) and the user portrait module (5) respectively; the user data mining module (7) is respectively connected with the user behavior module (4), the user portrait module (5) and the advertisement management module (6); the advertisement retrieval module (3) is connected with the advertisement management module (6);
the advertisement scheduling module (1) performs real-time flow analysis according to the real-time accurate crowd analysis module and is used for completing the environmental guidance of the whole advertisement scheduling execution;
the real-time accurate crowd analysis module (2) is used for finishing advertisement sequencing, numbering the sequenced advertisements, and returning the numbered advertisements to the advertisement scheduling module (1);
the advertisement retrieval module (3) indexes advertisement data by acquiring the label and characteristic parameter information transmitted by the real-time accurate crowd analysis module 2 and the data of the advertisement management module (6), and returns the hit advertisement list parameters to the real-time accurate crowd analysis module (2);
the user behavior module (4) acquires the category labels provided by the user data mining module (7) and the user portrait parameters transmitted by the real-time accurate crowd analysis module (2), completes the query of user behavior information, and returns the user information and the strategy to the real-time accurate crowd analysis module (2);
the user portrait module (5) completes user portrait matching through real-time information transmitted by the real-time accurate crowd analysis module (2), and returns user information to the accurate crowd analysis module (2);
the advertisement management module (6) is used for storing the latest advertisement putting strategy set and providing the data set to the advertisement retrieval module (3) for the user data mining module (7) and the advertisement scheduling module (1) to use;
the user data mining module (7) can acquire data of the advertisement management module (6) in real time, match the user with the user portrait and analyze and predict the user behavior.
2. The large scale user representation-based outdoor advertisement recommendation system of claim 1, wherein: the user data mining module (7) comprises a portrait updating module, a strategy updating module and a behavior flow detection module.
3. The large scale user representation-based outdoor advertisement recommendation system of claim 2, wherein: the portrait updating module and the strategy updating module realize real-time updating and concurrent use of the portrait and the strategy by constructing an Hbase dynamic data storage area online.
4. The large scale user representation-based outdoor advertisement recommendation system of claim 2, wherein: the behavior stream detection module receives real-time data through a Flume distributed log collection system, and carries out advertisement recommendation through establishing a sequencing model based on CTR, watching duration, forward feedback and reverse feedback.
5. An outdoor advertisement recommendation method based on a large-scale user portrait is characterized by comprising the following steps:
s1, acquiring user image information and user behavior information through log data and service data of the APP terminal;
s2, preprocessing the acquired data, wherein the input data format is libSVM, namely the input data format is expressed as follows:
y index_1:value_1index_2:value_2…index_n:value_n,
therefore, the data format of the feature vector is converted into a libSVM format so as to meet the input requirement of an FM model;
s3, considering that the information characteristics of the user portrait information, the user behavior information and the advertisement portrait information are not continuous category values at all, in order to conform to the training data format of the model, the collected data is subjected to One-hot coding, and the One-hot coding can be expressed as:
Figure FDA0002845879090000021
wherein u represents a user, a represents an advertisement, and h represents historical behavior information;
s4, adopting an FM model to model all pairwise combinations between each pair of features so as to estimate a personalized advertisement recommendation list of the user, wherein the second-order polynomial FM model can be expressed as:
Figure FDA0002845879090000022
s5, decomposing the matrix W in S4 into W ═ VTV, wherein V ═ V1,V2,…,Vn)T,Vi=(vi1,vi2,…,vik) The matrix VTEach row of matrix V represents the relevance of a certain user to different features, each row of matrix V represents the relevance of a certain feature to different advertisements, and the model can be expressed as:
Figure FDA0002845879090000023
wherein
Figure FDA0002845879090000024
Therefore, there are:
Figure FDA0002845879090000031
s6, in order to carry out off-line verification on the recommended model, the following definitions are carried out:
the accuracy rate is that the number of correctly classified positive samples accounts for the proportion of the number of samples judged as positive samples by the classifier, wherein R (u) is equivalent to the positive samples of the model and is expressed as:
Figure FDA0002845879090000032
wherein, R (u) is a recommendation list obtained through a recommendation model, and T (u) is a behavior list of a user in an actual scene;
recall ratio of correctly classified positive samples to true positive samples, where t (u) corresponds to the true positive sample set and is expressed as:
Figure FDA0002845879090000033
coverage rate, namely the proportion of the total number of the recommended commodity sets in the recommendation system, and the quantity distribution or popularity of different advertisements can be different for the same coverage rate; in order to better describe the capability of the recommendation system for mining long tails, the times of different advertisements are counted, and the coverage rate is defined by information entropy:
Figure FDA0002845879090000034
where p (i) is the popularity of ad i divided by the sum of the popularity of all ads;
measuring dissimilarity among all advertisements in the recommendation list, and measuring the similarity of the advertisements in the recommendation list through different similarity functions, such as similarity based on content and similarity based on collaborative filtering, so that the diversity of different angles can be obtained; the concrete expression is as follows:
Figure FDA0002845879090000035
wherein s (i, j) is the similarity of advertisement i and advertisement j;
the overall diversity of the recommendation system can be defined as the average of the diversity of all user recommendation lists:
Figure FDA0002845879090000036
the simplest method is to recommend the advertisements which are not seen before to the users, but the quantity of the advertisements which are not seen by each user is huge, so that the average popularity of the recommended advertisements is calculated, and the advertisements with lower popularity are more likely to be novel for the users; therefore, if the average popular degree of the commodities in the recommendation result is lower, the recommendation result is novel;
and S7, screening out the model which shows the best performance in the off-line verification, carrying out on-line A/B test, and evaluating the on-line advertisement recommendation effect.
6. The large scale user representation-based outdoor advertisement recommendation method of claim 5, wherein: the user portrait information comprises user basic information, user social data, user consumption characteristics and user historical purchase data; the user behavior information comprises mobile end user behavior information and equipment end user behavior information; the mobile terminal user behavior information comprises information such as online watching completion, skipping, sharing, advertisement commenting, online advertisement watching duration, online advertisement marking state and the like; the device end user behavior information comprises information such as the offline device finishes watching, skips advertisements, and the offline device marks advertisement states.
7. The large scale user representation-based outdoor advertisement recommendation method of claim 5, wherein: the FM model can adaptively adjust the learning rate by adopting an Adadelta algorithm, relatively update unusual parameters, relatively update common parameters, accumulate fixed-size items without directly storing the items, and does not need to set default values.
CN202011513373.0A 2020-12-18 2020-12-18 Outdoor advertisement recommendation system and method based on large-scale user portrait Pending CN112598438A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011513373.0A CN112598438A (en) 2020-12-18 2020-12-18 Outdoor advertisement recommendation system and method based on large-scale user portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011513373.0A CN112598438A (en) 2020-12-18 2020-12-18 Outdoor advertisement recommendation system and method based on large-scale user portrait

Publications (1)

Publication Number Publication Date
CN112598438A true CN112598438A (en) 2021-04-02

Family

ID=75200134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011513373.0A Pending CN112598438A (en) 2020-12-18 2020-12-18 Outdoor advertisement recommendation system and method based on large-scale user portrait

Country Status (1)

Country Link
CN (1) CN112598438A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837811A (en) * 2021-09-30 2021-12-24 成都新潮传媒集团有限公司 Elevator advertisement point recommendation method and device, computer equipment and storage medium
CN114565407A (en) * 2022-03-01 2022-05-31 北京派瑞威行互联技术有限公司 Advertisement delivery data analysis method and system
CN114780855A (en) * 2022-05-05 2022-07-22 穗保(广州)科技有限公司 Information sharing system based on Internet security
CN114912948A (en) * 2022-04-24 2022-08-16 深圳船奇科技有限公司 Cloud service-based cross-border e-commerce big data intelligent processing method, device and equipment
CN116485475A (en) * 2023-05-06 2023-07-25 湖北巨字传媒有限公司 Internet of things advertisement system, method and device based on edge calculation
CN116797282A (en) * 2023-08-28 2023-09-22 成都一心航科技有限公司 Real-time monitoring system and monitoring method for advertisement delivery

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837811A (en) * 2021-09-30 2021-12-24 成都新潮传媒集团有限公司 Elevator advertisement point recommendation method and device, computer equipment and storage medium
CN113837811B (en) * 2021-09-30 2023-10-10 成都屏盟科技有限公司 Elevator advertisement point position recommending method and device, computer equipment and storage medium
CN114565407A (en) * 2022-03-01 2022-05-31 北京派瑞威行互联技术有限公司 Advertisement delivery data analysis method and system
CN114565407B (en) * 2022-03-01 2022-10-11 北京派瑞威行互联技术有限公司 Advertisement delivery data analysis method and system
CN114912948A (en) * 2022-04-24 2022-08-16 深圳船奇科技有限公司 Cloud service-based cross-border e-commerce big data intelligent processing method, device and equipment
CN114912948B (en) * 2022-04-24 2023-03-24 深圳船奇科技有限公司 Cloud service-based cross-border e-commerce big data intelligent processing method, device and equipment
CN114780855A (en) * 2022-05-05 2022-07-22 穗保(广州)科技有限公司 Information sharing system based on Internet security
CN114780855B (en) * 2022-05-05 2022-11-25 穗保(广州)科技有限公司 Information sharing system based on Internet security
CN116485475A (en) * 2023-05-06 2023-07-25 湖北巨字传媒有限公司 Internet of things advertisement system, method and device based on edge calculation
CN116797282A (en) * 2023-08-28 2023-09-22 成都一心航科技有限公司 Real-time monitoring system and monitoring method for advertisement delivery
CN116797282B (en) * 2023-08-28 2023-10-27 成都一心航科技有限公司 Real-time monitoring system and monitoring method for advertisement delivery

Similar Documents

Publication Publication Date Title
CN112598438A (en) Outdoor advertisement recommendation system and method based on large-scale user portrait
CN110245981B (en) Crowd type identification method based on mobile phone signaling data
CN103714139B (en) Parallel data mining method for identifying a mass of mobile client bases
CN110222267A (en) A kind of gaming platform information-pushing method, system, storage medium and equipment
CN101634996A (en) Individualized video sequencing method based on comprehensive consideration
CN111523055B (en) Collaborative recommendation method and system based on agricultural product characteristic attribute comment tendency
CN110415071B (en) Automobile competitive product comparison method based on viewpoint mining analysis
CN106610970A (en) Collaborative filtering-based content recommendation system and method
CN107391670A (en) A kind of mixing recommendation method for merging collaborative filtering and user property filtering
CN113505204A (en) Recall model training method, search recall device and computer equipment
CN106528812A (en) USDR model based cloud recommendation method
CN111178721A (en) Intelligent tourism system
CN104537028A (en) Webpage information processing method and device
CN111143689A (en) Method for constructing recommendation engine according to user requirements and user portrait
CN101986301B (en) Inverse neighbor analysis-based collaborative filtering recommendation system and method
CN113407729B (en) Judicial-oriented personalized case recommendation method and system
CN104572915A (en) User event relevance calculation method based on content environment enhancement
Sun et al. Tourism demand forecasting of multi-attractions with spatiotemporal grid: a convolutional block attention module model
CN110489665B (en) Microblog personalized recommendation method based on scene modeling and convolutional neural network
CN115408618B (en) Point-of-interest recommendation method based on social relation fusion position dynamic popularity and geographic features
CN113239159A (en) Cross-modal retrieval method of videos and texts based on relational inference network
CN111506813A (en) Remote sensing information accurate recommendation method based on user portrait
CN113688281B (en) Video recommendation method and system based on deep learning behavior sequence
CN111143688B (en) Evaluation method and system based on mobile news client
CN112118486B (en) Content item delivery method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination