CN113032551A - Delivery progress calculation method and system based on combination of big data and article title - Google Patents

Delivery progress calculation method and system based on combination of big data and article title Download PDF

Info

Publication number
CN113032551A
CN113032551A CN202110562046.2A CN202110562046A CN113032551A CN 113032551 A CN113032551 A CN 113032551A CN 202110562046 A CN202110562046 A CN 202110562046A CN 113032551 A CN113032551 A CN 113032551A
Authority
CN
China
Prior art keywords
article
advertisement
component
progress
delivery
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110562046.2A
Other languages
Chinese (zh)
Other versions
CN113032551B (en
Inventor
段小霞
赵郑
刘德恒
于言言
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zeqiao Medical Technology Co ltd
Original Assignee
Beijing Zeqiao Media Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zeqiao Media Technology Co ltd filed Critical Beijing Zeqiao Media Technology Co ltd
Priority to CN202110562046.2A priority Critical patent/CN113032551B/en
Publication of CN113032551A publication Critical patent/CN113032551A/en
Application granted granted Critical
Publication of CN113032551B publication Critical patent/CN113032551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0254Targeted advertisements based on statistics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Evolutionary Biology (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a system for calculating a release progress based on combination of big data and an article title, wherein the system comprises the following parts: the system comprises an article acquisition and processing component, a feature association component, a strong association component, a central aggregation component, an article library component, an advertisement input component, an advertisement processing component, an advertisement release rule setting component and a release progress calculation component; the advertisement putting strategy can be adjusted in time according to the advertisement putting progress of each period, so that the accuracy of the advertisement putting strategy is improved, and the advertisement putting benefit of an advertisement putting person can be improved; the network articles are obtained based on big data, association rules of article title characteristics are mined, and a clustering center is obtained through calculation of an approximate intersection number obtained by a strong association rule and a full set of association rules, so that clustering division is more accurate, and accurate delivery of the articles is facilitated.

Description

Delivery progress calculation method and system based on combination of big data and article title
Technical Field
The invention relates to the field of computers, in particular to a method and a system for calculating a delivery progress based on combination of big data and an article title.
Background
With the development of the network era, the method of embedding advertisements into articles is an effective advertisement promotion method, and the existing method of embedding advertisements into web page articles is generally to firstly connect advertisement resources of web pages and embed advertisement contents into the web page articles when developing the web pages. The method for implanting the advertisement into the webpage article not only subdivides the webpage article in an article reading mode, but also shares the content through functions of transshipment, forwarding and the like, so that the advertisement propagation range is wider.
The method for calculating the advertising progress is multiple, and the advertising progress inquiry device proposed by people such as Pandang and the like mainly comprises an advertising progress inquiry device, can monitor the advertising playing of the played advertising device through a set body, realizes monitoring through wireless connection, can inquire the progress of the advertising through a display screen in real time, is convenient to know the advertising progress and the fed-back effect, and solves the problems that although the advertising effect of an advertising board in the prior art is played to an advertisement, the real-time inquiry of the advertising progress cannot be realized, and therefore for an advertiser, the advertising progress and the fed-back effect cannot be known. However, the device can only know the advertisement playing progress, and cannot know the advertisement putting effect, so that corresponding putting strategy adjustment cannot be made in time after the advertisement putting progress is known, and the advertisement putting benefit is improved.
In view of the above, it is desirable to provide a method and a system for calculating a delivery progress based on a combination of big data and an article title, which can solve the above problems.
Disclosure of Invention
The technical problem that this application will solve is: because the prior art cannot acquire the advertisement putting effect and cannot make corresponding putting strategy adjustment in time after acquiring the advertisement putting progress so as to improve the advertisement putting benefit, the method and the system for calculating the putting progress based on the combination of big data and an article title are provided.
The technical scheme of the invention is as follows:
the delivery system based on the combination of big data and article titles comprises the following parts:
the system comprises an article acquisition and processing component, a characteristic association component, a central aggregation component, an article library component, an advertisement input component, an advertisement processing component, an advertisement release rule setting component, an article release progress calculation component and an article library component, wherein the article acquisition and processing component is connected with the characteristic association component, the characteristic association component is respectively connected with the central aggregation component and the strong association component, the strong association component is connected with the central aggregation component, the central aggregation component is connected with the article library component, the article library component is connected with the advertisement processing component, meanwhile, the advertisement input component is also connected with the advertisement processing component, the advertisement processing component is connected with the release rule setting component, the release rule setting component is connected with the release progress;
the strong association component is used for screening out strong association rules in the association rules, setting judgment conditions and defining an approximate intersection number according to the judgment conditions;
the central aggregation component is used for merging all association rules of each feature, further calculating the central aggregation degree of each feature according to the association rule complete set and the approximate intersection number of each feature, and selecting a clustering center and the category of each feature;
the delivery rule setting component is used for setting advertisement delivery rules;
the delivery progress calculation component is used for receiving data of the article promotion background, setting settlement periods, calculating the click rate of each settlement period, obtaining the dynamic delivery progress of the advertisement according to the advertisement click rates of different settlement periods, and judging whether the dynamic delivery progress of the current settlement period meets the progress point expected value or not;
the article library component records the types of the articles which are delivered, sends the article title characteristics of the rest types to the advertisement processing component, recalculates the types and the similar characteristics of the article pool to be delivered corresponding to the advertisement, and delivers the articles again according to the advertisement delivery rule.
Preferably, the method for calculating the delivery progress based on the combination of big data and article titles comprises the following steps:
a, capturing historical articles in a network based on big data, extracting the title features of the articles, and clustering the title features of the articles based on association rules to form an article library component;
and B, determining an advertisement theme, calculating the similarity between the characteristics of the advertisement theme and the characteristics of the titles of the articles in the article library assembly to obtain the total relevancy of the advertisements, selecting the articles to be promoted to release the advertisements, and counting the reading amount of the articles after promotion and the click amount of the corresponding advertisements to obtain the dynamic advertisement release progress of each settlement period.
Preferably, the step a includes:
selecting strong association rules in all the characteristic association rules, and defining an approximate intersection number, wherein the approximate intersection satisfies the sum of confidence degrees of all the strong association rules of the judgment condition, and the approximate intersection number is as follows:
Figure DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 401943DEST_PATH_IMAGE002
d represents randomly selecting d strong association rules,
Figure 522346DEST_PATH_IMAGE003
and D is the number of all association rules.
Preferably, the confidence degree judgment condition is as follows:
(1) randomly selecting one strong association rule from all the strong association rules, traversing d association rules backwards from the strong association rule, and selecting the maximum confidence coefficient of the d association rules;
(2) randomly selecting d strong association rules, and obtaining d maximum confidence degrees according to the method in (1), wherein any maximum confidence degree is
Figure 302083DEST_PATH_IMAGE002
Preferably, the step a further comprises:
further calculating the central polymerization degree of each feature according to the association rule complete set and the approximate intersection number of each feature:
Figure 867932DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE005
and
Figure 885566DEST_PATH_IMAGE006
setting the weight of discrete characteristics according to the data characteristics and actual requirements for reconciling the parameters,
Figure 380133DEST_PATH_IMAGE007
is characterized by
Figure 647166DEST_PATH_IMAGE008
Any one of the features that are associated with,
Figure 282284DEST_PATH_IMAGE009
,
Figure 951163DEST_PATH_IMAGE010
is characterized by
Figure 147789DEST_PATH_IMAGE008
The number of features that are strongly correlated,
Figure 574222DEST_PATH_IMAGE011
is characterized by
Figure 311234DEST_PATH_IMAGE008
Number of features that are not strongly correlated, and
Figure 37882DEST_PATH_IMAGE012
Figure 467726DEST_PATH_IMAGE013
(ii) a The central degree of polymerization demonstrates a characteristic degree of centralization;
Figure 647035DEST_PATH_IMAGE014
Figure 92797DEST_PATH_IMAGE015
the number of the title features of the article a;
Figure 736268DEST_PATH_IMAGE016
representation and characteristics
Figure 274697DEST_PATH_IMAGE017
Associated with any one of the features
Figure 675722DEST_PATH_IMAGE018
The distance of (d);
Figure 754537DEST_PATH_IMAGE019
represents the central degree of polymerization;
Figure 455777DEST_PATH_IMAGE020
representing an approximate intersection number.
Preferably, the step B includes:
the correlation coefficient of the current advertisement topic characteristics and different article pool categories is as follows:
Figure 227424DEST_PATH_IMAGE021
the distance between the subject of the current advertisement and the different article pool categories is:
Figure 614280DEST_PATH_IMAGE022
wherein the content of the first and second substances,
Figure 496786DEST_PATH_IMAGE023
indicating the distance of the subject of the advertisement from the article pool category y,
Figure 521373DEST_PATH_IMAGE024
any advertisement subject feature x and any article title category obtained according to the calculation method in the step A
Figure 667184DEST_PATH_IMAGE025
X is the number of ad theme features,
Figure 105119DEST_PATH_IMAGE025
representing any one of the article title categories in the article pool category y,
Figure 463419DEST_PATH_IMAGE026
representing the number of chapter title characteristics in the article pool category y;
and (3) calculating the correlation coefficient and distance of each advertisement subject characteristic and each article pool category in a traversing manner to obtain the total correlation degree of the current advertisement:
Figure 935988DEST_PATH_IMAGE027
wherein Y is the article pool category number; selecting total correlation
Figure 485656DEST_PATH_IMAGE028
Removing the characteristics with the distance larger than the threshold value in the m article pools from the front m article pool categories with the maximum value, wherein the residual characteristics are similar characteristics of the advertisement;
Figure 410887DEST_PATH_IMAGE029
indicating the probability that the category y contains the feature x,
Figure 572878DEST_PATH_IMAGE030
indicating the probability that other classes than class y contain feature x,
Figure 306479DEST_PATH_IMAGE031
indicating the probability that the category y does not contain the feature x,
Figure 590829DEST_PATH_IMAGE032
indicating the probability that other classes than class y do not contain feature x.
Preferably, the step B includes:
setting an advertisement putting rule: determining the number n of articles to be delivered by a user, randomly selecting a category from m article pools as an article pool delivery category, and selecting a preset number of articles to be promoted with the most similar characteristics in the article pool delivery category as delivered articles of the current advertisement; counting the reading amount Ar of the article and the click amount Ah of the corresponding advertisement in a settlement period after the article is popularized;
the delivery progress calculation component acquires the article reading amount Ar of the text pushing background and the click rate Ah of the corresponding advertisement, and calculates the advertisement click rate of a first settlement period:
Figure DEST_PATH_IMAGE033
the settlement period is decided by the user;
obtaining the dynamic advertisement putting progress according to the advertisement click rates of different settlement periods:
Figure 206619DEST_PATH_IMAGE034
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE035
for the delivered progress of the advertisement in the t-th settlement period,
Figure 670836DEST_PATH_IMAGE036
the advertisement click rate for the t-1 th settlement period,
Figure DEST_PATH_IMAGE037
the number of articles to be advertised for the t-1 th settlement period,
Figure 790101DEST_PATH_IMAGE038
the average of the number of articles released for the first t-1 settlement periods.
Preferably, the step B further comprises:
setting progress point expectation values for different settlement periods by a user
Figure DEST_PATH_IMAGE039
Figure 917457DEST_PATH_IMAGE035
If the dynamic advertisement putting progress in the current settlement period reaches the expected value of the progress point of the user, namely the put progress of the advertisement in the t-th settlement period
Figure 754963DEST_PATH_IMAGE040
If so, continuing to select the article to be promoted from the article pool categories adopted in the previous period in the next period for advertisement putting; otherwise, the delivery progress calculation component sends the article pool type adopted in the previous period to the article library component, the article library component records the delivered type, sends the article title characteristics of the rest types to the advertisement processing component, recalculates the article pool type to be delivered and the similar characteristics corresponding to the advertisement, re-delivers according to the advertisement delivery rule, and iterates circularly until the sum of the dynamic delivery progress reaches the preset total delivery progress.
The invention has the beneficial effects that:
(1) the advertisement putting progress calculation method can adjust the advertisement putting strategy in time according to the advertisement putting progress of each period, thereby improving the accuracy of the advertisement putting strategy and further improving the advertisement putting benefits of advertisement putting persons;
(2) acquiring a network article based on big data, mining association rules of article title characteristics, and calculating an approximate intersection number obtained by a strong association rule and an association rule complete set to obtain a clustering center, so that clustering division is more accurate, and articles can be accurately delivered;
(3) and advertising is carried out according to the relevance of the advertising theme and the article pool, a plurality of settlement periods are set in the advertising process, the delivered articles are adjusted in real time according to the dynamic delivery progress of each settlement period, and an advertiser can independently master the advertising progress and effect.
Drawings
FIG. 1 is a diagram of a delivery system architecture based on the combination of big data and article titles according to the present invention;
fig. 2 is a flowchart of an input progress calculation method based on the combination of big data and an article title according to the present invention.
Detailed Description
The following detailed description will be provided with reference to the drawings in the present embodiment, so that how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented. It should be noted that, as long as there is no conflict, the features in the embodiments of the present invention may be combined with each other, and the formed technical solutions are within the scope of the present invention.
Referring to fig. 1, the delivery system based on big data and article title combination according to the present invention includes the following components:
the article acquisition and processing component 10, the feature association component 20, the strong association component 30, the central aggregation component 40, the article library component 50, the advertisement input component 60, the advertisement processing component 70, the placement rule setting component 80, and the placement progress calculation component 90
The article obtaining and processing component 10 is configured to capture a historical article in a recent fixed time period in a network, perform preprocessing such as denoising on the article, and extract a title feature of the article according to the prior art. The article acquisition and processing component 10 sends the article title features to the feature association component 20 in a data transmission manner;
the feature association component 20 is configured to calculate a distance between the features of the article title to obtain an association rule between the features. The feature association component 20 sends the association rule to the strong association component 30 and the central aggregation component 40 by means of data transmission;
the strong association component 30 is configured to filter out a strong association rule in the association rules, set a judgment condition, and define an approximate intersection number according to the judgment condition. The strong association component 30 sends the screened strong association rules and the screened approximate intersection numbers to the central aggregation component 40 in a data transmission manner;
the center aggregation component 40 is configured to merge all association rules of each feature, further calculate a center aggregation degree of each feature according to the association rule complete set and the approximate intersection number of each feature, and select a cluster center and a category to which each feature belongs. The central aggregation component 40 sends the clustered features to the chapter library component 50 in a data transmission manner;
the article library component 50 includes a plurality of article pools, each article pool storing article title characteristics of different categories and their associated rules, and labeling the categories of the delivered article pools. The article library component 50 sends the stored article title characteristics and the associated rules thereof to the advertisement processing component 70 in a data transmission manner;
the advertisement input component 60 inputs the subject of the advertisement provided by the advertiser, which includes information describing the advantages and features of the current advertisement. The advertisement input component 60 sends the advertisement topic to the advertisement processing component 70 by means of data transmission;
the advertisement processing component 70 is configured to perform feature extraction on an advertisement topic to obtain advertisement topic features, calculate a correlation coefficient and a distance between each advertisement topic feature and each article pool category to obtain a total correlation of a current advertisement, and obtain an article pool category to be delivered and similar features according to the total correlation. The advertisement processing component 70 sends the article pool categories and similar features to be delivered to the delivery rule setting component 80 in a data transmission manner;
the delivery rule setting component 80 is configured to set an advertisement delivery rule: the number n of articles to be delivered is determined by a user, one category is randomly selected from m article pools to serve as a delivery category, and a preset number of articles to be promoted with the largest similar characteristics in the article pool category are selected to serve as the delivered articles of the current advertisement for advertisement delivery. The release rule setting component 80 sends the release article pool type and the release article quantity to the release progress calculation component 90 in a data transmission mode;
the delivery progress calculation component 90 is configured to receive data of an article promotion background, including article reading amount and advertisement click amount, set a settlement period, calculate a click rate of each settlement period, obtain a dynamic delivery progress of an advertisement according to advertisement click rates of different settlement periods, and determine whether the dynamic delivery progress of the current settlement period meets an expected progress point value, and if not, the delivery progress calculation component 90 sends the article pool type adopted in the previous period to the stamp library component 50; the article library component 50 records the types of the articles that have been delivered, sends the article title features of the remaining types to the advertisement processing component 70, recalculates the types and similar features of the article pool to be delivered corresponding to the advertisement, and delivers the articles again according to the advertisement delivery rule.
The invention discloses a delivery progress calculation method based on combination of big data and article titles, which comprises the following steps:
A. capturing historical articles in a network based on big data, extracting the title features of the articles, and clustering the title features of the articles based on association rules to form an article library component 50;
A1. the article acquisition and processing component 10 captures historical articles in the latest fixed time period in the network based on big data, extracts article title features according to the prior art, and represents any article title feature as
Figure DEST_PATH_IMAGE041
Wherein, in the step (A),
Figure 780730DEST_PATH_IMAGE042
and A is the number of article titles,
Figure 82398DEST_PATH_IMAGE014
Figure 380655DEST_PATH_IMAGE015
the number of headline features of the article a.
The feature association component 20 performs association clustering on the article title features throughAnd calculating the distance between any two features to obtain the similarity between the two features, and judging whether the two features are similar through a similarity threshold value to obtain an association rule. Firstly, set up
Figure DEST_PATH_IMAGE043
Is any two of all the title features, the distance between the two terms is defined as:
Figure 705458DEST_PATH_IMAGE044
wherein the content of the first and second substances,
Figure 75259DEST_PATH_IMAGE045
representation feature
Figure 903538DEST_PATH_IMAGE008
Or characteristic of
Figure 700592DEST_PATH_IMAGE046
The number of occurrences that are common among all articles,
Figure 745647DEST_PATH_IMAGE047
representation feature
Figure 856822DEST_PATH_IMAGE008
The number of occurrences in all of the articles,
Figure 867503DEST_PATH_IMAGE048
representation feature
Figure 773142DEST_PATH_IMAGE046
Number of occurrences in all articles. If it is
Figure 869274DEST_PATH_IMAGE049
Then, it indicates the characteristic
Figure 518562DEST_PATH_IMAGE008
And features of
Figure 819968DEST_PATH_IMAGE046
In association with each other, the information is stored,
Figure 958825DEST_PATH_IMAGE050
is a similarity threshold. Forming an association rule:
Figure 479936DEST_PATH_IMAGE051
Figure 729652DEST_PATH_IMAGE008
is a front-piece of the association rule,
Figure 652609DEST_PATH_IMAGE046
as a back-piece of the association rule.
A2. The strong association component 30 screens out strong association rules in association rules, wherein the association rules have support degree, confidence degree and characteristics
Figure 900050DEST_PATH_IMAGE008
And features of
Figure 970774DEST_PATH_IMAGE046
Probability of common occurrence in all articles
Figure 460399DEST_PATH_IMAGE052
For support, including features
Figure 300179DEST_PATH_IMAGE008
The article of (1) also contains features
Figure 718522DEST_PATH_IMAGE046
Probability of (2)
Figure 948647DEST_PATH_IMAGE053
As confidence level, if the support level is high
Figure 805744DEST_PATH_IMAGE054
And confidence degree
Figure 437714DEST_PATH_IMAGE055
Then characteristic of
Figure 89275DEST_PATH_IMAGE008
And features of
Figure 570810DEST_PATH_IMAGE046
Is a strong association in which, among other things,
Figure 903702DEST_PATH_IMAGE056
in order to be the minimum degree of support,
Figure 452495DEST_PATH_IMAGE057
is the minimum confidence.
Selecting strong association rules in all the characteristic association rules, and defining an approximate intersection number, wherein the approximate intersection satisfies the sum of confidence degrees of all the strong association rules of the judgment condition, and the approximate intersection number is as follows:
Figure 478220DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 745253DEST_PATH_IMAGE002
d represents randomly selecting d strong association rules,
Figure 881836DEST_PATH_IMAGE003
and D is the number of all association rules. The judgment conditions are as follows:
(1) randomly selecting one strong association rule from all the strong association rules, traversing d association rules backwards from the strong association rule, and selecting the maximum confidence coefficient of the d association rules;
(2) randomly selecting d strong association rules, and obtaining d maximum confidence degrees according to the method in (1), wherein any maximum confidence degree is
Figure 721354DEST_PATH_IMAGE002
A3. The central aggregation component 40 incorporates the association rules described above, each feature being associated with at least one feature. Therefore, the characteristics
Figure 980297DEST_PATH_IMAGE008
There is an association rule:
Figure 672310DEST_PATH_IMAGE058
i.e. by
Figure 347005DEST_PATH_IMAGE059
Any one non-empty subset of
Figure 870390DEST_PATH_IMAGE060
Figure 237917DEST_PATH_IMAGE061
Is characterized by
Figure 213964DEST_PATH_IMAGE008
The number of associated features. And further calculating the central polymerization degree of each feature according to the association rule complete set and the approximate intersection number of each feature aiming at the discrete features:
Figure 190885DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 834356DEST_PATH_IMAGE005
and
Figure 107205DEST_PATH_IMAGE006
setting the weight of discrete characteristics according to the data characteristics and actual requirements for reconciling the parameters,
Figure 773810DEST_PATH_IMAGE007
is characterized by
Figure 852624DEST_PATH_IMAGE008
Associated with any one of the features,
Figure 288285DEST_PATH_IMAGE009
,
Figure 794352DEST_PATH_IMAGE010
Is characterized by
Figure 446788DEST_PATH_IMAGE008
The number of features that are strongly correlated,
Figure 329294DEST_PATH_IMAGE011
is characterized by
Figure 619461DEST_PATH_IMAGE008
Number of features that are not strongly correlated, and
Figure 562009DEST_PATH_IMAGE012
Figure 937627DEST_PATH_IMAGE013
. The central degree of polymerization demonstrates the degree of centralization of the feature.
Setting the cluster category number, arranging the center polymerization degrees in a descending order, selecting a larger preset number of center polymerization degrees, and taking the characteristics corresponding to the selected center polymerization degrees as cluster centers. According to the distance definition in step A1
Figure 358244DEST_PATH_IMAGE062
Features with distances smaller than a distance threshold are selected to be added into the category, and the same feature can be added into a plurality of categories. Each category is placed into a different article pool, a plurality of article pools forming the article library component 50.
The article title feature clustering method has the beneficial effects that: the network articles are obtained based on big data, association rules of article title characteristics are mined, and a clustering center is obtained through calculation of an approximate intersection number obtained by a strong association rule and a full set of association rules, so that clustering division is more accurate, and accurate delivery of the articles is facilitated.
B. Determining advertisement topics, calculating similarity between advertisement topic characteristics and article title characteristics in the article library component 50 to obtain total relevancy of advertisements, selecting articles to be promoted to deliver advertisements, and counting reading amount of the articles after promotion and click amount of corresponding advertisements to obtain dynamic advertisement delivery progress of each settlement period.
B1. The advertisement input component 60 receives the advantages and characteristics of the advertisement goods provided by the advertiser, and composes an advertisement theme corresponding to the advertisement, wherein the advertisement theme comprises description information of the advantages and characteristics of the current advertisement. The advertisement processing component 70 utilizes the Chinese word segmentation technology to realize the text word segmentation and extract the characteristics of the description information according to the space vector model and the TF-IDF weight calculation method to obtain the advertisement topic characteristics, and the methods are the prior art and the invention is not explained herein too much. Counting the feature distribution relationship between the advertisement topic features and the article pool categories in the article library component 50, randomly selecting one article pool category and one advertisement topic feature as samples, and regarding any one feature in the advertisement topic feature set
Figure 768496DEST_PATH_IMAGE063
Figure 52585DEST_PATH_IMAGE064
Representation containing features
Figure 977816DEST_PATH_IMAGE063
Figure 139807DEST_PATH_IMAGE065
Indicating not containing a feature
Figure 466883DEST_PATH_IMAGE063
. For any article pool category
Figure 688917DEST_PATH_IMAGE066
Figure 101444DEST_PATH_IMAGE067
Indicates belonging to a category
Figure 801546DEST_PATH_IMAGE066
Figure 983129DEST_PATH_IMAGE068
Representation not belonging to a category
Figure 874599DEST_PATH_IMAGE066
. The sample feature distribution table is:
Figure 774422DEST_PATH_IMAGE069
Figure 278216DEST_PATH_IMAGE070
total amount of
Figure 314305DEST_PATH_IMAGE071
Figure 878142DEST_PATH_IMAGE029
Figure 265261DEST_PATH_IMAGE030
Figure 307166DEST_PATH_IMAGE072
Figure 463341DEST_PATH_IMAGE073
Figure 696614DEST_PATH_IMAGE031
Figure 305450DEST_PATH_IMAGE032
Figure 416625DEST_PATH_IMAGE074
Total amount of
Figure 364989DEST_PATH_IMAGE075
Figure 332945DEST_PATH_IMAGE076
Figure 429077DEST_PATH_IMAGE077
The correlation coefficient of the current advertisement topic characteristics and the category is:
Figure 812785DEST_PATH_IMAGE021
the distance between the subject of the current advertisement and the different article pool categories is:
Figure 943552DEST_PATH_IMAGE078
wherein the content of the first and second substances,
Figure 518628DEST_PATH_IMAGE023
indicating the distance of the subject of the advertisement from the article pool category y,
Figure 102056DEST_PATH_IMAGE024
any advertisement topic feature x and any article title category obtained according to the calculation method of the step A1
Figure 289455DEST_PATH_IMAGE025
X is the number of ad theme features,
Figure 274728DEST_PATH_IMAGE025
representing any one of the article title categories in the article pool category y,
Figure 522170DEST_PATH_IMAGE026
representing the number of chapter title features in the article pool category y. And (3) calculating the correlation coefficient and distance of each advertisement subject characteristic and each article pool category in a traversing manner to obtain the total correlation degree of the current advertisement:
Figure 592894DEST_PATH_IMAGE027
wherein Y is the article pool category number. Selecting total correlation
Figure 583984DEST_PATH_IMAGE028
And removing the characteristics with the distance larger than the threshold value in the m article pools from the front m article pool categories with the maximum value, wherein the residual characteristics are similar characteristics of the advertisement.
B2. The placement rule setting component 80 sets advertisement placement rules: the number n of articles to be delivered is determined by a user, one category is randomly selected from m article pools to serve as a delivery category, and a preset number of articles to be promoted with the largest similar characteristics in the article pool category are selected to serve as the delivered articles of the current advertisement. And counting the reading amount Ar of the article and the click rate Ah of the corresponding advertisement in a settlement period after the article is popularized. The delivery progress calculation component 90 obtains the article reading amount Ar of the tweet background and the click rate Ah of the corresponding advertisement, and calculates the advertisement click rate of the first calculation period:
Figure 423764DEST_PATH_IMAGE033
the settlement period is determined by the user.
Obtaining the dynamic advertisement putting progress according to the advertisement click rates of different settlement periods:
Figure 340642DEST_PATH_IMAGE034
wherein the content of the first and second substances,
Figure 633083DEST_PATH_IMAGE035
for the delivered progress of the advertisement in the t-th settlement period,
Figure 427864DEST_PATH_IMAGE036
the advertisement click rate for the t-1 th settlement period,
Figure 59834DEST_PATH_IMAGE037
the number of articles to be advertised for the t-1 th settlement period,
Figure 711395DEST_PATH_IMAGE038
the average of the number of articles released for the first t-1 settlement periods.
Setting progress point expectation values for different settlement periods by a user
Figure 694394DEST_PATH_IMAGE039
If the dynamic releasing progress of the current settlement period reaches the expected value of the progress point of the user, that is to say
Figure 89604DEST_PATH_IMAGE040
If so, continuing to select the article to be promoted from the article pool categories adopted in the previous period in the next period for advertisement putting; otherwise, the delivery progress calculation component 90 sends the article pool type adopted in the previous period to the article library component 50, the article library component 50 records the delivered type, sends the article title characteristics of the remaining types to the advertisement processing component 70, recalculates the article pool type to be delivered and the similar characteristics corresponding to the advertisement, and delivers the article pool type to be delivered and the similar characteristics again according to the advertisement delivery rule, and iterates circularly until the sum of the dynamic delivery schedules reaches the preset total delivery schedule, as shown in fig. 2.
The beneficial effects of the dynamic advertisement delivery are as follows: and advertising is carried out according to the relevance of the advertising theme and the article pool, a plurality of settlement periods are set in the advertising process, the delivered articles are adjusted in real time according to the dynamic delivery progress of each settlement period, and an advertiser can independently master the advertising progress and effect.
In conclusion, the delivery progress calculation method and system based on the combination of the big data and the article title are completed.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. The delivery system based on the combination of big data and article titles is characterized by comprising the following parts:
the system comprises an article acquisition and processing component, a characteristic association component, a central aggregation component, an article library component, an advertisement input component, an advertisement processing component, an advertisement release rule setting component, an article release progress calculation component and an article library component, wherein the article acquisition and processing component is connected with the characteristic association component, the characteristic association component is respectively connected with the central aggregation component and the strong association component, the strong association component is connected with the central aggregation component, the central aggregation component is connected with the article library component, the article library component is connected with the advertisement processing component, meanwhile, the advertisement input component is also connected with the advertisement processing component, the advertisement processing component is connected with the release rule setting component, the release rule setting component is connected with the release progress;
the strong association component is used for screening out strong association rules in the association rules, setting judgment conditions and defining an approximate intersection number according to the judgment conditions;
the central aggregation component is used for merging all association rules of each feature, further calculating the central aggregation degree of each feature according to the association rule complete set and the approximate intersection number of each feature, and selecting a clustering center and the category of each feature;
the delivery rule setting component is used for setting advertisement delivery rules;
the delivery progress calculation component is used for receiving data of the article promotion background, setting settlement periods, calculating the click rate of each settlement period, obtaining the dynamic delivery progress of the advertisement according to the advertisement click rates of different settlement periods, and judging whether the dynamic delivery progress of the current settlement period meets the progress point expected value or not;
the article library component records the types of the articles which are delivered, sends the article title characteristics of the rest types to the advertisement processing component, recalculates the types and the similar characteristics of the article pool to be delivered corresponding to the advertisement, and delivers the articles again according to the advertisement delivery rule.
2. The delivery progress calculation method based on the combination of big data and article titles is characterized by comprising the following steps of:
a, capturing historical articles in a network based on big data, extracting the title features of the articles, and clustering the title features of the articles based on association rules to form an article library component;
and B, determining an advertisement theme, calculating the similarity between the characteristics of the advertisement theme and the characteristics of the titles of the articles in the article library assembly to obtain the total relevancy of the advertisements, selecting the articles to be promoted to release the advertisements, and counting the reading amount of the articles after promotion and the click amount of the corresponding advertisements to obtain the dynamic advertisement release progress of each settlement period.
3. The method for calculating the delivery progress based on the combination of the big data and the article title according to claim 2, wherein the step A comprises:
selecting strong association rules in all the characteristic association rules, and defining an approximate intersection number, wherein the approximate intersection satisfies the sum of confidence degrees of all the strong association rules of the judgment condition, and the approximate intersection number is as follows:
Figure 282726DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE003
d represents randomly selecting d strong association rules,
Figure 809654DEST_PATH_IMAGE004
and D is the number of all association rules.
4. The method for calculating the delivery progress based on the combination of the big data and the article title as claimed in claim 3, wherein the confidence degree is determined by the following conditions:
(1) randomly selecting one strong association rule from all the strong association rules, traversing d association rules backwards from the strong association rule, and selecting the maximum confidence coefficient of the d association rules;
(2) randomly selecting d strong association rules, and obtaining d maximum confidence degrees according to the method in (1), wherein any maximum confidence degree is
Figure 120549DEST_PATH_IMAGE003
5. The method for calculating the delivery progress based on the combination of the big data and the article title as claimed in claim 3, wherein the step A further comprises:
further calculating the central polymerization degree of each feature according to the association rule complete set and the approximate intersection number of each feature:
Figure 155239DEST_PATH_IMAGE006
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE007
and
Figure 907295DEST_PATH_IMAGE008
setting the weight of discrete characteristics according to the data characteristics and actual requirements for reconciling the parameters,
Figure DEST_PATH_IMAGE009
is characterized by
Figure 464178DEST_PATH_IMAGE010
Any one of the features that are associated with,
Figure DEST_PATH_IMAGE011
,
Figure 668894DEST_PATH_IMAGE012
is characterized by
Figure 772854DEST_PATH_IMAGE010
The number of features that are strongly correlated,
Figure DEST_PATH_IMAGE013
is characterized by
Figure 113837DEST_PATH_IMAGE010
Number of features that are not strongly correlated, and
Figure 310463DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE015
(ii) a The central degree of polymerization demonstrates a characteristic degree of centralization;
Figure 736896DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE017
the number of the title features of the article a;
Figure 910126DEST_PATH_IMAGE018
representation and characteristics
Figure DEST_PATH_IMAGE019
Associated with any one of the features
Figure 371195DEST_PATH_IMAGE020
The distance of (d);
Figure DEST_PATH_IMAGE021
represents the central degree of polymerization;
Figure 535460DEST_PATH_IMAGE022
representing an approximate intersection number.
6. The method for calculating the delivery progress based on the combination of the big data and the article title according to claim 2, wherein the step B comprises:
the correlation coefficient of the current advertisement topic characteristics and different article pool categories is as follows:
Figure DEST_PATH_IMAGE023
the distance between the subject of the current advertisement and the different article pool categories is:
Figure DEST_PATH_IMAGE025
wherein the content of the first and second substances,
Figure 619828DEST_PATH_IMAGE026
indicating the distance of the subject of the advertisement from the article pool category y,
Figure DEST_PATH_IMAGE027
any advertisement subject feature x and any article title category obtained according to the calculation method in the step A
Figure 567056DEST_PATH_IMAGE028
X is the number of ad theme features,
Figure 741685DEST_PATH_IMAGE028
representing any one of the article title categories in the article pool category y,
Figure DEST_PATH_IMAGE029
representing the number of chapter title characteristics in the article pool category y;
and (3) calculating the correlation coefficient and distance of each advertisement subject characteristic and each article pool category in a traversing manner to obtain the total correlation degree of the current advertisement:
Figure DEST_PATH_IMAGE031
wherein Y is the article pool category number; selecting total correlation
Figure 217797DEST_PATH_IMAGE032
Removing the characteristics with the distance larger than the threshold value in the m article pools from the front m article pool categories with the maximum value, wherein the residual characteristics are similar characteristics of the advertisement;
Figure 586199DEST_PATH_IMAGE034
indicating the probability that the category y contains the feature x,
Figure 461751DEST_PATH_IMAGE036
indicating the probability that other classes than class y contain feature x,
Figure 100674DEST_PATH_IMAGE038
indicating the probability that the category y does not contain the feature x,
Figure 403479DEST_PATH_IMAGE040
indicating the probability that other classes than class y do not contain feature x.
7. The method for calculating the delivery progress based on the combination of the big data and the article title according to claim 2, wherein the step B comprises:
setting an advertisement putting rule: determining the number n of articles to be delivered by a user, randomly selecting a category from m article pools as an article pool delivery category, and selecting a preset number of articles to be promoted with the most similar characteristics in the article pool delivery category as delivered articles of the current advertisement; counting the reading amount Ar of the article and the click amount Ah of the corresponding advertisement in a settlement period after the article is popularized;
the delivery progress calculation component acquires the article reading amount Ar of the text pushing background and the click rate Ah of the corresponding advertisement, and calculates the advertisement click rate of a first settlement period:
Figure 557380DEST_PATH_IMAGE042
the settlement period is decided by the user;
obtaining the dynamic advertisement putting progress according to the advertisement click rates of different settlement periods:
Figure 846410DEST_PATH_IMAGE044
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE045
for the delivered progress of the advertisement in the t-th settlement period,
Figure 166271DEST_PATH_IMAGE046
the advertisement click rate for the t-1 th settlement period,
Figure DEST_PATH_IMAGE047
the number of articles to be advertised for the t-1 th settlement period,
Figure 780923DEST_PATH_IMAGE048
the average of the number of articles released for the first t-1 settlement periods.
8. The method for calculating the delivery progress based on the combination of the big data and the article title according to claim 2, wherein the step B further comprises:
setting progress point expectation values for different settlement periods by a user
Figure DEST_PATH_IMAGE049
Figure 422120DEST_PATH_IMAGE045
If the dynamic advertisement putting progress in the current settlement period reaches the expected value of the progress point of the user, namely the put progress of the advertisement in the t-th settlement period
Figure 983682DEST_PATH_IMAGE050
If so, continuing to select the article to be promoted from the article pool categories adopted in the previous period in the next period for advertisement putting; otherwise, the delivery progress calculation component sends the article pool type adopted in the previous period to the article library component, the article library component records the delivered type, sends the article title characteristics of the rest types to the advertisement processing component, recalculates the article pool type to be delivered and the similar characteristics corresponding to the advertisement, re-delivers according to the advertisement delivery rule, and iterates circularly until the sum of the dynamic delivery progress reaches the preset total delivery progress.
CN202110562046.2A 2021-05-24 2021-05-24 Delivery progress calculation method and system based on combination of big data and article title Active CN113032551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110562046.2A CN113032551B (en) 2021-05-24 2021-05-24 Delivery progress calculation method and system based on combination of big data and article title

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110562046.2A CN113032551B (en) 2021-05-24 2021-05-24 Delivery progress calculation method and system based on combination of big data and article title

Publications (2)

Publication Number Publication Date
CN113032551A true CN113032551A (en) 2021-06-25
CN113032551B CN113032551B (en) 2021-09-10

Family

ID=76455536

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110562046.2A Active CN113032551B (en) 2021-05-24 2021-05-24 Delivery progress calculation method and system based on combination of big data and article title

Country Status (1)

Country Link
CN (1) CN113032551B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120114184A1 (en) * 2009-07-21 2012-05-10 Thomson Licensing Trajectory-based method to detect and enhance a moving object in a video sequence
CN106326379A (en) * 2016-08-16 2017-01-11 廖文广 Management system and method for embedded advertisement in webpage article
CN108132927A (en) * 2017-12-07 2018-06-08 西北师范大学 A kind of fusion graph structure and the associated keyword extracting method of node
CN109919641A (en) * 2017-12-12 2019-06-21 优视科技有限公司 A kind of advertisement placement method and platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120114184A1 (en) * 2009-07-21 2012-05-10 Thomson Licensing Trajectory-based method to detect and enhance a moving object in a video sequence
CN106326379A (en) * 2016-08-16 2017-01-11 廖文广 Management system and method for embedded advertisement in webpage article
CN108132927A (en) * 2017-12-07 2018-06-08 西北师范大学 A kind of fusion graph structure and the associated keyword extracting method of node
CN109919641A (en) * 2017-12-12 2019-06-21 优视科技有限公司 A kind of advertisement placement method and platform

Also Published As

Publication number Publication date
CN113032551B (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN110111128B (en) Apartment elevator advertisement playing method, device and equipment
CN104915392B (en) A kind of microblogging forwarding behavior prediction method and device
WO2017202336A1 (en) Method and device for preventing fraudulent behavior with respect to advertisement, and storage medium
CN103729785B (en) Video user gender classification method and device for method
CN105975581A (en) Media information display method, client and server
WO2015085967A1 (en) User behavior data analysis method and device
CN107103485B (en) Automatic advertisement recommendation method and system according to cinema visitor information
CN105302887A (en) Information pushing method and pushing apparatus
CN107526810B (en) Method and device for establishing click rate estimation model and display method and device
CN105491444B (en) A kind of data identifying processing method and device
CN110515904B (en) Quality prediction model training method, quality prediction method and device for media file
CN105608125B (en) Information processing method and server
CN105590240A (en) Discrete calculating method of brand advertisement effect optimization
CN108076387A (en) Business object method for pushing and device, electronic equipment
CN105512916A (en) Advertisement accurate delivery method and advertisement accurate delivery system
CN110264268B (en) Advertisement putting device, method, equipment and storage medium thereof
CN108900924A (en) The method and apparatus of commending friends in direct broadcasting room
CN106446149B (en) Notification information filtering method and device
WO2015124024A1 (en) Method and device for promoting exposure rate of information, method and device for determining value of search word
CN112150191B (en) Advertisement putting method and system
CN112969079B (en) Anchor resource allocation method and device, computer equipment and storage medium
CN106570020A (en) Method and apparatus used for providing recommended information
CN105956086B (en) Multimedia resource recommendation method and device
CN113032551B (en) Delivery progress calculation method and system based on combination of big data and article title
CN106204163B (en) Method and device for determining user attribute characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 100176 3203, 32nd floor, building 2, yard 1, Ronghua South Road, economic and Technological Development Zone, Daxing District, Beijing

Patentee after: Beijing Zeqiao Medical Technology Co.,Ltd.

Address before: 100176 3203, 32nd floor, building 2, yard 1, Ronghua South Road, economic and Technological Development Zone, Daxing District, Beijing

Patentee before: Beijing Zeqiao Media Technology Co.,Ltd.

CP01 Change in the name or title of a patent holder