CN111583363B - Visual automatic generation method and system for graphic news - Google Patents

Visual automatic generation method and system for graphic news Download PDF

Info

Publication number
CN111583363B
CN111583363B CN202010392691.XA CN202010392691A CN111583363B CN 111583363 B CN111583363 B CN 111583363B CN 202010392691 A CN202010392691 A CN 202010392691A CN 111583363 B CN111583363 B CN 111583363B
Authority
CN
China
Prior art keywords
news
emotion
graphic
visual
basic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010392691.XA
Other languages
Chinese (zh)
Other versions
CN111583363A (en
Inventor
胡溢
杨成
张亚娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Communication University of China
Original Assignee
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communication University of China filed Critical Communication University of China
Priority to CN202010392691.XA priority Critical patent/CN111583363B/en
Publication of CN111583363A publication Critical patent/CN111583363A/en
Application granted granted Critical
Publication of CN111583363B publication Critical patent/CN111583363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a visual automatic generation method and a visual automatic generation system for graphic news, belongs to the technical field of graphic news generation, and solves the problems of low production efficiency and high labor cost in the existing visual generation process of graphic news. The method comprises the following steps: receiving news documents and news pictures based on graphic news; determining the news type of the graphic news based on the news document; acquiring a basic visual feature set corresponding to the news type in a feature optimization individual library, and taking the basic visual feature set as a basic style of the graphic news; rendering the news document and the news picture by using the basic style to obtain a basic visual design result of the graphic news; acquiring emotion of a news document, and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the news of the picture. The automatic generation of the image-text news is realized, and the yield and efficiency of the image-text news are effectively improved.

Description

Visual automatic generation method and system for graphic news
Technical Field
The invention relates to the technical field of image-text news generation, in particular to a visual automatic generation method and system of image-text news.
Background
"Automated news" has long become one of the representative applications of artificial intelligence in the media industry. Most research has been focused on the generation of news videos and news text content, such as the EIGENNEWS system by m.daneshi et al and data-driven news generation by indonesian city selection, which focus on researching news services on text or video-based aspects. Today people often choose to access information by browsing various network platforms, such as microblogging, micro-signaling public numbers, etc. The information issued by the network platforms mainly adopts templates to combine pictures and texts to form a text. We call this type of text news. The advantage of the graphic news is that the typesetting mode, the color style and the appearance image can be selected according to the news type and the emotion of the news expression, so that the automatic generation of the visual sense (typesetting and style) of the graphic news is particularly important. .
Referring to similar work, zhang, cunjun et al, proposed AIPAINTING, they have focused on developing intelligent generation of drawings, utilizing keywords entered by a user to perform a drawing style migration, to generate new drawings. Generation Meets Recommendation by t.v. vo, h.soh, best Long Paper Runner-up, hopes to create a new set of items (each item being defined by a set of features) that can be liked by all users. Most of these tasks use machine learning or deep learning and existing databases for computation and iteration.
According to the traditional intelligent news generation thought, training in the aspect of machine learning or deep learning is needed for the front-end tag codes, and due to the lack of related data sets and high complexity of the front-end tag codes, the method is quite different from automatic generation of pictures or characters, so that the method only exists in a theoretical level and lacks practicality. In addition, in the news production line, the number of WeChat public numbers exceeds 1000 ten thousand by 2017, and the number of active accounts is 350 ten thousand. Because each text has different typesetting patterns and styles, a great deal of labor cost is input for news workers in visual generation of graphic news in order to attract flow. In the prior art, the automatic visual generation method of the graphic news does not exist, and the problems of low production efficiency and high labor cost in the existing visual generation process of the graphic news cannot be solved.
Disclosure of Invention
In view of the analysis, the invention aims to provide a visual automatic generation method and a visual automatic generation system for graphic news, which are used for solving the problems of low production efficiency and high labor cost in the conventional visual generation process of the graphic news.
The aim of the invention is mainly realized by the following technical scheme:
In one aspect, a visual automatic generation method of graphic news is provided, and the method comprises the following steps:
receiving news documents and news pictures based on graphic news;
Determining the news type of the graphic news based on the news document;
Acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news;
Acquiring emotion of the news document, and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news.
Based on the scheme, the invention also makes the following improvements:
further, the feature optimization individual library is established by:
Obtaining news document samples of all news types, and generating a news document sample set;
determining user satisfaction corresponding to each news document sample in the news document sample set;
establishing a mathematical model between a basic visual characteristic set of a news document sample and user satisfaction;
and inputting the established mathematical model between the basic visual characteristic set under various news types and the user satisfaction degree into a genetic algorithm as an fitness function, and obtaining the optimal basic visual characteristic set corresponding to each news type after training and converging by the genetic algorithm to form the characteristic optimization individual library.
Further, a mathematical model between the basic visual feature set of the news document sample and the user satisfaction is established by adopting a multiple linear regression analysis mode:
Wherein m represents the number of basic visual feature types in the basic visual feature group; n represents the number of categories under each basic visual feature type; x ij represents the feature value of the j-th category in the i-th basic visual feature type, x ij represents that the feature is not provided, and 1 represents that the feature is provided; a ij represents a feature value coefficient of the jth category under the current ith basic visual feature type; y represents the user satisfaction of the current news document sample;
the value of a ij is estimated by a multiple linear regression algorithm, thereby establishing a mathematical model between the user satisfaction and the underlying set of visual features.
Further, the emotion pattern template matched with the emotion of the news document is selected from the emotion pattern template library, and the following operations are executed:
acquiring emotion tendency scores of current news documents;
And calculating cosine similarity between the emotion tendency score of the news document and the emotion tendency scores of all emotion pattern templates in the emotion pattern template library, and randomly selecting a certain emotion pattern template from the emotion pattern templates with similarity higher than a similarity threshold as the emotion pattern template matched with the emotion of the news document at the time.
Further, the emotion pattern template at least comprises styles, colors, image-text typesetting, background types, text information and emotion tendencies; wherein the emotion pattern template has an emotion tendency score [ pos, neg ];
pos=α×tpos+(1-α)×cpos
neg=1-pos
wherein tpos represent the forward emotion bias of the text in the text emotion score of the text information; cpos represents the color forward emotion bias in the color emotion score for that color.
Further, the following operations are performed to obtain tpos:
extracting word segmentation in the text information in the emotion pattern template, and removing stop words contained in the word segmentation;
Matching the segmented words with the removed stop words with a corpus to obtain emotion words matched with the corpus, and calculating word frequency of the emotion words;
Ordering emotion words from high to low according to word frequency, and selecting a certain number of emotion words with top ranking to form a hotness word stock;
For forward emotion words in the hotness word stock:
positive emotion word score = whether a negative word (-1, 1) x the word strength (1, 3,5, 7, 9) exists at the first l-position of the word
For negative emotion words in the hotword stock:
Negative emotion word score = (-1) x whether there is a negative word (-1, 1) x the word strength (1, 3, 5, 7, 9) in the pre-word position then text positive emotion bias:
Wherein l represents the preset number ranked at the top.
Further, the following operations are performed to obtain cpos:
Inputting the colors in the emotion pattern template to a color emotion model, and classifying the color emotion model to obtain color emotion scores corresponding to the colors in the emotion pattern template;
And obtaining the color forward emotion bias cpos of the emotion pattern template based on the color emotion scores corresponding to the colors.
Further, the color emotion model is established by performing the following operations:
Determining emotion representative colors of emotion words in a corpus;
using emotion representative colors of emotion words and scores of the corresponding emotion words as a data set of the color emotion model;
Training the data set of the color emotion model through an SVM to obtain the relation between emotion representative colors of emotion words in the color emotion model and scores of the corresponding emotion words.
Further, the news type of the teletext news is determined by performing the following operations:
Extracting word segmentation vector feature values of the news documents, inputting the word segmentation vector feature values of the news documents into a news classification model, and classifying the news classification model to obtain the news types of the graphic news.
Further, the news classification model is built by performing the following operations:
taking a news document sample containing different news types as a classification data set;
Assigning a tag value of a corresponding news type to each news document sample;
extracting word segmentation vector feature values of all the news document samples, inputting the word segmentation vector feature values and corresponding tag values of all the news document samples into a news classification model, performing Bayesian classification, and training to obtain a final news classification model.
In another aspect, a visual automatic generation system of graphic news is provided, the system comprising:
the image-text news content receiving module is used for receiving news documents and news pictures based on image-text news;
The news type determining module is used for determining the news type of the graphic news based on the news document;
The basic visual layer processing module is used for acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news;
the detailed visual layer processing module is used for acquiring emotion of the news document and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news.
The invention has the following beneficial effects:
According to the visual automatic generation method and system of the image-text news, the basic style and the emotion style template of the image-text news are obtained by processing the received news document and news pictures, and the construction and rendering processes of the image-text news are completed through the basic style and the emotion style template of the image-text news, so that the automatic generation of the image-text news is realized, the yield and the efficiency of the image-text news are effectively improved, the burden of news workers is reduced, and the visual automatic requirement of the image-text news is met.
In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.
FIG. 1 is a flowchart of a visual automatic generation method of news in FIG. 1 according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for automatically generating news of the text in embodiment 1 of the present invention;
FIG. 3 is a plot of emotion distribution for school celebration news in example 1 of the present invention;
FIG. 4 is a pie chart of emotion distribution of school celebration news in example 1 of the present invention;
FIG. 5 is a visual automatic generation result effect diagram of the news of FIG. 1 according to the embodiment of the invention;
FIG. 6 is a visual automatic generation result effect diagram of another news of the text in embodiment 1 of the present invention;
fig. 7 is a schematic structural diagram of a visual generation system of graphic news in embodiment 2 of the present invention.
Detailed Description
The following detailed description of preferred embodiments of the application is made in connection with the accompanying drawings, which form a part hereof, and together with the description of the embodiments of the application, are used to explain the principles of the application and are not intended to limit the scope of the application.
Example 1
In embodiment 1 of the present invention, a visual automatic generation method of graphic news is provided, and a flowchart is shown in fig. 1, where the method includes the following steps:
step S1: receiving news documents and news pictures based on graphic news;
It should be noted that, the method is directed to a producer (i.e. "user" here) rather than a consumer, where "graphic news" refers to news materials to be subjected to visual generation processing provided by the user, including news documents and news pictures, and the news materials are processed by a subsequent method to finally obtain visual generation results of the graphic news; the news document and the news picture provided by the user have certain correlation so as to ensure the effect of the finally generated graphic news. The news pictures may be one or more.
Because different news types have great differences in typesetting modes, color styles, appearance images and the like, in the actual processing process, the news type of the current image-text news needs to be determined first, and then the news document and the news picture are processed in a targeted manner according to the relevant characteristics of the news type. The method for determining the news type refers to step S2, and the process of using the news type for subsequent processing refers to step S3;
step S2: determining the news type of the graphic news based on the news document;
Specifically, extracting word segmentation vector feature values of the news document, inputting the word segmentation vector feature values of the news document into a news classification model, and classifying the news classification model to obtain the news type of the graphic news;
Preferably, the term vector feature value of the news document may be obtained by: constructing a corpus, extracting TF-IDF word segmentation vector feature values in the news document, and forming the word segmentation vector feature values of the news document; illustratively, the maximum eigenvalue of its word segmentation vector may be set to 4000;
Preferably, the news classification model may be established by:
(1) Taking news document samples containing different news types as a data set, and cleaning and arranging the data set; the data set may be a published news data resource;
(2) Assigning a tag value of a corresponding news type to each news document sample;
Illustratively, the news types in the present embodiment may include: automobile, finance, science and technology, health, sports, education, culture, military and entertainment; its corresponding tag value may be set as follows: "automobile": 0; "finance and economics": 1, a step of; "science and technology": 2; "health": 3, a step of; "sports": 4, a step of; "education": 5, a step of; "culture": 6, preparing a base material; "military": 7, preparing a base material; "entertainment": 8.
(3) Extracting word segmentation vector feature values of all the news document samples, inputting the word segmentation vector feature values and corresponding tag values of all the news document samples into a news classification model, performing Bayesian classification, and training to obtain a final news classification model. The way of extracting the characteristic value of the word segmentation vector of each news document sample is the same as the way of obtaining the characteristic value of the word segmentation vector of the news document.
By adopting the news type determined in the mode, the classification accuracy is higher, and the self-adaption capability is stronger, so that the news classification model can be saved, and the news type can be quickly judged when the news type is judged in real time.
Because of the bayesian classification, it can be ensured that a tag value with the highest probability corresponding to a certain feature value is obtained, and therefore, in this embodiment, a news document corresponds to only one news type.
Step S3: acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news;
Through a great deal of investigation on the graphic news, the elements forming the graphic news are numerous, in this embodiment, several kinds of elements which are complete, independent and various are selected, and the elements are used as basic visual feature types contained in a basic visual feature group in this embodiment, including: cool and warm tone, structural style, font size, linking style, title style, and main content style. And a corresponding category type is set for the base element of each basic feature type, as shown in table 1.
TABLE 1 basic visual feature types in basic visual feature set and corresponding categories
In this step, a feature optimization individual library storing correspondence between a news type and an optimal basic visual feature group under the news type may be acquired.
In this embodiment, the genetic algorithm is used to obtain the feature optimization individual library; in order to obtain the feature optimization individual library, related samples need to be trained by using a genetic algorithm, and the selection of the samples and the training process are described as follows:
1. Collecting news document samples containing the news types, determining the most representative news document sample in the news types according to the principle of orthogonal test, and taking the most representative news document sample as a news document sample set; through the orthogonal test, less manpower and material resource consumption can be arranged to obtain more and more comprehensive information; illustratively, the L18 (3^7) orthogonal table may be selected to implement the orthogonal trial;
2. determining user satisfaction corresponding to each news document sample in the news document sample set;
Wherein, the user satisfaction corresponding to each news document sample may be determined through a questionnaire or other means. Meanwhile, the user satisfaction can refer to the psychological emotion experience level. Psychologists consider that emotional experience can be divided into several levels according to a cascade theory, and accordingly user satisfaction can be divided into seven or five levels. Seven-score division satisfaction is taken here: 1 (unsatisfied intolerance); 2 (barely tolerated); 3 (less satisfactory regrets); 4 (no obvious positive or negative emotion); 5 (better satisfaction positive); 6 (happy praise and happy satisfaction); 7 (very satisfactory, satisfied).
3. Establishing an association relationship between a basic visual feature set of a news document sample and user satisfaction, preferably, determining the association relationship by adopting a multiple linear regression analysis mode, wherein the multiple linear regression analysis formula is as follows:
Where m represents the number of basic visual feature types of the news type, and when the element in table 1 is selected as the basic visual feature group, m is 7, for example; according to the column sequence in table 1, when i takes 1, the corresponding basic visual feature type is "cool-warm tone", when i takes 2, the corresponding basic visual feature type is "structural style", and so on; n represents the number of categories under each basic visual feature type; according to the row order set forth in Table 1, when i takes 1 and j takes 1, the category is "Warm tone", and so on; x ij represents the feature value of the j-th category in the i-th basic visual feature type, x ij represents that the feature is not provided, and 1 represents that the feature is provided; for each type of basic visual feature type, determining the value of each feature value in the corresponding basic visual feature group by analyzing the content of the news document sample, wherein in the basic feature type of the corresponding 'cold and warm tone', the feature value corresponding to the 'warm tone' is 1, and the feature values corresponding to the 'cold tone' and the 'middle tone' are all 0; a ij represents a feature value coefficient of the jth category under the current ith basic visual feature type; y represents the user satisfaction of the current news document sample; in the above formula, the coefficient a ij is an unknown number, and the value of a ij can be estimated by a multiple linear regression algorithm. Thereby establishing a mathematical model between the user satisfaction and the underlying set of visual features.
Because the relationship between the basic visual characteristic set and the user satisfaction degree under different news types is different, when the relationship between the basic visual characteristic set and the user satisfaction degree is analyzed by adopting a multiple linear regression analysis mode, a mathematical model between the basic visual characteristic set and the user satisfaction degree under each news type can be respectively established.
And (3) inputting the established mathematical model between the basic visual characteristic set and the user satisfaction under various news types into a genetic algorithm as an adaptability function, and setting the initial population quantity, the hybridization probability and the variation probability, wherein the termination condition is that the iteration times reach a set value. After training is finished, the fitness function is converged, and then according to seven-level user satisfaction evaluation standards, the user satisfaction can be considered to be satisfied by the user when the user satisfaction reaches more than 5 minutes.
Acquiring basic visual characteristic groups corresponding to maximum values of user satisfaction higher than 5 points under each news type, and forming a characteristic optimization individual library; preferably, in the feature optimization individual library, the basic visual feature set can be embodied in a coded form, a certain basic visual feature set is selected from the feature optimization individual library, and then decompilation is performed to obtain specific contents in the basic visual feature set; for example, the feature optimization library may be in the form of table 2, in which individual DNA is used to represent the values of the feature values in the underlying visual feature set; the fitness score is the comprehensive score of the user satisfaction after training;
Table 2 feature optimized individual library example
Through 1-4 above, a feature optimized individual library was constructed. Thus, in step S3, the corresponding set of basic visual features in the feature optimized individual library may be obtained based on the news type.
By training samples in advance and establishing a feature optimization individual library, when real-time news needs to be generated, a corresponding basic visual feature group can be quickly determined according to the news type corresponding to the news material, and the time for carrying out complex algorithm in real time is saved. Taking entertainment news as an example, after the fitness function converges, one of the individual DNAs of the optimal value of the final objective function value is taken out, for example: [ 1,0,0,0,0,1,0,0,1,0,0,1,1,0,0,1,0,0,0,1,0 ] of the invention. Decompilation is [ warm tone, graphic context level, bold, 14px, text link, line frame title, background ], so the basic visual feature set is most suitable for entertainment news.
The process of rendering the news document and the news picture by using the basic style to obtain the basic visual design result of the graphic news can be realized by adopting the existing rendering mode, and the description is omitted here.
Step S4: acquiring emotion of the news document, and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news.
In recent years, problems concerning emotion have been a topic of concern. Most works have emotion colors, an article, a movie and a song have emotion atmosphere, and the same is true of graphic news. This section assigns the most appropriate emotion pattern template to the news by means of emotion matching.
In the emotion pattern template selected in this embodiment, the data information of the emotion pattern template may be divided into text information and color information. And obtaining emotion of different types of emotion pattern templates through fusion analysis of text emotion and color emotion, and further establishing an emotion pattern template library.
Illustratively, the emotion pattern template can originate from a web crawler, and a plurality of editing websites related to graphic news exist at present, so that the source code of each editing platform pattern template and text information (such as title, label and internal text description of the template) of the template are grabbed. And cleaning and classifying the grasped visual templates, and establishing an emotion pattern template library. The emotion pattern template library describes an emotion pattern template from different angles, comprising: style, color type, graphic typesetting, background type, text information, emotion bias and other information. The aim of constructing the resource is to select a proper emotion pattern template for the news document to be laid. In the emotion pattern template library, the general format is as shown in table 3:
Table 3 example of an emotion pattern template library format
Various information in the emotion pattern template library is briefly described as follows:
Styles refer to representative faces of an affective style template as a whole, which can be divided into the following 6 categories: conciseness (brief), classics (classical), nostalgia (REMINISCENCE), cartoon (cartoon), fashion (fashion), graffiti (graffiti).
The color refers to the color information with the largest proportion in the emotion pattern template; when the color is not satisfied by the user in the process of establishing the template, the primary color with the shortest distance between the color of the current emotion pattern template and the primary color of red, orange, green, blue and purple 6 can be used as the color of the current emotion pattern template.
The text-to-text typesetting is classified into the following 9 types, namely text (main body), background text (background), up-down text (up-down), left-right text (left-right), double-drawing (double), triple-drawing (three), four-drawing (four), five-drawing and above (five and more), and others (other), according to the text-to-text typesetting mode in the emotion style template.
The background types are exemplarily classified into 3 types according to the background style characteristics in the emotion style template, namely a frame background (border), a background color (undersize color) and a pattern background (pattern).
The text information refers to the information such as the title, the label and the internal text description of the template in the emotion pattern template, and the emotion bias of each visual template in the emotion pattern template library can be calculated by using the text information.
Emotion bias refers to the emotion preference of the current emotion pattern template, which needs to be calculated based on the text information described above. When the emotion pattern template library is constructed, the emotion deflection column is reserved, and after the emotion deflection of all pattern templates is calculated, the emotion deflection column is sequentially supplemented to be complete. The emotion bias provided in this embodiment is shown in vector form, i.e., [ pos, neg ], where pos represents the positive emotion bias score of the template and neg represents the negative emotion bias score of the template, and the sum of the two scores is 1.
Preferably, in this embodiment, the emotion bias of each emotion pattern template is calculated by means of fusion of text emotion and color emotion.
1. Text emotion score calculation mode
In the text emotion analysis process, a dictionary matching method is used. Illustratively, the dictionary may be selected from a Chinese ontology resource organized and annotated by a university information retrieval laboratory. The dictionary comprises a plurality of characteristics of part of speech, emotion type, emotion intensity, polarity and the like, and can describe the emotion characteristics of a word from different angles. The emotion of the Chinese ontology resource provided by the method is divided into 7 major classes and 21 minor classes. Emotional intensity is classified into 1, 3, 5, 7 and 9, wherein 9 represents maximum emotional intensity and 1 represents minimum emotional intensity. Each word corresponds to a polarity under each type of emotion. Wherein 0 represents neutral, 1 represents recognition, 2 represents destiny, and 3 represents both recognition and destiny.
In addition, in order to calculate the scores of emotion words in sentences, the emotion words are divided into two main categories, wherein, the emotion (PA, PE) and the emotion (PD, PH, PG, PB, PK) belong to forward emotion categories, and the emotion bias score is 1. Anger (NA), sadness (NB, NJ, NH, PF), fear (NI, NC, NG), aversion (NE, ND, NN, NK, NL), fright (PC) are negative emotion classes, and emotion bias score is-1. Wherein P represents Positive, forward, and the following letters represent finer degree classification; n represents Negative, as does the following letter. Therefore, the emotion dictionary selected by the embodiment can realize classification and corresponding degree subdivision of various emotion words. Table 4 provides an example of an improved emotion vocabulary ontology, which increases emotion bias scores relative to existing formats, as shown in the last column of Table 4.
Table 4 improved emotion vocabulary ontology examples
When the emotion bias of each emotion pattern template is calculated, the following operations are executed:
(1) Extracting the word segmentation in the text, and removing the stop words contained in the word segmentation; wherein, when calculating the emotion bias of each emotion pattern template, the text refers to the text information in the emotion pattern template; text herein refers to the entire news document when calculating the emotional bias of the news document;
(2) Matching the segmented words with the removed stop words with a corpus to obtain emotion words matched with the corpus, and calculating word frequency of the emotion words;
(3) Ordering emotion words from high to low according to word frequency, and selecting a certain number of emotion words with top ranking to form a hotness word stock; illustratively, top 60 emotion words may be selected. The 60 words are the most suitable word quantities obtained in the actual calculation process. Too much results in emotional redundancy, and too little is not representative. Based on emotion bias of each emotion word in the hotness word library, obtaining emotion bias scores of the text; the process may be based on the following emotion calculation formula:
For forward emotion words:
positive emotion word score = whether a negative word (-1, 1) x the word strength (1, 3,5, 7, 9) exists at the first l-position of the word
For negative emotion words:
negative emotion word score = (-1) x whether a negative word (-1, 1) x the word strength (1, 3, 5, 7, 9) exists in the preceding l-digit of the word
Wherein l represents a preset number which is ranked at the front; through a large number of example verification, when l is taken to be 5, the satisfaction requirement of emotion bias results can be met, and the timeliness of the calculation process can be ensured.
Text forward emotion bias:
text negative emotion bias:
the text emotion bias score may be expressed as [ tpos, tneg ].
In the above formula, the emotion score of the word in a sentence is obtained by multiplying the positive and negative scores of the emotion word by the strength, and in the process, the emotion bias of the word is reversed by considering the negative characteristics of the adverbs and adjectives arranged in front of the word. Because the words arranged in front of the emotion words do not necessarily contain negative meanings in the word segmentation process, in order to avoid missing the key information of the negative words, a large number of experiments show that the judgment of selecting five words arranged in front of the emotion words to carry out the negative word characteristics is accurate.
For verifying the accuracy of the algorithm, we used a hotel comment dataset that had been labeled with positive and negative emotions, and a total of 12000 (6000 pos+6000 neg) hotel comments were tested, with the following results.
Table 5 hotel comments test results
In general, the model with the accuracy reaching 80% has production value, and can be seen that the accuracy of the model reaches 0.8037, the true rate is higher, and the true negative rate is lower. The reason for this analysis is that the dictionary itself is mostly negative idioms and words, but lacks negative comment words. But the result proves that the model has a certain capability of judging the emotion of the sentence, so that the model can be applied to a news template generation system for calculating news and emotion bias of a text template in a visual template.
2. Calculation mode of color emotion score
In the color emotion analysis process, emotion classification of Chinese ontology emotion words is also selected (namely, chinese ontology resources which are the same as that of text emotion analysis are adopted), the emotion contained in the dictionary is known to be 7 major classes and 21 minor classes according to the description of the dictionary, each minor class contains 5 layers of emotion intensity, and the total of 21 x 5 emotion types can be known as vector types converted into emotion words. The color emotion analysis is carried out on the basis of the emotion, and the specific process of the color emotion analysis is as follows:
(1) Selecting each emotion word in the dictionary respectively, and searching the emotion word in a search engine to obtain three pictures which can most represent emotion represented by the emotion word;
(2) Extracting the color with the largest proportion of each picture as the emotion representative color of the emotion word, and establishing an association relationship between the emotion word and the corresponding emotion representative color, wherein the association relationship can be presented in the form of a color emotion database as shown in a table 6;
TABLE 6 color emotion database
(3) Using the emotion representative color of the emotion word and the score of the corresponding emotion word as a data set of a color emotion model, and obtaining the relation between the emotion representative color of the emotion word and the score of the corresponding emotion word through SVM training;
To improve the fault tolerance of color emotion, 7 emotion classes are similarly divided into two classes, wherein the score of the emotion class is 1. The anger, the grime, the fear, the aversion and the surprise belong to the negative emotion score of 0, the processed data (namely emotion representative colors of emotion words and scores of corresponding emotion words) are divided into a training set and a testing set according to the proportion of 7:3, and the training set is input into a support vector machine SVM for training. The accuracy of the test results of the test set is 0.9894. The analysis reason is that the accuracy is high because the color emotion data set is established only by 315 (21×5×3), which belongs to the small-scale data set.
(4) For ease of subsequent computation, the color emotion score of the emotion pattern template may be expressed as [ cpos, cneg ]. Inputting the color in the current emotion pattern template to a trained color emotion model to obtain a color emotion score corresponding to the color; if the score is 0, the color emotion score of the current emotion pattern template is 0,1, and the color emotion score is negative emotion; if the score is 1, the positive emotion is represented, and the color emotion score of the current emotion pattern template is [1,0];
(5) After obtaining the text emotion score and the color emotion score of the emotion pattern template, obtaining the emotion bias score of the whole emotion pattern template through the following formula:
pos=α×tpos+(1-α)×cpos
neg=1-pos
the emotion bias score for the entire emotion pattern template is denoted as [ pos, neg ];
Wherein alpha represents a weight coefficient corresponding to the text emotion score tpos, and alpha is more than or equal to 0 and less than or equal to 1; the weight coefficient can be obtained by analyzing the ratio of text emotion scores to color emotion scores; illustratively, the following may be employed: subjective labeling is carried out on the captured 3000 pieces of template data, accuracy is calculated, and a weight coefficient with highest accuracy is selected. Emotion analysis is carried out by taking 60 anniversary of the founding of a school news of China media university as an example, and a line diagram and a pie chart of analysis results are respectively shown in figures 3 and 4. As can be seen from this example, the viewer's emotion is substantially entirely concentrated between good and happy.
Backfilling the emotion bias scores of the emotion pattern templates obtained through calculation to the corresponding positions of the emotion pattern template library, so that a complete emotion pattern template library is formed.
When a news document is obtained, according to the text emotion score calculation mode, calculating the emotion deflection score of the current news document, then calculating cosine similarity of the emotion deflection score of the news document and emotion deflection scores of all templates in an emotion pattern template library, and randomly selecting a certain emotion pattern template from emotion pattern templates with similarity higher than a similarity threshold as an emotion pattern template matched with emotion of the news document at the time. In this embodiment, the emotion pattern templates meeting the requirement of the similarity threshold are considered to meet the emotion and have higher user satisfaction, so that the selection of the emotion pattern templates has certain difference, and the diversity of the generated image-text news effect is ensured.
And rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news. The rendering process can be implemented by adopting an existing rendering mode, and is not described in detail herein.
According to the visual automatic generation method of the graphic news, the basic style and the emotion style template of the graphic news are obtained by processing the received news document and news picture, and the construction and the rendering process of the graphic news are completed through the basic style and the emotion style template of the graphic news, so that the automatic generation of the graphic news is realized, the yield and the efficiency of the graphic news are effectively improved, the burden of a news worker is reduced, and the visual automatic requirement of the graphic news is met. The system is a visual generation system based on image-text content, the final generation time is between 8s and 13s, and the visual automatic generation result effect diagram obtained by adopting the embodiment is shown in fig. 5 and 6.
Example 2
In embodiment 2 of the present invention, a visual automatic generation system of graphic news is provided, a system structure schematic diagram is shown in fig. 7, and the system includes: the image-text news content receiving module is used for receiving news documents and news pictures based on image-text news; the news type determining module is used for determining the news type of the graphic news based on the news document; the basic visual layer processing module is used for acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news; the detailed visual layer processing module is used for acquiring emotion of the news document and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news.
The specific implementation process of each module in the embodiment of the present system (for example, the process of determining the news type by the news type determining module, the process of obtaining the basic visual feature set by the basic visual layer processing module, and obtaining the basic visual design result) may be referred to the above embodiment of the method, which is not described herein again.
Since the principle of the embodiment is the same as that of the embodiment of the method, the system also has the corresponding technical effects of the embodiment of the method.
Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims (8)

1. The visual automatic generation method of the graphic news is characterized by comprising the following steps:
Receiving news documents and news pictures based on graphic news; the graphic news refers to news materials provided by a user and to be subjected to visual generation processing;
Determining the news type of the graphic news based on the news document;
Acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news; the basic visual feature types in the basic visual feature set include: cool and warm tone, structural style, font size, linking style, title style and main content style;
Acquiring emotion of the news document, and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; the emotion pattern template at least comprises styles, colors, graphic typesetting, background types, text information and emotion tendencies; rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news;
the feature optimization individual library is established by the following modes:
Obtaining news document samples of all news types, and generating a news document sample set;
determining user satisfaction corresponding to each news document sample in the news document sample set;
establishing a mathematical model between a basic visual characteristic set of a news document sample and user satisfaction;
Inputting the mathematical model between the established basic visual characteristic set under various news types and the user satisfaction degree into a genetic algorithm as an fitness function, obtaining an optimal basic visual characteristic set corresponding to each news type after training and converging by the genetic algorithm, and forming the characteristic optimization individual library;
The emotion pattern template matched with the emotion of the news document is selected from the emotion pattern template library, and the following operations are executed:
acquiring emotion tendency scores of current news documents;
Calculating cosine similarity between the emotion tendency score of the news document and the emotion tendency scores of all emotion pattern templates in the emotion pattern template library, and randomly selecting a certain emotion pattern template from emotion pattern templates with similarity higher than a similarity threshold as an emotion pattern template matched with emotion of the news document at the time;
an emotion tendency score [ pos, neg ] of the emotion pattern template;
pos=α×tpos+(1-α)×cpos
neg=1-pos
Wherein tpos represent the forward emotion bias of the text in the text emotion score of the text information; cpos represents the color forward emotion bias in the color emotion score for that color; alpha represents a weight coefficient corresponding to the text emotion score tpos, and alpha is more than or equal to 0 and less than or equal to 1.
2. The visual automatic generation method of graphic news according to claim 1, wherein a mathematical model between a basic visual feature set of a news document sample and user satisfaction is established by using a multiple linear regression analysis mode:
Wherein m represents the number of basic visual feature types in the basic visual feature group; n represents the number of categories under each basic visual feature type; x ij represents the feature value of the j-th category in the i-th basic visual feature type, x ij represents that the feature is not provided, and 1 represents that the feature is provided; a ij represents a feature value coefficient of the jth category under the current ith basic visual feature type; y represents the user satisfaction of the current news document sample;
the value of a ij is estimated by a multiple linear regression algorithm, thereby establishing a mathematical model between the user satisfaction and the underlying set of visual features.
3. The visual automatic generation method of graphic news according to claim 2, wherein tpos is acquired by performing the following operations:
extracting word segmentation in the text information in the emotion pattern template, and removing stop words contained in the word segmentation;
Matching the segmented words with the removed stop words with a corpus to obtain emotion words matched with the corpus, and calculating word frequency of the emotion words;
Ordering emotion words from high to low according to word frequency, and selecting a certain number of emotion words with top ranking to form a hotness word stock;
For forward emotion words in the hotness word stock:
Positive emotion word score = whether a negative word (-1, 1) x the word strength (1, 3, 5, 7, 9) exists for a negative emotion word in the hotness word stock:
negative emotion word score = (-1) x whether a negative word (-1, 1) x the word strength (1, 3, 5, 7, 9) exists in the preceding l-digit of the word
Text forward emotion bias:
Wherein l represents the preset number ranked at the top.
4. A visual automatic generation method of graphic news according to claim 3, characterized in that the following operations are performed to obtain cpos:
Inputting the colors in the emotion pattern template to a color emotion model, and classifying the color emotion model to obtain color emotion scores corresponding to the colors in the emotion pattern template;
And obtaining the color forward emotion bias cpos of the emotion pattern template based on the color emotion scores corresponding to the colors.
5. The method for visual automatic generation of teletext according to claim 4, wherein the color emotion model is created by performing the following operations:
Determining emotion representative colors of emotion words in a corpus;
using emotion representative colors of emotion words and scores of the corresponding emotion words as a data set of the color emotion model;
Training the data set of the color emotion model through an SVM to obtain the relation between emotion representative colors of emotion words in the color emotion model and scores of the corresponding emotion words.
6. The method for visual automatic generation of teletext according to any one of claims 1-5, wherein the news type of the teletext is determined by performing the following operations:
Extracting word segmentation vector feature values of the news documents, inputting the word segmentation vector feature values of the news documents into a news classification model, and classifying the news classification model to obtain the news types of the graphic news.
7. The method for visual automatic generation of teletext news according to claim 6, wherein the news classification model is created by performing the following operations:
taking a news document sample containing different news types as a classification data set;
Assigning a tag value of a corresponding news type to each news document sample;
extracting word segmentation vector feature values of all the news document samples, inputting the word segmentation vector feature values and corresponding tag values of all the news document samples into a news classification model, performing Bayesian classification, and training to obtain a final news classification model.
8. A visual automatic generation system of graphic news, characterized in that the system is obtained based on the visual automatic generation method of graphic news according to any one of claims 1-7, the system comprising:
the image-text news content receiving module is used for receiving news documents and news pictures based on image-text news;
The news type determining module is used for determining the news type of the graphic news based on the news document;
The basic visual layer processing module is used for acquiring a basic visual feature group corresponding to the news type in a feature optimization individual library, and taking the basic visual feature group as a basic style of the graphic news; rendering the news document and the news picture by using the basic pattern to obtain a basic visual design result of the graphic news; the basic visual feature types in the basic visual feature set include: cool and warm tone, structural style, font size, linking style, title style and main content style;
The detailed visual layer processing module is used for acquiring emotion of the news document and selecting an emotion pattern template matched with the emotion of the news document from an emotion pattern template library; the emotion pattern template at least comprises styles, colors, graphic typesetting, background types, text information and emotion tendencies; and rendering the basic visual design result by using the emotion pattern template to obtain a visual generation result of the graphic news.
CN202010392691.XA 2020-05-11 2020-05-11 Visual automatic generation method and system for graphic news Active CN111583363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010392691.XA CN111583363B (en) 2020-05-11 2020-05-11 Visual automatic generation method and system for graphic news

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010392691.XA CN111583363B (en) 2020-05-11 2020-05-11 Visual automatic generation method and system for graphic news

Publications (2)

Publication Number Publication Date
CN111583363A CN111583363A (en) 2020-08-25
CN111583363B true CN111583363B (en) 2024-05-03

Family

ID=72122995

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010392691.XA Active CN111583363B (en) 2020-05-11 2020-05-11 Visual automatic generation method and system for graphic news

Country Status (1)

Country Link
CN (1) CN111583363B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553812A (en) * 2021-06-22 2021-10-26 北京来也网络科技有限公司 News processing method and device combining RPA and AI
CN116776228B (en) * 2023-08-17 2023-10-20 合肥工业大学 Power grid time sequence data decoupling self-supervision pre-training method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776523A (en) * 2017-01-22 2017-05-31 百度在线网络技术(北京)有限公司 News speed report generation method and device based on artificial intelligence
CN108052507A (en) * 2017-12-29 2018-05-18 浙江大学城市学院 A kind of city management information the analysis of public opinion system and method
CN109978631A (en) * 2019-04-03 2019-07-05 包永祥 A collective advertising distribution platform
CN110309415A (en) * 2018-03-16 2019-10-08 广东神马搜索科技有限公司 News information generation method, device and electronic device-readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090235312A1 (en) * 2008-03-11 2009-09-17 Amir Morad Targeted content with broadcast material

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776523A (en) * 2017-01-22 2017-05-31 百度在线网络技术(北京)有限公司 News speed report generation method and device based on artificial intelligence
CN108052507A (en) * 2017-12-29 2018-05-18 浙江大学城市学院 A kind of city management information the analysis of public opinion system and method
CN110309415A (en) * 2018-03-16 2019-10-08 广东神马搜索科技有限公司 News information generation method, device and electronic device-readable storage medium
CN109978631A (en) * 2019-04-03 2019-07-05 包永祥 A collective advertising distribution platform

Also Published As

Publication number Publication date
CN111583363A (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN111737495B (en) Middle-high-end talent intelligent recommendation system and method based on domain self-classification
CN110633373B (en) Automobile public opinion analysis method based on knowledge graph and deep learning
CN111008278B (en) Content recommendation method and device
CN110175227B (en) Dialogue auxiliary system based on team learning and hierarchical reasoning
CN105608477B (en) Method and system for matching portrait with job position
CN107943784B (en) Relationship extraction method based on generation of countermeasure network
CN110532379B (en) Electronic information recommendation method based on LSTM (least Square TM) user comment sentiment analysis
CN112395410B (en) Entity extraction-based industry public opinion recommendation method and device and electronic equipment
CN111858896B (en) Knowledge base question-answering method based on deep learning
Sinar Data visualization
CN111583363B (en) Visual automatic generation method and system for graphic news
CN115564393A (en) Recruitment requirement similarity-based job recommendation method
CN112231593B (en) Financial information intelligent recommendation system
CN117592489B (en) Method and system for realizing electronic commerce commodity information interaction by using large language model
CN110781300A (en) Tourism resource culture characteristic scoring algorithm based on Baidu encyclopedia knowledge graph
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment
Lazarevic et al. Machine learning driven course recommendation system
CN117235253A (en) Truck user implicit demand mining method based on natural language processing technology
Zhu A book recommendation algorithm based on collaborative filtering
CN109254993B (en) Text-based character data analysis method and system
CN112733021A (en) Knowledge and interest personalized tracing system for internet users
CN111967251A (en) Intelligent customer sound insight system
CN111339428A (en) Interactive personalized search method based on limited Boltzmann machine drive
Li et al. Deep recommendation based on dual attention mechanism
Yang Exploring External Knowledge for Accurate modeling of Visual and Language Problems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant