CN116306506A - Intelligent mail template method based on content identification - Google Patents

Intelligent mail template method based on content identification Download PDF

Info

Publication number
CN116306506A
CN116306506A CN202211103395.9A CN202211103395A CN116306506A CN 116306506 A CN116306506 A CN 116306506A CN 202211103395 A CN202211103395 A CN 202211103395A CN 116306506 A CN116306506 A CN 116306506A
Authority
CN
China
Prior art keywords
mail
feature
text
information
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211103395.9A
Other languages
Chinese (zh)
Inventor
严峻
孟祥磊
侯颖
张威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Best Information Technology Co ltd
Original Assignee
Wuhan Best Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Best Information Technology Co ltd filed Critical Wuhan Best Information Technology Co ltd
Priority to CN202211103395.9A priority Critical patent/CN116306506A/en
Publication of CN116306506A publication Critical patent/CN116306506A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an intelligent mail template method based on content identification, and relates to the technical field of computer application. The invention comprises the following steps: collecting the basic structure of an HTML mail of a Web page, extracting the text content of the Web page, labeling a training result and a feature training result, and establishing a corresponding mail feature model; creating a mail template file to be output so as to generate a preset mail template file; identifying text content of a user and inputting a mail characteristic model; outputting a mail template file by the mail characteristic model; and performing program format conversion on the mail template file to generate a target mail template. According to the method, the HTML mail training mail feature model of the Web page is collected, the mail template file to be output is created to generate the preset mail template file, the mail feature model is input to generate the target mail template after the text content of the user is identified, and the mail generation efficiency and accuracy are improved.

Description

Intelligent mail template method based on content identification
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to an intelligent mail template method based on content identification.
Background
Along with the development of diversification and refinement of computer application service scenes, each scene needs to monitor a large number of service indexes, key indexes monitored are read and summarized in a systematic way, and finally data analysis daily reports are pushed to a mail terminal in a mail way.
At present, for personalized mail sending, for example, a mail contains elements such as a picture, a table, a title and the like, the size and the brightness of the picture, the attribute of the table, the thickness and the color of a border and a shading, the font size, the font color, the line spacing and the like of the title are required to be regulated according to the requirement of mail presentation, so that a developer needs to develop different JAVA codes, finally, different JAVA codes are spliced together in order to form a mail which is finally required, so that the requirement of user service is met, namely, the traditional mail sending is realized in a mode of splicing HTML mail codes by means of Jmail.
When the HTML mail code is spliced by Jmail, different JAVA codes need to be developed, so that the code redundancy is high, the code is not modularized and is difficult to manage, and when the mail with different display modes needs to be newly added, the whole HTML code needs to be rewritten, so that the mail generation efficiency is low.
Disclosure of Invention
The invention aims to provide an intelligent mail template method based on content identification, which is characterized in that a mail characteristic model is trained by collecting HTML mails of Web pages, a mail template file to be output is created to generate a preset mail template file, and a mail characteristic model is input to generate a target mail template after the text content of a user is identified, so that the problems of low mail generation efficiency and inaccurate mail generation in the prior art are solved.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to an intelligent mail template method based on content identification, which comprises the following steps:
step S1: collecting the basic structure of an HTML mail of a Web page, and extracting the text content of the Web page;
step S2: preprocessing the extracted mail information;
step S3: marking, training and identifying the processed mail information through a deep learning algorithm;
step S4: the deep learning algorithm performs labeling training and feature training to obtain labeling training results and feature training results;
step S5: establishing a corresponding mail feature model according to the labeling training result and the feature training result;
step S6: creating a mail template file to be output so as to generate a preset mail template file;
step S7: identifying text content of a user and inputting a mail characteristic model;
step S8: outputting a mail template file by the mail characteristic model;
step S9: and performing program format conversion on the mail template file to generate a target mail template.
As a preferable technical solution, in the step S2, preprocessing the mail information includes the following steps:
step S21: collecting and training mail information;
step S22: discretizing the mail text and establishing a mail element library;
step S23: text feature extraction and text vectorization representation;
step S24: and carrying out weighted representation on words and elements corresponding to the text, and representing the text in a vector form.
As a preferable technical solution, in the step S21, when the mail information is collected and trained, the mail information needs to be cleaned; statistical value χ is adopted during cleaning 2 Is to be selected by the size of the screen,the specific expression is as follows:
Figure BDA0003840292600000031
wherein t is a feature item, C is a category of text, N is a total number of texts in the training set, a represents a frequency of occurrence of texts containing the feature item t in the category C, B represents a frequency of occurrence of texts containing the feature item t and not belonging to the category C, C represents a frequency of occurrence of texts not containing the feature item t in the category C, and D represents a frequency of occurrence of texts not containing the feature item t and not belonging to the category C.
As a preferred embodiment, in the step S22, the mail element includes one or more of a graph, a table, and a title; when the mail element is a graph, the attribute parameters comprise pixels, resolution, size, color, tone, saturation, brightness and gray value of the graph; the mail element is a representation, and the attribute parameters comprise the margin, the frame, the shading and the colors and line thicknesses of the frame and the shading of the table; when the mail element is a title, the attribute parameters include a title font, a font size, a line spacing and a font color.
As a preferable technical solution, in the step S23, a TF-IDF method is adopted for text feature extraction and text vectorization; the weighting function of the TF-IDF is expressed as:
w ij =t i f j ·id i f j
vectorizing the above formula is expressed as:
w ij =t i f j ×log(N/n i );
wherein tf represents the word frequency of the feature word, idf represents the data of the text with the feature word, N represents the text quantity in the training set, and N i Representing the total number of texts in which the feature term t appears.
In step S6, the mail feature model performs corresponding matching in a preset information recommendation library, determines element information corresponding to the information element library in the information recommendation library, and sends the recommendation information to a preset mail template file.
As a preferable technical solution, in step S9, the element data input by the user is processed in the mail generating process to generate the element display information.
The invention has the following beneficial effects:
according to the method, the HTML mail training mail feature model of the Web page is collected, the mail template file to be output is created to generate the preset mail template file, the mail feature model is input to generate the target mail template after the text content of the user is identified, and the mail generation efficiency and accuracy are improved.
Of course, it is not necessary for any one product to practice the invention to achieve all of the advantages set forth above at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for intelligent mail templates based on content identification according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention discloses an intelligent mail template method based on content identification, which comprises the following steps:
step S1: collecting the basic structure of an HTML mail of a Web page, and extracting the text content of the Web page;
step S2: preprocessing the extracted mail information;
step S3: marking, training and identifying the processed mail information through a deep learning algorithm;
step S4: the deep learning algorithm performs labeling training and feature training to obtain labeling training results and feature training results;
step S5: establishing a corresponding mail feature model according to the labeling training result and the feature training result;
step S6: creating a mail template file to be output so as to generate a preset mail template file; the mail feature model carries out corresponding matching in a preset information recommendation library, determines element information corresponding to an information element library in the information recommendation library, and sends the recommendation information to a preset mail template file;
step S7: identifying text content of a user and inputting a mail characteristic model;
step S8: outputting a mail template file by the mail characteristic model;
step S9: and performing program format conversion on the mail template file to generate a target mail template, processing element data input by a user in the mail generation process to generate element display information, generating the mail template at a terminal, and enabling the user to briefly modify the generated template according to actual needs until the mail effect required by the user is met.
In step S2, preprocessing the mail information includes the steps of:
step S21: collecting and training mail information;
step S22: discretizing the mail text and establishing a mail element library;
step S23: text feature extraction and text vectorization representation;
step S24: and carrying out weighted representation on words and elements corresponding to the text, and representing the text in a vector form.
In step S21, when the mail information is collected and trained, the mail information needs to be cleaned; statistical value χ is adopted during cleaning 2 Is selected by the size of the expression vector, and is specifically expressedThe formula is as follows:
Figure BDA0003840292600000061
where t is a feature item, C is a category of text, N is a total number of texts in the training set, a represents a frequency of occurrence of a text containing the feature item t in the category C, B represents a frequency of occurrence of a text containing the feature item t and not belonging to the category C, C represents a frequency of occurrence of a text not containing the feature item t in the category C, and D represents a frequency of occurrence of a text not containing the feature item t and not belonging to the category C, and therefore, n=a+b+c+d is known;
the magnitude of the association between a feature item and a class depends on the statistical value χ 2 Size, characteristic item and class χ 2 The higher the value, the greater the relevance of the explanatory feature item and the category, the more category distinguishing information is contained, and vice versa; usage statistics χ 2 The relation between the feature items and the text categories is fully considered and described, the feature extraction precision is greatly improved, and the algorithm is simple and easy to realize.
In step S22, the mail element includes one or more of a graph, a table, and a title; when the mail element is a graph, the attribute parameters comprise pixels, resolution, size, color, tone, saturation, brightness and gray-scale value of the graph; the mail element is a representation, and the attribute parameters comprise the margin, the frame, the shading and the colors and line thicknesses of the frame and the shading of the table; when the mail element is a title, then the attribute parameters include title font, font size, line spacing, and font color.
In step S23, a TF-IDF method is adopted for text feature extraction and text vectorization; the weighting function of TF-IDF is expressed as:
w ij =t i f j ·id i f j
vectorizing the above formula is expressed as:
w ij =t i f j ×log(N/n i );
where tf represents the term frequency of the feature word and idf represents the presence of the featureWord text data, N represents the number of text in the training set, N i Representing the total number of texts in which the feature item t appears;
the weighting rule of the TF-IDF method is as follows: a word appears multiple times in one text and then multiple times in another peer document. But for words with a large ability to distinguish between different text categories, words with a small frequency of occurrence in the text are often referred to as word frequency-inverse document frequency functions.
It should be noted that, in the above system embodiment, each unit included is only divided according to the functional logic, but not limited to the above division, so long as the corresponding function can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
In addition, those skilled in the art will appreciate that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program to instruct related hardware, and the corresponding program may be stored in a computer readable storage medium.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims (7)

1. The intelligent mail template method based on the content identification is characterized by comprising the following steps:
step S1: collecting the basic structure of an HTML mail of a Web page, and extracting the text content of the Web page;
step S2: preprocessing the extracted mail information;
step S3: marking, training and identifying the processed mail information through a deep learning algorithm;
step S4: the deep learning algorithm performs labeling training and feature training to obtain labeling training results and feature training results;
step S5: establishing a corresponding mail feature model according to the labeling training result and the feature training result;
step S6: creating a mail template file to be output so as to generate a preset mail template file;
step S7: identifying text content of a user and inputting a mail characteristic model;
step S8: outputting a mail template file by the mail characteristic model;
step S9: and performing program format conversion on the mail template file to generate a target mail template.
2. The intelligent mail template method based on content recognition according to claim 1, wherein the step S2 of preprocessing the mail message comprises the steps of:
step S21: collecting and training mail information;
step S22: discretizing the mail text and establishing a mail element library;
step S23: text feature extraction and text vectorization representation;
step S24: and carrying out weighted representation on words and elements corresponding to the text, and representing the text in a vector form.
3. The intelligent mail template method based on content recognition according to claim 2, wherein in the step S21, when the mail information is collected and trained, the mail information needs to be cleaned; statistical value χ is adopted during cleaning 2 Is selected by the size of the formula:
Figure FDA0003840292590000021
wherein t is a feature item, C is a category of text, N is a total number of texts in the training set, a represents a frequency of occurrence of texts containing the feature item t in the category C, B represents a frequency of occurrence of texts containing the feature item t and not belonging to the category C, C represents a frequency of occurrence of texts not containing the feature item t in the category C, and D represents a frequency of occurrence of texts not containing the feature item t and not belonging to the category C.
4. The intelligent mail template method based on content recognition according to claim 2, wherein in step S22, the mail elements include one or more of a graph, a table, and a title; when the mail element is a graph, the attribute parameters comprise pixels, resolution, size, color, tone, saturation, brightness and gray value of the graph; the mail element is a representation, and the attribute parameters comprise the margin, the frame, the shading and the colors and line thicknesses of the frame and the shading of the table; when the mail element is a title, the attribute parameters include a title font, a font size, a line spacing and a font color.
5. The intelligent mail template method based on content recognition according to claim 2, wherein in step S23, the TF-IDF method is used for text feature extraction and text vectorization; the weighting function of the TF-IDF is expressed as:
w ij =t i f j ·id i f j
vectorizing the above formula is expressed as:
w ij =t i f j ×log(N/n i );
wherein tf represents the word frequency of the feature word, idf represents the data of the text with the feature word, N represents the text quantity in the training set, and N i Representing the total number of texts in which the feature term t appears.
6. The intelligent mail template method based on content recognition according to claim 1, wherein in step S6, the mail feature model performs corresponding matching in a preset information recommendation library, determines element information corresponding to the information element library in the information recommendation library, and sends the recommendation information to a preset mail template file.
7. The intelligent mail template method based on content recognition according to claim 1, wherein in step S9, the element data input by the user is processed in the mail generation process to generate element presentation information.
CN202211103395.9A 2022-09-09 2022-09-09 Intelligent mail template method based on content identification Pending CN116306506A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211103395.9A CN116306506A (en) 2022-09-09 2022-09-09 Intelligent mail template method based on content identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211103395.9A CN116306506A (en) 2022-09-09 2022-09-09 Intelligent mail template method based on content identification

Publications (1)

Publication Number Publication Date
CN116306506A true CN116306506A (en) 2023-06-23

Family

ID=86783881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211103395.9A Pending CN116306506A (en) 2022-09-09 2022-09-09 Intelligent mail template method based on content identification

Country Status (1)

Country Link
CN (1) CN116306506A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240819A (en) * 2023-11-10 2023-12-15 天津异乡好居网络科技股份有限公司 Mail configuration method, device, equipment and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117240819A (en) * 2023-11-10 2023-12-15 天津异乡好居网络科技股份有限公司 Mail configuration method, device, equipment and computer readable storage medium
CN117240819B (en) * 2023-11-10 2024-02-09 天津异乡好居网络科技股份有限公司 Mail configuration method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN107766371B (en) Text information classification method and device
CN107807968B (en) Question answering device and method based on Bayesian network and storage medium
CN109685056A (en) Obtain the method and device of document information
CN106383875A (en) Artificial intelligence-based man-machine interaction method and device
CN112052414A (en) Data processing method and device and readable storage medium
CN104881428B (en) A kind of hum pattern extraction, search method and the device of hum pattern webpage
CN113627797B (en) Method, device, computer equipment and storage medium for generating staff member portrait
CN113742592A (en) Public opinion information pushing method, device, equipment and storage medium
CN113255331B (en) Text error correction method, device and storage medium
CN116306506A (en) Intelligent mail template method based on content identification
CN108595466B (en) Internet information filtering and internet user information and network card structure analysis method
CN117034948B (en) Paragraph identification method, system and storage medium based on multi-feature self-adaptive fusion
CN112818693A (en) Automatic extraction method and system for electronic component model words
CN113312924A (en) Risk rule classification method and device based on NLP high-precision analysis label
CN115017271B (en) Method and system for intelligently generating RPA flow component block
CN110297965B (en) Courseware page display and page set construction method, device, equipment and medium
CN114842982B (en) Knowledge expression method, device and system for medical information system
CN115130437B (en) Intelligent document filling method and device and storage medium
CN114331932A (en) Target image generation method and device, computing equipment and computer storage medium
Cho et al. Design of image generation system for DCGAN-based kids' book text
CN114021004A (en) Method, device and equipment for recommending science similar questions and readable storage medium
CN113569741A (en) Answer generation method and device for image test questions, electronic equipment and readable medium
US20240086452A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
CN115376153B (en) Contract comparison method, device and storage medium
CN115358186B (en) Generating method and device of slot label and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination