CN113935306A

CN113935306A - Method and device for processing advertising pattern template

Info

Publication number: CN113935306A
Application number: CN202111072170.7A
Authority: CN
Inventors: 黄于晏; 王秋文; 孔晓晴; 陈莹莹
Original assignee: Youmi Technology Co ltd
Current assignee: Youmi Technology Co ltd
Priority date: 2021-09-14
Filing date: 2021-09-14
Publication date: 2022-01-14

Abstract

The invention discloses a method and a device for processing an advertisement case template, wherein the method comprises the following steps: acquiring a plurality of advertisement documentations; screening at least one similar advertisement and documentary group from the plurality of advertisement and documentaries based on the similarity information between any two advertisement and documentaries; and performing character matching on all the advertisement texts in the similar advertisement text group, and replacing unmatched character segments in the advertisement texts in the group with slot positions to generate corresponding advertisement text templates. Therefore, the method and the device can determine similar case groups based on the similarity between the advertisement cases and determine the advertisement case template corresponding to each group of similar cases based on character matching, thereby determining the advertisement case template based on the correlation between the cases, being beneficial to improving the reliability and the accuracy of the generated template and providing an accurate and reliable template for the subsequent generation or prediction of the advertisement case.

Description

Method and device for processing advertising pattern template

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a device for processing an advertisement case template.

Background

With the development of internet technology, the form of network advertisements is more and more complex and diversified, and in order to improve the efficiency of advertisement content generation, the prior art starts to introduce natural language processing technology and neural network technology into the field of algorithms for automatically generating advertisements. However, in the prior art, when the template for generating the advertisement file is determined, the advertisement file form with the marking property is generally selected only by manual work, the correlation of the advertisement file is not considered, and the manual method has the problems of low efficiency and poor accuracy. Therefore, the prior art has defects and needs to be solved urgently.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a method and a device for processing and determining an advertisement and document template, which can determine the advertisement and document template based on the correlation between documents, and is beneficial to improving the reliability and accuracy of the generated template, and providing an accurate and reliable template for the subsequent generation or prediction of the advertisement and document.

In order to solve the above technical problem, a first aspect of the present invention discloses an advertisement case template processing method, including:

acquiring a plurality of advertisement documentations;

screening at least one similar advertisement and documentary group from the plurality of advertisement and documentaries based on the similarity information between any two advertisement and documentaries;

and performing character matching on all the advertisement texts in the similar advertisement text group, and replacing unmatched character segments in the advertisement texts in the group with slot positions to generate corresponding advertisement text templates.

As an optional implementation manner, in the first aspect of the present invention, the screening at least one similar advertisement document group from the plurality of advertisement documents based on the similarity information between any two of the advertisement documents includes:

determining similarity information between any two of the advertisement documents in the plurality of advertisement documents;

judging whether the similarity information is larger than a first similarity threshold value or not;

and when the judgment result is yes, the two advertisement documents are classified into the same similar advertisement document group.

As an alternative implementation, in the first aspect of the present invention, the similarity information includes at least one of cosine similarity information, euclidean distance similarity information, Jaccard distance similarity information, edit distance similarity information, chebyshev distance similarity information, hamming distance similarity information, mahalanobis distance similarity information, manhattan distance similarity information, and minkowski distance similarity information.

As an optional implementation manner, in the first aspect of the present invention, the performing character matching on all the intra-group advertisement documents in the similar advertisement document group, and replacing unmatched character fragments in the intra-group advertisement documents with slots to generate corresponding advertisement document templates includes:

matching all characters of at least two groups of advertisement documents in the similar advertisement document group, and determining a distinguishing character segment which can not be matched in each group of advertisement documents;

and replacing the distinguishing character segments in the group of the advertisement documents with filling slots to generate the advertisement document template corresponding to the similar advertisement document group.

As an alternative implementation manner, in the first aspect of the present invention, the determining the distinguishing character segments that cannot be matched in each group of the intra-advertisement documents includes:

determining all character segments which can not be matched in each group of the advertisement documents;

and determining the character segments with segment lengths larger than a preset length threshold value in all the character segments which can not be matched as distinguishing character segments.

As an alternative implementation, in the first aspect of the present invention, the method further includes:

acquiring all processed advertisement documents in the similar advertisement document group; the processed advertisement case is the in-group advertisement case in which unmatched character segments are replaced by slot positions;

calculating a text vector corresponding to each processed advertisement case;

and determining group vectors of the similar advertisement file groups according to the text vectors corresponding to all the processed advertisement files.

acquiring an advertisement file to be processed;

calculating the similarity between the advertisement file to be processed and any similar advertisement file group;

judging whether the similarity between the advertisement file to be processed and all the similar advertisement file groups is smaller than a second similarity threshold value;

and if so, determining a new similar advertisement case group according to the advertisement case to be processed.

The second aspect of the present invention discloses an advertisement and document template processing apparatus, which comprises:

the acquisition module is used for acquiring a plurality of advertisement documentations;

the screening module is used for screening out at least one similar advertisement file group from the plurality of advertisement files based on the similarity information between any two advertisement files;

and the generating module is used for performing character matching on all the advertisement texts in the similar advertisement text group, and replacing unmatched character segments in the advertisement texts in the group with slots to generate corresponding advertisement text templates.

As an alternative embodiment, in the second aspect of the present invention, the screening module includes:

a determining unit, configured to determine, for any two of the advertisement documents in the plurality of advertisement documents, similarity information between the two advertisement documents;

the judging unit is used for judging whether the similarity information is larger than a first similarity threshold value or not;

and the grouping unit is used for grouping the two advertisement documents into the same similar advertisement document group when the judgment result of the judging unit is yes.

As an alternative implementation, in the second aspect of the present invention, the similarity information includes at least one of cosine similarity information, euclidean distance similarity information, Jaccard distance similarity information, edit distance similarity information, chebyshev distance similarity information, hamming distance similarity information, mahalanobis distance similarity information, manhattan distance similarity information, and minkowski distance similarity information.

As an optional implementation manner, in the second aspect of the present invention, a specific manner in which the generating module performs character matching on all intra-group advertisement documents in the similar advertisement document group, and replaces unmatched character fragments in the intra-group advertisement documents with slots to generate corresponding advertisement document templates includes:

As an optional implementation manner, in the second aspect of the present invention, the specific manner of determining the distinguishing character segments that cannot be matched in each group of the in-content advertising copy by the generation module includes:

As an optional implementation manner, in the second aspect of the present invention, the apparatus further includes a group vector determining module, configured to perform the following steps:

calculating a text vector corresponding to each processed advertisement case;

As an optional implementation manner, in the second aspect of the present invention, the apparatus further includes a document set determining module, configured to perform the following steps:

acquiring an advertisement file to be processed;

The third aspect of the present invention discloses another device for processing an advertisement and document template, which comprises:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute part or all of the steps of the method for processing the advertising copy template disclosed by the first aspect of the embodiment of the invention.

A fourth aspect of the present invention discloses a computer storage medium, where the computer storage medium stores computer instructions, and the computer instructions, when called, are used to execute part or all of the steps in the advertisement and literature template processing method disclosed in the first aspect of the present invention.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, a plurality of advertisement documentations are obtained; screening at least one similar advertisement and documentary group from the plurality of advertisement and documentaries based on the similarity information between any two advertisement and documentaries; and performing character matching on all the advertisement texts in the similar advertisement text group, and replacing unmatched character segments in the advertisement texts in the group with slot positions to generate corresponding advertisement text templates. Therefore, the method and the device can determine similar case groups based on the similarity between the advertisement cases and determine the advertisement case template corresponding to each group of similar cases based on character matching, thereby determining the advertisement case template based on the correlation between the cases, being beneficial to improving the reliability and the accuracy of the generated template and providing an accurate and reliable template for the subsequent generation or prediction of the advertisement case.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart illustrating a method for processing an advertisement copy template according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating another method for processing an advertisement copy template according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of an advertisement and document template processing apparatus according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of another advertisement pattern template processing device disclosed in the embodiment of the present invention;

fig. 5 is a schematic structural diagram of another advertisement pattern template processing device disclosed in the embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or article.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The invention discloses an advertisement and document template processing method and device, which can determine similar document groups based on the similarity between advertisement documents and determine an advertisement and document template corresponding to each group of similar documents based on character matching, thereby determining the advertisement and document template based on the correlation between the documents, being beneficial to improving the reliability and the accuracy of the generated template and providing an accurate and reliable template for the subsequent generation or prediction of the advertisement and document. The following are detailed below.

Example one

Referring to fig. 1, fig. 1 is a flow chart illustrating a method for processing an advertisement document template according to an embodiment of the present invention. The method described in fig. 1 is applied to an automatic document content generating device, where the generating device may be a corresponding generating terminal, generating device, or generating server, and the server may be a local server or a cloud server, and the embodiment of the present invention is not limited thereto. As shown in fig. 1, the method for processing an advertisement case template may include the following operations:

101. a plurality of advertising copy cases are obtained.

Optionally, the advertisement case acquiring manner may be crawling the text of the specific network address, receiving the advertisement case input after manual screening, or acquiring the advertisement text in a voice recognition or image recognition manner, which is not limited in the present invention.

Optionally, the advertisement copy may be a publicity copy of different categories of goods or services, such as a description page of the goods or services or a publicity copy used in a business activity, or may be a publicity copy for guiding the goods or services, such as a title or a story quotation copy of an article written by a specific medium, such as a wechat public number, for attracting a user, and the present invention is not limited thereto.

Optionally, the language type of the advertisement case is not limited to chinese, but may also be other languages with clear grammar, such as english, french, and the like, and the present invention is not limited.

102. And screening at least one similar advertisement case group from the plurality of advertisement cases based on the similarity information between any two advertisement cases.

Specifically, the similarity between all the documents in each similar advertisement document group is higher, and the similarity is used for representing a class of similar advertisement documents, for example, documents for promoting the same class of commodities or documents with similar writing style.

103. And performing character matching on all the advertisement files in the similar advertisement file group, and replacing unmatched character segments in the advertisement files in the group with slot positions to generate corresponding advertisement file templates.

Optionally, the generated advertisement pattern template may be subsequently used in a training material of a neural network, for example, in the neural network for advertisement pattern style recognition or pattern recognition, because the slots in the advertisement pattern template represent positions that may be replaced with different nouns of the same type, which are invalid information for style recognition or pattern recognition, and thus the advertisement pattern template may be used as a training material after being effectively cleaned to train the neural network.

Optionally, the generated advertisement and literature template may be used as an auxiliary template for advertisement and literature generation, for example, if advertisement propaganda of a specific style is required for a product of a specific category in some literature generation requirements, the advertisement and literature template corresponding to the specific style may be selected, and the character segment corresponding to the product generation slot of the specific category may be filled into the advertisement and literature template to generate the advertisement and literature.

Therefore, the method described by the embodiment of the invention can determine the similar file groups based on the similarity between the advertisement files, and determine the advertisement file template corresponding to each group of similar files based on character matching, thereby determining the advertisement file template based on the correlation between the files, being beneficial to improving the reliability and the accuracy of the generated template, and providing an accurate and reliable template for the subsequent generation or prediction of the advertisement file.

As an optional implementation manner, in the step 102, based on the similarity information between any two advertisement documents, screening out at least one similar advertisement document group from the plurality of advertisement documents includes:

determining similarity information between two advertisement copy cases aiming at any two advertisement copy cases in the plurality of advertisement copy cases;

and if so, grouping the two advertisement documents into the same similar advertisement document group.

Optionally, the expression that the similarity information is greater than the first similarity threshold is intended to limit the degree of similarity between two advertisement documents to be greater than a preset degree, and is not limited to comparison of numerical values, because in some cases, the higher the value of the similarity information is, the lower the similarity is, at this time, the meaning that the similarity information is greater than the first similarity threshold is that the value of the similarity information is smaller than the first similarity threshold, and in most cases, the higher the value of the similarity information is, the higher the similarity is, at this time, the meaning that the similarity information is greater than the first similarity threshold is that the value of the similarity information is higher than the first similarity threshold.

Optionally, the similarity information may include at least one of cosine similarity information, euclidean distance similarity information, Jaccard distance similarity information, edit distance similarity information, chebyshev distance similarity information, hamming distance similarity information, mahalanobis distance similarity information, manhattan distance similarity information, and minkowski distance similarity information. Optionally, when the similarity information includes a combination of at least two of the above-mentioned multiple similarity information types, the similarity information may be a weighted combination of the multiple similarity information types, where different similarity information types may have different weights.

In a preferred embodiment, the similarity information is edit distance similarity information, and the edit distance is generally used to characterize the minimum number of edit operations required to change from one string to another string. Generally, the smaller the edit distance, the greater the similarity of the two texts. At this time, the edit distance between the character strings corresponding to the two advertisement documents, that is, the Levenshtein distance, may be calculated, and the edit distance score between the two advertisement documents may be obtained, at this time, the first similarity threshold may be set to the edit distance score threshold, for example, 0.5, and since the edit distance score is generally between 0 and 1, the two advertisement documents smaller than the edit distance score threshold may be grouped into the same similar advertisement document group.

Optionally, in this embodiment, the first similarity threshold may be set according to an experimental value or an empirical value, and preferably, the first similarity threshold may be increased to make the similarity between all the advertisement documents in the same similar advertisement document group higher, which tends to make the lengths of all the advertisement documents in the same similar advertisement document group substantially the same, and only some character segments are different, so as to facilitate subsequent replacement processing.

Therefore, by implementing the optional implementation mode, the similarity information between any two advertisement documents in the plurality of advertisement documents can be judged, and when the similarity information is judged to be larger than the threshold value, the two advertisement documents are classified into the same similar advertisement document group, so that the similarity classification can be carried out on two or more similar advertisement documents in the plurality of advertisement documents, so as to determine at least one group of advertisement document groups which are similar to each other, and provide a data base for subsequent replacement operation.

As an optional implementation manner, in the step 103, performing character matching on all the intra-group advertisement documents in the similar advertisement document group, and replacing unmatched character fragments in the intra-group advertisement documents with slots to generate corresponding advertisement document templates, includes:

matching all characters of at least two advertisement documents in the similar advertisement document group, and determining the different character segments which can not be matched in each group of advertisement documents;

and replacing the distinguishing character segments in the group of the advertisement copy with the filling slots to generate the advertisement copy template corresponding to the similar advertisement copy group.

Optionally, as can be seen from the above embodiment, the intra-group advertisement documents in the same similar advertisement document group preferably tend to have the same text character length and different partial character segments, so that matching all the characters of at least two intra-group advertisement documents may be performed by performing word-by-word comparison on all the characters of at least two intra-group advertisement documents under the condition that the sequences are aligned with each other, extracting different characters obtained by the comparison, and determining different characters obtained by the continuous comparison as the same distinguishing character segment.

Optionally, it may also be determined that at least two intra-group advertisement documents in the same similar advertisement document group have the same document length, then perform word-by-word comparison on all characters of the at least two intra-group advertisement documents under the condition that the sequences are mutually aligned, extract different characters obtained by the comparison, and determine different characters obtained by the continuous comparison into the same distinct character segment.

Optionally, the distinguished character segments in any group of the internal advertisement documents are replaced with filling slots, parts of speech of the distinguished character segments can be labeled through an existing part of speech labeling library, the distinguished character segments in any group of the internal advertisement documents are replaced with corresponding part of speech expression vectors, for example, a Jieba part of speech labeling tool can be used for part of speech labeling of unmatched text segments, famous part of speech segments in the text segments, such as place names, person names, constellation names, and the like, are replaced with filling slots, and the filling slots are in the format { 'the parts of speech corresponding to the distinguished character segments' }, so as to generate the template.

Therefore, by implementing the optional implementation manner, the distinguishing character segments which cannot be matched in each group of the internal advertisement documents can be determined, and the distinguishing character segments in the group of the internal advertisement documents are replaced by the filling slots to generate the advertisement document templates corresponding to the similar advertisement document groups, so that the advertisement document templates can be determined based on the distinguishing characters among the similar documents, the reliability and the accuracy of the generated templates can be improved, and accurate and reliable templates can be provided for the subsequent generation or prediction of the advertisement documents.

As an alternative implementation, the step of determining the different character segments that cannot be matched in each group of the intra-advertisement documents includes:

and determining the character segments with segment lengths larger than a preset length threshold value from all the character segments which cannot be matched as distinguishing character segments.

Alternatively, the length threshold may be 2.

Therefore, the optional implementation method can determine the character segments with the segment length larger than the length threshold value in all the character segments which cannot be matched as the distinguishing character segments, so that the character segments with the too short length and the specific part-of-speech significance cannot be formed are screened out, and the representation and the reliability of the generated template are improved.

Example two

Referring to fig. 2, fig. 2 is a schematic flow chart illustrating another method for processing an advertisement document template according to an embodiment of the present invention. The method described in fig. 2 is applied to an automatic document content generating device, where the generating device may be a corresponding generating terminal, generating device, or generating server, and the server may be a local server or a cloud server, and the embodiment of the present invention is not limited thereto. As shown in fig. 2, the method for processing an advertisement copy template may include the following operations:

201. a plurality of advertising copy cases are obtained.

202. And screening at least one similar advertisement case group from the plurality of advertisement cases based on the similarity information between any two advertisement cases.

203. And performing character matching on all the advertisement files in the similar advertisement file group, and replacing unmatched character segments in the advertisement files in the group with slot positions to generate corresponding advertisement file templates.

The detailed technical details and technical noun explanations of the steps 201-203 can refer to the description of the steps 101-103 in the first embodiment, which will not be repeated herein.

204. And acquiring all the processed advertisement documents in the similar advertisement document group.

Wherein the processed advertisement case is an in-group advertisement case in which unmatched character segments are replaced with slots.

205. And calculating a text vector corresponding to each processed advertisement case.

Optionally, a word vector calculation model, such as an Tencent word vector model, may be used to calculate the text vector corresponding to each processed advertisement copy.

206. And determining group vectors of the similar advertisement file groups according to the text vectors corresponding to all the processed advertisement files.

Optionally, the sum of the text vectors corresponding to all the processed advertisement documents is determined as the group vector of the similar advertisement document group. Optionally, the group vector of the similar advertisement case group may be used for matching a subsequent new advertisement case group, may also be used for representing features of the same type of advertisement case in the similar advertisement case group, and further may be used as a calculation parameter of the type of advertisement case in neural network training or other algorithm processing.

Therefore, the method described by the embodiment of the invention can calculate the text vector corresponding to each processed advertisement case in all the processed advertisement cases in the similar advertisement case group, and determine the group vector of the similar advertisement case group according to the text vectors corresponding to all the processed advertisement cases, so that the garment gender identification can be carried out based on the characteristics on more modal levels of the garment goods, the accuracy and efficiency of the garment goods gender identification can be effectively improved, and the problem of lower accuracy caused by the fact that the garment gender identification is carried out only by using single modal data in the prior art can be effectively solved.

As an optional implementation, the method may further include:

acquiring an advertisement file to be processed;

judging whether the similarity between the advertisement file to be processed and all similar advertisement file groups is smaller than a second similarity threshold value;

if so, determining a new similar advertisement case group according to the advertisement case to be processed.

Optionally, the similarity between the advertisement to be processed and any similar advertisement document group is calculated, and may be obtained by calculating the similarity between the text vector of the document to be processed and the group vector of any similar advertisement document group. Optionally, the expression that the similarity is smaller than the second similarity threshold is intended to limit the similarity between the advertisement document to be processed and any similar advertisement document group to be smaller than a preset degree, and is not limited to comparison of numerical values, because in some cases, the higher the value of the similarity is, the lower the similarity is, at this time, the similarity is smaller than the second similarity threshold, which means that the value of the similarity is larger than the second similarity threshold, and in most cases, the higher the value of the similarity is, the higher the similarity is, and at this time, the similarity is smaller than the second similarity threshold, which means that the value of the similarity is smaller than the second similarity threshold.

Optionally, the similarity between the advertisement document to be processed and any similar advertisement document group may include at least one of cosine similarity, euclidean distance similarity, Jaccard distance similarity, edit distance similarity, chebyshev distance similarity, hamming distance similarity, mahalanobis distance similarity, manhattan distance similarity, and minkowski distance similarity. Optionally, when the similarity includes a combination of at least two of the above multiple similarity information types, the similarity may be a weighted combination of the multiple similarity information types, where different similarity information types may have different weights.

Alternatively, in this embodiment, the setting of the second similarity threshold may be set according to an experimental value or an empirical value. Optionally, if it is determined that the similarity between the advertisement document to be processed and the plurality of similar advertisement document groups is higher than the second similarity threshold, the advertisement document to be processed may be classified into the similar advertisement document group with the highest similarity, so that the advertisement document to be processed may be subsequently processed in the similar advertisement document group, and this classification may also be used to characterize the type identification of the advertisement document to be processed, for example, may be used to characterize the classification of the advertisement document to be processed as the advertisement style or the advertisement type corresponding to the similar advertisement document group.

Therefore, by implementing the optional implementation mode, when the similarity between the advertisement file to be processed and all similar advertisement file groups is judged to be smaller than the second similarity threshold, the new similar advertisement file group can be determined according to the advertisement file to be processed, so that a new advertisement file group which is not similar to other advertisement file groups can be determined, and a data basis is provided for subsequent replacement operation.

EXAMPLE III

Referring to fig. 3, fig. 3 is a schematic structural diagram of an advertisement document template processing device according to an embodiment of the present invention. The apparatus described in fig. 3 may be applied to a corresponding automatic document content generating apparatus, where the generating apparatus may be a corresponding generating terminal, a generating device, or a generating server, and the server may be a local server or a cloud server, which is not limited in the embodiment of the present invention. As shown in fig. 3, the apparatus may include:

the obtaining module 301 is configured to obtain a plurality of advertisement scenarios.

The screening module 302 is configured to screen at least one similar advertisement and document group from a plurality of advertisement and documents based on similarity information between any two advertisement and document groups.

The generating module 303 is configured to perform character matching on all advertisement documents in the similar advertisement document group, and replace unmatched character segments in the advertisement documents in the group with slots to generate corresponding advertisement document templates.

Therefore, the device described by the embodiment of the invention can determine the similar file groups based on the similarity between the advertisement files, and determine the advertisement file template corresponding to each group of similar files based on character matching, so that the advertisement file template can be determined based on the correlation between the files, the reliability and the accuracy of the generated template can be improved, and the accurate and reliable template can be provided for the subsequent generation or prediction of the advertisement file.

As an alternative embodiment, as shown in fig. 4, the screening module 302 includes:

a determining unit 3021 configured to determine, for any two advertisement documents among the plurality of advertisement documents, similarity information between the two advertisement documents;

a judging unit 3022, configured to judge whether the similarity information is greater than a first similarity threshold;

a grouping unit 3023, configured to group the two advertisement documents into the same similar advertisement document group when the determination result of the determining unit 3022 is yes.

As an optional implementation manner, the specific manner in which the generating module 303 performs character matching on all advertisement documents in a group of similar advertisement documents, and replaces unmatched character fragments in the advertisement documents in the group with slots to generate corresponding advertisement document templates includes:

As an alternative implementation, the generating module 303 determines a specific manner of the distinguishing character segments that cannot be matched in each group of the intra-advertisement documents, including:

Alternatively, the length threshold may be 2.

As an alternative implementation, as shown in fig. 4, the apparatus further includes a group vector determining module 304, configured to perform the following steps:

acquiring all processed advertisement documents in the similar advertisement document group;

calculating a text vector corresponding to each processed advertisement case;

Wherein the processed advertisement case is an in-group advertisement case in which unmatched character segments are replaced with slots. Optionally, a word vector calculation model, such as an Tencent word vector model, may be used to calculate the text vector corresponding to each processed advertisement copy.

Therefore, by implementing the optional implementation mode, the text vector corresponding to each processed advertisement case in all the processed advertisement cases in the similar advertisement case group can be calculated, and the group vector of the similar advertisement case group is determined according to the text vectors corresponding to all the processed advertisement cases, so that the garment gender identification can be performed based on the features on more modal levels of the garment goods, the accuracy and efficiency of the garment goods gender identification can be effectively improved, and the problem of lower accuracy caused by the fact that the garment gender identification is performed only by using single modal data in the prior art can be effectively solved.

As an alternative implementation, as shown in fig. 4, the apparatus further includes a document set determining module 305, configured to perform the following steps:

acquiring an advertisement file to be processed;

Example four

Referring to fig. 5, fig. 5 is a schematic structural diagram of another advertisement document template processing device according to an embodiment of the present invention. As shown in fig. 5, the apparatus may include:

a memory 401 storing executable program code;

a processor 402 coupled with the memory 401;

the processor 402 calls the executable program code stored in the memory 401 to execute part or all of the steps of the method for processing the advertising copy template disclosed in the first embodiment or the second embodiment of the present invention.

EXAMPLE five

The embodiment of the invention discloses a computer storage medium, which stores computer instructions, and when the computer instructions are called, the computer instructions are used for executing part or all of the steps of the advertising copy template processing method disclosed by the first embodiment or the second embodiment of the invention.

The above-described embodiments of the apparatus are merely illustrative, and the modules described as separate components may or may not be physically separate, and the components shown as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above detailed description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on such understanding, the above technical solutions may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, where the storage medium includes a Read-Only Memory (ROM), a Random Access Memory (RAM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc-Read-Only Memory (CD-ROM), or other disk memories, CD-ROMs, or other magnetic disks, A tape memory, or any other medium readable by a computer that can be used to carry or store data.

Finally, it should be noted that: the method and apparatus for processing an advertisement document template disclosed in the embodiments of the present invention are only preferred embodiments of the present invention, and are only used for illustrating the technical solutions of the present invention, not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. An advertising copy template processing method, characterized in that the method comprises:

acquiring a plurality of advertisement documentations;

2. The method of claim 1, wherein the step of screening at least one similar advertisement copy group from the plurality of advertisement copies based on the similarity information between any two of the advertisement copies comprises:

3. The method as claimed in claim 2, wherein the similarity information includes at least one of cosine similarity information, euclidean distance similarity information, Jaccard distance similarity information, edit distance similarity information, chebyshev distance similarity information, hamming distance similarity information, mahalanobis distance similarity information, manhattan distance similarity information, and minkowski distance similarity information.

4. The method of claim 1, wherein said character matching all intra-group advertisement documents in the similar group of advertisement documents, and replacing unmatched character segments in the intra-group advertisement documents with slots to generate corresponding advertisement document templates comprises:

5. The method of claim 4, wherein the determining the different character segments that cannot be matched in each group of the advertisement documents comprises:

6. The advertising copy template processing method of claim 1, further comprising:

calculating a text vector corresponding to each processed advertisement case;

7. The advertising copy template processing method of claim 1, further comprising:

acquiring an advertisement file to be processed;

8. An advertising copy template processing apparatus, comprising:

9. An advertising copy template processing apparatus, comprising:

a memory storing executable program code;

a processor coupled with the memory;

the processor calls the executable program code stored in the memory to execute the advertising copy template processing method according to any one of claims 1 to 7.

10. A computer storage medium having stored thereon computer instructions which, when invoked, perform the method of advertising copy template processing according to any one of claims 1 to 7.