CN114818641A - Method for generating document template, device, equipment, medium and product thereof - Google Patents

Method for generating document template, device, equipment, medium and product thereof Download PDF

Info

Publication number
CN114818641A
CN114818641A CN202210540521.0A CN202210540521A CN114818641A CN 114818641 A CN114818641 A CN 114818641A CN 202210540521 A CN202210540521 A CN 202210540521A CN 114818641 A CN114818641 A CN 114818641A
Authority
CN
China
Prior art keywords
tag
advertisement
word
label
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210540521.0A
Other languages
Chinese (zh)
Inventor
葛莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huanju Shidai Information Technology Co Ltd
Original Assignee
Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huanju Shidai Information Technology Co Ltd filed Critical Guangzhou Huanju Shidai Information Technology Co Ltd
Priority to CN202210540521.0A priority Critical patent/CN114818641A/en
Publication of CN114818641A publication Critical patent/CN114818641A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/268Morphological analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application relates to a method for generating a file template, a device, equipment, a medium and a product thereof, wherein the method comprises the following steps: acquiring an advertisement text of a commodity; deconstructing the advertisement text into a tag sequence according to a preset tag structure, wherein the tag structure comprises a plurality of tag groups, and each tag group comprises at least one word tag; extracting material texts corresponding to all the label groups from the advertisement texts based on the word labels in the label sequence; and constructing a document template according to the label sequence, wherein part of label groups are represented as corresponding replaceable group labels in the document template, and other label groups are reserved and represented as corresponding material texts. The method and the device for generating the advertisement documents under the guidance of the levels and the granularity of the label structure are more accurate and effective, can be used for producing the advertisement documents in batches, improve the generation efficiency of the advertisement documents required by merchants, and improve the online advertisement technical service capacity of the e-commerce platform.

Description

Method for generating document template, device, equipment, medium and product thereof
Technical Field
The present application relates to the field of e-commerce information technologies, and in particular, to a document template generation method, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.
Background
In the E-commerce scene, the advertisement case of the commodity is an important component of the advertisement material, when a merchant creates an advertisement, the merchant needs to fill in the title, text and other text information, the steps need to spend manpower, the excellent advertisement case also has certain writing threshold and skill, and particularly, the unskilled person has more difficulty when writing the advertisement case of an unfamiliar language. The quality of the advertisement file affects the effectiveness of advertisement delivery, and therefore, as a platform side of e-commerce, providing a service for generating the advertisement file for the merchant is a rigid demand.
The traditional automatic generation technology of the commodity advertisement file is mostly realized by adopting a template matching mode, but for how to generate the corresponding template, a common mode is to preset a plurality of file templates in a manual compiling mode and issue the file templates to a template library for merchants to call, and obviously, the mode is quite low in efficiency.
Related schemes for automatically generating templates also exist in the industry, but the algorithms and processing logics adopted by various families are different, so that the effects are different, and various existing schemes in actual measurement cannot achieve good effects, particularly the expression effect of the finally generated file is not good, and additional exploration is needed.
Disclosure of Invention
The present application is directed to solve the above problems and provide a document template generation method and a corresponding apparatus, a computer device, a computer readable storage medium, a computer program product,
The technical scheme is adopted to adapt to various purposes of the application as follows:
in one aspect, a method for generating a document template is provided, which comprises the following steps:
acquiring an advertisement text of a commodity;
deconstructing the advertisement text into a tag sequence according to a preset tag structure, wherein the tag structure comprises a plurality of tag groups, and each tag group comprises at least one word tag;
extracting material texts corresponding to all the label groups from the advertisement texts based on the word labels in the label sequence;
and constructing a document template according to the label sequence, wherein part of label groups are represented as corresponding replaceable group labels in the document template, and other label groups are reserved and represented as corresponding material texts.
On the basis of any of the above embodiments, deconstructing the advertisement text into a tag sequence according to a preset tag structure includes the following steps:
obtaining semantic vectors of all participles in the advertisement text;
acquiring a part-of-speech vector of each participle in the advertisement text;
constructing the semantic vector and the part-of-speech vector of each participle into a comprehensive vector;
and determining corresponding word labels of the word segments in a preset label structure based on the comprehensive vectors of the word segments to obtain a label sequence corresponding to the advertisement text.
On the basis of any of the above embodiments, obtaining part-of-speech vectors of each participle in the advertisement text includes the following steps:
inputting the advertisement text into a part-of-speech extractor to obtain part-of-speech identifiers of all the participles;
and querying a preset part-of-speech coding table, determining part-of-speech vectors corresponding to the participles, orderly arranging the part-of-speech coding table according to the statistical frequency of different parts-of-speech in a reference advertisement text set, and determining the part-of-speech vectors corresponding to the parts-of-speech by applying a one-hot coding algorithm.
On the basis of any of the above embodiments, the tag structure includes a tag group indicating that the following types of information correspond to: advertisement sentence pattern, core word, reserved word, punctuation mark, wherein,
the label group of the advertisement sentence pattern comprises label groups corresponding to an opening sentence pattern, a discount information sentence pattern and a closing sentence pattern, wherein each label group comprises a word label used for indicating that a corresponding participle belongs to a starting position or a non-starting position;
the tag groups of the core words comprise tag groups corresponding to product words and brand words, wherein each tag group comprises a word tag used for indicating that a corresponding word belongs to a starting position or a non-starting position;
the label group corresponding to the reserved words and punctuation marks comprises single corresponding word labels.
On the basis of any of the above embodiments, constructing a document template according to the tag sequence includes the following steps:
reserving word segmentation corresponding to word labels in a label group belonging to a reserved type in the advertisement text;
and replacing and representing word segmentation sets corresponding to all continuously-occurring word labels in the label groups belonging to the replaceable types in the advertisement texts as corresponding group labels.
On the basis of any of the above embodiments, after the step of extracting the material text corresponding to each tag group from the advertisement text based on the word tags in the tag sequence, the method includes the following steps:
and storing the mapping relation data of the group tags of the tag group and the material texts thereof in a material library corresponding to the attribution type of the commodity described by the advertisement text, wherein the group tags comprise group tags for indicating the material texts corresponding to the tag group as advertisement sentence patterns and group tags for indicating the material texts corresponding to the tag group as core words, and the core words are product words and/or brand words of the commodity.
On the basis of any of the above embodiments, the step of constructing the pattern template according to the tag sequence includes the following steps:
responding to an advertisement pattern generation request submitted by terminal equipment, and acquiring a commodity core word specified by the request, wherein the commodity core word comprises a product word and/or a brand word of a target commodity corresponding to the request;
correspondingly replacing the group labels used for indicating the core words in the case template with the commodity core words;
replacing a group label used for indicating an advertisement sentence pattern in the document template with any material text corresponding to the group label in a material library corresponding to the attribution type of the target commodity;
and pushing the advertisement file obtained after the file template is replaced to the terminal equipment.
On the other hand, one of the objectives of the present application is adapted to provide a document template generating apparatus, which includes a text obtaining module, a tag deconstruction module, a material extraction module, and a template construction module, wherein the text obtaining module is configured to obtain an advertisement text of a commodity; the tag deconstruction module is used for deconstructing the advertisement text into a tag sequence according to a preset tag structure, wherein the tag structure comprises a plurality of tag groups, and each tag group comprises at least one word tag; the material extracting module is used for extracting material texts corresponding to all the label groups from the advertisement texts based on the word labels in the label sequence; and the template construction module is used for constructing a document template according to the label sequence, wherein part of label groups in the document template are represented as corresponding replaceable group labels, and other label groups are reserved and represented as corresponding material texts.
On the basis of any of the above embodiments, the tag deconstruction module includes: the semantic processing submodule is used for acquiring semantic vectors of all the participles in the advertisement text; the part-of-speech processing submodule is used for acquiring part-of-speech vectors of all the participles in the advertisement text; the vector synthesis submodule is used for constructing the semantic vector and the part-of-speech vector of each participle into a synthesis vector; and the part-of-speech determining submodule is used for determining corresponding word labels of the participles in a preset label structure based on the comprehensive vectors of the participles to obtain a label sequence corresponding to the advertisement text.
On the basis of any of the above embodiments, the part-of-speech processing sub-module includes: a part-of-speech extraction unit, which is used for inputting the advertisement text into a part-of-speech extractor to obtain part-of-speech marks of each participle; and the vector embedding unit is used for inquiring a preset part-of-speech coding table, determining part-of-speech vectors corresponding to the participles, orderly arranging the part-of-speech coding table according to the statistical frequency of different parts-of-speech in the reference advertisement text set, and determining the part-of-speech vectors corresponding to the parts-of-speech by applying a one-hot coding algorithm.
On the basis of any of the above embodiments, the tag structure includes a tag group indicating that the following types of information correspond to: advertisement sentence pattern, core word, reserved word, punctuation mark, wherein,
the label group of the advertisement sentence pattern comprises label groups corresponding to an opening sentence pattern, a discount information sentence pattern and a closing sentence pattern, wherein each label group comprises a word label used for indicating that a corresponding participle belongs to a starting position or a non-starting position;
the tag groups of the core words comprise tag groups corresponding to product words and brand words, wherein each tag group comprises a word tag used for indicating that a corresponding word belongs to a starting position or a non-starting position;
the label group corresponding to the reserved words and punctuation marks comprises single corresponding word labels.
On the basis of any of the above embodiments, the template construction module includes: the reservation processing submodule is used for reserving word segmentation corresponding to the word tags in the tag group belonging to the reservation type in the advertisement text; and the replacement processing submodule is used for replacing and representing the word segmentation set corresponding to each continuously-occurring word label in the label group belonging to the replaceable type in the advertisement text as the corresponding group label.
On the basis of any of the above embodiments, the material extraction module includes: and the material filing module is used for storing the mapping relation data of the group tags of the tag group and the material texts thereof in a material library corresponding to the attribution type of the commodity described by the advertisement text, wherein the group tags comprise a group tag used for indicating the material text corresponding to the tag group as an advertisement sentence pattern and a group tag used for indicating the material text corresponding to the tag group as a core word, and the core word is a product word and/or a brand word of the commodity.
On the basis of any of the above embodiments, the template construction module includes: the request response module is used for responding to an advertisement file generation request submitted by the terminal equipment and acquiring a commodity core word specified by the request, wherein the commodity core word comprises a product word and/or a brand word of a target commodity corresponding to the request; the core word replacing module is used for correspondingly replacing the group labels used for indicating the core words in the case template with the commodity core words; the advertisement replacing module is used for replacing a group label used for indicating an advertisement sentence pattern in the document template with any material text corresponding to the group label in the material library corresponding to the attribution type of the target commodity; and the file pushing module is used for pushing the advertisement file obtained after the file template is replaced to the terminal equipment.
In yet another aspect, a computer device adapted for one of the purposes of the present application includes a central processing unit and a memory, the central processing unit being configured to invoke execution of a computer program stored in the memory to perform the steps of the document template generation method described in the present application.
In another aspect, a computer-readable storage medium is provided, which stores a computer program implemented according to the method for generating a pattern template in the form of computer-readable instructions, and when the computer program is called by a computer, executes the steps included in the method.
In a further aspect, a computer program product is provided, which comprises computer program/instructions, which when executed by a processor, implement the steps of the method as described in any one of the embodiments of the present application.
Compared with the prior art, the application has various advantages, including at least the following aspects: in the construction process of the text template according to the given advertisement text, the preset label structure is utilized, the hierarchical and granularity relation between the label groups and the word labels under the label groups is embodied in the label structure, the material texts corresponding to the label groups are accurately extracted from the advertisement text, then in the construction process of the corresponding text template, part of the label groups are used as replaceable label groups and are represented as group labels to play a role of mask codes and can be used for replacing other variable contents, for the rest of the label groups, the original contents in the advertisement text are reserved by representing the replaceable label groups as the corresponding material texts, the selection of the character contents of the advertisement text is realized, and the generated document template not only contains the expression contents of highlights in the advertisement text, but also is convenient for replacing the group labels of the replaceable label groups with the corresponding information of the commodities needing to generate the advertisement document Under the guidance of the level and the granularity of the label structure, the document template is more accurate and effective, can be used for producing advertisement documents in batches, improves the generation efficiency of the advertisement documents required by merchants, and improves the online advertisement technical service capability of the e-commerce platform.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of an exemplary embodiment of a document template generation method of the present application.
Fig. 2 is a schematic diagram of an exemplary tag structure of the present application.
Fig. 3 is a schematic network structure diagram of an exemplary deep learning model of the present application.
Fig. 4 is a flowchart illustrating a process of deconstructing advertisement text into a tag sequence in an embodiment of the present application.
Fig. 5 is a flowchart illustrating a process of obtaining a part-of-speech vector in an embodiment of the present application.
Fig. 6 is a flowchart illustrating a process of converting advertisement text into a text template according to an embodiment of the present application.
Fig. 7 is a flowchart illustrating a process of generating an advertisement scheme according to a text template in an embodiment of the present application.
FIG. 8 is a schematic block diagram of a document template generation apparatus of the present application;
fig. 9 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a Internet access terminal, and a music/video playing terminal, and may be, for example, a PDA, an MID (Mobile Internet Device), and/or a Mobile phone with music/video playing function, and may also be a smart television, a set-top box, and other devices.
The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.
It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.
One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.
Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.
Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.
The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, it is proposed based on the same inventive concept, and therefore, concepts of the same expression and concepts of which expressions are different but are appropriately changed only for convenience should be equally understood.
The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.
The document template generation method can be programmed into a computer program product and is deployed in a client or a server to run, for example, in an exemplary application scenario of the application, the document template generation method can be deployed in a server of an e-commerce platform, so that the method can be executed by accessing an interface opened after the computer program product runs and performing human-computer interaction with a process of the computer program product through a graphical user interface.
Referring to fig. 1, the method for generating a document template of the present application, in an exemplary embodiment thereof, includes the following steps:
step S1100, acquiring advertisement texts of the commodities;
an exemplary application scenario may be to run a computer program implemented according to the present application on an e-commerce platform for creating a document template.
The advertisement text may adopt a text with relatively long content, and may include content corresponding to any one or more of an opening sentence, a discount information sentence, a description sentence, an ending sentence, and the like, and may include a core word of a commodity corresponding to the advertisement text in the description sentence, where the core word may be a product word representing a name of the commodity, a brand word representing a brand to which the commodity belongs, or both.
The advertisement text can be commodity propaganda content collected from historically delivered commodity advertisements, can be obtained by manual pre-collection, can be obtained by identifying the text content in a commodity advertisement picture containing an image of the commodity propaganda content by means of a text identification model, and can also be obtained by manual independent writing.
In one embodiment, historical advertisement documents which are historically put can be directly called from an advertisement document database of an advertisement putting system of an e-commerce platform, and advertisement texts in the historical advertisement documents are directly extracted for use. In a further embodiment, the advertisement copy may be preferentially screened according to the advertisement performance index data corresponding to the advertisement copy, so as to determine that the advertisement copy is superior, for example, one or more advertisement texts in the advertisement copy with the advertisement performance index data higher than a preset threshold are used for making a copy template. The advertisement performance indicator data may be a conversion rate of a corresponding commercial product after the corresponding advertisement is delivered or other similar indicators.
In another embodiment, one or more similar advertisement texts with the highest similarity to each other can be selected for making the text template according to one or more keywords given by the user and the merchant user and according to the similarity matching between the semantic vector of the keyword and the semantic vector of the advertisement texts in the historical advertisement file. The semantic similarity is used for determining the advertisement text matched with the keywords of the merchant user to be used for making the text template, so that the reference content more matched with the subjective intention of the merchant user can be obtained, and the finally generated text template is more matched with the user requirement.
In still another embodiment, one or more historical advertisement documents of the same type of goods can be obtained according to a goods given by a merchant user, and advertisement texts in the historical advertisement documents are used for making text templates. Specifically, after the merchant user selects a commodity needing to generate the advertisement text, according to the category to which the commodity belongs, the historical advertisement copy of one or more commodities corresponding to the category can be selected from the historical advertisement copy, and the advertisement text in the historical advertisement copy is used for making the text template.
As such, one skilled in the art may obtain one or more existing advertisement texts associated with the goods for use in making the text template.
Step S1200, deconstructing the advertisement text into a label sequence according to a preset label structure, wherein the label structure comprises a plurality of label groups, and each label group comprises at least one word label;
for any of the advertisement texts, one or more text templates may be generated in response. Firstly, according to a preset label structure, combining part of speech and semantic information, deconstructing the advertisement text into a label sequence.
The tag structure, in one embodiment, as shown in fig. 2, includes a tag group for indicating that the following types of information correspond to: advertisement sentence patterns, core words, reserved words, punctuation marks, each tag group may include one or more word tags.
The advertisement sentence pattern is mainly used for conveying advertisement words to readers, and can comprise one or more label groups according to needs, the advertisement sentence pattern can be subdivided into a plurality of specific sentence patterns according to chapter structures of advertisement texts, for example, the advertisement sentence pattern comprises any one or any plurality of opening sentence patterns, discount information sentence patterns and ending sentence patterns, and other sentence patterns which are not listed yet and can be flexibly added by technicians in the field, and each specific sentence pattern is provided with a corresponding label group. Therefore, in a plurality of label groups of the advertisement sentence pattern, each label group can correspond to a specific sentence pattern, for example, the label groups corresponding to the opening sentence pattern, the discount information sentence pattern and the ending sentence pattern are respectively arranged. In the tag group of the advertisement sentence pattern, for example, two word tags may be included to indicate that the corresponding participles in the advertisement sentence pattern belong to the start position or the non-start position respectively.
For example, the tag set of the beginning sentence pattern may include a beginning position word tag { B-Startwords }, a beginning non-beginning position word tag { I-Startwords }; for the tag group of the ending sentence pattern, an ending starting position word tag { B-Endwords }, and an ending non-starting position word tag { I-Endwords }; the tag set for the discount information sentence pattern may include an information start position word tag { B-Sale }, an information non-start position word tag { I-Sale }. It will be appreciated that when a word tag beginning with B is followed by a plurality of word tags of the same type beginning with I (same Startwords, Endwords, or Sale suffixes) in a tag sequence, the word tags with the same suffix can be treated as text corresponding to the same tag group in a collective manner.
In terms of the effect of the advertisement sentence pattern, the opening sentence pattern generally plays a role of guidance in propaganda, and attracts the attention of users as much as possible with the bombing character content; the discount information sentence pattern is generally the corresponding text content which plays the role of promoting commodity discount so as to play the effect of attracting success; the ending sentence pattern is generally corresponding text content which plays a role of enhancing the confidence of the user in purchasing the commodity. The function of the advertisement sentence is only for reference, and those skilled in the art can set other specific contents and specific functions, as long as the corresponding word label is set according to the structure of the label set of the advertisement sentence.
The core word is used for indicating a brand and/or a name of a commodity, so that only one tag group can be defined and used for corresponding to the brand word or the product word, the core word can be subdivided into the brand word and the product word, and the tag groups corresponding to the product word and the brand word are respectively set, wherein each tag group comprises a word tag used for indicating that the corresponding participle belongs to a starting position or a non-starting position.
Similarly, for the tag group of the brand words, a brand word starting position word tag { B-Product }, a brand word non-starting position word tag { I-Product }; the tag group of the product words can comprise a product word starting position word tag { B-Brand }, and a product word non-starting position word tag { I-Brand }. It will be appreciated that when a word tag beginning with B is followed by a plurality of word tags of the same type beginning with I (same suffix Product or branch) in a tag sequence, these word tags with the same suffix can be treated as corresponding texts of the same tag group for centralized processing.
In addition to the set of labels for the previous word and the word labels therein, the set of labels corresponding to the reserved word and punctuation marks may include a single corresponding word label. For example, a single word tag included in the tag group corresponding to the reserved word may be represented as {0}, which means that the participle indicated by the word tag is independent of other contents and then may be reserved in the text template; the single word tag included in the tag group corresponding to the tag symbol may be denoted as { Punc }, which means that the corresponding single word is a label symbol, and may be subsequently retained in the text template.
According to the above tag structure, the advertisement text may be deconstructed into a corresponding sequence of tags. And mapping the advertisement text into word labels corresponding to the word segmentation by taking the word segmentation as a unit through deconstruction, and orderly arranging the word labels in the advertisement text to form a label sequence.
The advertisement text may be deconstructed in a number of ways, such as:
in one embodiment, the advertisement text is subjected to rule matching based on a preset dictionary, and corresponding word labels are determined according to part-of-speech marked by matched participles in the dictionary, so that the label sequence is obtained;
in another embodiment, a deep learning model which is suitable for executing a sequence labeling task and is trained to a convergence state in advance according to the tag structure is adopted to label the sequence of the advertisement text, and the tag sequence is directly obtained and comprises word tags which are labeled at corresponding positions on each word segmentation in the advertisement text according to the tag structure. The deep learning model may be implemented by using a neural network model such as LSTM + CRF, or by using a neural network model such as Bert, and those skilled in the art can flexibly select the model according to the principle disclosed in this embodiment, as long as a sufficient amount of corresponding training samples are used to train the model to a convergent state, so that the model is suitable for generating the corresponding label sequence for an advertisement text.
It should be reminded that the participle indicated by each word label may be different according to the language to which the advertisement text belongs, for example, for latin languages such as english, the participle may be a word; for languages such as chinese, the participle may be a single word or a multi-word obtained after the advertisement text is participled.
Step 1300, extracting material texts corresponding to all label groups from the advertisement texts based on the word labels in the label sequence;
after the tag sequence is obtained, word tags corresponding to the positions of the word segments in the advertisement text are arranged in the tag sequence, so that the tag sequence and the advertisement text are mapped by comparison, a tag group with a preset tag structure can be corresponded, and the advertisement text is decomposed into a plurality of material texts.
By way of example, for advertisement text: "NEW LISTING IN OUR SHOP. Choosing The High-quality T-shirt wild Make Young More Comfortable! Get yours here! (refer to translation: New products on shelf in our shop. good products T-shirts make you more comfortable! buy one quickly! ", the expression of the tag sequence eg _ label _ sequence obtained after parsing is as follows:
{B-Startwords;I-Startwords;I-Startwords;I-Startwords;I-Startwords;Punc;O;O;O;B-Product;O;O;O;O;O;Punc;B-Endwords;I-Endwords;I-Endwords;Punc}
and comparing and mapping the tag sequence with the advertisement text to obtain the corresponding relation of the word tags corresponding to each participle in the advertisement text, wherein:
the opening sentence "NEW LISTING IN OUR shop." corresponds to the word label:
“NEW”{B-Startwords}
“LISTING”{I-Startwords}
“IN”{I-Startwords}
“OUR”{I-Startwords}
“SHOP”{I-Startwords}
“.”{Punc}
the sentence corresponding to the word is retained in the middle: the word label corresponding to "Choosing The High-quality T-shirt wild Make You More Commodite" is:
“Choosing”{O}
“The”{O}
“High-quality”{O}
“T-shirt”{B-Product}
“Will”{0}
“Make”{0}
“You”{0}
“More”{0}
“Comfortable”{0}
“!”{Punc}
ending sentence pattern "Get yours here! "the corresponding word label is:
“Get”{B-Endwords}
“yours”{I-Endwords}
“here”{I-Endwords}
“!”{Punc}
from the above comparison mapping results, it can be seen that the material text "NEW LISTING IN OUR SHOP" of the opening sentence is obtained by combining the participles corresponding to the word tags { B-Startwords }, { I-Startwords } belonging to the tag group corresponding to the opening sentence into a character string. Similarly, the corresponding participles of the word labels { B-Endwords }, { I-Endwords } of the label group corresponding to the ending sentence pattern are combined into a character string, so as to obtain the corresponding material text "Get yours here" of the ending sentence pattern. Similarly, the corresponding material text can also be obtained by the participle corresponding to { B-Product }, and if necessary, the participle with the word label of {0} can also be constructed into the corresponding material text. It is understood that each material text essentially corresponds to the tag group to which the word tag of its respective segmented word belongs.
In an embodiment improved on the above basis, after determining the material texts, the mapping relationship data between the group tags of the tag group and the material texts thereof may be further stored in a material library corresponding to the attribution categories of the goods described in the advertisement texts, where the group tags include a group tag used for indicating that the material texts corresponding to the tag group are advertisement sentences and a group tag used for indicating that the material texts corresponding to the tag group are core words, and the core words are product words and/or brand words of the goods.
For example, corresponding group tags may be preset for each tag group in the tag structure, such as a group tag { Startwords } corresponding to an opening sentence pattern, a group tag { Sale } corresponding to a discount information sentence pattern, a group tag { end words } corresponding to an ending sentence pattern, a group tag { branch } corresponding to a Brand word, a group tag { Product } corresponding to a Product word, and group tags of other tag groups may be set according to actual needs.
According to the preset group tag, for the above example, a mapping relationship may be established between each material text and its corresponding group tag, and corresponding mapping relationship data may be obtained, for example, corresponding mapping relationship data may be constructed at least for the beginning sentence pattern and the ending sentence pattern as follows:
{Startwords}“NEW LISTING IN OUR SHOP”
{Endwords}“Get yours here”
and then, storing the mapping relation data into a preset material library so as to be convenient for calling as a basic material for generating a new advertisement scheme in the following.
The material library can be correspondingly constructed according to the attribution type of the commodity corresponding to the advertisement text, namely, a corresponding material library is arranged for each type in a commodity category system of the e-commerce platform, when the material text in the advertisement text of one commodity is extracted and corresponding mapping relation data is obtained, the mapping relation data can be stored in the material library corresponding to the attribution type of the commodity, and the material text is called from the corresponding material library according to the attribution type of the commodity of the advertisement scheme generated according to the requirement in the following process. By establishing the material library according to the categories of the commodities, semantically more suitable material texts can be conveniently obtained according to the attribution categories of the commodities, and for the automatic generation of the advertisement scheme, the generated advertisement scheme can be more accurately matched on the table meaning due to the finer granularity of the material texts.
And S1400, constructing a document template according to the label sequence, wherein part of label groups in the document template are represented as corresponding replaceable group labels, and other label groups are reserved and represented as corresponding material texts.
Further, a pattern template can be constructed from the tag sequence. The file template is divided into two parts according to whether the content in the file template can be replaced or not, wherein the first part of the content plays a role of a mask and is represented by group tags belonging to a tag group of an alternative type, and the second part of the content directly refers to the original text of the advertisement text, so that the original text of the tag group to be preserved is preserved and represented.
Following the above example, for the beginning sentence pattern, the ending sentence pattern, the discount offer sentence pattern, the core word (including the product word and/or the brand word), etc., the corresponding tag groups may be pre-divided into alternative types of tag groups, so that, in the document template, at the occurrence position determined with reference to the advertisement text, the corresponding group tags thereof are used for representation, while for the reserved word, the punctuation mark, etc., the corresponding tag groups may be pre-divided into tag groups to be reserved, so that, in the document template, at the occurrence position determined with reference to the advertisement text, the corresponding original text in the advertisement text is reserved. Thus, according to the above specific example of the advertisement text, a document template expressed as follows can be obtained:
{Startwords}.Choosing The High-quality{Product}Will Make You More Comfortable!{Endwords}!
it should be noted that, in the generation process of the pattern template, for the processing of punctuation marks, the tag groups of punctuation marks may also be used as alternative tag groups and represented in the pattern template by using their corresponding group tags, so that the punctuation marks may be adjusted flexibly according to the actual mood.
After the case template is obtained, the case template can be applied mechanically subsequently, wherein at the position represented by the group tag, the group tag can be replaced by calling the material text of the corresponding group tag from the material library, so that the corresponding advertisement case can be generated.
According to the above embodiments, the present application has various advantages, including at least the following aspects: in the construction process of the text template according to the given advertisement text, the preset label structure is utilized, the hierarchical and granularity relation between the label groups and the word labels under the label groups is embodied in the label structure, the material texts corresponding to the label groups are accurately extracted from the advertisement text, then in the construction process of the corresponding text template, part of the label groups are used as replaceable label groups and are represented as group labels to play a role of mask codes and can be used for replacing other variable contents, for the rest of the label groups, the original contents in the advertisement text are reserved by representing the replaceable label groups as the corresponding material texts, the selection of the character contents of the advertisement text is realized, and the generated document template not only contains the expression contents of highlights in the advertisement text, but also is convenient for replacing the group labels of the replaceable label groups with the corresponding information of the commodities needing to generate the advertisement document Under the guidance of the level and the granularity of the label structure, the document template is more accurate and effective, can be used for producing advertisement documents in batches, improves the generation efficiency of the advertisement documents required by merchants, and improves the online advertisement technical service capability of the e-commerce platform.
On the basis of any of the above embodiments, a deep learning model may be built to implement deconstruction of the advertisement text into a corresponding tag sequence, and a network architecture of the exemplary deep learning model is as shown in fig. 3, which employs a text feature extraction model to extract a semantic vector1 for each participle in the advertisement text, employs a part-of-speech extractor to extract a part-of-speech vector2 for the advertisement text, and then constructs the semantic vector and the part-of-speech vector of each participle into a comprehensive vector [ vector1, vector2] by a concatenation layer to synthesize semantic information and part-of-speech information of each participle, and then performs full connection via a classification network to implement classification mapping, and calculates a classification probability that each participle is mapped to each word tag in the preset tag structure, and takes a word tag corresponding to a maximum classification probability of each participle as a word tag of the participle.
The text feature extraction Model may be implemented by using a Neural Network Model based on RNN, including but not limited to mature models in the LSTM, Transformer, Bert, etc., and the part of speech extractor may be implemented by using a generalized Markov Model membership algorithm including HMM (hidden Markov Model), Maximum Entropy Markov Model (MEMM), Conditional Random Fields (CRFs), etc., and a deep learning algorithm represented by Recurrent Neural Network (RNN). In addition, some Machine-learned conventional classifiers, such as Support Vector Machines (SVMs), may also be used for part-of-speech tagging after refinement.
In an alternative embodiment of the better measured combination, the text feature extraction model may be implemented by a Bert model, the part of speech extractor may be implemented by a text processing interface TextBlob provided by Python development framework, and the TextBlob may be used to perform many natural language processing tasks, such as part of speech tagging, part of speech component extraction, emotion analysis, text translation, and so on.
The deep learning model may be pre-trained to a convergent state by those skilled in the art using a sufficient number of training samples to learn the ability to generate their corresponding tag sequences from a given ad text. Due to the training through the classification network, the loss function of the deep learning model can be constructed by adopting the following formula:
Figure BDA0003648032990000151
Figure BDA0003648032990000152
wherein c is a label serial number corresponding to each specific word label in the preset label structure, and the value range can be represented as [0,11 ]]"-" represents a dot product operation;
Figure BDA0003648032990000153
for the label with the highest classification probability predicted by the model,
Figure BDA0003648032990000154
are the corresponding weights.
According to the above exemplary deep learning model, referring to fig. 4, the step S1200 of deconstructing the advertisement text into a tag sequence according to a preset tag structure includes the following steps:
step S1210, obtaining semantic vectors of each participle in the advertisement text;
inputting the advertisement text into the text feature extraction model, performing vector embedding on each participle in the text feature extraction model to obtain a corresponding embedded vector, performing semantic extraction, and then obtaining a semantic vector1 corresponding to each participle, wherein the semantic vector is vector representation of deep semantics of the corresponding participle.
Step S1220, obtaining part-of-speech vectors of each participle in the advertisement text;
and inputting the advertisement text into the part of speech extractor, and labeling the part of speech of each participle in the advertisement text by the part of speech extractor according to the word label in the label structure so as to obtain part of speech marks corresponding to each participle in the advertisement text.
According to the part-of-speech identifier corresponding to each participle, part-of-speech-based encoding can be performed on each participle of the advertisement text to obtain a corresponding part-of-speech vector 2. The specific encoding mode can be flexibly set, for example, the part-of-speech vector can also be obtained by adopting a one-hot encoding mode.
Step 1230, constructing the semantic vector and the part-of-speech vector of each participle into a comprehensive vector;
in order to realize the synthesis of the semantic vector and the part-of-speech vector of each participle, as shown in fig. 3, the semantic vector and the part-of-speech vector are directly spliced by a splicing layer, so that a corresponding synthesis vector can be obtained. Therefore, the comprehensive expression of the semanteme and the part of speech of the participle is realized simultaneously through the comprehensive vector.
Step S1240, determining corresponding word labels of the participles in a preset label structure based on the comprehensive vectors of the participles, and obtaining a label sequence corresponding to the advertisement text.
And the comprehensive vector of each participle enters the classification network, after deep characteristic interaction is carried out through one or more full connection layers, the comprehensive vector is output to an activation layer constructed by a Softmax function, the classification probability corresponding to each word label mapped to the preset label structure is calculated, for each participle, the word label with the highest classification probability is the word label corresponding to the participle, the word labels corresponding to the participles of the whole advertisement text are naturally ordered according to the occurrence positions of the participles in the advertisement text, and the label sequence corresponding to the advertisement text can be obtained.
In one embodiment, an activation layer or a Dropout layer may be added to the classification network to improve the convergence speed and avoid overfitting, so that the deep learning model is easier to be trained to the convergence state in the training stage, and the training efficiency is improved.
According to the above embodiment, it can be known that, in the present application, the tag structure is according to when the advertisement text is subjected to sequence tagging, on the one hand, semantic information of each participle in the advertisement text is referred to, on the other hand, part-of-speech information of each participle is referred to, classification mapping is performed according to comprehensive vectors corresponding to the two types of information, and it is determined that each participle is in a corresponding word tag in the tag structure, so that the tag structure can be used for performing sequence tagging by combining the semantic and part-of-speech of the participle without being limited to single information, and the tagging result is more accurate.
On the basis of any of the above embodiments, referring to fig. 5, the step S1220 of obtaining a part-of-speech vector of each participle in the advertisement text includes the following steps:
step S1221, inputting the advertisement text into a part-of-speech extractor to obtain part-of-speech identifiers of each participle;
the parts of speech can be divided into a plurality of types in advance according to the grammar structure of natural language and in combination with the text habit for expressing commodity information in the e-commerce field, for example, the parts of speech are determined to include the following types: nouns, verbs, adjectives, adverbs, prepositions, conjunctions, quantifiers, others, and the like, for which corresponding part-of-speech identifiers are respectively set, so that the part-of-speech extractor outputs to indicate corresponding parts-of-speech according to the part-of-speech identifiers.
Step S1222, querying a preset part-of-speech coding table, determining part-of-speech vectors corresponding to the respective participles, wherein the part-of-speech coding table is ordered according to the statistical frequency of different parts-of-speech in the reference advertisement text set, and determining the part-of-speech vectors corresponding to the respective parts-of-speech by applying a one-hot coding algorithm.
In order to obtain the part-of-speech vector corresponding to each participle according to the part-of-speech identifier determined by the part-of-speech extractor for the participle, a part-of-speech coding table can be compiled by applying a one-hot coding algorithm, so that each part-of-speech has the one-hot coding vector corresponding to the part-of-speech, and the one-hot coding vector can be used as the part-of-speech vector of the participle of the corresponding part-of-speech.
When the part-of-speech coding table is compiled, word frequency statistics can be performed on the participles of each advertisement text in a reference advertisement text set, frequency data corresponding to each part-of-speech is obtained through statistics, the statistical word frequency corresponding to each part-of-speech is determined, and then the parts-of-speech are orderly arranged according to the statistical word frequency of each part-of-speech, so that effective representation of the part-of-speech with low statistical frequency is achieved through unique hot coding.
According to the above principle of encoding with part-of-speech correlation, the dimension of the part-of-speech vector is determined to be 8, while the output vector of the text feature extraction model based on Bert is 768, so that few additional training parameters are required to be introduced, and the features extracted by Bert can be well supplemented.
Accordingly, applicants provide an exemplary part-of-speech encoding table as follows:
Figure BDA0003648032990000171
Figure BDA0003648032990000181
as can be seen from the exemplary part-of-speech encoding table, each part-of-speech has its corresponding part-of-speech vector, for example, where the part-of-speech vector for nouns is 10000000 and the vector for adjectives is 00100000.
According to the embodiments, the part of speech is encoded by using a one-hot encoding method, a corresponding part of speech encoding table is provided, after the part of speech extractor determines the part of speech identifier of each participle, a table can be quickly looked up to determine a corresponding part of speech vector, and the part of speech vector is obtained by encoding according to the statistical frequency of different parts of speech in a reference advertisement text set, so that effective part of speech reference information is provided for subsequently determining the comprehensive vector of the participle, and a deep learning model can be guided to obtain a tag sequence with accurate labeling.
On the basis of any of the above embodiments, referring to fig. 6, the step S1400 of constructing the pattern template according to the tag sequence includes the following steps:
step S1410, reserving word segmentation corresponding to the word labels in the label group belonging to the reserved type in the advertisement text;
in the process of constructing the document template, for the tag group belonging to the type to be reserved in the preset tag structure in the tag sequence, for example, the tag group corresponding to the reserved word and the punctuation mark described in the foregoing example, on the basis of the advertisement text, the corresponding participle is reserved in the original text, that is, the participle does not need to be replaced.
Step S1420, replacing the word segmentation sets corresponding to each continuously appearing word label in the label groups belonging to the replaceable types in the advertisement text with their corresponding group labels.
In the process of constructing the document template, regarding the tag groups belonging to the replaceable types in the preset tag structure in the tag sequence, such as the opening sentence pattern, the ending sentence pattern, the discount information sentence pattern, the tag group corresponding to the core word (including the product word and/or the brand word), and the like, on the basis of the advertisement license, the corresponding participles are correspondingly replaced with the group tags of the respective tag groups. Since the material text corresponding to each tag group has been processed in the previous step, it can also be understood that the material text corresponding to each tag group is replaced with the corresponding group tag. When two identical tag groups occur in succession, the two tag groups may be combined into a single tag group, and thus represented using a single group tag. It will be understood that the expression thus obtained constitutes the pattern template, the group tag in the pattern template effectively acting as a mask, indicating that it can be replaced.
For punctuation in the advertisement text, as described earlier in this application, in some embodiments, word tags corresponding to punctuation can also be treated as alternative types and represented as their group tags.
According to the above embodiments, in the process of generating the document template, the original text is correspondingly retained or the original text is represented as the group tags by corresponding to the different types of tag groups, so that the structure of the document template is completed, the standardization of the document template structure is realized, and the document template can be flexibly transplanted and applied.
On the basis of any of the above embodiments, referring to fig. 7, after the step S1400 of constructing the document template according to the tag sequence, the method includes the following steps:
step S1500, responding to an advertisement file generation request submitted by terminal equipment, and acquiring a commodity core word specified by the request, wherein the commodity core word comprises a product word and/or a brand word of a target commodity corresponding to the request;
when any user of the e-commerce platform, for example, a merchant user needs to call the document template generated by the application to realize automatic generation of the advertisement document, a corresponding interface can be called, a corresponding advertisement document generation request is submitted through the terminal device, and the request can carry a target commodity which is specified by the merchant user and needs to generate the advertisement document. The target commodity can be a commodity on the shelf or about to be on the shelf in the online shop of the merchant user, and the commodity information can be stored in a corresponding commodity database in advance.
The target commodity is obtained by determining a corresponding commodity core word in advance, or extracting the corresponding commodity core word from commodity information of the target commodity by calling a preset core word extraction model, wherein the commodity core word may only include a single product word or a single brand word of the target commodity, or may include both the product word and the brand word of the target commodity.
In another embodiment, the user may directly provide the commodity core word in a text form, and the commodity core word is directly included in the request for analysis and calling.
Step S1600, correspondingly replacing the group labels used for indicating the core words in the pattern template with the commodity core words;
according to the embodiments disclosed in the foregoing of the present application, the text template includes a group tag of an alternative type, in which a group tag corresponding to a product word and/or a brand word is generally included, so that the group tag can be replaced by the product word and/or the brand word corresponding to the target product according to the corresponding group tag in the document template, thereby completing the first stage of processing.
Step S1700, replacing a group label used for indicating an advertisement sentence pattern in the document template with any material text corresponding to the group label in a material library corresponding to the attribution type of the target commodity;
similarly, the group tags remaining in the pattern template and belonging to alternative types, such as the group tags corresponding to the beginning sentence pattern and the ending sentence pattern, may be substituted for some prepared material texts.
As described above, the material texts may be called from a material library, where the material library is obtained by sequentially labeling each advertisement text in advance according to the principle of the present application, and is stored in the material library of the corresponding category in association with the attribution category of the product corresponding to the advertisement text, so that each material library stores various material texts available for the product of the same category, and these material texts are associated with corresponding group tags, and any one or more of the material texts carrying the same group tag may be called for being sleeved in the document template according to the group tags.
For the generation of a single advertisement document, according to each group label belonging to an advertisement sentence pattern in the document template, specifically, for example, the corresponding group label of the opening sentence pattern, the ending sentence pattern, the discount information sentence pattern, etc., a material text carrying the group label is called from the corresponding material library, and is substituted for the corresponding group label in the text template.
When a plurality of advertisement documents need to be generated according to one document template, for each group of tags, a plurality of corresponding material texts can be called from corresponding material libraries to be respectively sleeved into the document template, and the technical personnel in the field can flexibly change the text.
It is understood that after the group labels of the advertisement sentence pattern are not replaced by the corresponding material texts, the corresponding advertisement file can be obtained.
For the material library corresponding to the target commodity, the material library is arranged according to the commodity category, so that the corresponding material library can be determined according to the attribution category of the target commodity labeled in the commodity information in advance. The corresponding material library is determined according to the attribution type of the commodity, so that the material text of the commodity of the same type as the target commodity can be obtained, and the method is more suitable for generating the advertisement copy for the target commodity.
Taking the above exemplary generated document template as an example, the document template:
{Startwords}.Choosing The High-quality{Product}Will Make You More Comfortable!{Endwords}!
assume that there is a material text corresponding to { Strartwords } in the clothing-corresponding database: "HOT multimedia IS association" and the material text "Try it (reference translation: trial bar)" corresponding to { applications } and the Product word { Product } of the target Product specified by the user IS "Chiffon Fabric shift", then the obtained advertising copy IS:
HOT SUMMER IS COMING.Choosing The High-quality Chiffon Shirt Skirt Will Make You More Comfortable!Try it!
(reference translation: coming in midsummer. high quality chiffon shirt makes you more comfortable! trial bar!)
And step S1440, pushing the advertisement file obtained after the file template is replaced to the terminal equipment.
After one or more corresponding advertisement documents are obtained on the basis of the document template, the advertisement documents can be pushed to the terminal equipment for further processing by the user, for example, one of the advertisement documents is selected for editing and then is issued, and the like, so that the terminal user is guided to finish automatic generation of the advertisement documents.
According to the embodiment, the document template manufactured by the method has universality, can assist a user to quickly generate the advertisement document, and the user does not need to deeply understand the text content of the advertisement document automatically generated by the system, only needs to designate a corresponding target commodity and submit a corresponding advertisement document generation request, so that the user can trust to obtain an effective advertisement document which is worthy of trust in the content, and is used for realizing commodity advertisement putting, the advertisement putting efficiency is improved, and the service function of an e-commerce platform is perfected.
Please refer to fig. 8, which is a functional embodiment of the document template generating apparatus adapted to one of the purposes of the present application, and the apparatus includes a text obtaining module 1100, a tag deconstruction module 1200, a material extracting module 1300, and a template construction module 1400, wherein the text obtaining module 1100 is configured to obtain an advertisement text of a commodity; the tag deconstruction module 1200 is configured to deconstruct the advertisement text into a tag sequence according to a preset tag structure, where the tag structure includes a plurality of tag groups, and each tag group includes at least one word tag; the material extracting module 1300 is configured to extract a material text corresponding to each tag group from the advertisement text based on the word tags in the tag sequence; the template construction module 1400 is configured to construct a document template according to the tag sequence, where a part of tag groups in the document template are represented as corresponding replaceable group tags, and other tag groups are retained and represented as corresponding material texts.
On the basis of any of the above embodiments, the tag deconstruction module 1200 includes: the semantic processing submodule is used for acquiring semantic vectors of all the participles in the advertisement text; the part-of-speech processing submodule is used for acquiring part-of-speech vectors of all the participles in the advertisement text; the vector synthesis submodule is used for constructing the semantic vector and the part-of-speech vector of each participle into a synthesis vector; and the part-of-speech determining submodule is used for determining corresponding word labels of the participles in a preset label structure based on the comprehensive vectors of the participles to obtain a label sequence corresponding to the advertisement text.
On the basis of any of the above embodiments, the part-of-speech processing sub-module includes: a part-of-speech extraction unit, which is used for inputting the advertisement text into a part-of-speech extractor to obtain part-of-speech marks of each participle; and the vector embedding unit is used for inquiring a preset part-of-speech coding table, determining part-of-speech vectors corresponding to the participles, orderly arranging the part-of-speech coding table according to the statistical frequency of different parts-of-speech in the reference advertisement text set, and determining the part-of-speech vectors corresponding to the parts-of-speech by applying a one-hot coding algorithm.
On the basis of any of the above embodiments, the tag structure includes a tag group indicating that the following types of information correspond to: advertisement sentence pattern, core word, reserved word, punctuation mark, wherein,
the label group of the advertisement sentence pattern comprises label groups corresponding to an opening sentence pattern, a discount information sentence pattern and a closing sentence pattern, wherein each label group comprises a word label used for indicating that a corresponding participle belongs to a starting position or a non-starting position;
the tag groups of the core words comprise tag groups corresponding to product words and brand words, wherein each tag group comprises a word tag used for indicating that a corresponding word belongs to a starting position or a non-starting position;
the label group corresponding to the reserved words and punctuation marks comprises single corresponding word labels.
On the basis of any of the above embodiments, the template construction module 1400 includes: the reservation processing submodule is used for reserving word segmentation corresponding to the word tags in the tag group belonging to the reservation type in the advertisement text; and the replacement processing submodule is used for replacing and representing the word segmentation set corresponding to each continuously-occurring word label in the label group belonging to the replaceable type in the advertisement text as the corresponding group label.
On the basis of any of the above embodiments, the material extracting module 1300 includes: and the material filing module is used for storing the mapping relation data of the group tags of the tag group and the material texts thereof in a material library corresponding to the attribution type of the commodity described by the advertisement text, wherein the group tags comprise a group tag used for indicating the material text corresponding to the tag group as an advertisement sentence pattern and a group tag used for indicating the material text corresponding to the tag group as a core word, and the core word is a product word and/or a brand word of the commodity.
On the basis of any of the above embodiments, the template construction module 1400 includes: the request response module is used for responding to an advertisement pattern generation request submitted by the terminal equipment and acquiring a commodity core word specified by the request, wherein the commodity core word comprises a product word and/or a brand word of a target commodity corresponding to the request; the core word replacing module is used for correspondingly replacing the group labels used for indicating the core words in the case template with the commodity core words; the advertisement replacing module is used for replacing a group label used for indicating an advertisement sentence pattern in the document template with any material text corresponding to the group label in the material library corresponding to the attribution type of the target commodity; and the file pushing module is used for pushing the advertisement file obtained after the file template is replaced to the terminal equipment.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions, when executed by the processor, can enable the processor to realize a commodity search category identification method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform the documentation template generation method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data necessary for executing all modules/submodules in the filing template generation device of the present application, and the server can call the program codes and data of the server to execute the functions of all the submodules.
The present application also provides a storage medium storing computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the document template generation method of any of the embodiments of the present application.
The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
To sum up, the document template generated from the advertisement text in the application not only contains the wonderful expression content in the advertisement text, but also is convenient to replace the group tags of the replaceable tag groups therein with the corresponding information of the commodities needing to generate the advertisement document, under the guidance of the level and the granularity of the tag structure, the document template is more accurate and effective, can be used for producing the advertisement documents in batches, improves the generation efficiency of the advertisement documents needed by merchants, and improves the online advertisement technical service capability of the e-commerce platform.
Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims (10)

1. A method for generating a document template is characterized by comprising the following steps:
acquiring an advertisement text of a commodity;
deconstructing the advertisement text into a tag sequence according to a preset tag structure, wherein the tag structure comprises a plurality of tag groups, and each tag group comprises at least one word tag;
extracting material texts corresponding to all the label groups from the advertisement texts based on the word labels in the label sequence;
and constructing a document template according to the label sequence, wherein part of label groups are represented as corresponding replaceable group labels in the document template, and other label groups are reserved and represented as corresponding material texts.
2. The method of claim 1, wherein deconstructing the advertisement text into a sequence of tags according to a predetermined tag structure comprises:
obtaining semantic vectors of all participles in the advertisement text;
acquiring a part-of-speech vector of each participle in the advertisement text;
constructing the semantic vector and the part-of-speech vector of each participle into a comprehensive vector;
and determining corresponding word labels of the word segments in a preset label structure based on the comprehensive vectors of the word segments to obtain a label sequence corresponding to the advertisement text.
3. The method of claim 2, wherein obtaining part-of-speech vectors of each participle in the advertisement text comprises:
inputting the advertisement text into a part-of-speech extractor to obtain part-of-speech identifiers of all the participles;
and querying a preset part-of-speech coding table, determining part-of-speech vectors corresponding to the participles, orderly arranging the part-of-speech coding table according to the statistical frequency of different parts-of-speech in a reference advertisement text set, and determining the part-of-speech vectors corresponding to the parts-of-speech by applying a one-hot coding algorithm.
4. The method of claim 1, wherein:
the tag structure includes a tag group for indicating that the following types of information correspond: advertisement sentence pattern, core word, reserved word, punctuation mark, wherein,
the label group of the advertisement sentence pattern comprises label groups corresponding to an opening sentence pattern, a discount information sentence pattern and a closing sentence pattern, wherein each label group comprises a word label used for indicating that a corresponding participle belongs to a starting position or a non-starting position;
the tag groups of the core words comprise tag groups corresponding to product words and brand words, wherein each tag group comprises a word tag used for indicating that a corresponding word belongs to a starting position or a non-starting position;
the label group corresponding to the reserved words and punctuation marks comprises single corresponding word labels.
5. The method of claim 1, wherein constructing a document template from the sequence of tags comprises:
reserving word segmentation corresponding to word labels in a label group belonging to a reserved type in the advertisement text;
and replacing and representing word segmentation sets corresponding to all continuously-occurring word labels in the label groups belonging to the replaceable types in the advertisement texts as corresponding group labels.
6. The method of claim 1, wherein the step of extracting material text corresponding to each tag group from the advertisement text based on the word tags in the tag sequence comprises the following steps:
and storing the mapping relation data of the group tags of the tag group and the material texts thereof in a material library corresponding to the attribution type of the commodity described by the advertisement text, wherein the group tags comprise group tags for indicating the material texts corresponding to the tag group as advertisement sentence patterns and group tags for indicating the material texts corresponding to the tag group as core words, and the core words are product words and/or brand words of the commodity.
7. The method of claim 6, wherein the step of constructing a document template from the sequence of tags is followed by the steps of:
responding to an advertisement pattern generation request submitted by terminal equipment, and acquiring a commodity core word specified by the request, wherein the commodity core word comprises a product word and/or a brand word of a target commodity corresponding to the request;
correspondingly replacing the group labels used for indicating the core words in the case template with the commodity core words;
replacing a group label used for indicating an advertisement sentence pattern in the document template with any material text corresponding to the group label in a material library corresponding to the attribution type of the target commodity;
and pushing the advertisement file obtained after the file template is replaced to the terminal equipment.
8. A document template generating apparatus, comprising:
the text acquisition module is used for acquiring advertisement texts of commodities;
the tag deconstruction module is used for deconstructing the advertisement text into a tag sequence according to a preset tag structure, wherein the tag structure comprises a plurality of tag groups, and each tag group comprises at least one word tag;
the material extraction module is used for extracting material texts corresponding to all the label groups from the advertisement texts on the basis of the word labels in the label sequence;
and the template construction module is used for constructing a document template according to the label sequence, wherein part of label groups in the document template are represented as corresponding replaceable group labels, and other label groups are reserved and represented as corresponding material texts.
9. A computer device comprising a central processor and a memory, characterized in that the central processor is adapted to invoke execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.
CN202210540521.0A 2022-05-17 2022-05-17 Method for generating document template, device, equipment, medium and product thereof Pending CN114818641A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210540521.0A CN114818641A (en) 2022-05-17 2022-05-17 Method for generating document template, device, equipment, medium and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210540521.0A CN114818641A (en) 2022-05-17 2022-05-17 Method for generating document template, device, equipment, medium and product thereof

Publications (1)

Publication Number Publication Date
CN114818641A true CN114818641A (en) 2022-07-29

Family

ID=82514528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210540521.0A Pending CN114818641A (en) 2022-05-17 2022-05-17 Method for generating document template, device, equipment, medium and product thereof

Country Status (1)

Country Link
CN (1) CN114818641A (en)

Similar Documents

Publication Publication Date Title
Yang et al. Fashion captioning: Towards generating accurate descriptions with semantic rewards
CN107729309A (en) A kind of method and device of the Chinese semantic analysis based on deep learning
CN114971730A (en) Method for extracting file material, device, equipment, medium and product thereof
CN113962224A (en) Named entity recognition method and device, equipment, medium and product thereof
CN114997288A (en) Design resource association method
CN116976920A (en) Commodity shopping guide method and device, equipment and medium thereof
CN115563982A (en) Advertisement text optimization method and device, equipment, medium and product thereof
CN114780582A (en) Natural answer generating system and method based on form question and answer
CN114218948A (en) Keyword recognition method and device, equipment, medium and product thereof
Elghannam Multi-label annotation and classification of Arabic texts based on extracted seed Keyphrases and Bi-Gram alphabet feed forward neural networks model
Tarride et al. A comparative study of information extraction strategies using an attention-based neural network
Swamy et al. Nit-agartala-nlp-team at semeval-2020 task 8: Building multimodal classifiers to tackle internet humor
CN113806536B (en) Text classification method and device, equipment, medium and product thereof
Pakray et al. An hmm based pos tagger for pos tagging of code-mixed indian social media text
CN114997921A (en) Advertisement case recommendation method and device, equipment, medium and product thereof
CN114818641A (en) Method for generating document template, device, equipment, medium and product thereof
CN115018548A (en) Advertisement case prediction method and device, equipment, medium and product thereof
CN111339303B (en) Text intention induction method and device based on clustering and automatic abstracting
CN113887230A (en) Financial scene-oriented end-to-end natural language processing training framework and method
CN111061939B (en) Scientific research academic news keyword matching recommendation method based on deep learning
Sarkar et al. Analyzing movie reviews sentiment
Bulfamante Generative enterprise search with extensible knowledge base using AI
CN116798417B (en) Voice intention recognition method, device, electronic equipment and storage medium
Li et al. STCP: An Efficient Model Combining Subject Triples and Constituency Parsing for Recognizing Textual Entailment
Ingale et al. Artificial Intelligences-Based Approaches for Generating Image Caption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination