CN117057873A

CN117057873A - Commodity description content generation method, device, equipment and medium thereof

Info

Publication number: CN117057873A
Application number: CN202311016697.7A
Authority: CN
Inventors: 郭志伟
Original assignee: Guangzhou Shangyan Network Technology Co ltd
Current assignee: Guangzhou Shangyan Network Technology Co ltd
Priority date: 2023-08-11
Filing date: 2023-08-11
Publication date: 2023-11-14

Abstract

The application relates to a commodity descriptive content generating method and a device, equipment, medium and product thereof, wherein the method comprises the following steps: acquiring commodity prompt texts, wherein the commodity prompt texts comprise a plurality of keywords for explaining selling point information of target commodities; inputting the commodity prompt text into a preset description generation model to obtain a commodity description text generated by the description generation model, wherein the description generation model belongs to an example of a large language model obtained through reinforcement learning training, and the commodity description text comprises text contents of meaning expansion description corresponding to selling point information of the commodity prompt text; and constructing commodity description contents of the target commodity according to the commodity description text, and storing the commodity description contents into a commodity database. The application can accurately understand the selling point information of the commodity prompt text, generates text contents for expanding the selling point information end to end, can improve commodity information generation efficiency and reduce the operation cost of online shops.

Description

Commodity description content generation method, device, equipment and medium thereof

Technical Field

The application relates to the technical field of electronic commerce information, in particular to a commodity description content generation method, a device, equipment and a medium thereof.

Background

The shops on each line in the e-commerce platform need to frequently release commodity information, and each time a new commodity is put on the shelf, the information corresponding to the commodity needs to be recorded, and sometimes commodity information of a plurality of commodities needs to be released in a concentrated mode, particularly when the station is built for initialization.

It is easy to imagine that the entry of commodity information is very cumbersome, and besides basic data such as commodity titles, commodity pictures, commodity attributes and the like, commodity description text is one of the most important commodity information. The commodity description text, as the name implies, is in the form of natural language text, and correspondingly describes different aspects of functions, characteristics, purposes, effects, advantages, notes and the like of the commodity as required, thereby playing roles of introducing the commodity to a consumer user and promoting cooperation.

However, the description of the commodity description text is not an easy matter. The good quality commodity description text can better prompt the consumer user to place an order, so that the merchant user can want to create a good quality commodity description text for the commodity. However, since the writing of the commodity description text is not easy, the overall efficiency of the merchant user in entering commodity information is low.

With the development of artificial intelligence technology, various technical schemes based on neural network models exist in the industry and can be used for generating commodity titles by consumers, but for creation of commodity description texts, the technical schemes still seem to be apprehended, because commodity titles are relatively short and have no strict word order, are easier to generate, and the required semantic meaning of commodity description texts is tighter, if the artificial intelligence technology is implemented by randomly applying a model originally used for generating titles, the effect is purely east-applied, the practical problem in industrial practice cannot be effectively solved, and effective commodity description texts cannot be generated for commodity objects.

Disclosure of Invention

The application aims to provide a commodity description content generation method, a corresponding device and equipment thereof and a nonvolatile readable storage medium.

According to an aspect of the present application, there is provided a commodity description content generating method including the steps of:

acquiring commodity prompt texts, wherein the commodity prompt texts comprise a plurality of keywords for explaining selling point information of target commodities;

inputting the commodity prompt text into a preset description generation model to obtain a commodity description text generated by the description generation model, wherein the description generation model belongs to an example of a large language model obtained through reinforcement learning training, and the commodity description text comprises text contents of meaning expansion description corresponding to selling point information of the commodity prompt text;

And constructing commodity description contents of the target commodity according to the commodity description text, and storing the commodity description contents into a commodity database.

According to another aspect of the present application, there is provided a commodity description content generating apparatus comprising:

the prompt acquisition module is used for acquiring commodity prompt texts, wherein the commodity prompt texts comprise a plurality of keywords for explaining the selling point information of the target commodity;

the description generation module is used for inputting the commodity prompt text into a preset description generation model to obtain a commodity description text generated by the description generation model, the description generation model belongs to an example of a large language model obtained through reinforcement learning training, and the commodity description text comprises text contents of meaning expansion description corresponding to selling point information of the commodity prompt text;

and the description release module is used for constructing commodity description contents of the target commodity according to the commodity description text and storing the commodity description contents into a commodity database.

According to another aspect of the present application, there is provided a commodity descriptive content generating apparatus comprising a central processor and a memory, the central processor being operative to invoke execution of a computer program stored in the memory to perform the steps of the commodity descriptive content generating method of the present application.

According to another aspect of the present application, there is provided a non-transitory readable storage medium storing a computer program embodied in accordance with the commodity description content generating method in the form of computer readable instructions, the computer program when executed by a computer to perform the steps included in the method.

The present application has various technical advantages over the prior art including, but not limited to:

firstly, the application uses a large language model to carry out reinforcement learning training to obtain the description generation model, so that the description generation model has the capability of generating commodity description texts, when a user gives out commodity prompt texts, the description generation model can understand the selling point information of the given commodity prompt texts according to the learned capability, the commodity description texts for expanding the selling point information are generated end to end, commodity description contents are further constructed by utilizing the commodity description texts and stored in a commodity database for use, the trouble that the user writes the commodity description texts by himself can be avoided, the editing efficiency of commodity information can be improved, and the operation cost of online stores can be reduced.

And secondly, after the description generation model is subjected to reinforcement learning training, the commodity description text generated by the commodity prompt text corresponding to the description generation model can be ensured to comprise text contents of the meaning expansion description of the selling point information corresponding to the given commodity prompt text, the obtained text contents are not limited to short discrete forms, the information quantity is large and complete, the requirement of commodity information construction in an electronic commerce platform is met, the commodity information can be accurately reflected by the semantic meaning, and the method is more practical.

In addition, for the whole electronic commerce platform, the technical scheme of the application is opened for all merchant users in the platform, so that the commodity data input efficiency of online shops in the electronic commerce platform can be comprehensively and obviously improved, and particularly, when the electronic commerce platform is in daily demand for inputting massive commodity data, the commodity information generation efficiency in the whole platform range can be improved, and the comprehensive cost is saved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a network architecture schematic of an exemplary application environment of the present application;

FIG. 2 is a flow chart of an embodiment of a method for generating merchandise descriptions of the present application;

FIG. 3 is a schematic flow chart of training to obtain description generation model in the embodiment of the application;

FIG. 4 is a flow chart of constructing each dataset according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of training, acquiring and generating an example model in an embodiment of the application;

FIG. 6 is a schematic flow chart of training a scoring example model in an embodiment of the application;

FIG. 7 is a flow chart of reinforcement learning training for generating an instance model by means of a scoring instance model in an embodiment of the application;

FIG. 8 is a flow chart of constructing a commodity description content including an illustration according to a commodity description text according to an embodiment of the present application;

fig. 9 is a schematic block diagram of the commodity descriptive content generating apparatus of the present application;

fig. 10 is a schematic diagram of a commodity descriptive content generating apparatus according to the present application.

Detailed Description

In the network architecture shown in fig. 1, the e-commerce platform 82 is deployed in the internet to provide corresponding services to its users, and the merchant user's device 80 and the consumer user's device 81 of the e-commerce platform 82 are similarly connected to the internet to use the services provided by the e-commerce platform. For example, a merchant user may establish an online store through an e-commerce platform and enter merchandise information, a consumer user may access a certain online store in the e-commerce platform to view merchandise information of various merchandise therein, and so on.

The exemplary e-commerce platform 82 provides matching of supply and demand for products and/or services to the public by means of an internet infrastructure, in the e-commerce platform 82, the products and/or services are provided as merchandise information, and for simplicity of description, the concept of merchandise, products, etc. is used in the present application to refer to the products and/or services in the e-commerce platform 82, and specifically may be physical products, digital products, tickets, service subscriptions, other off-line fulfillment services, etc.

In reality, each entity of the parties can access the identity of the user to the e-commerce platform 82, and the purpose of participating in the business activity realized by the e-commerce platform 82 is realized by using various online services provided by the e-commerce platform 82. These entities may be natural persons, legal persons, social organizations, etc. The e-commerce platform 82 corresponds to both merchant and consumer entities in commerce, and there are two broad categories of merchant users and consumer users, respectively. The online service can be used in the e-commerce platform 82 by the identity of the merchant user, while the online service can be used in the e-commerce platform 82 by the identity of the consumer, including the real or potential consumer, of the merchant user. In actual business activities, the same entity can perform activities on the identity of a merchant user and the identity of a consumer user, so that the user can flexibly understand the activities.

The infrastructure for deploying the e-commerce platform 82 mainly comprises a background architecture and front-end equipment, wherein the background architecture runs various online services through a service cluster, and the service functions of the background architecture are enriched and perfected by middleware or front-end services facing a platform side, services facing a consumer, services facing a merchant and the like; the head-end equipment primarily encompasses the terminal equipment that the user uses to access the e-commerce platform 82 as a client, including but not limited to various mobile terminals, personal computers, point-of-sale devices, and the like. For example, a merchant user may enter merchandise information for his online store through his terminal device 80, or generate his merchandise information using an open interface of the e-commerce platform; the consumer user can access the web page of the online store realized by the electronic commerce platform 82 through the terminal device 81 thereof, and trigger the shopping flow through the shopping keys provided on the web page, and call various online services provided by the electronic commerce platform 82 in the shopping flow, thereby realizing the purpose of purchasing orders.

In some embodiments, the e-commerce platform 82 may be implemented by a processing facility including a processor and memory that stores a set of instructions that, when executed, cause the e-commerce platform 82 to perform e-commerce and support functions in accordance with the present application. The processing facility may be part of a server, client, network infrastructure, mobile computing platform, cloud computing platform, fixed computing platform, or other computing platform, and provide electronic components of the merchant platform 82, merchant devices, payment gateways, application developers, marketing channels, transport providers, client devices, point-of-sale devices, and the like.

The e-commerce platform 82 may be implemented as online services such as cloud computing services, software as a service (SaaS), infrastructure as a service (IaaS), platform as a service (PaaS), desktop as a service (DaaS), hosted software as a service, mobile back end as a service (MBaaS), information technology management as a service (ITMaaS), and the like. In some embodiments, the various features of the e-commerce platform 82 may be implemented to be adapted to operate on a variety of platforms and operating systems, e.g., for an online store, the administrator user may enjoy the same or similar functionality, whether in the various embodiments iOS, android, homonyOS, web page, etc.

The e-commerce platform 82 may implement its respective independent station for each merchant to run its respective online store, providing the merchant with a respective instance of the commerce management engine for the merchant to establish, maintain, and run one or more of its online stores in one or more independent stations. The business management engine instance can be used for content management, task automation and data management of one or more online stores, and various specific business processes of the online stores can be configured through interfaces or built-in components and the like to support the realization of business activities. The independent station is an infrastructure of the e-commerce platform 82 with cross-border service functionality, and merchants can maintain their online stores more centrally and autonomously based on the independent station. The stand-alone stations typically have merchant-specific domain names and memory space, with relative independence between the different stand-alone stations, and the e-commerce platform 82 may provide standardized or personalized technical support for a vast array of stand-alone stations, so that merchant users may customize their own adaptive commerce management engine instances and use such commerce management engine instances to maintain one or more online stores owned by them.

The online store may implement background configuration and maintenance by the merchant user logging in his business management engine instance with an administrator identity, which, in support of various online services provided by the infrastructure of the e-commerce platform 82, may configure various functions in his online store, review various data, etc., e.g., the merchant user may manage various aspects of his online store, such as viewing recent activities of the online store, updating online store inventory, managing orders, recent access activities, total order activities, etc.; the merchant user may also view more detailed information about the business and visitors to the merchant's online store by retrieving reports or metrics, such as sales summaries showing the merchant's overall business, specific sales and participation data for the active sales marketing channel, etc.

The e-commerce platform 82 may provide a communications facility and associated merchant interface for providing electronic communications and marketing, such as utilizing an electronic message aggregation facility to collect and analyze communications interactions between merchants, consumers, merchant devices, customer devices, point-of-sale devices, etc., to aggregate and analyze communications, such as for increasing the potential to provide product sales, etc. For example, a consumer may have problems with the product, which may create a dialogue between the consumer and the merchant (or an automated processor-based proxy on behalf of the merchant), where the communication facility is responsible for interacting and providing the merchant with an analysis of how to increase sales probabilities.

In some embodiments, an application program suitable for being installed to a terminal device may be provided to serve access requirements of different users, so that various users can access the e-commerce platform 82 in the terminal device through running the application program, for example, a merchant background module of an online store in the e-commerce platform 82, and in the process of implementing the business activity through the functions, the e-commerce platform 82 may implement various functions related to supporting implementation of the business activity as middleware or online service and open corresponding interfaces, and then implant a tool kit corresponding to the interface access function into the application program to implement function expansion and task implementation. The commerce management engine may include a series of basic functions and expose those functions through APIs to online service and/or application calls that use the corresponding functions by remotely calling the corresponding APIs.

Under the support of the various components of the commerce management engine instance, the e-commerce platform 82 may provide online shopping functionality, enabling merchants to establish contact with customers in a flexible and transparent manner, consumer users may purchase items online, create merchandise orders, provide delivery addresses for the items in the merchandise orders, and complete payment confirmation of the merchandise orders. The merchant may then review and fulfill or cancel the order. The audit component carried by the business management engine instance may enable compliance use of the business process to ensure that the order is suitable for fulfillment prior to actual fulfillment. Orders can sometimes be fraudulent, requiring verification (e.g., identification card checking), a payment method that requires the merchant to wait to ensure funds are received can act to prevent such risk, and so on. The order risk may be generated by fraud detection tools submitted by third parties through an order risk API or the like. Before fulfillment, the merchant may need to acquire payment information or wait to receive payment information in order to mark the order as paid before the merchant can prepare to deliver the product. Such as this, a corresponding examination can be made. The audit flow may be implemented by a fulfillment component. Merchants can review, adjust the job, and trigger related fulfillment services by way of fulfillment components, such as: through manual fulfillment services, use when a merchant picks and packages a product in a box, purchases a shipping label and enters its tracking number, or simply marks an item as fulfilled; a custom fulfillment service that may define sending emails for notification; an API fulfillment service that may trigger a third party application to create a fulfillment record at a third party; a legacy fulfillment service that may trigger custom API calls from a business management engine to a third party; the gift card fulfills the service. Generating a number and activating the gift card may be provided. Merchants may print shipping slips using an order printer application. The fulfillment process may be performed when the items are packaged in boxes and ready for shipment, tracking, delivery, verification by the consumer, etc.

It can be seen that the services provided by the e-commerce platform are developed based on the product as a core, the corresponding commodity information is basic information of the e-commerce platform, and the commodity information of various commodities is high-frequency and required to be efficiently generated and maintained.

Referring to fig. 2, the method for generating commodity descriptive contents according to the present application includes the following steps:

step S5100, acquiring commodity prompt texts, wherein the commodity prompt texts comprise a plurality of keywords for explaining selling point information of target commodities;

when a merchant user needs to generate commodity information of a commodity object, the merchant user can enter a commodity information editing page, enter various information such as commodity titles, commodity attributes, commodity pictures and the like, and can give commodity prompt texts.

The merchandise alert text used to generate the merchandise description text typically includes one or more keywords that are used to describe the point of sale information of the target merchandise for which merchandise information is being entered, which keywords may be entered by the merchant user themselves or may be used with existing information, for example, in some embodiments, the merchandise title entered by the merchant user for this target merchandise may be used directly as the merchandise alert text.

The keywords given in the commodity prompt text for representing the selling point information have corresponding connotations, which indicate corresponding selling points, for example, for a text such as "current season popular coat", it can be understood that two keywords are included, namely "current season popular" and "coat", wherein "current season popular" indicates meaning that the target commodity belongs to fashion products, and "coat" indicates meaning that the target commodity belongs to clothing of a specific function, so that these keywords actually correspond to a certain selling point information representing the target commodity.

The selling point information expressed by the keyword may play a role in various aspects of the information display structure, including but not limited to the type, function, feature, action, effect, advantage, use, notice, etc. of the commodity, in terms of the information display structure of the commodity information. Can be given according to the actual use.

In some embodiments, a keyword may represent sales point information corresponding to one or more sales points, e.g., a "sports shoe" represents both sales point information representing a function of "shoe" and sales point information of the type of "sports" and also sales point information of the type of use.

In some embodiments, when the commodity prompt text is a commodity title, considering that some emoticons or other punctuations which do not represent semantic meanings may exist in the commodity title, format preprocessing may be performed on the commodity title first, stray interference characters in the commodity title may be removed, and a purer commodity title may be obtained for subsequent processing, so as to improve accuracy of subsequent generation of the commodity description text.

In some embodiments, the method includes firstly comparing a preset selling point information dictionary to segment the commodity title, matching the similar meaning words with the semantic similarity of each keyword reaching a preset threshold value in the selling point information dictionary according to each keyword obtained after the segmentation, and combining the similar meaning words into the commodity title to realize word expansion of the commodity title so as to enrich semantic expression of commodity prompt texts, and further facilitate understanding of semantics by a model to accurately generate commodity description texts.

Step S5200, inputting the commodity prompt text into a preset description generation model to obtain a commodity description text generated by the description generation model, wherein the description generation model belongs to an example of a large language model obtained through reinforcement learning training, and the commodity description text comprises text contents corresponding to meaning expansion description of selling point information of the commodity prompt text;

In order to quickly generate the commodity description text of the target commodity for the merchant user based on the commodity prompt text, a description generation model is trained in advance, so that the commodity description generation model has the capability of generating the corresponding commodity description text according to the given commodity prompt text, the description generation model can ensure the text content in the generated commodity description text to form an expanded description of connotation corresponding to the selling point information given in the commodity prompt text, that is, the commodity description text can more thoroughly reveal the commodity information of the target commodity compared with the commodity prompt text, and various aspects of the target commodity can be introduced to readers in a more thorough mode, including but not limited to the type, the function, the characteristics, the action, the effect, the advantages, the purposes, the notes and the like of the target commodity.

Of course, considering that the large language model has stronger reasoning capability inherently, when the description generation model of the application generates the commodity description text according to the commodity prompt text, some expanded information which is richer than the selling point information provided in the commodity prompt text can be generated in the text content of the commodity description text, but because the description generation model of the application is subjected to strengthening training in advance, the expanded information is still very closely related with the actual situation of the target commodity.

The large language model has a powerful reasoning function, so the description generation model is trained and generated by adopting the large language model with an open source in advance so as to inherit the reasoning capability acquired by the large language model and quickly achieve the aim of the application. Various known large language models, such as LLAMA, chatGPT, etc., can be used to prepare the description generative model of the present application. Under the action of the high-efficiency training method disclosed by other embodiments of the application, a description generation model can be prepared by selecting a low-parameter version, such as LLAM 7B version, from the large language models, so that the effect of light weight is achieved, and the finally prepared description generation model can be deployed into small equipment, and the deployment cost is reduced.

And S5300, constructing commodity description contents of the target commodity according to the commodity description text, and storing the commodity description contents into a commodity database.

After the description generation model obtains the commodity description text according to the commodity prompt text, the commodity description text can be directly used as commodity description content of the target commodity to become one item of resource data in commodity information of the target commodity, and the resource data is directly stored in a commodity database, wherein the commodity database is usually a commodity database corresponding to an online store of a merchant user.

In some embodiments, the commodity description text may be further formatted to generate formatted data of a commodity detail page of the target commodity, to form commodity description content, and then be used as one item of resource data in commodity information to be stored in a commodity database corresponding to an online store of the merchant user.

When the commodity description contents are stored in the corresponding commodity database, other commodity information input by a user can be combined and stored together, and the commodity description contents of all commodities in the commodity database can be automatically generated and stored in an automatic batch mode.

After the commodity description contents are stored in the corresponding commodity database, commodity information of the target commodity is actually issued to an electronic commerce platform, particularly to an online store of a merchant user. According to the business logic of the online store, when the commodity information of the target commodity is required to be displayed, the online store invokes corresponding data in a commodity database to construct a corresponding commodity detail page, and the commodity detail page is pushed to terminal equipment of a browsing user, which is usually a consumer user or a merchant user, for display.

From the above embodiments, the present application has various technical advantages including, but not limited to:

On the basis of any embodiment of the method of the present application, referring to fig. 3, before acquiring the commodity prompt text, the method includes:

s4100, inputting a first prototype instance of the large language model by adopting a sentence pair consisting of commodity titles and corresponding commodity description texts in a first data set, and implementing self-supervision training to a convergence state to obtain a generated instance model;

the first data set is used for training to obtain a generated instance model. The first data set comprises pre-collected commodity titles and corresponding commodity description texts thereof, wherein each commodity title and the corresponding commodity description text form the same sentence pair, and a one-to-one mapping relation exists.

In some embodiments, the data advantage of the e-commerce platform may be utilized to obtain the commodity titles and corresponding commodity description texts in the commodity databases of the multiple online stores in the e-commerce platform to construct the first data set. Therefore, each sentence pair in the first data set is a commodity title and a corresponding commodity description text of the commodity title which are actually applied, the commodity title corresponds to each other with high accuracy, a high-quality training sample can be obtained without a manual labeling process, and the labeling cost is zero.

In some embodiments, commodity titles collected from commodity databases actually used by online shops can be used, a large language model with strong reasoning capability, such as ChatGPT 3.0 or above, which has been successfully deployed and used, can be input, commodity description texts corresponding to the commodity titles are generated by the large language model, sentence pairs are formed between the commodity titles and the commodity description texts thereof, so that a first data set is constructed, and the labeling cost can be saved.

To obtain a generated instance model, from some known large language models, for example, each of the low parameter versions in llama, for example, the llama 7B version, one of the versions is obtained as a first prototype instance, which is fine-tuned using a first dataset to obtain the generated instance model.

Because the large language model of the selected model is a generating model, the first prototype instance can implement self-supervision training based on a given training sample, and the self-supervision training can remarkably improve training efficiency unlike supervised training.

In the process of implementing self-supervision training on the first prototype, only each sentence pair is required to be used as a training sample, each training sample is sequentially put into the first prototype, and the first prototype can complete the self-supervision training process according to the self-supervision mechanism until a preset training target is reached, for example, the training target is considered to be reached when convergence is reached according to the loss value judgment of a certain training sample.

After training the first prototype instance, the generated instance model of the application can be obtained, wherein the generated instance model is actually a product obtained by fine-tuning training the prototype instance of the large language model with a low parameter version, and compared with the prototype instance, the generated instance model has the capability of generating corresponding commodity description text according to a given input text.

Step S4200, inputting a second prototype instance of the large language model by adopting parallel corpus in a second data set, and implementing contrast learning training to a convergence state to obtain a scoring instance model, wherein the second prototype instance is a network architecture obtained by replacing an output layer of the second prototype instance with a terminal linear layer on the basis of the large language model, the terminal linear layer maps an reasoning result of the large language model into a single-dimensional vector, and each parallel corpus comprises the same commodity title and two corresponding different commodity description texts;

the second data set is used for training to obtain a scoring instance model. The second data set comprises a large number of parallel corpus, each parallel corpus comprises two mapping relation data, the two mapping relation data are respectively the mapping from the same commodity title to different commodity description texts, and the storage mode can be flexible and various. It will be understood that the commodity titles and the commodity description text in the same mapping relation data form the same sentence pair.

The same parallel corpus can divide two sentence pairs into a first sentence pair and a second sentence pair according to the accuracy of the commodity description text to the explanation of the selling point information in the commodity title. In order to guide the model to calculate the loss value during training and also to quickly determine whether the first sentence pair and the second sentence pair are better or worse so as to guide the contrast learning training process of the model, the first sentence pair and the second sentence pair can be identified according to the sources of commodity description texts.

In some embodiments, the first sentence pair of each parallel corpus in the second data set may be a commodity title and a commodity description text of the first sentence pair, which are obtained from commodity information of the same commodity already put on shelf in a commodity sample library provided by the e-commerce platform, and the commodity description text of the second sentence pair corresponding to the first sentence pair may be other randomly generated texts or commodity description texts of other different commodities. That is, in the first sentence pair, the commodity description text accurately and effectively explains the selling point information of the commodity title, and the meaning of the commodity description text in the corresponding second sentence pair cannot effectively and accurately correspond to the selling point information of the commodity title. In this way, the second data set is generated from the existing commodity data of the e-commerce platform, at zero cost and more efficiently.

In other embodiments, two sentence pairs of each parallel corpus in the second dataset may refer to the manner of construction of the sentence pairs in the first dataset, and the first sentence pair and the second sentence pair may be generated from different large language models that have already been run in maturity, respectively. For example, a batch of commodity titles are collected from a commodity sample library provided by an e-commerce platform, and a first large language model such as ChatGPT is called to generate first commodity description texts corresponding to the commodity titles and used as a first sentence pair; a second large language model, such as Vicuna, is invoked to generate second commodity description text corresponding to the commodity titles. Since the first commodity description text generated based on ChatGPT is superior to the second commodity description text generated by Vicuna in interpretation ability of the commodity title for the same commodity title, the sentence pair formed by the same commodity title and the first commodity description text is set to be used as a first sentence pair with better quality, the sentence pair formed by the same commodity title and the first commodity description text is set to be used as a second sentence pair with lower quality, and the first sentence pair and the second sentence pair are constructed into parallel corpus according to identity of the commodity title and are stored in the second data set.

In addition to the above-exemplified ways, a person skilled in the art may flexibly combine the above-described ways to construct the second data set in a less costly and efficient way.

To obtain a scoring instance model, a second prototype instance that constructs the same large language model, e.g., the aforementioned llama 7B version, can be obtained that is a modified network architecture relative to the first prototype instance, and specifically, based on the exemplary llama 7B version network architecture, its final output layer is replaced with an end linear layer, so that the inference result obtained by the hidden layer of the second prototype instance is mapped into a one-dimensional vector by the end linear layer to represent the score.

It should be noted that the underlying basis of the scoring example model is not a trained first prototype example, but a second prototype example obtained by modifying the homologous large language model based on the first prototype example, so that the advantage is that the simplicity of the scoring example model can be kept, the scoring example model is focused on the scoring task, the scoring result is more objective and independent, and a more effective measuring effect is achieved in the subsequent reinforcement learning training process. Conversely, if the first prototype instance which has been subjected to fine-tuning training is modified into the scoring instance model at this stage, since the first prototype instance originally has the capability of generating commodity descriptive text, this capability necessarily affects the objective effectiveness of the scoring result, similar to the self-evaluation of individuals in life, which is obviously not preferable.

After the second data set and the second prototype example are obtained, the second data set can be used for implementing comparison learning training on the second prototype example, during comparison learning, two sentence pairs in each parallel corpus in the second data set are input into the second prototype example, the second prototype example infers the two parallel corpora, the characteristics of the final hidden layer are mapped into a score through the terminal linear layer, then the corresponding scores of the two sentence pairs in each parallel corpus are compared according to the setting that the quality of the first sentence pair is superior to that of the second sentence pair, and corresponding loss values are determined, and then the weight of the second prototype example is updated according to the loss values, so that the comparison learning training is realized.

It is easy to understand that by adopting a sufficient amount of parallel corpus to perform contrast learning training on the second prototype instance, the training sample does not need to be labeled, and any labeling cost can be avoided. In a specific scenario of the e-commerce platform, which needs to respond to the update service quickly, it is extremely impractical to wait for manual completion of labeling of the training samples, so that a second prototype example is obtained by means of transformation of a network architecture of a large language model, and the second prototype example is subjected to contrast learning training by adopting parallel corpus in a second data set and is used as a scoring example model after convergence, so that the method has the advantages of efficiency and cost.

Step S4300, building the generated instance model and the scoring instance model into a reinforcement learning architecture, and performing reinforcement learning training on the reinforcement learning architecture by adopting commodity titles in a third data set;

after training of the first prototype instance is completed to obtain a generated instance model and training of the second prototype instance is completed to obtain a scoring instance model, the generated instance model and the scoring instance model can be respectively initialized and then built into a reinforcement learning framework, and reinforcement learning training is implemented on the whole framework.

In the reinforcement learning training process, an instance model is generated to infer a given commodity title according to the capability of the model obtained by training in the previous fine tuning, a corresponding commodity description text is generated, and then the scoring instance model scores the commodity description text corresponding to the commodity title according to the capability of the model obtained by training, so that a score is given. The score is used as a reward for generating the instance model, and further guides the generated instance model to correct the predictive reasoning capacity of the generated instance model. When the generated instance model meets a preset convergence condition or meets other preset conditions after limited times of iterative training, the reinforcement learning training process can be terminated, and further training of the generated instance model is completed.

In the reinforcement learning training process, since the generation instance model has the capability of generating commodity description text and the scoring model can provide rewarding information, commodity titles only need to be provided for the generation instance model as driving the generation of commodity description text. Therefore, a third data set is prepared in advance, wherein the third data set only comprises various commodity titles, and the commodity titles can be directly collected from commodity information of the online commodity from a commodity sample library provided by the electronic commerce platform.

The capability of generating commodity description text accurately is further improved under the training of the grading example model through the generation example model of reinforcement learning training, and the method can be put into the use of an inference stage.

Step S4400, deploying the generated instance model subjected to reinforcement learning training as the description generation model.

The generated instance model through reinforcement learning training can be deployed into a service cluster of an e-commerce platform to be operated as a description generation model, and becomes an online service, and an open interface is formed to other online services, application programs, other third party programs and the like of the whole e-commerce platform for external calling. For example, after inputting a commodity title, a merchant user can trigger the call of an interface of the generated instance model by the code of the page in a commodity information input page provided by a business management engine of the merchant user, so that the commodity title input by the merchant user is transmitted into the generated instance model, and commodity description text generated by the generated instance model according to the commodity title is obtained and displayed in the input page pair for further editing and use by the merchant user.

It can be seen from the above embodiments that the present application actually provides a training method for describing the generation model, which is used by the present application, and the training method can further achieve the following technical advantages:

firstly, in the description generation model, in the training process, a first prototype instance of a large language model is subjected to self-supervision fine tuning training to form a generation instance model, then a second prototype instance of the large language model is transformed, a scoring instance model is obtained through comparison learning training, and finally the generation instance model and the scoring instance model are built into a reinforcement learning framework to implement reinforcement learning training, so that the generation instance model subjected to reinforcement learning training can be used as the description generation model. In the whole training process, three training modes of self-supervision training, contrast learning training and reinforcement learning training are respectively applied, no additional labeling is needed to corresponding training samples, the cost is low and the efficiency is high, in addition, the generated example model is on the basis of inheriting the congenital advantages of a large language model, the reasoning capacity of the adaptive commodity description text generation is correspondingly enhanced and promoted step by the processes, the finally obtained generated example model can accurately and effectively generate the commodity description text corresponding to a given commodity title, the commodity description text well illustrates the connotation of selling point information given by the commodity title, and the method has very remarkable advantages for improving the input and editing efficiency of commodity information in an electronic commerce platform.

Secondly, the electronic commerce platform has big data advantages in commodity information, and the big language model adopted in the training process belongs to a generating model based on an autoregressive principle, so that when a data set is constructed for training of each stage, materials can be obtained from the electronic commerce platform, for example, commodity titles of the on-line shops are adopted from a commodity database of the on-line shops, various sentence pairs required by training are quickly organized, manual intervention is not required in the whole process, manual labeling is not required, efficiency is high, zero additional cost is low, and the cost on computer resources is extremely low due to simple operation of the database.

In addition, in the training process, the generated instance model and the scoring instance model used in the reinforcement learning training stage are obtained by training different prototype instances of the same large language model, so that the reasoning capacity of the two instances can be more focused on respective corresponding tasks, for example, the generated instance model is more focused on the generation of commodity description texts, and the scoring instance model is more focused on scoring the commodity description texts, thus, the coordination control about the state to the action in the reinforcement learning process can be more effective, the reinforcement training efficiency is improved, and the reasoning capacity of the generated instance model can be naturally further improved, so that the commodity description texts more effective can be output corresponding to given commodity titles.

On the basis of any embodiment of the method of the present application, referring to fig. 4, before inputting the first prototype instance of the large language model to implement self-supervision training to a convergence state by using sentence pairs composed of a commodity title in the first data set and a corresponding commodity description text, the method includes:

step S3100, acquiring commodity titles corresponding to a plurality of commodity objects from a commodity sample library to form a commodity title set, dividing the commodity title set into a plurality of parts, wherein each part comprises a plurality of commodity titles;

in this embodiment, massive commodity databases in the e-commerce platform can be used as commodity sample libraries, or combined into the same commodity sample library, so as to prepare various data sets required by the training of the application. In order to improve the subsequent information processing efficiency, the commodity titles corresponding to the plurality of commodity objects can be acquired from the commodity sample libraries, wherein the acquisition orders of the commodity titles can be acquired according to requirements, for example, the commodity titles of the whole amount of the on-shelf commodities of the electronic commerce platform, the commodity titles of one or more commodity categories in the electronic commerce platform, the commodity titles of the whole amount of the commodities of one on-line shops and the like can be flexibly selected.

A large number of commodity titles collected from a commodity sample library are specially stored as a commodity title set, and then all commodity titles in the commodity title set can be set into three parts corresponding to different purposes, so that each part has a large number of commodity titles. Wherein the first part is to be used for constructing a first data set, the second part is to be used for constructing a second data set, and the third part is to be used for constructing a third data set, the three data sets respectively using training data of different training phases.

Step S3200, inputting each commodity title belonging to the first part in the commodity title set into a first reference model to obtain commodity description text corresponding to each commodity title, and storing sentence pairs formed by the commodity title and the corresponding commodity description text into a first data set;

selecting a large language model with stronger reasoning capability such as ChatGPT which is already operated online as a first reference model, inputting all commodity titles of a first part in a commodity title set into the first reference model, generating commodity description texts corresponding to all commodity titles by the first reference model, forming a sentence pair by each commodity title and the commodity description texts correspondingly obtained by each commodity title, and storing each sentence pair corresponding to the commodity title of the first part in a first data set to finish the construction of a first data set.

Step S3300, inputting each commodity title belonging to the second part in the commodity title set into a first reference model and a second reference model respectively to obtain commodity description texts output by each reference model corresponding to each commodity title, forming parallel corpus by the commodity titles and commodity description texts obtained by the commodity titles corresponding to different reference models, and storing the parallel corpus into a second data set;

the second data set needs parallel corpus, each parallel corpus is a mapping from the same commodity title to two different commodity description texts, namely each parallel corpus actually comprises two sentence pairs, namely a first sentence pair and a second sentence pair, wherein the commodity description text in the first sentence pair can be defaulted to be better than the commodity description text in the second sentence pair.

According to the construction principle about the parallel corpus, inputting the commodity titles of the second part in the commodity title set into the first reference model to obtain corresponding commodity description texts, and constructing the commodity titles and the commodity description texts thereof into sentence pairs as the first sentence pairs in the parallel corpus.

And then, continuously adopting a large language model with relatively strong reasoning capability, such as Vicuna, as a second reference model, inputting the commodity titles of the second part in the commodity title set into the second reference model to obtain corresponding commodity description texts, and constructing the commodity titles and the commodity description texts thereof into sentence pairs as second sentence pairs in the parallel corpus in the same way.

And combining and representing each first sentence pair and each second sentence pair generated for the commodity title of the second part into each parallel corpus according to the corresponding relation of the commodity title, and storing the parallel corpus into a second data set to complete the construction of the second data set.

And step S3400, storing a third part of commodity titles in the commodity title set to a third data set independently.

Because the third data set is used for implementing reinforcement learning training, only a pure commodity title is required to be provided, and therefore, only a third part of commodity titles in the commodity standard set are required to be transferred into the third data set.

According to the embodiment, all data sets are constructed by utilizing the existing commodity titles and the existing large language models in the e-commerce platform, the whole process can be automatically implemented without manual intervention, the preparation cost is low, the cost efficiency is high, and the corresponding data quality aspect is reliable.

On the basis of any embodiment of the method of the present application, please refer to fig. 5, in which sentence pairs composed of commodity titles in the first dataset and corresponding commodity description texts are adopted, a first prototype instance of the large language model is input to implement self-supervision training to a convergence state, which includes:

Step S4110, calling a single sentence pair in the first data set, and splicing the commodity title and the corresponding commodity description text thereof into a training sample;

when training is required for the first prototype instance, each sentence pair may be iteratively invoked from the previously prepared first dataset as a training sample, and the first prototype instance is input. The sentence pair comprises commodity titles and corresponding commodity description texts thereof, and the commodity titles and the commodity description texts are spliced together in order according to format requirements corresponding to the input first prototype examples and can be used as training samples.

Step S4120, inputting the training sample into a first prototype instance, and performing self-supervision training on the training sample;

the first prototype example is a large language model, is essentially an autoregressive model capable of processing the serialization information, the serialization information is encoded by an encoder to extract deep semantic information, then the decoder decodes the deep semantic information step by step, and corresponding output is obtained in each time step.

Accordingly, after a single training sample is input into the first prototype instance, text content corresponding to the commodity title is deduced step by step, and then the loss of the text content is supervised by adopting commodity description text provided in the training sample, so that self-supervised learning can be realized.

Step S4130, judging whether the first prototype example is characterized by reaching convergence according to the loss value obtained by the training sample, and updating the weight parameter of the first prototype example when the convergence is not reached, and continuously constructing a next training sample from the first data set to continue iterative training.

After calculating the corresponding loss value according to a single training sample or calculating a total loss value according to the loss values of a plurality of training samples in batches, the first prototype example can judge whether the first prototype example converges according to whether the loss value meets a preset condition, the preset condition can be a preset threshold, and when the first prototype example converges, the training of the first prototype example can be stopped to obtain a generated example model corresponding to the current stage. When the first prototype instance does not converge, gradient updating may be performed on the weight parameters of each link of the first prototype instance, and then the process returns to step S4110 to continue invoking the next training sample from the first data set to continue performing iterative training, and so on, until the first prototype instance is trained to converge, so that it becomes the generated instance model.

According to the embodiment, the training process of generating the instance model is based on self-supervision training, and no supervision label is provided additionally, so that the training efficiency is high and the training cost is low.

On the basis of any embodiment of the method of the present application, please refer to fig. 6, in which a second prototype instance of the large language model is input by using parallel corpus in a second data set, the method includes:

step S4210, calling a single parallel corpus in a second data set, and constructing two training samples corresponding to the same commodity title according to the commodity title and two commodity description texts, wherein each training sample comprises the commodity title and a corresponding single commodity description text thereof;

when training is required to be carried out on the scoring example model, parallel corpus is called from the second data set to serve as training samples, and as two sentence pairs exist in each parallel corpus, two training samples exist, each training sample is actually mapping relation data between a commodity title and a corresponding commodity description text, and the commodity titles and corresponding single commodity description texts are spliced in order, so that the corresponding training samples are obtained.

Step S4220, respectively inputting two training samples corresponding to the same commodity title into a second prototype example for training to obtain the score output by the second prototype example at the tail end linear layer;

Because the second prototype example is required to be subjected to contrast learning training, a parallel corpus is called from the second data set each time to obtain two training samples, the two training samples are provided for the second prototype example to train, and after the two training samples obtain corresponding prediction results, the training conditions of the two training samples are combined to determine corresponding loss values.

The second prototype instance, except its end output layer, has its front end components each being the prototype of the corresponding large language model, so it also has the same reasoning business logic as the first prototype instance, which predicts a feature representation corresponding to the text content based on the sequential autoregressive decoding mechanism after providing a corresponding training sample to the first prototype instance, but the second prototype instance will input this feature representation into its end linear layer, which maps this feature representation to a score by the linear layer through full concatenation. It will be appreciated that each training sample may receive its corresponding score.

Step S4230, monitoring the second prototype example to perform contrast learning training according to the corresponding scores of the two training samples and the preset relative advantages and disadvantages of the source reference model of the commodity description text in the two training samples, so as to obtain a loss value between the two training samples;

When determining the loss value for the second prototype example, for the comparison learning requirement, two training samples corresponding to the same parallel corpus are taken as the same group, and the corresponding loss value of the same group is calculated. In the second data set, two sentence pairs of parallel corpus, namely a first sentence pair and a second sentence pair, are correspondingly distinguished, in the process of contrast learning training, the commodity description text obtained by the first reference model is assumed to be better than the commodity description text obtained by the second reference model, namely the quality of the default first sentence pair is better than the quality of the second sentence pair, so that according to the preset, the difference value of two scores in the same group of training samples is obtained, the difference value is further quantized into a corresponding loss value, and whether the training target is achieved can be measured by the loss value.

Step S4240, judging whether the loss value obtained by the second prototype example is characterized by reaching convergence, and updating the weight parameters of the second prototype example when the convergence is not reached, and continuing constructing the next pair of training samples from the second data set to continue iterative training.

After obtaining the corresponding loss value according to a single parallel corpus or obtaining the total loss value of the loss values according to a plurality of parallel corpora in batches, the finally determined loss value can be compared with a preset threshold value used as a judgment threshold, for example, the preset threshold value can be used for representing that the score obtained by two training samples in the same group is 0 or is infinitely close to 0, and when the finally determined loss value reaches the preset threshold value, the second prototype instance can be judged to have reached a convergence state, and training can be terminated to obtain a scoring instance model. When the finally determined loss value does not reach the preset threshold, i.e. does not converge, the weight parameters of each link of the second prototype instance may be updated according to the finally determined loss value, and then the process returns to step S4210 again, and the next parallel corpus is continuously called from the second dataset, so as to continue to perform iterative training, and so on, until the second prototype instance is trained to a convergence state to obtain the scoring instance model.

According to the embodiment, the scoring example model is obtained based on comparison learning training, the model automatically determines the loss value by using the scores of two training samples related to the same commodity title in the same parallel corpus, and then the weight is updated according to the loss value, so that the process is full-automatic, the training efficiency is high, and the training cost is low.

On the basis of any embodiment of the method of the present application, referring to fig. 7, reinforcement learning training is performed on the reinforcement learning architecture by using the commodity titles in the third data set, including:

step S4310, invoking commodity titles in a third data set to be independently used as training samples to be input into a generated instance model in the reinforcement learning architecture to start iteration, and reasoning to obtain corresponding commodity description text by the generated instance model;

the third data set is a simple commodity title, so that in the reinforcement learning process, when training is iterated each time, a single commodity title in the third data set is called as a training sample.

As described above, in order to implement reinforcement learning training, the present application initializes the trained generated instance model and the score instance model, and then transmits the output of the generated instance model to the score instance model as input to form reinforcement learning architecture, and on the basis of reinforcement learning architecture, retrains the generated instance model is implemented.

In the reinforcement learning architecture, the scoring instance model needs to play a role in giving objective and effective scores for output results of the generated instance model by utilizing the learned reasoning capability, so that the scoring instance model does not participate in weight update in the reinforcement learning architecture, that is, the weights of the scoring instance model are frozen before reinforcement learning is started.

Accordingly, after reinforcement learning training is started, when a single commodity title is called from the third data set and is used as a training sample to be input into the generated instance model, a text content is obtained by reasoning of the generated instance model, and the text content is commodity description text predicted by the generated instance model. Since generating the instance model already has been fine-tuned with the capability of generating a relatively high commodity descriptive text, this commodity descriptive text itself has been a relatively good result in theory.

Step S4320, determining a commodity title serving as the training sample and a score corresponding to a sentence pair formed by a commodity description text correspondingly output by the generated example model by using a scoring example model in the reinforcement learning architecture;

the input format of the corresponding scoring example model in the training stage generates commodity description text obtained by the example model corresponding to the training sample, and the commodity description text is orderly combined with commodity titles in the training sample to form sentence pairs, and then is used as the input of the scoring example model, and the scoring example model can obtain the score corresponding to the inputted sentence pairs at the output of the terminal linear layer according to the learning capability of the scoring example model, so that the score plays an objective quantitative evaluation role on the sentence pairs.

Step S4330, determining expected scores corresponding to the training samples according to a preset cost function, determining model loss values according to differences between the expected scores and scores output by the scoring instance models, and updating weights of the generated instance models;

based on the principle of reinforcement learning training, the score obtained by scoring the instance model can be used as a reward for generating the instance model so that the generated instance model can be more effectively updated in weight, and the association control between the state and the action is coordinated.

According to the method, the expected score corresponding to the training sample of the current iteration can be calculated according to the preset cost function, then the difference between the expected score and the score obtained by the training sample is obtained, the difference can be used for measuring the level difference of the generated instance model and can be regarded as a model loss value, and the generated instance model is correspondingly weighted and updated according to the model loss value.

Step S4340, judging whether the loss value corresponding to the iteration at the time meets a preset condition, when the loss value meets the preset condition, determining the highest value of the scores obtained by each iteration, acquiring a generated instance model corresponding to the highest value, and determining the generated instance model as the description generated model, otherwise, calling the next commodity title to continue iterative training.

When each iteration is completed, the corresponding weight parameters of the generated instance model in the reinforcement learning architecture can be cached, and the loss value of the generated instance model is lower and lower along with the advancement of the iteration times until the generated instance model enters a relatively smooth interval, but fluctuation, namely the fluctuation of the loss value of the model, is still unavoidable in the process of continuing the iteration within the smooth interval range. Accordingly, a policy may be applied to find an optimal state to obtain a relatively optimal generated instance model.

Specifically, after each iteration completes the determination of the loss value of one training sample, judging whether the loss value meets a preset condition, wherein the preset condition comprises whether the loss value reaches a preset threshold corresponding to the characteristic entering smooth interval or not, and whether the loss value continuously reaches the preset threshold N (N > 2) times or not, if not, the step S4310 is repeated to input the next training sample continuously without being considered; if so, determining the highest value from the scores of all the training samples input by the history, wherein the state of the generated instance model corresponding to the highest value is regarded as the relatively optimal state, and acquiring and storing the weight parameters of the generated instance model when the optimal state is acquired to obtain the description generated model. It will be appreciated that this descriptive generative model is the optimal model instance obtained during the reinforcement learning training process.

It is easy to understand that by implementing reinforcement learning training on the generated instance model, the performance of the generated instance model under each training sample is continuously and iteratively tested, under the guidance of preset conditions, the optimal state of the generated instance model is preferably generated under each state of corresponding training of each training sample, the weight parameter of the optimal state is saved to obtain the optimal generated instance model as a description generation model, and the description generation model is ensured to be a best and optimal result, so that the description generation model is suitable for effectively serving the generation of commodity description texts.

On the basis of any embodiment of the method of the present application, please refer to fig. 8, the method constructs the commodity description content of the target commodity according to the commodity description text, stores the commodity description content in a commodity database, and includes:

step S5410, acquiring a plurality of commodity pictures provided corresponding to the commodity prompt text and explanation labels of the commodity pictures;

in the process of inputting commodity information of a target commodity, a merchant user takes the input commodity title as a commodity prompt text to guide the description generation model of the application to output a corresponding commodity description text, and generally, the merchant user can also provide a plurality of commodity pictures of the target commodity together so as to describe the sales point information of each commodity picture from different aspects, the merchant user can set a description label of each commodity picture, the description label can be a keyword, a simple sentence or other types of class labels, and in a word, the description label can possibly give corresponding labels to the description effect exerted by the corresponding commodity picture. For example, the description label of one picture may be a "product internal structure diagram", the description label of another picture may be a "rainy day use state", and so on, it is easy to understand that the description label may give a certain meaning so that a user can quickly understand the effect of the commodity picture through the description label. Of course, these explanatory tabs may be intermediate data, and are not necessarily required to be displayed as contents of the commodity information to the graphical user interface.

Step S5420, dividing the commodity description text sentence into a plurality of commodity sentences, determining semantic similarity between each description label and each commodity sentence, and determining the commodity sentence with the highest semantic similarity for each description label as a commodity sentence in the center;

on the other hand, for the commodity description text generated by the description generation model based on the commodity prompt text input by the merchant user, the description generation model can naturally separate sentences according to punctuation marks, so that a plurality of commodity sentences are obtained, and each commodity sentence can have different ideas or be associated with each other. The plurality of commodity sentences may collectively describe the selling point information of one aspect of the target commodity, or may individually describe one selling point information. In any case, feature representation can be performed on the commodity sentences to obtain sentence vectors, and then semantic similarity matching is performed on the sentence vectors and the sentence vectors of the explanatory labels of the commodity pictures.

Therefore, a text feature extraction model can be trained in advance, so that the text feature extraction model can perform feature representation on text contents, including the commodity description text and text contents in the description tags, and therefore sentence vectors corresponding to all the description tags and all commodity sentences in the commodity description text, which are given by a merchant user, can be obtained by means of the text feature extraction model.

After sentence vectors of each description tag and each commodity sentence are obtained, the data distance between each description tag and all the commodity sentences can be calculated based on any one data distance algorithm such as cosine similarity, min Shi distance algorithm, the anelson correlation coefficient algorithm, the jekade coefficient algorithm and the like, the data distance is normalized to be semantic similarity, then the commodity sentence with the highest semantic similarity with each description tag is found out to serve as a central commodity sentence, and therefore each description tag can obtain a corresponding central commodity sentence, and of course, two description tags are allowed to obtain the same central commodity sentence at the same time.

Step S5430, based on the corresponding relation of the explanatory labels, establishing corresponding relation between each commodity picture and each central commodity sentence, so that each commodity picture is used as an illustration of the corresponding central commodity sentence in the commodity description text, and commodity description contents are obtained;

on the basis that the explanatory labels can be associated with each center commodity sentence and commodity pictures, the association between each commodity picture and the center commodity sentence with the same explanatory label can be established, and the association information is combined with each commodity sentence to form commodity descriptive contents together, so that formatting of commodity descriptive texts is realized, and the commodity descriptive contents are obtained. Subsequently, when the commodity description text is required to be loaded according to the commodity description content, corresponding commodity pictures can be added according to the central commodity sentences to be used as the illustration, so that the function of automatically illustration in the commodity description text is realized.

And step S5440, releasing and storing the commodity description content to a commodity database of the electronic commerce platform.

After the processing of the commodity descriptive contents is completed, the commodity descriptive contents can be stored in a commodity database of an electronic commerce platform, and the commodity descriptive contents can be specifically stored in a commodity database of an online store corresponding to a merchant user of the target commodity, so that the release of the online commodity is realized.

According to the embodiment, on the basis of the commodity description text obtained according to the commodity prompt text, the commodity sentence is obtained through the clause, semantic matching is carried out by utilizing the description label of the commodity sentence and the commodity picture, the center commodity sentence and the corresponding commodity picture are determined, and finally the commodity description content is formatted and issued to the electronic commerce platform, so that the whole process is automatically realized, and the method is very efficient. When the terminal equipment loads corresponding commodity description contents, automatic illustration typesetting can be realized according to the corresponding relation between the central commodity sentences and the commodity pictures, and coordinated display is realized based on the association of selling point information between the commodity description texts and the corresponding commodity pictures.

Referring to fig. 9, a commodity description content generating apparatus according to an aspect of the present application includes a prompt acquiring module 5100, a description generating module 5200, and a description publishing module 5300, where the prompt acquiring module 5100 is configured to acquire a commodity prompt text, and the commodity prompt text includes a plurality of keywords for describing sales point information of a target commodity; the description generation module 5200 is configured to input the commodity prompt text into a preset description generation model to obtain a commodity description text generated by the description generation model, where the description generation model belongs to an example of a large language model obtained through reinforcement learning training, and the commodity description text includes text content corresponding to an connotation description of the selling point information of the commodity prompt text; the description release module 5300 is configured to construct a commodity description content of the target commodity according to the commodity description text, and store the commodity description content in a commodity database.

On the basis of any embodiment of the apparatus of the present application, the commodity descriptive content generating apparatus of the present application further includes: the first training module is set to adopt sentence pairs formed by commodity titles and corresponding commodity description texts in a first data set, and inputs a first prototype example of the large language model to implement self-supervision training to a convergence state so as to obtain a generated example model; the second training module is configured to input a second prototype instance of the large language model into a convergence state by adopting parallel corpus in a second data set to implement comparison learning training to obtain a scoring instance model, the second prototype instance is a network architecture obtained by replacing an output layer of the second prototype instance with a terminal linear layer on the basis of the large language model, the terminal linear layer maps an reasoning result of the large language model into a single-dimensional vector, and each parallel corpus comprises the same commodity title and two corresponding different commodity description texts; the third training module is used for building the generated instance model and the scoring instance model into a reinforcement learning framework, and performing reinforcement learning training on the reinforcement learning framework by adopting commodity titles in a third data set; and the deployment online module is used for deploying the generated instance model subjected to reinforcement learning training as the description generated model.

On the basis of any embodiment of the apparatus of the present application, the commodity descriptive content generating apparatus of the present application further includes: the sample collection module is used for collecting commodity titles corresponding to a plurality of commodity objects from a commodity sample library to form a commodity title set, dividing the commodity title set into a plurality of parts, wherein each part comprises a plurality of commodity titles; the first construction module is used for inputting each commodity title belonging to the first part in the commodity title set into the first reference model, obtaining commodity description text corresponding to each commodity title, and storing sentence pairs formed by the commodity title and the corresponding commodity description text into the first data set; the second construction module is used for inputting each commodity title belonging to the second part in the commodity title set into the first reference model and the second reference model respectively to obtain commodity description texts output by each reference model corresponding to each commodity title, forming parallel corpus by the commodity titles and commodity description texts obtained by the commodity titles corresponding to different reference models, and storing the parallel corpus into the second data set; and a third construction module configured to store a third portion of the commodity titles in the commodity title set separately to a third data set.

On the basis of any embodiment of the device of the present application, the first training module includes: the first sample calling unit is used for calling a single sentence pair in the first data set, and splicing the commodity title and the corresponding commodity description text thereof into a training sample; a first training execution unit configured to input the training sample into a first prototype instance, and to implement self-supervision training thereon; the first iteration decision unit is used for judging whether the loss value obtained by the first prototype example according to the training sample represents that the convergence is achieved, and updating the weight parameter of the first prototype example when the convergence is not achieved, and continuously constructing the next training sample from the first data set to continue the iterative training.

On the basis of any embodiment of the device of the present application, the second training module includes: the second sample calling unit is used for calling single parallel corpus in a second data set, two training samples corresponding to the same commodity title are constructed according to the commodity title and the two commodity description texts, and each training sample comprises the commodity title and the corresponding single commodity description text; the second training execution unit is used for respectively inputting two training samples corresponding to the same commodity title into a second prototype example to perform training to obtain the score output by the second prototype example at the tail end linear layer of the second prototype example; the second loss calculation unit is set to monitor the second prototype example for comparison learning training according to the corresponding scores of the two training samples and the preset relative advantages and disadvantages of the source reference model of the commodity description text in the two training samples, so as to obtain a loss value between the two training samples; and the second iteration decision unit is used for judging whether the loss value obtained by the second prototype example represents convergence or not, and updating the weight parameters of the second prototype example when the convergence is not reached, and continuing constructing the next pair of training samples from the second data set to continue iterative training.

On the basis of any embodiment of the apparatus of the present application, the third training module is configured to include: the third sample calling unit is used for calling commodity titles in a third data set to be independently used as training samples to be input into a generated instance model in the reinforcement learning architecture to start iteration, and corresponding commodity description texts are obtained by reasoning the generated instance model; a third sample scoring unit configured to determine, from a scoring instance model in the reinforcement learning architecture, a score corresponding to a sentence pair formed by a commodity title as the training sample and a commodity description text correspondingly output by the generated instance model; a third weight updating unit, configured to determine a desired score corresponding to the training sample according to a preset cost function, determine a model loss value according to a difference between the desired score and a score output by the scoring instance model, and implement weight updating on the generated instance model; and the third model optimizing unit is used for judging whether the loss value corresponding to the current iteration meets the preset condition, determining the highest value of the score obtained by each iteration when the loss value meets the preset condition, acquiring a generated example model corresponding to the highest value, and determining the generated example model as the description generated model, otherwise, calling the next commodity title to continue iterative training.

On the basis of any embodiment of the apparatus of the present application, the description publishing module 5300 includes: the resource acquisition unit is used for acquiring a plurality of commodity pictures and explanation labels of the commodity pictures provided by the commodity prompt text; the mapping determining unit is used for dividing the commodity description text into a plurality of commodity sentences, determining the semantic similarity between each description label and each commodity sentence, and determining the commodity sentence with the highest semantic similarity for each description label as the commodity sentence in the center; the content construction unit is used for establishing corresponding association between each commodity picture and each center commodity sentence based on the corresponding relation of the explanatory labels, so that each commodity picture is used as an illustration of the corresponding center commodity sentence in the commodity description text, and commodity description content is obtained; and the content release unit is used for releasing and storing the commodity description content to a commodity database of the electronic commerce platform.

Another embodiment of the present application also provides an article description content generating apparatus. As shown in fig. 10, the internal structure of the commodity descriptive content generating apparatus is schematically shown. The article description content generating apparatus includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The non-volatile readable storage medium of the commodity descriptive content generating apparatus stores an operating system, a database and computer readable instructions, the database may store an information sequence, and the computer readable instructions, when executed by the processor, may cause the processor to implement a commodity descriptive content generating method.

The processor of the article description content generating device is operable to provide computing and control capabilities supporting operation of the entire article description content generating device. The memory of the article description content generating apparatus may store computer readable instructions that, when executed by a processor, cause the processor to perform the article description content generating method of the present application. The network interface of the commodity description content generating apparatus is for communicating with a terminal connection.

It will be appreciated by those skilled in the art that the structure shown in fig. 10 is merely a block diagram of a portion of the structure associated with the present application and does not constitute a limitation of the article description content generating apparatus to which the present application is applied, and that a particular article description content generating apparatus may include more or less components than those shown in the drawings, or may combine some components, or have a different arrangement of components.

The processor in this embodiment is configured to perform specific functions of each module in fig. 9, and the memory stores program codes and various types of data required for executing the above-described modules or sub-modules. The network interface is used for realizing data transmission between the user terminals or the servers. The nonvolatile readable storage medium in this embodiment stores therein program codes and data necessary for executing all modules in the commodity description content generating apparatus according to the present application, and the server can call the program codes and data of the server to execute the functions of all modules.

The application also provides a non-transitory readable storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the commodity description content generating method of any of the embodiments of the present application.

The application also provides a computer program product comprising computer programs/instructions which when executed by one or more processors implement the steps of the method of any of the embodiments of the application.

It will be appreciated by those skilled in the art that implementing all or part of the above-described methods according to the embodiments of the present application may be implemented by a computer program for instructing relevant hardware, where the computer program may be stored in a non-volatile readable storage medium, and where the program, when executed, may include the steps of the embodiments of the methods described above. The storage medium may be a computer readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

In summary, the application understands the selling point information of the given commodity prompt text by means of the large language model trained by reinforcement learning, generates text content for expanding the selling point information, can improve commodity information generation efficiency and reduce operation cost of online shops.

Claims

1. A commodity descriptive content generating method, comprising:

2. The commodity descriptive content generating method according to claim 1, wherein before obtaining the commodity prompt text, comprising:

inputting a first prototype instance of the large language model by adopting a sentence pair consisting of commodity titles and corresponding commodity description texts in a first data set, and implementing self-supervision training to a convergence state to obtain a generated instance model;

inputting a second prototype instance of the large language model by adopting parallel corpus in a second data set, and implementing contrast learning training to a convergence state to obtain a scoring instance model, wherein the second prototype instance is a network architecture obtained by replacing an output layer of the second prototype instance with a terminal linear layer on the basis of the large language model, the terminal linear layer maps an reasoning result of the large language model into a single-dimensional vector, and each parallel corpus comprises the same commodity title and two corresponding different commodity description texts;

Building the generated instance model and the scoring instance model into a reinforcement learning architecture, and performing reinforcement learning training on the reinforcement learning architecture by adopting commodity titles in a third data set;

deploying the generated instance model subjected to reinforcement learning training as the description generated model.

3. The method of claim 2, wherein the step of inputting the first prototype instance of the large language model to the first user device for self-supervision training to a convergence state using sentence pairs comprising the product title and corresponding product description text in the first data set comprises:

acquiring commodity titles corresponding to a plurality of commodity objects from a commodity sample library to form a commodity title set, dividing the commodity title set into a plurality of parts, wherein each part comprises a plurality of commodity titles;

inputting each commodity title belonging to the first part in the commodity title set into a first reference model to obtain commodity description text corresponding to each commodity title, and storing sentence pairs formed by the commodity title and the corresponding commodity description text into a first data set;

inputting each commodity title belonging to the second part in the commodity title set into a first reference model and a second reference model respectively to obtain commodity description texts output by each reference model corresponding to each commodity title, forming parallel corpus by the commodity titles and commodity description texts obtained by the commodity titles corresponding to different reference models, and storing the parallel corpus into a second data set;

And storing a third part of commodity titles in the commodity title set to a third data set independently.

4. The method of claim 3, wherein inputting the first prototype instance of the large language model to implement self-supervised training to a convergence state using sentence pairs of the commodity titles in the first dataset and their corresponding commodity descriptive text, comprises:

invoking a single sentence pair in the first data set, and splicing the commodity title and the corresponding commodity description text thereof into a training sample;

inputting the training sample into a first prototype instance, and performing self-supervision training on the training sample;

and judging whether the first prototype example reaches convergence according to the loss value characterization obtained by the training sample, and updating the weight parameter of the first prototype example when the convergence is not reached, and continuously constructing a next training sample from the first data set to continue iterative training.

5. The commodity descriptive content generating method according to claim 3, wherein inputting the second prototype instance of the large language model using parallel corpus in the second data set performs contrast learning training to a convergence state, comprising:

invoking a single parallel corpus in a second data set, and constructing two training samples corresponding to the same commodity title according to the commodity title and the two commodity description texts, wherein each training sample comprises the commodity title and the corresponding single commodity description text;

Respectively inputting two training samples corresponding to the same commodity title into a second prototype example for training to obtain the score output by the second prototype example at the tail end linear layer of the second prototype example;

according to the corresponding scores of the two training samples and the preset relative advantages and disadvantages of the source reference model of the commodity description text in the two training samples, the second prototype example is supervised for comparison learning training, and a loss value between the two training samples is obtained;

and judging whether the loss value obtained by the second prototype example represents that convergence is achieved or not, and updating the weight parameters of the second prototype example when the convergence is not achieved, and continuously constructing a next pair of training samples from the second data set to continue iterative training.

6. The article descriptive content generation method of claim 3, wherein performing reinforcement learning training on the reinforcement learning architecture using article titles in a third dataset comprises:

invoking commodity titles in a third data set to be independently used as training samples to be input into a generated instance model in the reinforcement learning architecture to start iteration, and reasoning to obtain corresponding commodity description text by the generated instance model;

determining a score corresponding to a sentence pair formed by commodity titles serving as training samples and commodity description texts correspondingly output by the generated instance model by using a scoring instance model in the reinforcement learning architecture;

Determining expected scores corresponding to the training samples according to a preset cost function, determining a model loss value according to a difference value between the expected scores and scores output by a scoring instance model, and updating weights of the generated instance model;

judging whether the loss value corresponding to the iteration of the time meets a preset condition, determining the highest value of the score obtained by each iteration when the loss value meets the preset condition, acquiring a generated instance model corresponding to the highest value, and determining the generated instance model as the description generated model, otherwise, calling the next commodity title to continue iterative training.

7. The article description content generating method according to any one of claims 1 to 6, wherein constructing article description content of the target article from the article description text, storing the article description content in an article database, includes:

acquiring a plurality of commodity pictures provided corresponding to the commodity prompt text and explanatory labels of the commodity pictures;

dividing the commodity description text into multiple commodity sentences, determining the semantic similarity between each description label and each commodity sentence, and determining the commodity sentence with the highest semantic similarity for each description label as the commodity sentence in the center;

Based on the corresponding relation of the explanatory labels, establishing corresponding relation between each commodity picture and each central commodity sentence, and enabling each commodity picture to be used as an illustration of the corresponding central commodity sentence in the commodity description text to obtain commodity description content;

and releasing and storing the commodity description content to a commodity database of the electronic commerce platform.

8. A commodity descriptive content generating apparatus, comprising:

9. A commodity descriptive content generating apparatus comprising a central processor and a memory, wherein the central processor is operable to invoke execution of a computer program stored in the memory to perform the steps of the method of any of claims 1 to 7.

10. A non-transitory readable storage medium, characterized in that it stores in form of computer readable instructions a computer program implemented according to the method of any one of claims 1 to 7, which when invoked by a computer, performs the steps comprised by the corresponding method.