CN117807278A - Resource retrieval method, training method and device based on large language model - Google Patents

Resource retrieval method, training method and device based on large language model Download PDF

Info

Publication number
CN117807278A
CN117807278A CN202311842314.1A CN202311842314A CN117807278A CN 117807278 A CN117807278 A CN 117807278A CN 202311842314 A CN202311842314 A CN 202311842314A CN 117807278 A CN117807278 A CN 117807278A
Authority
CN
China
Prior art keywords
sample
semantic
resource
language model
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311842314.1A
Other languages
Chinese (zh)
Inventor
王超
黄飞
贺登武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu com Times Technology Beijing Co Ltd
Original Assignee
Baidu com Times Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu com Times Technology Beijing Co Ltd filed Critical Baidu com Times Technology Beijing Co Ltd
Priority to CN202311842314.1A priority Critical patent/CN117807278A/en
Publication of CN117807278A publication Critical patent/CN117807278A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a resource retrieval method, a training method and a device based on a large language model, relates to the technical field of artificial intelligence, in particular to the technical fields of intelligent search, large data, deep learning, large language models and the like, and can be used for scenes such as information retrieval, man-machine interaction and the like. The specific implementation scheme of the resource retrieval method based on the large language model is as follows: the resource retrieval method based on the large language model comprises the following steps: in response to receiving the search information, processing the search information by using a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute; determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library; determining target resources matched with the search information according to the intermediate resources; and displaying the target resource.

Description

Resource retrieval method, training method and device based on large language model
Technical Field
The disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of intelligent search, big data, deep learning, big language models and the like, and can be used for scenes such as information retrieval, man-machine interaction and the like.
Background
With the rapid development of internet technology, users can rapidly browse resource information such as news through terminal equipment such as smart phones, and search information such as texts and images can be input into the terminal equipment based on requirements to search the resource information, so that the resource information matched with actual requirements can be searched, and the resource acquisition efficiency is improved.
Disclosure of Invention
The disclosure provides a resource retrieval method, a training method, a device, electronic equipment and a storage medium based on a large language model.
According to an aspect of the present disclosure, there is provided a resource retrieval method based on a large language model, including: in response to receiving the search information, processing the search information by using a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute; determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library; determining target resources matched with the search information according to the intermediate resources; and displaying the target resource.
According to another aspect of the present disclosure, there is provided a training method of a large language model, including: obtaining a training sample, wherein the training sample comprises sample search information and a sample semantic coding sequence corresponding to sample resources, the sample search information is associated with the sample resources, the sample semantic coding sequence comprises at least one sample semantic coding identifier, and the sample semantic coding identifier characterizes at least one sample semantic attribute; and training an initial large language model based on the sample search information and the sample semantic coding sequence to obtain a trained large language model.
According to another aspect of the present disclosure, there is provided a resource retrieval device based on a large language model, including: the semantic coding sequence obtaining module is used for responding to the received search information, processing the search information by utilizing a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute; the intermediate resource determining module is used for determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library; the target resource determining module is used for determining target resources matched with the search information according to the intermediate resources; and the first display module is used for displaying the target resource.
According to another aspect of the present disclosure, there is provided a training apparatus of a large language model, including: the acquisition module is used for acquiring a training sample, the training sample comprises sample search information and a sample semantic coding sequence corresponding to sample resources, the sample search information is associated with the sample resources, the sample semantic coding sequence comprises at least one sample semantic coding identifier, and the sample semantic coding identifier characterizes at least one sample semantic attribute; and the training module is used for training the initial large language model based on the sample search information and the sample semantic coding sequence to obtain a trained large language model.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with an embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a method provided according to an embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to embodiments of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 schematically illustrates an exemplary system architecture to which large language model based resource retrieval methods and apparatus may be applied, in accordance with embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a large language model based resource retrieval method in accordance with an embodiment of the present disclosure;
FIG. 3 schematically illustrates an application scenario diagram of a large language model-based resource retrieval method according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates an application scenario diagram of a large language model-based resource retrieval method according to another embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow chart of a training method of a large language model according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a schematic diagram of training a large language model according to an embodiment of the present disclosure;
FIG. 7 schematically illustrates a schematic diagram of training a large language model according to another embodiment of the present disclosure;
FIG. 8 schematically illustrates a block diagram of a large language model based resource retrieval device in accordance with an embodiment of the present disclosure;
FIG. 9 schematically illustrates a block diagram of a training apparatus of a large language model according to an embodiment of the present disclosure; and
Fig. 10 schematically illustrates a block diagram of an electronic device suitable for implementing a large language model based resource retrieval method, training method, in accordance with an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related personal information of the user all conform to the regulations of related laws and regulations, necessary security measures are taken, and the public order harmony is not violated.
In the search scene of resource information such as news resources and advertisement resources, it is generally difficult to accurately search resources matching with the search requirements of a target object according to simpler search terms input by the target object, so that the target object is required to search for resources meeting the search requirements through multiple rounds of search. This can increase the length of time required for retrieval and can easily reduce the user experience for target objects that cannot retrieve resources that match the requirements.
The embodiment of the disclosure provides a large language model-based resource retrieval method, a training method, a device, an electronic device and a storage medium, wherein the large language model-based resource retrieval method comprises the following steps: in response to receiving the search information, processing the search information by using a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute; determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library; determining target resources matched with the search information according to the intermediate resources; and displaying the target resource.
According to the embodiment of the disclosure, the large language model is utilized to process the search information to obtain the semantic coding sequence related to the resource to be searched, and the semantic coding identifier can represent at least one semantic attribute, so that the output semantic coding sequence can determine the semantic attribute matched with the search information under the condition that the search requirement intention of the search information is fully understood based on the semantic understanding capability and the text prediction capability of the large language model, at least one intermediate resource matched with the search information can be queried from the preset resource library according to the semantic attribute represented by the semantic coding sequence, the target resource is determined and displayed through the intermediate resource, and the matching degree of the displayed target resource and the search intention of the search information can be improved, so that the resource retrieval precision and the retrieval efficiency are improved.
FIG. 1 schematically illustrates an exemplary system architecture to which large language model based resource retrieval methods and apparatus may be applied, in accordance with embodiments of the present disclosure.
It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios. For example, in another embodiment, an exemplary system architecture to which the large language model-based resource retrieval method and apparatus may be applied may include a terminal device, but the terminal device may implement the large language model-based resource retrieval method and apparatus provided by the embodiments of the present disclosure without interacting with a server.
As shown in fig. 1, a system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired and/or wireless communication links, and the like.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 101, 102, 103, such as a knowledge reading class application, a web browser application, a search class application, an instant messaging tool, a mailbox client and/or social platform software, etc. (as examples only).
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for content browsed by the user using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that, the resource retrieval method based on the large language model provided in the embodiments of the present disclosure may be generally executed by the server 105. Accordingly, the large language model-based resource retrieval device provided by the embodiments of the present disclosure may be generally provided in the server 105. The large language model based resource retrieval method provided by the embodiments of the present disclosure may also be performed by a server or server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 1 05. Accordingly, the large language model-based resource retrieval device provided by the embodiments of the present disclosure may also be provided in a server or server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 2 schematically illustrates a flow chart of a large language model based resource retrieval method in accordance with an embodiment of the present disclosure.
As shown in fig. 2, the large language model-based resource retrieval method includes operations S210 to S240.
In response to receiving the search information, the search information is processed using the large language model to obtain a semantic coding sequence, wherein the semantic coding sequence includes at least one semantic coding identifier that characterizes at least one semantic attribute, in operation S210.
In operation S220, an intermediate resource associated with the semantic coding sequence is determined from among candidate resources of a preset resource library.
In operation S230, a target resource that matches the search information is determined according to the intermediate resource.
In operation S240, the target resource is displayed.
According to an embodiment of the present disclosure, the search information may include text information generated based on an input operation of the target object, for example, the text information may be obtained based on various types of information such as text, image, audio, etc. input by the target object in a search information input box of the page. The search information may be characterized based on a natural language manner, or may be further characterized based on other types of text such as numerals, characters, etc., and the specific type of the search information is not limited by the embodiments of the present disclosure.
According to embodiments of the present disclosure, a large language model (LLM: large Language Model) may include a deep learning model trained using a large amount of text data, and the large language model may include, for example, a model constructed based on a neural network such as a Transformer model, etc., the large language model to understand meaning of language text, and generating natural language text. The large language model may handle natural language tasks. Since large language models typically contain billions of parameters, large scale parameters can help large language models learn complex patterns in natural language data, and thus perform well in natural language processing (NLP: natural Language Processing) tasks.
According to embodiments of the present disclosure, semantic code identification in a semantic code sequence may characterize semantic attributes of resources matching a demand intent of search information, e.g., semantic attribute "plant cultivation" may be characterized based on semantic identification code "1". As another example, the semantic attribute "flower cultivation" may also be characterized based on the semantic identification code "2".
According to the embodiment of the disclosure, the semantic code identifiers can be characterized based on characters or character strings, the semantic code identifiers can be in one-to-one correspondence with semantic attributes, the semantic code sequences can be arranged in sequence, and the semantic code identifiers in the semantic code sequences can be used for representing the semantic attributes, so that the retrieval range of the resource information can be determined according to the semantic code sequences, and further the retrieval efficiency and the retrieval accuracy are improved.
According to an embodiment of the present disclosure, the preset resource library may include one or more candidate resources, which may be associated with respective candidate semantic coding sequences, which may characterize a plurality of semantic attributes that the candidate resources characterize.
According to the embodiment of the present disclosure, the candidate resources may include any type of resources that can be browsed based on a terminal device such as a smart phone, including news resources, advertisement resources, video resources, and the like, and the embodiment of the present disclosure does not limit the specific type of the candidate resources, and a person skilled in the art may select according to actual needs.
According to the embodiment of the disclosure, the intermediate resources associated with the semantic coding sequences are determined from candidate resources of a preset resource library, the candidate semantic coding sequences which are at least partially matched with the semantic coding sequences can be determined, and the intermediate resources matched with the semantic coding sequences are determined from the candidate resources according to the association relation between the candidate resources and the candidate semantic coding sequences.
According to the embodiment of the disclosure, when the initial semantic coding sequences output by the large language model include a plurality of initial semantic coding sequences, the semantic coding sequences can be obtained from the plurality of initial semantic coding sequences by utilizing a searching mode such as a bundle searching mode or a probability random sampling mode, so that accurate searching of intermediate resources is realized.
According to an embodiment of the present disclosure, determining a target resource that matches the search information from the intermediate resource may include: and taking the intermediate resource as a target resource to improve the recall rate of the resource information. But not limited to this, a plurality of intermediate resources may be screened based on preset configuration parameters to obtain the target resources. Or the target resource may also be generated based on a selection operation of the target object for the intermediate object. The embodiment of the present disclosure does not limit the specific manner of determining the target resource, and those skilled in the art may select according to actual requirements.
Fig. 3 schematically illustrates an application scenario diagram of a large language model-based resource retrieval method according to an embodiment of the present disclosure.
As shown in fig. 3, an interactive interface 300 may be included in the application scenario. The target object may input search information 311 "flower" based on a search box 310 in the interactive interface 300. In response to obtaining the search information 311, the resource searching method provided by the embodiment of the present disclosure may be executed to process the search information 311 using a large language model, obtain a semantic coding sequence including a plurality of semantic coding identifiers, and determine an intermediate resource from a preset resource library using the semantic coding sequence. The large language model may also output feedback information 321 "good about semantic attributes of intermediate resources, which have been shown for you that advertisement resources related to" flowers "," potted ",. The following: "and determines the intermediate resource as the target resource 300. The target object is prompted to browse advertisement resource 1, advertisement resource 2, and..once again, to advertisement resource N by presenting feedback information 321 and target resource 330 in interactive interface 300, to promote advertisement resource recall for search information 311 "flowers".
According to an embodiment of the present disclosure, the large language model-based resource retrieval method further includes: and displaying the intermediate resources.
According to an embodiment of the present disclosure, determining a target resource that matches the search information from the intermediate resource may include: in response to a validation operation for a target resource in the intermediary resources, the target resource is determined according to the validation operation.
According to the embodiment of the disclosure, the intermediate resources can be rendered in the interactive interface so as to facilitate the selection of the target object in the interactive interface, and the target object can determine one or more target resources to be browsed from the retrieved plurality of intermediate resources according to actual requirements, so that the target object can retrieve more accurate intermediate resources by inputting open expression modes of search information, and further determine the target resources according to autonomous selection operation of the target object, so that the target resources more accurately match the search requirements of the target object.
According to an embodiment of the present disclosure, the large language model-based resource retrieval method may further include: displaying a code identification object sequence corresponding to the semantic code sequence, responding to an editing operation aiming at the code identification object sequence, and updating the semantic code sequence according to the editing operation to obtain a new semantic code sequence.
According to the embodiment of the disclosure, the code identification object sequence can represent one or more semantic code identifications arranged in the semantic code sequence, the target object can update the semantic code identifications in the semantic code sequence by deleting, adding, replacing, confirming and other editing operations on the code identification object sequence, so that a new semantic code sequence is obtained, a new intermediate resource can be generated according to the new semantic code sequence, a target resource matched with the retrieval intention of the target object can be determined according to the new intermediate resource, and the target resource can be timely generated by timely updating the recalled intermediate resource so as to meet the timely requirement of the target object.
Fig. 4 schematically illustrates an application scenario diagram of a large language model-based resource retrieval method according to another embodiment of the present disclosure.
As shown in fig. 4, an interactive interface 400 may be included in the application scenario. The target object may input search information 411 "flower cultivation" based on the search box 410 in the interactive interface 400. In response to obtaining the search information 411, the resource searching method provided by the embodiment of the present disclosure may be executed to process the search information 411 by using a large language model to obtain a semantic coding sequence including a plurality of semantic coding identifiers, and may also obtain feedback information 421 "good" related to the search information 411, which already shows the required advertisement semantic classification for you. Feedback information 421 and a coded identification object sequence 430 corresponding to the semantic coded sequence may be presented in the interactive interface 400. The coded identification object sequence 430 may include a plurality of coded identification objects of "flower cultivation", "indoor flower cultivation", and "orchid cultivation", and the coded identification objects may represent a plurality of semantic coded identifications arranged in the semantic coded sequence, so that the target object can quickly and conveniently learn about semantic properties of recalled resources, and thus can edit the coded identification object sequence 430 to generate a new semantic coded sequence, thereby meeting temporary retrieval requirements of the target object.
According to an embodiment of the present disclosure, the large language model-based resource retrieval method may further include: displaying a code identification object sequence corresponding to the semantic code sequence, processing the update requirement information by utilizing a large language model in response to acquiring the update requirement information aiming at the code identification object sequence to obtain a new semantic code sequence, and displaying the code identification object sequence representing the new semantic code sequence.
According to the embodiment of the disclosure, the update demand information can be based on the text representation input by the target object and based on the natural language expression, so that the target object can update the semantic coding sequence in time based on the open natural language expression mode, and further, the target resource which is more matched with the temporary search demand of the target object can be searched according to the new semantic coding sequence.
According to an embodiment of the present disclosure, the candidate resource is associated with a candidate semantic coding sequence, the candidate semantic coding sequence being determined based on: acquiring candidate resource related information corresponding to the candidate resource; extracting semantic features of the candidate resource related information to obtain candidate resource related semantic features; and determining candidate semantic coding sequences according to the candidate resource related semantic features.
According to embodiments of the present disclosure, the candidate resource-related information may include various types of information characterizing text, images, etc. of the candidate resource, for example, in the case where the candidate resource is an advertisement resource, the candidate resource-related information may include advertisement titles, advertisement publicity text, advertisement icon information, advertisement description information, etc. of the advertisement resource. But is not limited thereto, other types of information related to the candidate resource may be included, for example, in the case where the candidate resource is an advertisement resource, skip page information of the advertisement resource, click rate of the advertisement resource, history search words corresponding to the advertisement resource, and the like, and interaction information related to the candidate resource.
In addition, when the advertisement related information contains information of image, audio and other types, the advertisement related information can be processed based on a text recognition algorithm to recognize and obtain text information, and then semantic feature extraction can be performed on the recognized and obtained text information to obtain candidate resource related semantic features.
According to the embodiment of the disclosure, the candidate resource related information can be processed based on the pre-trained encoder network layer, so that semantic feature extraction of the candidate resource related information is realized. The candidate resource related semantic features can represent semantic attributes of candidate resource related information, the candidate resource related semantic features are processed through a deep learning algorithm, and the candidate semantic code identifiers in the obtained candidate semantic code sequences can further represent the semantic attributes accurately, so that resource retrieval is realized according to the semantic code sequences and the candidate semantic code sequences output by a large language model, accurate retrieval according to the semantic code identifiers representing the semantic attributes can be realized under the condition that a plurality of semantic attributes of the candidate resource related information are fully reserved, the matching degree of recalled intermediate resources and search information is improved, and the retrieval precision and the retrieval effect of target resources are further improved.
In one embodiment of the present disclosure, candidate resource-related information may be processed based on a pre-trained encoder to obtain candidate related candidate semantic coding sequences. The pre-trained encoder can comprise an encoder network layer and a semantic code identification embedded layer, and the semantic code identification embedded layer can be utilized to process candidate resource related semantic features to obtain candidate related candidate semantic code sequences.
According to an embodiment of the present disclosure, determining the candidate semantic coding sequence according to the candidate resource-related semantic features may further comprise: processing the related semantic features of the candidate resources based on a clustering algorithm to obtain a plurality of candidate semantic code identifiers; and determining a candidate semantic code sequence according to the plurality of candidate semantic code identifications.
According to the embodiment of the disclosure, the candidate resource-related semantic features may be processed based on any type of clustering algorithm, for example, the candidate resource-related semantic features may be processed based on a hierarchical clustering algorithm, so as to obtain a plurality of cluster clusters, and the code identifications corresponding to the cluster clusters may be determined as a plurality of candidate semantic code identifications corresponding to the candidate resource. And arranging a plurality of candidate semantic code identifications based on the hierarchical relation among the plurality of clusters to obtain a candidate semantic code sequence.
According to the embodiment of the disclosure, the candidate semantic code sequences are used for representing the plurality of semantic attributes of the candidate resources, so that the plurality of candidate semantic code identifiers arranged in the candidate semantic code sequences can identify a 'layered abstract' thinking mode for representing human thinking, and the plurality of semantic attributes of the candidate resources are managed in a layered and accurate mode. The semantic understanding capability of the large language model is utilized to fully learn the mapping relation between the semantic coding sequence and the semantic attribute, so that the semantic coding sequence output by the large language model can accurately inquire intermediate resources matched with the semantic attribute required by search information, and the accuracy of recalled resource information is improved.
In one embodiment of the disclosure, the industry semantic attributes of the advertisement resources may be first divided based on the industry to which the advertisement provider belongs, so as to obtain respective advertisement resource subsets of the industries, and the advertisement resources in the advertisement resource subsets of the industries are used as candidate resources to determine candidate semantic coding sequences corresponding to the candidate resources. For example, the candidate semantic coding sequences may be "[1] [4] [5] [1] [9]", where semantic attributes of candidate semantic coding identifiers in the candidate semantic coding sequences are respectively characterized: "[ legal consultation ]", "[ civil consultation ]", "[ divorce property division consultation ]" [ group 19 advertisement ] ". Therefore, the candidate advertisement resource can be accurately searched according to the semantic coding sequence output by the large language model, the problem of low search precision caused by recall of the resource through semantic similarity calculation is avoided, meanwhile, semantic attributes related to the candidate resource can be accurately represented based on semantic coding identification, the problem of feature information loss of related information of the candidate resource in the coding process is avoided, and the search accuracy is further improved.
According to an embodiment of the present disclosure, processing search information using a large language model, deriving a semantic coding sequence may include: updating a preset search prompt template based on the search information to obtain search prompt information; and processing the search prompt information by using the large language model to obtain a semantic coding sequence.
According to embodiments of the present disclosure, a hint template (or template) may be information that is used to help a large language model understand the task of generating a semantic coding sequence, the hint template may be determined based on a hint-tag sequence that controls the large language model to accurately predict, and the hint-tag sequence may include any type of hint-tag, such as characters, fields, words, and the like. The search hint templates may be populated based on the search information to obtain search hint information.
In one example of the present disclosure, the search hint information may be based on the following "//" enclosed paragraph characterization:
the search information of the user is flower, and the matched advertisement semantic coding sequence is: //
By inputting the search prompt information into the large language model, the large language model can output a semantic coding sequence matched with the search information.
According to an embodiment of the present disclosure, processing search information using a large language model, obtaining a semantic coding sequence includes: information inquiry is carried out based on the search information, and associated search information is obtained; and processing the search information and the associated search information by using the large language model to obtain a semantic coding sequence.
According to embodiments of the present disclosure, associated search information may characterize background knowledge related to the search information, and associated knowledge information associated with the search information may be queried based on a preset knowledge base, thereby generating associated search information according to the associated knowledge information.
According to the embodiment of the disclosure, the preset search prompt template can be updated by using the search information and the associated search information to obtain the search prompt information. The search prompt information is input into the large language model, so that the large language model can understand background knowledge related to the search information, the search task can be executed more accurately, and the semantic coding sequence matched with the search information can be output.
In one example of the present disclosure, the search hint information may be based on the following "//" enclosed paragraph characterization:
the search information of the// user is "peony", and the background knowledge related to the "peony" includes "paeoniaceae, paeonia plant..the..the..the" advertisement semantic coding sequence that can be matched for it is: //
FIG. 5 schematically illustrates a flow chart of a training method for a large language model according to an embodiment of the present disclosure.
As shown in FIG. 5, the training method of the large language model includes operations S510 to S520.
In operation S510, a training sample including sample search information and a sample semantic code sequence corresponding to a sample resource is acquired.
In operation S520, an initial large language model is trained based on the sample search information and the sample semantic coding sequence, resulting in a trained large language model.
According to an embodiment of the present disclosure, the sample search information is associated with a sample resource, the sample semantic coding sequence comprising at least one sample semantic coding identifier, the sample semantic coding identifier characterizing at least one sample semantic attribute.
According to an embodiment of the present disclosure, the sample search information may include search information generated in a history period, and the sample resource may include a resource obtained based on a selection operation corresponding to the search information in the history period. The sample semantic coding sequence may be obtained by processing sample resource related information corresponding to the sample resource based on a pre-trained encoder.
According to the embodiment of the disclosure, training the initial large language model according to the sample search information and the sample semantic coding sequence can include fine-tuning (fine-tune) model parameters of the pre-trained initial large language model, so that the trained large language model can fully learn a mapping relation between the sample semantic coding sequence and sample semantic attributes based on strong semantic understanding capability and autoregressive generating capability, and the semantic coding sequence output by the large language model can accurately inquire intermediate resources matched with the semantic attributes required by the search information, thereby improving accuracy of recalled resource information.
It should be noted that, technical terms mentioned in the training method of the large language model provided in the embodiment of the present disclosure include, but are not limited to, sample resources, sample search information, and sample semantic coding sequences, and technical terms mentioned in the resource searching method based on the large language model provided in the above embodiment include, but are not limited to, candidate resources, search information, and candidate semantic coding sequences, which have the same or corresponding attributes, and the embodiments of the present disclosure are not repeated herein.
The large language model obtained by training the large language model training method provided by the embodiment of the present disclosure may be applied to the large language model-based resource retrieval method provided in the foregoing embodiment, and the embodiments of the present disclosure are not described herein again.
According to an embodiment of the present disclosure, a preset resource library may be constructed based on the sample resources provided by the embodiment of the present disclosure as candidate resources, and the sample semantic coding sequences corresponding to the sample resources may be represented as candidate semantic coding sequences corresponding to the candidate resources.
According to an embodiment of the present disclosure, the training sample further comprises sample resource related information corresponding to the sample resource.
According to embodiments of the present disclosure, the sample asset-related information may include various types of information characterizing text, images, etc. of the sample asset, for example, in the case where the sample asset is an advertisement asset, the sample asset-related information may include advertisement titles, advertisement publicity text, advertisement icon information, advertisement description information, etc. of the advertisement asset. But not limited thereto, other types of information related to the sample resource may be included, for example, in the case that the sample resource is an advertisement resource, skip page information of the advertisement resource, click rate of the advertisement resource, historical search terms corresponding to the advertisement resource, and the like, and interaction information related to the sample resource.
According to embodiments of the present disclosure, training an initial large language model based on sample search information and sample semantic coding sequences may include: training an initial large language model according to the sample resource related information to obtain an intermediate large language model; determining sample search prompt information according to the sample search information and a preset sample search prompt template; and training the intermediate large language model by using the sample search prompt information and the sample semantic coding sequence.
According to an embodiment of the present disclosure, training the initial large language model according to the sample resource-related information may include inputting the sample resource-related information into the initial large language model to obtain a first semantic coding sequence, and adjusting model parameters of the initial large language model by using a loss value between the first semantic coding sequence and the sample semantic coding sequence to obtain a trained intermediate large language model.
According to an embodiment of the present disclosure, training the initial large language model based on the sample resource-related information may include: determining sample semantic prompt information according to the sample resource related information and a preset sample semantic prompt template; and training the initial large language model by using the sample semantic prompt information and the sample semantic coding sequence.
FIG. 6 schematically illustrates a schematic diagram of training a large language model according to an embodiment of the present disclosure.
As shown in fig. 6, training the initial large language model according to the sample resource related information to obtain the intermediate large language model may include: and updating a preset sample semantic prompt template according to the sample resource related information to obtain sample semantic prompt information 601. The sample semantic prompt 601 is processed using the initial large language model 610 to obtain a first semantic coding sequence 603. Using the sample semantic coding sequence 605 as a tag, an initial large language model 610 is trained from the first semantic coding sequence 603 and the sample semantic coding sequence 605 to obtain an intermediate large language model 620.
As shown in fig. 6, training the intermediate large language model with sample search hints and sample semantic coding sequences may include: the sample search hint information 602 is input into the intermediate large language model 620 and the second semantic coding sequence 604 is output. And training the middle large language model 620 according to the second semantic coding sequence 604 and the sample semantic coding sequence 605 by taking the sample semantic coding sequence 605 as a label to obtain a trained large language model.
For example, the sample semantic hint information may be based on the paragraph representation enclosed by "//":
The search information of the user is flower, and the matched advertisement semantic coding sequence is: //
For example, the sample search hint information may be represented based on the following "//" enclosed paragraphs:
the title of the advertisement is XXXX, the page content text of the advertisement has yyyy, and the corresponding advertisement semantic coding sequence is: //
According to embodiments of the present disclosure, a training task of a first stage may be generated from sample semantic cues and a training task of a second stage may be generated from sample search cues. The training task of the first stage can enable the trained middle big language model to fully learn the mapping relation between the semantic attributes of the sample resource related information and the sample semantic coding sequence, the training task of the second stage can enable the trained big language model to fully learn the mapping relation between the search information and the sample resource related information and the sample semantic coding sequence, so that the big language model can realize multi-task learning capability, accurately identify the retrieval intention of a target object according to the search information input by the target object, search the sample resource semantic attributes matched with the retrieval intention, accurately represent the retrieved sample resources (candidate resources) through the output semantic coding sequence, enable the trained big language model to realize deep interaction of the search information and the sample resource related semantic attributes, capture cross features between the search information and the sample resource related semantic attributes, realize integrated construction of model training and resource index, reduce the dependency degree of the index information, thereby realize the index modeling process of the big language model on the resource, improve the accuracy of the retrieval of the resource, and promote the retrieval accuracy of the resource.
According to an embodiment of the present disclosure, training the initial large language model according to the sample resource-related information, obtaining the intermediate large language model may further include: inputting the sample resource related information into a preset encoder, and outputting sample resource related semantic features and at least one sample semantic coding identification feature; training an initial encoder according to the sample resource related semantic features and at least one sample semantic coding identification feature to obtain a pre-trained encoder; and updating the initial large language model according to the pre-trained encoder to obtain an intermediate large language model.
According to the embodiment of the disclosure, the initial encoder can be utilized to process sample resource related information to obtain sample resource related semantic features representing semantic attributes and sample semantic code identification features corresponding to sample semantic code identifications representing the semantic attributes, the sample resource related semantic features and the sample semantic code identification features are processed based on a contrast loss learning mode, and then the initial encoder is trained to obtain a pre-trained encoder, so that the encoder can fully learn the mapping relation between the sample semantic code identifications and the semantic attributes.
According to the embodiment of the disclosure, the pre-trained encoder updates the initial large language model, and the learning ability of the pre-trained encoder can be migrated to the initial large language model in a migration training mode, so that the obtained intermediate large language model can learn the mapping relation between the sample semantic coding identifier and the semantic attribute, the frequency of model parameter adjustment of the large language model is reduced, and the training cost is reduced.
According to the embodiment of the disclosure, the pre-trained encoder can be used for processing the sample resource related information to obtain sample semantic coding identification features, and can be used for processing the sample semantic coding identification features based on any coding mode to obtain sample semantic coding identifications, so that a sample semantic coding sequence can be determined according to the arrangement position relation among the sample semantic coding identifications.
According to embodiments of the present disclosure, the training sample may also include tag semantic features corresponding to sample resource-related semantic features.
According to the embodiment of the disclosure, when the sample resource related information includes a plurality of sample resource related information, the sample resource related semantic features corresponding to the two different sample resource related information can be used as the label semantic features of each other, so that contrast loss learning is realized.
According to an embodiment of the present disclosure, training an initial encoder based on sample resource-related semantic features and at least one sample semantic coding identification feature may include: processing the sample resource related semantic features and the label semantic features according to the loss function to obtain a first loss value; determining a second loss value according to the degree of difference between the sample resource related semantic features and the at least one sample semantic coding identification feature; and training the initial encoder based on the first loss value and the second loss value.
According to an embodiment of the present disclosure, processing sample resource-related semantic features and tag semantic features according to a penalty function may include: and processing the sample resource related semantic features corresponding to the different sample resource related information according to the loss function.
According to embodiments of the present disclosure, the degree of difference between the sample resource-related semantic features and the at least one sample semantic code identification feature may be calculated based on a similarity algorithm, and the degree of difference may be determined based on the calculated degree of similarity.
FIG. 7 schematically illustrates a schematic diagram of training a large language model according to another embodiment of the present disclosure.
As shown in fig. 7, the initial encoder 700 may include an initial encoder network layer 710, a quantization layer 720, and a semantic coding identification embedding layer 730. The initial encoder network layer 710 may be built based on neural network algorithms, for example, the initial encoder network layer 710 may be built based on convolutional neural networks to achieve text feature extraction. The Quantization layer 720 may be constructed based on a Quantization (Quantization) algorithm.
As shown in fig. 7, different sample resource-related information 701 and 702 may be input to the initial encoder network layer 710 to enable semantic feature extraction for the sample resource-related information 701 and 702, resulting in sample resource-related semantic features 7011 and 7021. Sample resource related semantic features 7011 and 7021 are respectively input to a quantization layer 720 to obtain sample semantic code identifiers corresponding to the sample resource related information 701 and 702 respectively. The sample semantic code identification corresponding to the sample resource related information 701 and 702 is input into a semantic code identification embedding layer 730, so that inverse sample semantic code identification is realized, and sample semantic code identification features 7012 and 7022 corresponding to the sample resource related information 701 and 702 are obtained.
As shown in fig. 7, sample resource-related semantic features 7011 and 7021 may be processed based on a penalty function to obtain a first penalty value, a second penalty value may be determined based on a degree of difference between the sample resource-related semantic feature 7011 and the sample semantic code identification feature 7012, and a third penalty value may be determined based on the sample semantic code identification features 7012 and 7022, and the initial encoder may be jointly trained based on the first penalty value, the second penalty value, and the third penalty value to obtain a pre-trained encoder.
According to the embodiment of the disclosure, the third loss value can represent the difference degree between the sample semantic coding identifications corresponding to different sample resource related information, so that the initial encoder can be trained by introducing the third loss value, the information loss direction between the sample semantic coding identifications output by the initial encoder can be adjusted in the training process without facing the direction opposite to the training task, the robustness of the pre-trained encoder is further improved, the recognition precision of the encoder for the semantic coding identifications is improved, the initial large language model can be updated by the pre-trained encoder, and the prediction precision of the large language model for the semantic coding sequence is improved.
According to an embodiment of the present disclosure, a sample semantic coding sequence corresponding to a sample resource is determined based on: sample related information corresponding to the sample resources is input into an encoder, and a sample semantic coding identifier is output.
According to an embodiment of the present disclosure, the sample semantic code identifications output by the encoder may include a plurality of sample semantic code identifications, and the plurality of sample semantic code identifications may be arranged based on a positional relationship between the plurality of sample semantic code identifications to obtain the sample semantic code sequence.
In one embodiment of the present disclosure, the sample semantic coding sequence may be determined based on a sample semantic coding identity output by an encoder network layer comprised by the encoder.
Fig. 8 schematically illustrates a block diagram of a large language model based resource retrieval device in accordance with an embodiment of the present disclosure.
As shown in fig. 8, the large language model-based resource retrieval device 800 includes: a semantic coding sequence acquisition module 810, an intermediate resource determination module 820, a target resource determination module 830, and a first presentation module 840.
The semantic code sequence obtaining module 810 is configured to process the search information by using the large language model in response to receiving the search information, to obtain a semantic code sequence, where the semantic code sequence includes at least one semantic code identifier, and the semantic code identifier characterizes at least one semantic attribute.
An intermediate resource determining module 820, configured to determine an intermediate resource associated with the semantic coding sequence from candidate resources in a preset resource library.
The target resource determining module 830 is configured to determine, according to the intermediate resource, a target resource that matches the search information.
A first display module 840 for displaying the target resource.
According to an embodiment of the present disclosure, the candidate resource is associated with a candidate semantic coding sequence, the candidate semantic coding sequence being determined based on: acquiring candidate resource related information corresponding to the candidate resource; extracting semantic features of the candidate resource related information to obtain candidate resource related semantic features; and determining candidate semantic coding sequences according to the candidate resource related semantic features.
According to an embodiment of the present disclosure, determining candidate semantic coding sequences from candidate resource-related semantic features includes: processing the related semantic features of the candidate resources based on a clustering algorithm to obtain a plurality of candidate semantic code identifiers; and determining a candidate semantic code sequence according to the plurality of candidate semantic code identifications.
According to an embodiment of the present disclosure, the semantic coding sequence obtaining module includes: the information query sub-module and the first semantic coding sequence obtaining sub-module.
And the information inquiry sub-module is used for carrying out information inquiry based on the search information to obtain the associated search information.
The first semantic coding sequence obtaining sub-module is used for processing the search information and the associated search information by using the large language model to obtain the semantic coding sequence.
According to an embodiment of the present disclosure, the semantic coding sequence obtaining module includes: the system comprises a search prompt information obtaining sub-module and a second semantic coding sequence obtaining sub-module.
And the search prompt information obtaining sub-module is used for updating a preset search prompt template based on the search information to obtain the search prompt information.
And the second semantic coding sequence obtaining sub-module is used for processing the search prompt information by using the large language model to obtain the semantic coding sequence.
According to an embodiment of the present disclosure, the large language model-based resource retrieval device further includes a second presentation module.
The second display module is used for displaying the intermediate resources;
according to an embodiment of the present disclosure, the target resource determination module includes a target resource determination sub-module.
And the target resource determining sub-module is used for responding to the confirming operation for the target resource in the intermediate resources and determining the target resource according to the confirming operation.
According to an embodiment of the present disclosure, the large language model-based resource retrieval device further includes a third presentation module and an update module.
A third display module for displaying the coded identification object sequence corresponding to the semantic coding sequence,
and the updating module is used for responding to the editing operation aiming at the code identification object sequence, updating the semantic code sequence according to the editing operation and obtaining a new semantic code sequence.
FIG. 9 schematically illustrates a block diagram of a training apparatus for a large language model according to an embodiment of the present disclosure.
As shown in fig. 9, the training apparatus 900 of the large language model may include: an acquisition module 910 and a training module 920.
An obtaining module 910, configured to obtain a training sample, where the training sample includes sample search information and a sample semantic code sequence corresponding to a sample resource, and the sample search information is associated with the sample resource, and the sample semantic code sequence includes at least one sample semantic code identifier, and the sample semantic code identifier characterizes at least one sample semantic attribute.
The training module 920 is configured to train the initial large language model based on the sample search information and the sample semantic coding sequence, and obtain a trained large language model.
According to an embodiment of the present disclosure, the training sample further includes sample resource related information corresponding to the sample resource,
the training module comprises: the system comprises a first training sub-module, a sample search prompt information determining sub-module and a second training sub-module.
And the first training sub-module is used for training the initial large language model according to the sample resource related information to obtain the intermediate large language model.
The sample search prompt information determining submodule is used for determining sample search prompt information according to the sample search information and a preset sample search prompt template.
And the second training sub-module is used for training the middle large language model by using the sample search prompt information and the sample semantic coding sequence.
According to an embodiment of the present disclosure, the first training submodule includes: the device comprises a first coding unit, a pre-training unit and an updating unit.
The first coding unit is used for inputting the sample resource related information into a preset coder and outputting sample resource related semantic features and at least one sample semantic coding identification feature.
And the pre-training unit is used for training the initial encoder according to the sample resource related semantic features and the at least one sample semantic coding identification feature to obtain a pre-trained encoder.
And the updating unit is used for updating the initial large language model according to the pre-trained encoder to obtain the intermediate large language model.
According to an embodiment of the present disclosure, the training sample further comprises tag semantic features corresponding to sample resource-related semantic features.
According to an embodiment of the present disclosure, the pre-training unit comprises: the first loss value obtaining subunit, the second loss value obtaining subunit and the training subunit.
The first loss value obtaining subunit is used for processing the sample resource related semantic features and the label semantic features according to the loss function to obtain a first loss value.
A second loss value obtaining subunit, configured to determine a second loss value according to a degree of difference between the sample resource related semantic feature and the at least one sample semantic code identification feature.
And a training subunit for training the initial encoder according to the first loss value and the second loss value.
According to an embodiment of the present disclosure, a sample semantic coding sequence corresponding to a sample resource is determined based on: sample related information corresponding to the sample resources is input into an encoder, and a sample semantic coding identifier is output.
According to an embodiment of the present disclosure, the first training submodule includes: and the sample semantic prompt information determining unit and the training unit.
The sample semantic prompt information determining unit is used for determining sample semantic prompt information according to the sample resource related information and a preset sample semantic prompt template.
The training unit is used for training the initial large language model by using the sample semantic prompt information and the sample semantic coding sequence.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
According to an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described above.
According to an embodiment of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described above.
According to an embodiment of the present disclosure, a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
Fig. 10 schematically illustrates a block diagram of an electronic device suitable for implementing a large language model based resource retrieval method, training method, in accordance with an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 10, the apparatus 1000 includes a computing unit 1001 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1002 or a computer program loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data required for the operation of the device 1000 can also be stored. The computing unit 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
Various components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and communication unit 1009 such as a network card, modem, wireless communication transceiver, etc. Communication unit 1009 allows device 1000 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1001 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1001 performs the respective methods and processes described above, for example, a resource retrieval method based on a large language model, a training method. For example, in some embodiments, the large language model-based resource retrieval method, training method, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM 1002 and/or communication unit 1009. When the computer program is loaded into RAM 1003 and executed by computing unit 1001, one or more steps of the above-described large language model-based resource retrieval method, training method may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured to perform the large language model based resource retrieval method, training method, by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (29)

1. A resource retrieval method based on a large language model comprises the following steps:
in response to receiving search information, processing the search information by using a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute;
determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library;
Determining target resources matched with the search information according to the intermediate resources; and
and displaying the target resource.
2. The method of claim 1, wherein the candidate resource is associated with a candidate semantic coding sequence, the candidate semantic coding sequence determined based on:
acquiring candidate resource related information corresponding to the candidate resource;
extracting semantic features of the candidate resource related information to obtain candidate resource related semantic features; and
and determining the candidate semantic coding sequence according to the candidate resource related semantic features.
3. The method of claim 2, wherein the determining the candidate semantic coding sequence according to the candidate resource-related semantic features comprises:
processing the candidate resource related semantic features based on a clustering algorithm to obtain a plurality of candidate semantic code identifiers; and
and determining the candidate semantic code sequences according to a plurality of candidate semantic code identifiers.
4. The method of claim 1, wherein the processing the search information using a large language model to obtain a semantic coding sequence comprises:
performing information inquiry based on the search information to obtain associated search information;
And processing the search information and the associated search information by using the large language model to obtain the semantic coding sequence.
5. The method of claim 1, wherein the processing the search information using a large language model to obtain a semantic coding sequence comprises:
updating a preset search prompt template based on the search information to obtain search prompt information;
and processing the search prompt information by using the large language model to obtain the semantic coding sequence.
6. The method of any one of claims 1 to 5, further comprising:
displaying the intermediate resource;
wherein, the determining, according to the intermediate resource, the target resource matched with the search information includes:
in response to a validation operation for a target resource in the intermediary resources, the target resource is determined according to the validation operation.
7. The method of any one of claims 1 to 5, further comprising:
displaying a coded identification object sequence corresponding to the semantic code sequence,
and responding to the editing operation aiming at the code identification object sequence, and updating the semantic code sequence according to the editing operation to obtain a new semantic code sequence.
8. A method of training a large language model, comprising:
obtaining a training sample, wherein the training sample comprises sample search information and a sample semantic coding sequence corresponding to a sample resource, the sample search information is associated with the sample resource, the sample semantic coding sequence comprises at least one sample semantic coding identifier, and the sample semantic coding identifier characterizes at least one sample semantic attribute;
and training an initial large language model based on the sample search information and the sample semantic coding sequence to obtain a trained large language model.
9. The method of claim 8, wherein the training sample further comprises sample resource related information corresponding to the sample resource,
the training an initial large language model based on the sample search information and the sample semantic coding sequence comprises:
training the initial large language model according to the sample resource related information to obtain an intermediate large language model;
determining sample searching prompt information according to the sample searching information and a preset sample searching prompt template; and
and training the intermediate large language model by using the sample search prompt information and the sample semantic coding sequence.
10. The method of claim 9, wherein the training the initial large language model based on the sample resource-related information to obtain an intermediate large language model comprises:
inputting the sample resource related information into a preset encoder, and outputting sample resource related semantic features and at least one sample semantic coding identification feature;
training the initial encoder according to the sample resource related semantic features and the at least one sample semantic coding identification feature to obtain a pre-trained encoder; and
and updating the initial large language model according to the pre-trained encoder to obtain the intermediate large language model.
11. The method of claim 10, wherein the training sample further comprises tag semantic features corresponding to the sample resource-related semantic features;
wherein said training said initial encoder based on said sample resource-related semantic features and said at least one sample semantic coding identification feature comprises:
processing the sample resource related semantic features and the label semantic features according to a loss function to obtain a first loss value;
determining a second loss value according to the degree of difference between the sample resource related semantic features and the at least one sample semantic coding identification feature; and
Training the initial encoder according to the first loss value and the second loss value.
12. The method of claim 10, wherein the sample semantic coding sequence corresponding to a sample resource is determined based on:
and inputting sample related information corresponding to the sample resources into the encoder, and outputting the sample semantic coding identification.
13. The method of claim 9, wherein the training the initial large language model from the sample resource-related information comprises:
determining sample semantic prompt information according to the sample resource related information and a preset sample semantic prompt template; and
training the initial large language model by using the sample semantic prompt information and the sample semantic coding sequence.
14. A large language model-based resource retrieval device, comprising:
the semantic coding sequence obtaining module is used for responding to the received search information, processing the search information by utilizing a large language model to obtain a semantic coding sequence, wherein the semantic coding sequence comprises at least one semantic coding identifier, and the semantic coding identifier characterizes at least one semantic attribute;
The intermediate resource determining module is used for determining intermediate resources associated with the semantic coding sequence from candidate resources of a preset resource library;
the target resource determining module is used for determining target resources matched with the search information according to the intermediate resources; and
and the first display module is used for displaying the target resource.
15. The apparatus of claim 14, wherein the candidate resource is associated with a candidate semantic coding sequence determined based on:
acquiring candidate resource related information corresponding to the candidate resource;
extracting semantic features of the candidate resource related information to obtain candidate resource related semantic features; and
and determining the candidate semantic coding sequence according to the candidate resource related semantic features.
16. The apparatus of claim 15, wherein the determining the candidate semantic coding sequence according to the candidate resource-related semantic features comprises:
processing the candidate resource related semantic features based on a clustering algorithm to obtain a plurality of candidate semantic code identifiers; and
and determining the candidate semantic code sequences according to a plurality of candidate semantic code identifiers.
17. The apparatus of claim 14, wherein the semantic coding sequence obtaining module comprises:
the information inquiry sub-module is used for inquiring information based on the search information to obtain associated search information;
and the first semantic coding sequence obtaining submodule is used for processing the search information and the associated search information by using the large language model to obtain the semantic coding sequence.
18. The apparatus of claim 14, wherein the semantic coding sequence obtaining module comprises:
the search prompt information obtaining sub-module is used for updating a preset search prompt template based on the search information to obtain search prompt information;
and the second semantic coding sequence obtaining sub-module is used for processing the search prompt information by using the large language model to obtain the semantic coding sequence.
19. The apparatus of any of claims 14 to 18, further comprising:
the second display module is used for displaying the intermediate resources;
wherein the target resource determination module comprises:
and the target resource determining submodule is used for responding to the confirming operation for the target resource in the intermediate resources and determining the target resource according to the confirming operation.
20. The apparatus of any of claims 14 to 18, further comprising:
a third display module for displaying the code identification object sequence corresponding to the semantic code sequence,
and the updating module is used for responding to the editing operation aiming at the code identification object sequence, updating the semantic code sequence according to the editing operation and obtaining a new semantic code sequence.
21. A training apparatus for a large language model, comprising:
an acquisition module configured to acquire a training sample, where the training sample includes sample search information and a sample semantic coding sequence corresponding to a sample resource, where the sample search information is associated with the sample resource, the sample semantic coding sequence includes at least one sample semantic coding identifier, and the sample semantic coding identifier characterizes at least one sample semantic attribute;
and the training module is used for training the initial large language model based on the sample search information and the sample semantic coding sequence to obtain a trained large language model.
22. The apparatus of claim 21, wherein the training sample further comprises sample resource related information corresponding to the sample resource,
The training module comprises:
the first training sub-module is used for training the initial large language model according to the sample resource related information to obtain an intermediate large language model;
the sample search prompt information determining submodule is used for determining sample search prompt information according to the sample search information and a preset sample search prompt template; and
and the second training submodule is used for training the intermediate large language model by using the sample search prompt information and the sample semantic coding sequence.
23. The apparatus of claim 23, wherein the first training submodule comprises:
the first coding unit is used for inputting the sample resource related information into a preset coder and outputting sample resource related semantic features and at least one sample semantic coding identification feature;
the pre-training unit is used for training the initial encoder according to the sample resource related semantic features and the at least one sample semantic coding identification feature to obtain a pre-trained encoder; and
and the updating unit is used for updating the initial large language model according to the pre-trained encoder to obtain the intermediate large language model.
24. The apparatus of claim 23, wherein the training sample further comprises tag semantic features corresponding to the sample resource-related semantic features;
Wherein the pre-training unit comprises:
the first loss value obtaining subunit is used for processing the sample resource related semantic features and the label semantic features according to a loss function to obtain a first loss value;
a second loss value obtaining subunit, configured to determine a second loss value according to a degree of difference between the sample resource related semantic feature and the at least one sample semantic coding identifier feature; and
a training subunit for training the initial encoder according to the first loss value and the second loss value.
25. The apparatus of claim 23, wherein the sample semantic coding sequence corresponding to a sample resource is determined based on:
and inputting sample related information corresponding to the sample resources into the encoder, and outputting the sample semantic coding identification.
26. The apparatus of claim 22, wherein the first training submodule comprises:
the sample semantic prompt information determining unit is used for determining sample semantic prompt information according to the sample resource related information and a preset sample semantic prompt template; and
the training unit is used for training the initial large language model by using the sample semantic prompt information and the sample semantic coding sequence.
27. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 13.
28. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 13.
29. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 13.
CN202311842314.1A 2023-12-28 2023-12-28 Resource retrieval method, training method and device based on large language model Pending CN117807278A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311842314.1A CN117807278A (en) 2023-12-28 2023-12-28 Resource retrieval method, training method and device based on large language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311842314.1A CN117807278A (en) 2023-12-28 2023-12-28 Resource retrieval method, training method and device based on large language model

Publications (1)

Publication Number Publication Date
CN117807278A true CN117807278A (en) 2024-04-02

Family

ID=90427223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311842314.1A Pending CN117807278A (en) 2023-12-28 2023-12-28 Resource retrieval method, training method and device based on large language model

Country Status (1)

Country Link
CN (1) CN117807278A (en)

Similar Documents

Publication Publication Date Title
CN111680517B (en) Method, apparatus, device and storage medium for training model
CN112860866A (en) Semantic retrieval method, device, equipment and storage medium
CN109033220B (en) Automatic selection method, system, equipment and storage medium of labeled data
CN111651572A (en) Multi-domain task type dialogue system, method and terminal
CN113190702B (en) Method and device for generating information
CN114840671A (en) Dialogue generation method, model training method, device, equipment and medium
CN112417121A (en) Client intention recognition method and device, computer equipment and storage medium
CN111553556A (en) Business data analysis method and device, computer equipment and storage medium
CN112836521A (en) Question-answer matching method and device, computer equipment and storage medium
US11036996B2 (en) Method and apparatus for determining (raw) video materials for news
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN113220828B (en) Method, device, computer equipment and storage medium for processing intention recognition model
CN112199374B (en) Data feature mining method for data missing and related equipment thereof
CN111538817A (en) Man-machine interaction method and device
CN112506864A (en) File retrieval method and device, electronic equipment and readable storage medium
CN111143568A (en) Method, device and equipment for buffering during paper classification and storage medium
CN116597443A (en) Material tag processing method and device, electronic equipment and medium
CN115543428A (en) Simulated data generation method and device based on strategy template
CN117807278A (en) Resource retrieval method, training method and device based on large language model
CN114880498A (en) Event information display method and device, equipment and medium
CN114637831A (en) Data query method based on semantic analysis and related equipment thereof
CN114218431A (en) Video searching method and device, electronic equipment and storage medium
CN114329016A (en) Picture label generation method and character matching method
CN117807188A (en) Search information processing method, apparatus, electronic device and storage medium
CN114297285A (en) Construction method and device of portrait data, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination