CN110955755A - Method and system for determining target standard information - Google Patents

Method and system for determining target standard information Download PDF

Info

Publication number
CN110955755A
CN110955755A CN201911210000.3A CN201911210000A CN110955755A CN 110955755 A CN110955755 A CN 110955755A CN 201911210000 A CN201911210000 A CN 201911210000A CN 110955755 A CN110955755 A CN 110955755A
Authority
CN
China
Prior art keywords
information
candidate
machine learning
standard
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911210000.3A
Other languages
Chinese (zh)
Inventor
陈晓军
崔恒斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN201911210000.3A priority Critical patent/CN110955755A/en
Publication of CN110955755A publication Critical patent/CN110955755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One or more embodiments of the present specification relate to a method and system for determining target criteria information. The method for determining the target standard information comprises the following steps: acquiring a user question and one or more candidate standard information corresponding to the user question; determining candidate textual information based on the user question and the one or more candidate criteria information; determining target criteria information from the one or more candidate criteria information based at least on a machine learning model and the candidate text information.

Description

Method and system for determining target standard information
Technical Field
One or more embodiments of the present disclosure relate to the field of information processing, and more particularly, to a method and system for determining target standard information.
Background
With the rapid development of computer technology, intelligent question answering is widely applied in multiple fields, and computer question answering tends to be intelligent. In the face of various and huge user problems, the intelligent question-answering system needs to standardize various user problems so as to process the user problems into the problems which can be uniformly processed by the machine, so that the machine can accurately feed back knowledge information corresponding to the problems to the user.
Therefore, there is a need for a more efficient target standard information determination method to quickly determine standardized problem descriptions or knowledge information thereof according to user problems.
Disclosure of Invention
One of the embodiments of the present specification provides a method of determining target criteria information, the method being performed by at least one processor, the method comprising: acquiring a user question and one or more candidate standard information corresponding to the user question; determining candidate textual information based on the user question and the one or more candidate criteria information; determining target criteria information from the one or more candidate criteria information based at least on a machine learning model and the candidate text information.
One of the embodiments of the present specification provides a system for determining target standard information, the system comprising: the candidate standard information acquisition module is used for acquiring the user question and one or more candidate standard information corresponding to the user question; a candidate text information determination module to determine candidate text information based on the user question and the one or more candidate criteria information; a target criteria information determination module to determine target criteria information from the one or more candidate criteria information based at least on a machine learning model and the candidate text information.
One of the embodiments of the present specification provides an apparatus for determining target standard information, including a processor and a memory; the memory is used for storing instructions, and the processor is used for executing the instructions to realize the corresponding operation of the target standard information determining method.
Drawings
The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a schematic diagram of an application scenario of a method for determining target criteria information according to some embodiments of the present description;
FIG. 2 is a block diagram of a method of determining target criteria information according to some embodiments of the present description;
FIG. 3 is an exemplary flow diagram of a method of determining target criteria information, shown in accordance with some embodiments of the present description;
FIG. 4 is a schematic diagram of model training for a method of determining target criteria information according to some embodiments of the present description.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in one or more embodiments of the present specification and in the claims, the terms "a," "an," and/or "the" are not intended to be inclusive of the plural form as long as the context clearly indicates the exception. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
In some application scenarios of intelligent customer service, a user may input a user question through a user terminal, and a server may obtain the user question from the user terminal and recall one or more candidate standard information corresponding to the user question from a storage device based on the question posed by the user and return the candidate standard information to the user. In some embodiments, the candidate standard information corresponding to the user question may be understood as a standard question corresponding to the user question, and may also be understood as a standard question corresponding to the user question and content information for solving the standard question.
In this embodiment, the server may also find one or more most accurate answers from the recalled candidate criteria information, and return the answers to the user terminal. In some embodiments, the server first needs to calculate semantic similarity between each piece of candidate standard information recalled and the user question, and then sends the candidate standard information with higher similarity to the user question to the user. In this embodiment, the server needs to calculate the similarity between each piece of candidate standard information recalled and the user question in turn, and the calculation speed is slow, which may not well meet the needs of the user in some scenarios.
In some embodiments, the server can simultaneously calculate similarity between the recalled candidate standard information or candidate standard information and the user problem, so that the calculation efficiency can be improved to a certain extent, the customer requirements can be met, the software and hardware resources can be saved, and the service efficiency can be optimized.
Fig. 1 is a schematic diagram of an application scenario of a method for determining target standard information according to some embodiments of the present disclosure.
In some embodiments, system 100 for determining target criteria information includes server 110, storage device 120, user terminal 130, and network 140.
The server 110 may process data and/or information for at least one other component in the system 100. For example, server 110 may obtain a question of a user from user terminal 130 and determine one or more candidate criteria information based on the question posed by the user. For another example, the server may determine candidate text information based on the user question and the candidate criteria information. For another example, the server may determine the target standard information from the candidate standard information and push the target standard information to the terminal 130.
In some embodiments, the server 110 may be a single processing device or a group of processing devices. In some embodiments, server 110 may access information and/or data stored in user terminal 130 and/or storage device 120 via network 140. For example, the server may obtain the user question through the user terminal 130. In some embodiments, the server 110 may be local or remote with respect to the terminal 130.
The terminal 130 may be a device with data acquisition, storage, and/or transmission capabilities, such as a smart phone. Through the terminal 130, the user can ask a question and can also acquire standard information corresponding to the question. In some embodiments, the terminal 130 may include, but is not limited to, a mobile device 130-1, a tablet 130-2, a laptop 130-3, a desktop 130-4, and the like, or any combination thereof. In some embodiments, the terminal 130 may send the retrieved data to one or more devices in the target standard information system 100. For example, the terminal 130 may transmit the acquired data to the server 110 or the storage device 120.
Storage device 120 may store data and/or instructions. The storage device 120 may store data retrieved by the server. For example, the storage device 120 may store candidate text information, target criterion information, and the like. In some embodiments, storage device 120 may store data and/or instructions for execution or use by server 110, which may be executed or used by server 110 to implement the example methods of this specification. In some embodiments, storage device 120 may be connected to network 140 to enable communication with one or more components (e.g., server 110, terminal 130, etc.) in system 100 that determine target criteria information. In some embodiments, the storage device 120 may be directly connected to or in communication with one or more components of the system 100 (e.g., the server 110, the terminal 130, etc.) that determine the target criteria information. In some embodiments, storage device 120 may be part of server 110.
Network 140 may facilitate the exchange of information and/or data. In some embodiments, at least one component of the target criteria information system 100 (e.g., server 110, storage 120, user terminal 130) may send information and/or data to other components of the target criteria information system 100 via the network 140. For example, the user terminal 130 may transmit the user question to the server 110 through the network 140. As another example, server 110 may send the determined target criteria information to the user via network 140. As another example, server 110 may recall one or more candidate criteria information from storage 120 based on a predetermined algorithm. In some embodiments, the network 140 may be any form of wired or wireless network, or any combination thereof.
It should be noted that the description of the application scenario 100 of the method of determining target standard information described above is for illustration and explanation only, and does not limit the scope of applicability of the present description. Various modifications and changes may be made to the application scenario 100 by those skilled in the art, guided by the present description. However, such modifications and variations are intended to be within the scope of the present description.
FIG. 2 is a block diagram of a system for determining target criteria information, shown in accordance with some embodiments of the present description.
As shown in FIG. 2, the determine target criteria information system 200 may include a candidate criteria information acquisition module 210, a candidate text information determination module 220, a target criteria information determination module 230, and a model training module 240.
The candidate standard information obtaining module 210 is configured to obtain a user question and one or more candidate standard information corresponding to the user question.
A candidate text information determination module 220 for determining candidate text information based on the user question and the one or more candidate criteria information. In some embodiments, the candidate text information determination module is configured to concatenate the user question and the one or more candidate standard information according to a preset rule to determine candidate text information.
A target criterion information determination module 230 configured to determine target criterion information from the one or more candidate criterion information based on at least the machine learning model and the candidate text information. In some embodiments, the target criteria information determination module is further configured to determine a location in the candidate text information where the target criteria information corresponds based on the machine learning model and the candidate text information. In some embodiments, the target criterion information determination module is further configured to determine a probability value corresponding to each candidate criterion information based on the machine learning model and the candidate text information. In some embodiments, the target criteria information determination module is further to: determining a vector representation to which the user question and the one or more candidate standard information each correspond based on the machine learning model and the candidate textual information; determining the target criteria information based on a distance between the vector representation of the user question and a vector representation of each candidate criteria information.
A model training module 240 to: acquiring a training sample set; the training sample set comprises historical problems, historical candidate standard information and historical target standard information corresponding to the historical problems; splicing the historical problems and the corresponding historical candidate standard information thereof to form historical candidate text information which is used as input data; taking historical target standard information corresponding to the historical problems as a reference standard; and training the initial machine learning model by using the input data and the corresponding reference standard to obtain the trained machine learning model.
It should be understood that the system and its modules shown in FIG. 2 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of one or more embodiments in this specification may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of hardware circuits and software (e.g., firmware).
It should be noted that the above descriptions of the candidate item display and determination system and the modules thereof are merely for convenience of description, and are not intended to limit the present disclosure to the scope of one or more embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. In some embodiments, one or more of the modules in FIG. 2 may be omitted, e.g., model training module 240 may be omitted. In some embodiments, the candidate standard information obtaining module 210, the candidate text information determining module 220, and the target standard information determining module 230 in fig. 2 may be different modules in one system, or may be a module that implements the functions of two or more modules described above. In some embodiments, the candidate standard information obtaining module 210 and the candidate text information determining module 220 may be two modules, or one module may have both the functions of obtaining candidate standard information and determining candidate text information. In some embodiments, each module may share one memory module, and each module may also have its own memory module. Such variations are intended to be within the scope of one or more embodiments of the present disclosure.
FIG. 3 is an exemplary flow chart of a method of determining target criteria information, shown in some embodiments herein.
Step 310, a user question and one or more candidate standard information corresponding to the user question are obtained. In some embodiments, this step may be performed by the candidate criteria information acquisition module 210.
In some embodiments, a user question may be understood as a question that a user has raised in a certain application scenario, which is typically implemented on the user terminal 130, in case the user has a problem with a difficulty that requires an answer to be sought. In some embodiments, the form of the user question includes, but is not limited to, one or any combination of text, voice, image, and the like.
In some embodiments, the standard information may be understood as a standardized problem, i.e., a standard problem. In some embodiments, knowledge information related to the questions may be stored in a database, such as storage device 120, indexed by standard questions, in a form including, but not limited to, one or more of text, speech, images, and the like. Through the standard problem, knowledge information related to the problem can be quickly located. In some embodiments, the standard information may be understood to further include answer knowledge content corresponding to the standard question. In some embodiments, the candidate criteria information may be an understanding of one or more standardized questions and/or corresponding answer content that may be initially filtered that are relatively closely related to the subject matter and/or meaning of the user question.
In some embodiments, the user inputs the question to the user terminal 130 in any one or more of text, voice, and image, and the server 110 may retrieve the user question from the user terminal 130 via the network 140. In some embodiments, the server may recall, from the memory, one or more candidate criteria information corresponding to the user question according to the retrieved user question. Where "corresponding" may be understood as semantically relatively close. In some embodiments, the manner of recalling candidate criteria information includes: and matching the candidate standard information according to the keywords, acquiring the candidate standard information according to the historical operation behaviors of the user and the like.
In some embodiments, after the server obtains the user question, one or more candidate criteria information corresponding to the user question may be obtained from the memory based on the question keyword. For example, after the server acquires the question of 'borrow payment failure' from the user terminal, the keyword 'repayment' is determined, and then one or more candidate standard information 'beijiao cannot be paid', 'borrow cannot be paid', 'credit card cannot be paid', 'no money payment' containing the keyword are searched from the memory in a text matching mode. The standard information base can be understood as a database containing a large amount of standard information.
In some embodiments, the user historical operational behavior may include some operational behavior that occurred by the user prior to the question being posed, which operational behavior may be capable of some correlation with the question posed by the user, and the candidate objective criteria information may be recalled based on the correlation. For example, if the user has a history operation behavior of just opening the bei, and the question asked by the user is "how to repay", then it can be determined that the question asked by the user is "how to repay" based on the history operation behavior of the user and the question asked by the user, that is, the question is recalled as candidate standard information.
In some embodiments, the user question may also be processed by a pre-trained classification machine learning model, outputting one or more candidate criteria information.
In the present application, the method for recalling candidate standard information is not limited at all, and any means for recalling relevant information based on the search element can be used for recalling candidate standard information.
Step 320, determining candidate text information based on the user question and the one or more candidate criteria information. In some embodiments, this step may be performed by the candidate text information module 220.
In some embodiments, the candidate text information may be understood as text information obtained by combining the user question with one or more candidate standard information. For example, the candidate text information may include text information corresponding to the user question "payment over with money failed", and text information corresponding to a plurality of candidate standard information "payment over with money", "payment over with money impossible", "payment over credit card impossible", "payment over money" and "payment over no money". In some embodiments, if the user question and/or candidate criteria information includes information in the form of speech, it may be converted to corresponding text information before determining the candidate text information.
In some embodiments, the user question may be concatenated with one or more candidate standard information according to a preset rule to determine candidate text information. In some embodiments, splicing the user question and the candidate standard information according to a preset rule may include splicing the user question and the candidate standard information according to an order that the user question is before and the candidate standard information is after; splicing can also be carried out according to the sequence of the candidate standard information before and the user problem after; and splicing other preset rules for placing the user question between two different candidate standard information.
In some embodiments, the candidate text information may further include symbol information indicating a location and/or type of the user question and the candidate criteria information in the candidate text. The symbol information includes one or any combination of words, letters, numbers, characters, punctuation, and the like.
In some embodiments, at least one different symbol may be used to represent different types of information, respectively, for distinguishing user questions from candidate criteria information. For example, candidate standard information may be represented using a symbol [ Q ], and one or more candidate standard information may be represented using [ Q1], [ Q2], [ Q3].,. - [ Qn ], respectively; the user question may be represented using the symbol CLS. In some embodiments, the different symbol information may also indicate the location of the candidate criteria information or user question in the candidate text. In some embodiments, the position and type of information in the candidate text may be indicated by one symbol information, and in some implementations, the position and type of information may be indicated by different symbol information. In some embodiments, symbolic information indicating location and/or type may be provided before the textual information for indication, e.g. "Q1 flower cannot be paid for"; it may also be provided after a text message for indication, for example "flower unable to pay [ Q1 ]". By way of example only, the left side of fig. 4 shows candidate text information including a user question, a plurality of candidate standard information, and symbol information.
Step 330, determining target criterion information from the one or more candidate criterion information based on at least a machine learning model and the candidate text information. In some embodiments, this step may be performed by the target criteria information module 230.
In some embodiments, the target criteria information may be understood as one or more criteria information of the one or more candidate criteria information that are most relevant and/or semantically closest to the user question subject.
In some embodiments, the candidate text information may be processed based on a machine learning model to determine target criteria information from one or more candidate criteria information. In some embodiments, a vector representation to which the user question and the one or more candidate criteria information each correspond may be determined based on the machine learning model and the candidate text information; the target criteria information is then determined based on the distance between the vector representation of the user question and the vector representation of each candidate criteria information. For example, candidate text information [ CLS ] is not yet paid by [ Q1] or [ Q2] is not yet paid by [ Q3] credit card, the candidate text information is input into a machine learning model, and a plurality of vector representations are obtained after operation conversion. After the vector conversion is completed, the distance between the vector representation of each candidate standard information and the vector representation of the user question may be calculated, and then one or more candidate standard information with the closest distance may be used as the target standard information. The machine learning model converts the candidate text information in batch, can simultaneously process the user problem and one or more candidate standard information, effectively saves resources and improves the processing efficiency.
In some embodiments, methods of determining the distance between the vector representation of the user question and the vector representation corresponding to each candidate criterion information include, but are not limited to, a calculation method based on word frequency statistics, a calculation method based on ontology, a calculation method based on geometric metric space, and the like. In some embodiments, calculating the vector distance between the user question and each candidate criterion information may be implemented based on a pre-set algorithm, which in some implementations may be implemented within a machine learning model.
In some embodiments, the target criterion information may be determined to correspond to a location in the candidate text information based on a machine learning model. In particular, the machine learning model may determine target criteria information by a distance between basis vector representations (e.g., a distance between a vector representation of a user question and a vector representation of each candidate criteria information), output one or more location information from which a location of the target criteria information in the input candidate text information may be determined. In some embodiments, the location information may include symbol information indicating the candidate standard information. In some embodiments, the position information may also include an indication of the position of the sequence vector in the text information input by the model. For example, after the well spliced candidate text information "beibeibei bei qi 2 bei qi 10] not yet repayment", is input to the machine learning model, and after the operation processing, the model can output "beiqi 2 ]", "" Q2 bei: the payment unable to be paid can be used as target standard information.
In still other embodiments, probability values corresponding to the candidate criteria information may be determined based on a machine learning model. In some embodiments, the probability value corresponding to each candidate standard information may be understood as a probability value capable of characterizing semantic similarity between each candidate standard information and the user question. For example, after the well spliced candidate text information "bei | (" CLS ") cannot be paid by a [ Q1] payment failure [ Q2] cannot be paid by a [ Q3] credit card cannot be paid... [ Q10] no money payment" is input into the machine learning model, after the operation processing, the model can respectively output probability values p1, p2, p3... p10 corresponding to 10 candidate standard information in the candidate text information, and then the text information "[ Q2] cannot be paid" corresponding to the p2 with the largest probability value can be used as the target standard information.
It should be noted that, the above-mentioned related flow: the description of determining candidate text information based on the user question and the candidate standard information and determining target standard information based on the machine learning model and the candidate text information is merely for illustration and description, and does not limit the scope of application of the present specification. Various modifications and alterations to the flow may occur to those skilled in the art, given the benefit of this description. However, such modifications and variations are intended to be within the scope of the present description. For example, determination target criterion information other than the machine learning model may be used.
FIG. 4 is a machine learning model training diagram of a method of determining target criteria information according to some embodiments described herein.
The machine learning model in one or more embodiments of the present specification can be obtained by: acquiring a training sample set; the training sample set comprises historical problems, historical candidate standard information and historical target standard information corresponding to the historical problems; splicing the historical problems and the corresponding historical candidate standard information thereof to form historical candidate text information which is used as input data; taking historical target standard information corresponding to the historical problems as a reference standard; and training the initial machine learning model by using the input data and the corresponding reference standard to obtain the trained machine learning model.
In some embodiments, the historical questions and the historical candidate standard information and the historical target standard information corresponding to the historical questions may be stored in a storage device, the historical questions may be understood as questions posed by a historical user, the historical candidate standard information may be understood as candidate standard information having a certain semantic similarity with the posed questions, the historical target standard information may be understood as standard information selected by the historical user from the historical candidate standard information, or standard information with forward feedback of the historical user on the historical candidate standard information, and the forward feedback may include approval, reading time greater than a set threshold, and the like.
In some embodiments, the training sample set may be obtained by server 110 from storage device 140 via network 120. In some embodiments, the concatenation manner of the historical question in the historical candidate text and the historical candidate standard information may be referred to in the related description of step 320. Referring to the example shown in fig. 4, the input data includes historical candidate text information "[ CLS ] debt failure [ Q1] beit unable payment [ Q6356 ] beit unable payment [ Q3] credit card unable payment ] spliced by the historical problem and the historical candidate standard information, the initial machine learning model is trained by using the input data and the corresponding reference standard.
Referring to FIG. 4, in some embodiments, the initial machine learning model may include a BERT model and a probability inversion layer. In some embodiments, the BERT model may also be replaced with other natural language processing models, such as the ELMO model, and the like. The BERT model will be described in detail below as an example.
In some embodiments, the BERT model processes the user question and the one or more candidate criteria information in the candidate textual information into one or more corresponding vector representations, and the probability translation layer is configured to translate the one or more vector representations output by the BERT model into one or more probability values. In some embodiments, the one or more probability values output by the probability conversion layer may correspond to the candidate standard information in a one-to-one manner, and respectively represent the distance between the vector representation of the corresponding candidate standard information and the vector representation of the user question, for example, if the probability value corresponding to the candidate standard information is large, it represents that the candidate standard information is close to the vector representation of the user question. In the introduction part of the model training module, historical data are used as training samples, and for simplifying the description, the candidate standard information and the target standard information related in the part refer to historical candidate standard information and historical target standard information.
In some embodiments, one or more vector representations may be obtained after the text information in the input data is computed by a neural network layer inside the BERT model. In some embodiments, the neural network layer may be a multi-head self-attentional (multi-head self-attention) neural network layer. The neural network layer can comprise N layers, wherein N is more than 1, and N can be set to different values according to actual needs. For example, N may equal 12.
In some embodiments, the probability transformation Layer may implement the transformation of the vector into the probability by using a Multi-Layer Perceptron (MLP), Softmax, or other algorithms. The multilayer perceptron is also called an artificial neural network, and a result is obtained by connecting a plurality of characteristic values output by the BERT model through combination of linearity and nonlinearity. In some embodiments, by means of MLP, a plurality of vector representations of the BERT model output may be converted into corresponding probability values. Softmax may enable the conversion of vectors to probability values, and in some other embodiments Softmax may be used instead of MLP.
In some embodiments, in the model training process, a Loss Function (Loss Function) may be used to evaluate how well the machine learning model is in the training process, and then it may be determined whether training needs to be terminated, so as to obtain an optimal training effect of the machine learning model.
In some embodiments, when training the initial machine learning, the loss function may be expressed by the following formula:
Figure BDA0002296063230000141
in the loss function, i represents the ith candidate standard information of the candidate text information, and may correspond to i in [ Qi ]. p [ i ] represents the corresponding probability value of the vector of the ith candidate standard information position, and p [2] represents the corresponding probability value of the vector of the second candidate standard information position, which is also the probability value corresponding to the reference standard. The probability value reflects the probability that the candidate standard information corresponding to the position is the target standard information, or reflects the correlation degree of the candidate standard information corresponding to the position and the user problem. In other embodiments, if the given reference criterion is the third candidate criterion information, the probability value corresponding to the reference criterion may be represented by p [3]. margin can be understood as a magnitude parameter, i.e. representing the gap between the probability value corresponding to the reference criterion and the other probability values. Correspondingly, the loss function indicates that the training goal of the initial machine learning model is to minimize the value of the function, i.e., make the probability value corresponding to the reference criterion larger than other probability values and the magnitude of the difference as close to or larger than margin as possible.
Specifically, referring to fig. 4, in this embodiment, the input data "bei [ CLS ] cannot be reimbursed [ Q1] by bei [ Q3] credit card cannot be reimbursed.. till [ Q10] is not reimbursed," [ Q2] cannot be reimbursed ] as a reference standard is input to the initial machine learning model, and after passing through the BERT model and the probability transformation layer, the initial machine learning model outputs probability values p [1], p [2], p [3]. till p [10] corresponding to the positions of the candidate standard information. And inputting p [1], p [2], p [3].... No. p [10] into the loss function, if the value of the loss function is 0, the probability values corresponding to the reference standard in the candidate standard information are all larger than the probability values corresponding to other candidate standard information, and the deviation amplitude is larger than or equal to a preset amplitude margin, namely, the training target is met, and forward feedback can be given to the model. If the value of the loss function is not 0, it indicates that one or more other probability values are not smaller than the probability value corresponding to the reference standard, or the small amplitude does not reach the preset amplitude margin, that is, the training target is not met, and the model is given negative punishment or tuning. And repeating the steps until the output results of each time in the model training stage can better meet the training target, wherein the overall value of the loss function is close to 0.
In some embodiments, other forms of loss functions may be employed, common loss functions including: 0-1 loss function, absolute loss function, log logarithmic loss function, square loss function, and exponential loss function. The skilled person can select it according to specific needs.
The beneficial effects that may be brought by the embodiments of the present description include, but are not limited to: (1) by using the machine learning model to synchronously process the user problem and the plurality of candidate standard information, the calculation time is effectively saved, and the efficiency is improved. (2) The probability of target standard information can be accurately obtained by using a BERT model and a machine model of a probability conversion layer, and the purpose of optimizing a processing result is achieved. It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed within the present specification and are intended to be within the spirit and scope of one or more exemplary embodiments of the present specification.
Also, specific words are used in this specification to describe the embodiments herein. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Furthermore, unless explicitly stated in the claims, the order of processing elements and sequences, the use of numerical letters, or the use of other names in the present specification are not intended to limit the order of processes and methods in one or more embodiments of the present specification. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that the technical solutions described in this specification require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application is filed in a manner inconsistent or contrary to the present specification, and except where a claim is filed in a manner limited to the broadest scope of the application (whether present or later appended to the specification). It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims (15)

1. A method of determining target criteria information, the method being performed by at least one processor, the method comprising:
acquiring a user question and one or more candidate standard information corresponding to the user question;
determining candidate textual information based on the user question and the one or more candidate criteria information;
determining target criteria information from the one or more candidate criteria information based at least on a machine learning model and the candidate text information.
2. The method of claim 1, the determining candidate textual information based on the user question and the one or more candidate criteria information comprising:
and splicing the user question and the one or more candidate standard information according to a preset rule to determine candidate text information.
3. The method of claim 2, wherein the candidate text information further comprises symbol information indicating a location and/or a type of the user question and the candidate criteria information.
4. The method of claim 1, wherein said step of treating is carried out in a single step,
the determining target criterion information based on at least a machine learning model and the candidate text information comprises:
determining the position of the target standard information corresponding to the candidate text information based on the machine learning model and the candidate text information;
or determining probability values corresponding to the candidate standard information based on the machine learning model and the candidate text information.
5. The method according to claim 1 or 4,
the determining target criterion information based on at least a machine learning model and the candidate text information comprises:
determining a vector representation to which the user question and the one or more candidate standard information each correspond based on the machine learning model and the candidate textual information;
determining the target criteria information based on a distance between the vector representation of the user question and a vector representation of each candidate criteria information.
6. The method of claim 1, wherein the machine learning model is obtained by:
acquiring a training sample set; the training sample set comprises historical problems, historical candidate standard information and historical target standard information corresponding to the historical problems;
splicing the historical problems and the corresponding historical candidate standard information thereof to form historical candidate text information which is used as input data; taking historical target standard information corresponding to the historical problems as a reference standard;
and training the initial machine learning model by using the input data and the corresponding reference standard to obtain the trained machine learning model.
7. The method of claim 1 or 6, the machine learning model comprising a BERT model and a probability translation layer; the probability translation layer is configured to translate one or more vector representations output by the BERT model into one or more probability values.
8. A system for determining target criteria information, the system comprising:
the candidate standard information acquisition module is used for acquiring the user question and one or more candidate standard information corresponding to the user question;
a candidate text information determination module to determine candidate text information based on the user question and the one or more candidate criteria information;
a target criteria information determination module to determine target criteria information from the one or more candidate criteria information based at least on a machine learning model and the candidate text information.
9. The system of claim 8, the candidate textual information determination module further configured to concatenate the user question and the one or more candidate standard information according to a preset rule to determine candidate textual information.
10. The system of claim 9, the candidate textual information further comprising symbolic information indicating the location and/or type of the user question and the candidate criteria information.
11. The system of claim 8, the target criteria information determination module further to determine a location in the candidate text information where the target criteria information corresponds based on the machine learning model and the candidate text information;
or determining probability values corresponding to the candidate standard information based on the machine learning model and the candidate text information.
12. The system of claim 8 or 11, the target criteria information determination module further to:
determining a vector representation to which the user question and the one or more candidate standard information each correspond based on the machine learning model and the candidate textual information;
determining the target criteria information based on a distance between the vector representation of the user question and a vector representation of each candidate criteria information.
13. The system of claim 8, further comprising a model training module to:
acquiring a training sample set; the training sample set comprises historical problems, historical candidate standard information and historical target standard information corresponding to the historical problems;
splicing the historical problems and the corresponding historical candidate standard information thereof to form historical candidate text information which is used as input data; taking historical target standard information corresponding to the historical problems as a reference standard;
and training the initial machine learning model by using the input data and the corresponding reference standard to obtain the trained machine learning model.
14. The system of claim 8 or 13, the machine learning model comprising a BERT model and a probability translation layer; the probability translation layer is configured to translate one or more vector representations output by the BERT model into one or more probability values.
15. An apparatus for determining target criteria information, the apparatus comprising a processor and a memory; the memory is used for storing instructions, and the processor is used for executing the instructions to realize the corresponding operation of the method for determining the target standard information according to any one of claims 1 to 7.
CN201911210000.3A 2019-11-29 2019-11-29 Method and system for determining target standard information Pending CN110955755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911210000.3A CN110955755A (en) 2019-11-29 2019-11-29 Method and system for determining target standard information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911210000.3A CN110955755A (en) 2019-11-29 2019-11-29 Method and system for determining target standard information

Publications (1)

Publication Number Publication Date
CN110955755A true CN110955755A (en) 2020-04-03

Family

ID=69979167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911210000.3A Pending CN110955755A (en) 2019-11-29 2019-11-29 Method and system for determining target standard information

Country Status (1)

Country Link
CN (1) CN110955755A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898678A (en) * 2020-07-30 2020-11-06 北京嘀嘀无限科技发展有限公司 Method and system for classifying samples

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992659A (en) * 2019-02-12 2019-07-09 阿里巴巴集团控股有限公司 Method and apparatus for text sequence
CN110032623A (en) * 2018-12-12 2019-07-19 阿里巴巴集团控股有限公司 The matching process and device of user's question sentence and knowledge dot leader
US20190243900A1 (en) * 2017-03-03 2019-08-08 Tencent Technology (Shenzhen) Company Limited Automatic questioning and answering processing method and automatic questioning and answering system
CN110210021A (en) * 2019-05-22 2019-09-06 北京百度网讯科技有限公司 Read understanding method and device
CN110222167A (en) * 2019-07-03 2019-09-10 阿里巴巴集团控股有限公司 A kind of method and system obtaining target criteria information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190243900A1 (en) * 2017-03-03 2019-08-08 Tencent Technology (Shenzhen) Company Limited Automatic questioning and answering processing method and automatic questioning and answering system
CN110032623A (en) * 2018-12-12 2019-07-19 阿里巴巴集团控股有限公司 The matching process and device of user's question sentence and knowledge dot leader
CN109992659A (en) * 2019-02-12 2019-07-09 阿里巴巴集团控股有限公司 Method and apparatus for text sequence
CN110210021A (en) * 2019-05-22 2019-09-06 北京百度网讯科技有限公司 Read understanding method and device
CN110222167A (en) * 2019-07-03 2019-09-10 阿里巴巴集团控股有限公司 A kind of method and system obtaining target criteria information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898678A (en) * 2020-07-30 2020-11-06 北京嘀嘀无限科技发展有限公司 Method and system for classifying samples

Similar Documents

Publication Publication Date Title
US11610064B2 (en) Clarification of natural language requests using neural networks
CN109960734A (en) It is answered for the problem of data visualization
CN111353033B (en) Method and system for training text similarity model
US10922342B2 (en) Schemaless systems and methods for automatically building and utilizing a chatbot knowledge base or the like
US11238050B2 (en) Method and apparatus for determining response for user input data, and medium
CN110377733B (en) Text-based emotion recognition method, terminal equipment and medium
CN115082920B (en) Deep learning model training method, image processing method and device
CN110704586A (en) Information processing method and system
CN111582500A (en) Method and system for improving model training effect
CN111324738B (en) Method and system for determining text label
US20200364216A1 (en) Method, apparatus and storage medium for updating model parameter
CN112507095A (en) Information identification method based on weak supervised learning and related equipment
CN114385694A (en) Data processing method and device, computer equipment and storage medium
CN117520497A (en) Large model interaction processing method, system, terminal, equipment and medium
CN110955755A (en) Method and system for determining target standard information
CN111858899B (en) Statement processing method, device, system and medium
CN111324722B (en) Method and system for training word weight model
CN111400413B (en) Method and system for determining category of knowledge points in knowledge base
CN111382246B (en) Text matching method, matching device, terminal and computer readable storage medium
CN112347320A (en) Associated field recommendation method and device for data table field
CN112597208A (en) Enterprise name retrieval method, enterprise name retrieval device and terminal equipment
CN115249017B (en) Text labeling method, training method of intention recognition model and related equipment
CN112785415B (en) Method, device and equipment for constructing scoring card model and computer readable storage medium
CN116226478B (en) Information processing method, model training method, device, equipment and storage medium
CN117009532B (en) Semantic type recognition method and device, computer readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200403

RJ01 Rejection of invention patent application after publication