CN112732877B - Data processing method, device and system - Google Patents

Data processing method, device and system Download PDF

Info

Publication number
CN112732877B
CN112732877B CN201910973268.6A CN201910973268A CN112732877B CN 112732877 B CN112732877 B CN 112732877B CN 201910973268 A CN201910973268 A CN 201910973268A CN 112732877 B CN112732877 B CN 112732877B
Authority
CN
China
Prior art keywords
shorthand
model
query
sequence
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910973268.6A
Other languages
Chinese (zh)
Other versions
CN112732877A (en
Inventor
赵鹏
徐光伟
李辰
包祖贻
刘恒友
李林琳
张佶
杜河禄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910973268.6A priority Critical patent/CN112732877B/en
Publication of CN112732877A publication Critical patent/CN112732877A/en
Application granted granted Critical
Publication of CN112732877B publication Critical patent/CN112732877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method, device and system. Wherein the method comprises the following steps: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem. The invention solves the technical problem of low efficiency of matching question-answering libraries in the prior art.

Description

Data processing method, device and system
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a method, an apparatus, and a system for data processing.
Background
Under the development trend of the internet, in order to reduce the workload of manual customer service in the process of providing online shopping or consultation service, an e-commerce platform starts to develop an intelligent customer service response system gradually based on the internet technology and in combination with a computer technology, wherein a long question is frequently presented in a form based on the communication habit of human beings, namely, the intelligent customer service system is asked by a long query, and the condition of reduced accuracy rate in the question-answer matching process of the long query is easily caused due to a large amount of information attached to the long query.
In the related art, the question length query input by the user is not limited, so that part of the question length query input by the user is very long, the expression redundancy is high, and the question is difficult to match in a question-answer library.
In the related technology, the static vectors of all words in the query are averaged to be used as the vectors of the query, and the similarity is calculated with the problem vectors in the question-answering library, so that the aim of matching is fulfilled. The disadvantages of this solution are mainly three:
Firstly, static word vectors cannot utilize the information of the context, and the problem of ambiguous words cannot be solved.
Secondly, word vector average is too coarse as sentence vector, and all words are treated with the same weight.
Thirdly, the original language model does not utilize sentence membership information of a question and answer library.
Aiming at the problem of low efficiency of matching question-answering libraries in the prior art, no effective solution is proposed at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method, device and system, which at least solve the technical problem of low efficiency of matching question-answering libraries in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a system for data processing, including: the prediction subsystem is used for acquiring a query long sentence input by a user; and the offline training subsystem is used for matching the long query sentence input by the user with the problem in the language model obtained by offline training to obtain a shorthand problem, and returning the shorthand problem to the prediction subsystem.
Optionally, the method comprises offline training, which is used for optimizing the language model through optimizing the loss function according to the question input by the user and the corpus of the question-answering library, training the shorthand model through a sequence-by-sequence mode according to the optimized language model, obtaining the word vector of the long query sentence, and rewriting the long query sentence according to the word vector to obtain the shorthand question.
Optionally, the prediction subsystem includes: an online prediction subsystem.
According to another aspect of the embodiment of the present invention, there is also provided a data processing method, including: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
Optionally, the method further comprises: and optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user to obtain an optimized language model.
Further, optionally, optimizing the language model by optimizing the loss function according to the question and question-answer corpus input by the user includes: and according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through optimizing the loss function.
Optionally, rewriting the query long sentence through a shorthand model according to the word vector, and obtaining the shorthand problem includes: acquiring words meeting preset conditions from the word vectors through a shorthand model, wherein the preset conditions are words, the weight of which meets a preset threshold, in the acquired word vectors; and (3) rewriting the query long sentence by acquiring words meeting preset conditions to obtain a shorthand problem.
Further, optionally, the shorthand model includes: sequence-to-sequence shorthand model.
Optionally, obtaining, from the word vector, the word satisfying the preset condition through the shorthand model includes: and acquiring words meeting preset conditions from the word vectors through a self-focusing mechanism in the sequence-to-sequence shorthand model.
Optionally, obtaining the query long sentence input by the user includes: obtaining, by a client device, a query long sentence input by a user, wherein the client device includes: intelligent mobile terminal, intelligent mobile terminal includes: desktop computers, smart wearable devices, smart phones, tablet computers, notebook computers, or palm business.
Optionally, the data processing method is applied to an e-commerce online customer service system.
Further, optionally, rewriting the query long sentence through the shorthand model according to the word vector includes: determining the rewriting accuracy of the query long sentence according to the set rewriting mode; and rewriting the query long sentence through a shorthand model according to the rewriting precision.
According to still another aspect of the embodiment of the present invention, there is also provided an apparatus for data processing, including: the first acquisition module is used for acquiring a query long sentence input by a user; the second acquisition module is used for acquiring word vectors of the long query sentence through the language model; and the shorthand module is used for rewriting the query long sentence through the shorthand model according to the word vector to obtain the shorthand problem.
According to an aspect of another embodiment of the present invention, there is also provided a storage medium including a stored program, wherein the program controls a device in which the storage medium is located to execute the following steps: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
According to an aspect of another embodiment of the present invention, there is also provided a processor for running a program, wherein the program performs the following steps when running: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
In the embodiment of the invention, the query long sentence input by the user is obtained; acquiring word vectors of the query long sentences through a language model; the long query sentence is rewritten through the shorthand model according to the word vector to obtain the shorthand problem, and the purpose of shorthand of lengthy query inputted by the user is achieved, so that the technical effect of matching to the question-answering library and automatically answering the proportion of the user problem is achieved, and the technical problem of low efficiency of matching to the question-answering library in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of the execution of a system for data processing according to a first embodiment of the present invention;
FIG. 2 is a block diagram showing the hardware structure of a computer terminal of a data processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method of data processing according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for data processing according to a third embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The technical nouns related to the application
Static word vector: the term vector representation is related to the term itself only, the term vector is the same when the same term appears in different contexts, and typical static term vectors are word2vec, glove, etc.
Dynamic word vector: the vector of a single word is calculated according to the context, namely, the word vectors are different when the same word appears in different contexts, and the method is mainly used for solving the problems of polysemous words, word sense transition and the like, and typical dynamic word vectors comprise elmo and bert Sequence2 sequences: the method for inputting one sequence and outputting the converted other sequence is generally realized by a decoder-encoder and is commonly used for machine translation.
Transformer: a neural network based on self-attention and full connectivity.
LSTM: long Short Term Memory network, a common recurrent neural network.
GRU Gated Recurrent Unit, a commonly used recurrent neural network.
MaskLM: a language model training method for marking words as masks or randomly replacing words.
Example 1
The present application provides a system for data processing as shown in figure 1. Fig. 1 is a flowchart of the execution of a system for data processing according to the first embodiment of the present application. The data processing system provided by the embodiment of the application comprises:
A prediction subsystem 12, configured to obtain a query long sentence input by a user; the offline training subsystem 14 is configured to match the query long sentence input by the user with the questions in the language model obtained through offline training, obtain a shorthand question, and return the shorthand question to the prediction subsystem.
Optionally, the method comprises offline training, which is used for optimizing the language model through optimizing the loss function according to the question input by the user and the corpus of the question-answering library, training the shorthand model through a sequence-by-sequence mode according to the optimized language model, obtaining the word vector of the long query sentence, and rewriting the long query sentence according to the word vector to obtain the shorthand question.
Optionally, the prediction subsystem 12 includes: an online prediction subsystem.
Specifically, as shown in fig. 1, the system for data processing provided in the embodiment of the present application is based on a natural language processing system (Natural Language Processing, abbreviated as NLP), and the applicable scenario is described by taking an online customer service consultation system as a preferred example, and in addition, the system can also be applied to an online man-machine question-answering, a human question answering (question answering of a user and a customer service personnel), so that the system for data processing provided in the embodiment of the present application is implemented.
In the embodiment of the application, the system for processing data is divided into a prediction subsystem 12 and an offline training subsystem 14, as shown in fig. 1, a language model for analyzing a long question query fed back by a user is arranged in the offline training subsystem 14, which is different from the language model in the prior art;
Based on a language model, in the embodiment of the application, a shorthand model is trained by using a sequence-to-sequence 2sequence method, wherein, because the past method for obtaining sentence vectors by averaging all word vectors in the shorthand model (offline) training in the prior art is the same for processing each word in a long query, and is weighted equally, if a long query is input by a user, for example, if the query is segmented, the weight of each segmented word in the prior art is the same, the emphasis of shorthand is lost when the shorthand is further abbreviated, the obtained shorthand problem is not the original problem of the user input, and the accuracy of the obtained shorthand problem is low.
In the embodiment of the application, the keyword is acquired through a self-focusing self-attention mechanism in sequence2sequence, so that the redundant query input by the user is accurately rewritten, and the accurate shorthand problem is obtained.
In the embodiment of the application, the simplified model sequence2sequence can be realized by a transformer or an LSTM or GRU, and the redundant query input by the user is accurately rewritten in a mode of acquiring the keyword through a self-attention mechanism, so that the accurate simplified problem is obtained, and the method is not limited in detail.
Based on the language model obtained by the offline training subsystem 14, the prediction subsystem 12 is used for obtaining a long query sentence input by a user, for example, a lengthy query input by the user is input into the language model in the offline training subsystem 14, the word segmentation of the query is obtained by the language model, word vectors are obtained by calculation of each word segmentation, and finally, the obtained word vectors are rewritten by a shorthand model to obtain a shorthand problem.
Example 2
There is also provided, in accordance with an embodiment of the present invention, a method embodiment of data processing, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.
The method embodiment provided in the second embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Taking a computer terminal as an example, fig. 2 is a block diagram of a hardware structure of a computer terminal according to a data processing method according to an embodiment of the present application. As shown in fig. 2, the computer terminal 20 may include one or more (only one is shown in the figure) processors 202 (the processors 202 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA), a memory 204 for storing data, and a transmission module 206 for communication functions. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 2 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 20 may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2.
The memory 204 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the data processing method in the embodiment of the present invention, and the processor 202 executes the software programs and modules stored in the memory 204 to perform various functional applications and data processing, that is, to implement the data processing method of the application program. Memory 204 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 204 may further include memory located remotely from the processor 202, which may be connected to the computer terminal 20 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission module 206 is used to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 20. In one example, the transmission module 206 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission module 206 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In the above-described operating environment, the present application provides a method of data processing as shown in FIG. 3. Fig. 3 is a flowchart of a method of data processing according to a second embodiment of the present application. The data processing method provided by the embodiment of the application comprises the following steps:
Step S302, acquiring a query long sentence input by a user;
in the step S302 of the present application, the query long sentence input by the user is obtained through the client device, where the client device includes: an intelligent mobile terminal, the intelligent mobile terminal comprising: desktop computers, smart wearable devices, smart phones, tablet computers, notebook computers, or palm business.
Taking an online consultation customer service system as an example, when a user asks for related questions through a web page or a client installed in a terminal (such as a desktop computer, a notebook computer, a smart phone and intelligent wearing equipment), the user often encounters questions with long characters, and the questions are often characterized by multiple fixed languages and complex contents, and if the questions are required to be rewritten to improve the efficiency of online automatic reply, the questions are changed into abbreviated questions so that the server background can accurately match with answers to the questions, for example, if the long questions input by the client are "whether a shirt of model BBB of AAA in XXX store is good? "rewrite is necessary for the query long sentence, and the rewrite step is shown in step S304 and step S306.
Step S304, word vectors of the long query sentence are obtained through a language model;
in the above step S304, based on the query long sentence obtained in the step S302, for example, the lengthy query input by the user, the word vector is obtained by inputting the lengthy query input by the user into the language model, obtaining the word of the query from the language model, and then calculating the word vector from each word.
And step S306, rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
In the above step S306, the shorthand problem is obtained by rewriting the shorthand model based on the word vector obtained in step S304.
The method for processing data provided by the embodiment of the application is applied to an online customer service system of an e-commerce, is based on a natural language processing (Natural Language Processing, abbreviated as NLP) technology, and is applicable to a scene which is described by taking an online customer service consultation system as a preferred example, and can be applied to a scene of an online man-machine question and answer, a man-machine question and answer (question and answer of a customer service person), wherein the method for processing data provided by the embodiment of the application can be applied to a system for processing data in the embodiment 1, the system for processing data is divided into a prediction subsystem and an offline training subsystem, a language model for analyzing long questions and sentences fed back by a user is arranged in the offline training subsystem, and is different from a language model in the prior art;
Based on a language model, in the embodiment of the application, a shorthand model is trained by using a sequence-to-sequence 2sequence method, wherein, because the past method for obtaining sentence vectors by averaging all word vectors in the shorthand model (offline) training in the prior art is the same for processing each word in a long query, and is weighted equally, if a long query is input by a user, for example, if the query is segmented, the weight of each segmented word in the prior art is the same, the emphasis of shorthand is lost when the shorthand is further abbreviated, the obtained shorthand problem is not the original problem of the user input, and the accuracy of the obtained shorthand problem is low.
In the embodiment of the application, the keyword is acquired through a self-focusing self-attention mechanism in sequence2sequence, so that the redundant query input by the user is accurately rewritten, and the accurate shorthand problem is obtained.
In the embodiment of the application, the simplified model sequence2sequence can be realized by a transformer or an LSTM or GRU, and the redundant query input by the user is accurately rewritten in a mode of acquiring the keyword through a self-attention mechanism, so that the accurate simplified problem is obtained, and the method is not limited in detail.
Based on the language model obtained by the offline training subsystem, the long query sentence input by the user is obtained through the prediction subsystem, for example, the long query input by the user is input into the language model in the offline training subsystem, the word segmentation of the query is obtained through the language model, the word vector is obtained through calculation of each word segmentation, and finally the obtained word vector is rewritten through the shorthand model to obtain the shorthand problem.
Further, optionally, rewriting the query long sentence through the shorthand model according to the word vector includes: determining the rewriting accuracy of the query long sentence according to the set rewriting mode; and rewriting the query long sentence through a shorthand model according to the rewriting precision.
Specifically, in the embodiment of the application, different rewrite accuracy can be set for the rewrite of the query long sentence so as to meet different application scenarios, for example, when the rewrite is performed on the query long sentence in the interview report, the rewrite accuracy can be set to be high so as to ensure the professionality of the text and the periodical when the interview report is externally published; if the rewriting accuracy is general in the e-commerce platform environment, if the feedback received by the user does not accord with the meaning of the original long query sentence, the user can adjust the long query sentence again to issue a query again, and the long query sentence corrected by the user is analyzed and rewritten again through further learning of a shorthand model, so that the query and answer accuracy fed back to the user is improved;
Here, the rewriting mode in the embodiment of the present application may be a mode applied to the rewriting accuracy selected by the user in different scenes, or a mode of the rewriting accuracy preset for each application scene.
In the embodiment of the invention, the query long sentence input by the user is obtained; acquiring word vectors of the query long sentences through a language model; the long query sentence is rewritten through the shorthand model according to the word vector to obtain the shorthand problem, and the purpose of shorthand of lengthy query inputted by the user is achieved, so that the technical effect of matching to the question-answering library and automatically answering the proportion of the user problem is achieved, and the technical problem of low efficiency of matching to the question-answering library in the prior art is solved.
Optionally, the method for processing data provided by the embodiment of the present application further includes:
and step S300, optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user, and obtaining an optimized language model.
Further optionally, optimizing the language model according to the question and question-answer corpus input by the user in step S300 through optimizing the loss function includes:
Step S3001, according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through the optimizing loss function.
Specifically, in the process of implementing the data processing method provided by the embodiment of the present application, the language model used in steps S302 to S306 is obtained through offline training, that is, the language model is obtained by optimizing the offline training subsystem in embodiment 1, where the optimization process is as follows:
in the offline training subsystem, the language model increases the information of each sentence belonging to the same problem in the corpus by utilizing the corpus of the question-answering library, so as to optimize the language model, and solves the problem of polysemous words by utilizing the contextual information.
Optionally, in step S306, rewriting the query long sentence through the shorthand model according to the word vector, where obtaining the shorthand problem includes:
step S3061, obtaining words meeting preset conditions from word vectors through a shorthand model, wherein the preset conditions are words, the weight of which meets a preset threshold, in the word vectors are obtained;
Step S3062, the query long sentence is rewritten by obtaining words meeting preset conditions, and a shorthand problem is obtained.
Further, optionally, the shorthand model includes: sequence-to-sequence shorthand model.
Optionally, the obtaining, in step S3062, the word satisfying the preset condition from the word vector through the shorthand model includes:
Step S30321, obtaining words meeting preset conditions from the word vectors through a self-focusing mechanism in the sequence-to-sequence shorthand model.
Specifically, in the embodiment of the present application, a shorthand model is trained by using a method of sequence-to-sequence 2sequence, where in the case of a long query input by a user, for example, a lengthy query input by the user, if the query is segmented, the weight of each segmented word in the prior art will be the same, and when further shorthand, the emphasis of shorthand will be lost, resulting in that the obtained shorthand problem is not the original one of the problem input by the user, and the accuracy of the obtained shorthand problem is low, because the previous method of obtaining the sentence vector by averaging all word vectors in the prior art by training the shorthand model (offline) is the same for each word in the long query.
In the embodiment of the application, the keyword is acquired through a self-focusing self-attention mechanism in sequence2sequence, so that the redundant query input by the user is accurately rewritten, and the accurate shorthand problem is obtained.
In the embodiment of the application, the simplified model sequence2sequence can be realized by a transformer or an LSTM or GRU, and the redundant query input by the user is accurately rewritten in a mode of acquiring the keyword through a self-attention mechanism, so that the accurate simplified problem is obtained, and the method is not limited in detail.
In summary, the method for processing data provided in the embodiment of the present application includes three parts: the language model is optimized by utilizing the question and answer library, the sequence2sequence shorthand model is trained by using long queries and matching questions offline, and the query shorthand is performed by using the shorthand model online. The language model is optimized by utilizing the corpus and sentence membership of the massive question-answer library, and the lengthy query is rewritten under the condition of keeping key semantics by a sequence2sequence method, so that the proportion of problems in matching is improved, and the user experience is improved.
The data processing provided by the embodiment of the application provides that the redundant query is automatically abbreviated by utilizing a question-answer library and a language model to pretrain dynamic word vectors and a sequence2sequence method, the loss function of the language model is optimized to be more suitable for a service scene, the loss belonging to the same question-answer is increased on the basis of the original maskLM and sentence upper-lower relation of bert, and the effect is further improved.
The data processing provided by the embodiment of the application provides the use of the information that sentences in the question-answering library belong to the same problem in the language model to further optimize the effect.
The data processing provided by the embodiment of the application optimizes the pre-training method of the dynamic word vector by utilizing massive question-answer library information, and innovatively trains the query shorthand model of the sequence2 sequence. Therefore, the lengthy query input by the user is abbreviated, the proportion of matching to the question-answering library and automatically answering the user questions is improved, and the answering efficiency of the online customer service question-answering system is optimized.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method of data processing according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
Example 3
According to an embodiment of the present invention, there is further provided an apparatus for implementing the above-mentioned data processing method, and fig. 4 is a schematic structural diagram of an apparatus for data processing according to a third embodiment of the present invention, as shown in fig. 4, where the apparatus includes: a first obtaining module 42, configured to obtain a query long sentence input by a user; a second obtaining module 44, configured to obtain a word vector of the query long sentence through the language model; the shorthand module 46 is configured to rewrite the query long sentence according to the word vector through the shorthand model, so as to obtain a shorthand problem.
Example 4
According to an aspect of another embodiment of the present invention, there is also provided a storage medium including a stored program, wherein the program controls a device in which the storage medium is located to execute the following steps: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
Example 5
According to an aspect of another embodiment of the present invention, there is also provided a processor for running a program, wherein the program performs the following steps when running: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
Example 6
The embodiment of the invention also provides a storage medium. Alternatively, in this embodiment, the storage medium may be used to store program codes executed by the method for data processing provided in the first embodiment.
Alternatively, in this embodiment, the storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; and rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: and optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user to obtain an optimized language model.
Further optionally, in the present embodiment, the storage medium is configured to store program code for performing the steps of: optimizing the language model through optimizing the loss function according to the questions and the question-answering corpus input by the user comprises the following steps: and increasing the loss of the same answer to which the sentences belong in the corpus according to the corpus in the question-answer library, and optimizing the language model through optimizing the loss function.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: rewriting the query long sentence through a shorthand model according to the word vector, wherein the obtaining the shorthand problem comprises the following steps: acquiring words meeting preset conditions from the word vectors through a shorthand model, wherein the preset conditions are words, the weight of which meets a preset threshold, in the acquired word vectors; and (3) rewriting the query long sentence by acquiring words meeting preset conditions to obtain a shorthand problem.
Further optionally, in the present embodiment, the storage medium is configured to store program code for performing the steps of: the shorthand model includes: sequence-to-sequence shorthand model.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: the obtaining words meeting the preset conditions from the word vectors through the shorthand model comprises the following steps: and acquiring words meeting preset conditions from the word vectors through a self-focusing mechanism in the sequence-to-sequence shorthand model.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (9)

1. A system for data processing, comprising:
The prediction subsystem is used for acquiring a query long sentence input by a user;
the offline training subsystem is used for matching the query long sentence input by the user with the problem in the language model obtained by offline training to obtain a shorthand problem, and returning the shorthand problem to the prediction subsystem;
The offline training is used for optimizing the language model through an optimization loss function according to the questions and the question-answering corpus input by the user, training a shorthand model through a sequence-to-sequence mode according to the optimized language model to obtain word vectors of the query long sentences, and rewriting the query long sentences according to the word vectors to obtain the shorthand questions.
2. The system of claim 1, wherein the prediction subsystem comprises: an online prediction subsystem.
3. A method of data processing, comprising:
Acquiring a query long sentence input by a user;
Acquiring word vectors of the query long sentences through a language model;
rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem;
Optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user to obtain an optimized language model;
according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through the optimizing loss function;
wherein, the writing the query long sentence according to the word vector through a shorthand model to obtain a shorthand problem includes:
Acquiring words meeting a preset condition from the word vectors through a self-focusing mechanism in a sequence-to-sequence shorthand model, wherein the preset condition is that words with weights meeting a preset threshold in the word vectors are acquired, and the shorthand model comprises: a sequence-to-sequence shorthand model;
and rewriting the query long sentence through the words meeting the preset conditions, so as to obtain a shorthand problem.
4. The method of claim 3, wherein the obtaining the query long sentence entered by the user comprises:
Obtaining, by a client device, the query long sentence input by the user, where the client device includes: an intelligent mobile terminal, the intelligent mobile terminal comprising: desktop computers, smart wearable devices, smart phones, tablet computers, notebook computers, or palm business.
5. A method according to claim 3, wherein the method of data processing is applied to an e-commerce online customer service system.
6. The method of claim 5, wherein said overwriting the query long sentence by a shorthand model in accordance with the word vector comprises:
determining the rewriting precision of the query long sentence according to the set rewriting mode;
And rewriting the query long sentence through a shorthand model according to the rewriting precision.
7. An apparatus for data processing, comprising:
the first acquisition module is used for acquiring a query long sentence input by a user;
The second acquisition module is used for acquiring word vectors of the query long sentence through a language model;
The shorthand module is used for rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem;
The device is further used for optimizing the language model through an optimization loss function according to the questions and the question-answer corpus input by the user to obtain an optimized language model; according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through the optimizing loss function;
The shorthand module is further configured to obtain, from the word vector, a word that satisfies a preset condition through a self-attention mechanism in a sequence-to-sequence shorthand model, where the preset condition is that a word in which a weight in the word vector satisfies a preset threshold is obtained, and the shorthand model includes: a sequence-to-sequence shorthand model; and rewriting the query long sentence through the words meeting the preset conditions, so as to obtain a shorthand problem.
8. A storage medium comprising a stored program, wherein the program, when run, controls a device on which the storage medium resides to perform the steps of: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem; optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user to obtain an optimized language model; according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through the optimizing loss function; wherein, the writing the query long sentence according to the word vector through a shorthand model to obtain a shorthand problem includes: acquiring words meeting a preset condition from the word vectors through a self-focusing mechanism in a sequence-to-sequence shorthand model, wherein the preset condition is that words with weights meeting a preset threshold in the word vectors are acquired, and the shorthand model comprises: a sequence-to-sequence shorthand model; and rewriting the query long sentence through the words meeting the preset conditions, so as to obtain a shorthand problem.
9. A processor for running a program, wherein the program when run performs the steps of: acquiring a query long sentence input by a user; acquiring word vectors of the query long sentences through a language model; rewriting the query long sentence through a shorthand model according to the word vector to obtain a shorthand problem; optimizing the language model through an optimization loss function according to the questions and the question-answering library corpus input by the user to obtain an optimized language model; according to the corpus in the question-answering library, increasing the information of the same problem of sentences in the corpus, and optimizing the language model through the optimizing loss function; wherein, the writing the query long sentence according to the word vector through a shorthand model to obtain a shorthand problem includes: acquiring words meeting a preset condition from the word vectors through a self-focusing mechanism in a sequence-to-sequence shorthand model, wherein the preset condition is that words with weights meeting a preset threshold in the word vectors are acquired, and the shorthand model comprises: a sequence-to-sequence shorthand model; and rewriting the query long sentence through the words meeting the preset conditions, so as to obtain a shorthand problem.
CN201910973268.6A 2019-10-14 2019-10-14 Data processing method, device and system Active CN112732877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910973268.6A CN112732877B (en) 2019-10-14 2019-10-14 Data processing method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910973268.6A CN112732877B (en) 2019-10-14 2019-10-14 Data processing method, device and system

Publications (2)

Publication Number Publication Date
CN112732877A CN112732877A (en) 2021-04-30
CN112732877B true CN112732877B (en) 2024-05-17

Family

ID=75588427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910973268.6A Active CN112732877B (en) 2019-10-14 2019-10-14 Data processing method, device and system

Country Status (1)

Country Link
CN (1) CN112732877B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704521A (en) * 2017-09-07 2018-02-16 北京零秒科技有限公司 A kind of question and answer processing server, client and implementation method
CN107870964A (en) * 2017-07-28 2018-04-03 北京中科汇联科技股份有限公司 A kind of sentence sort method and system applied to answer emerging system
KR20190059084A (en) * 2017-11-22 2019-05-30 한국전자통신연구원 Natural language question-answering system and learning method
CN110135551A (en) * 2019-05-15 2019-08-16 西南交通大学 A kind of robot chat method of word-based vector sum Recognition with Recurrent Neural Network
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110263160A (en) * 2019-05-29 2019-09-20 中国电子科技集团公司第二十八研究所 A kind of Question Classification method in computer question answering system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10418023B2 (en) * 2017-10-17 2019-09-17 International Business Machines Corporation Automatic answer rephrasing based on talking style

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870964A (en) * 2017-07-28 2018-04-03 北京中科汇联科技股份有限公司 A kind of sentence sort method and system applied to answer emerging system
CN107704521A (en) * 2017-09-07 2018-02-16 北京零秒科技有限公司 A kind of question and answer processing server, client and implementation method
KR20190059084A (en) * 2017-11-22 2019-05-30 한국전자통신연구원 Natural language question-answering system and learning method
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems
CN110135551A (en) * 2019-05-15 2019-08-16 西南交通大学 A kind of robot chat method of word-based vector sum Recognition with Recurrent Neural Network
CN110263160A (en) * 2019-05-29 2019-09-20 中国电子科技集团公司第二十八研究所 A kind of Question Classification method in computer question answering system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Conversational Query Understanding Using Sequence to Sequence Modeling;Gary Ren et al;《Track: Web Search and Mining》;20180427;全文 *
基于时间递归序列模型的短文本语义简化;蔺伟斌;杨世瀚;;物联网技术;20190520(第05期);全文 *

Also Published As

Publication number Publication date
CN112732877A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
JP7127150B2 (en) Question and Answer Interaction Method, Apparatus, Computing Device and Computer Readable Storage Medium
US11551007B2 (en) Determining intent from a historical vector of a to-be-analyzed statement
US11128579B2 (en) Systems and processes for operating and training a text-based chatbot
US11429405B2 (en) Method and apparatus for providing personalized self-help experience
CN111310440B (en) Text error correction method, device and system
US10824816B2 (en) Semantic parsing method and apparatus
CN111144120A (en) Training sentence acquisition method and device, storage medium and electronic equipment
CN103678304A (en) Method and device for pushing specific content for predetermined webpage
CN114757176A (en) Method for obtaining target intention recognition model and intention recognition method
US20220058349A1 (en) Data processing method, device, and storage medium
CN111026840A (en) Text processing method, device, server and storage medium
CN112131401A (en) Method and device for constructing concept knowledge graph
CN116541493A (en) Interactive response method, device, equipment and storage medium based on intention recognition
CN116681561A (en) Policy matching method and device, electronic equipment and storage medium
CN111274813B (en) Language sequence labeling method, device storage medium and computer equipment
CN113342944B (en) Corpus generalization method, apparatus, device and storage medium
CN112732877B (en) Data processing method, device and system
CN113342932B (en) Target word vector determining method and device, storage medium and electronic device
CN110222186B (en) Method, device and equipment for processing problem of overlapped characters and storage medium
CN111401083B (en) Name identification method and device, storage medium and processor
US9600770B1 (en) Method for determining expertise of users in a knowledge management system
CN111523952B (en) Information extraction method and device, storage medium and processor
CN114925185B (en) Interaction method, model training method, device, equipment and medium
CN110929019B (en) Information display method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant