CN112825247A - Data processing method and device and electronic equipment - Google Patents

Data processing method and device and electronic equipment Download PDF

Info

Publication number
CN112825247A
CN112825247A CN201911136852.2A CN201911136852A CN112825247A CN 112825247 A CN112825247 A CN 112825247A CN 201911136852 A CN201911136852 A CN 201911136852A CN 112825247 A CN112825247 A CN 112825247A
Authority
CN
China
Prior art keywords
training
language model
training data
shared language
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911136852.2A
Other languages
Chinese (zh)
Inventor
黄海兵
邱晓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201911136852.2A priority Critical patent/CN112825247A/en
Publication of CN112825247A publication Critical patent/CN112825247A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/12Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: acquiring training data; judging whether the training data has a training value for training a shared language model; if the training data have a training value, training the shared language model through the training data to update model parameters of the shared language model, and uploading the updated model parameters to a server; therefore, the calculation amount of the terminal equipment and the data transmission amount between the terminal equipment and the server are reduced, and the effect of reducing resource consumption is achieved.

Description

Data processing method and device and electronic equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.
Background
With the development of computer technology, electronic devices such as mobile phones and tablet computers are more and more popular, and great convenience is brought to life, study and work of people. These electronic devices are typically installed with an input method application (abbreviated as input method) so that a user can input information using the input method.
In order to improve the input efficiency of the user, the input method usually adopts language model prediction candidates for the user to directly input. The traditional method for training the language model is to centralize training data in a certain machine or a single data center (which can also be called as a cloud), and then train the language model by adopting the centralized training data; the training data is generally obtained by collecting user data through an agreement with the user. However, as the privacy problem of the user is more and more emphasized, the training data collection method is limited; therefore, a method for federated learning is proposed: the model training and the cloud storage are unbound, namely, the model is trained on the terminal equipment by adopting user data, and only the updated parameters of the model are uploaded to the cloud, so that the problem of user privacy data is solved.
However, as the number of user devices and the user data in each user device continuously expand, the local training calculation amount in federal learning and the number of model parameters to be uploaded are also continuously increased; thereby increasing resource consumption.
Disclosure of Invention
The embodiment of the invention provides a data processing method for reducing resource consumption.
Correspondingly, the embodiment of the invention also provides a data processing device and electronic equipment, which are used for ensuring the realization and application of the method.
In order to solve the above problem, an embodiment of the present invention discloses a data processing method, which specifically includes: acquiring training data; judging whether the training data has a training value for training a shared language model; and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
Optionally, the training data comprises training text and reference results; the judging whether the training data has a training value for training a shared language model comprises: inputting the training text into the shared language model for forward calculation to obtain a prediction result; and judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
Optionally, the prediction result comprises a predicted text and a corresponding prediction probability; the judging whether the training data has the training value for training the shared language model according to the prediction result and the reference result comprises the following steps: judging whether the prediction text with the maximum prediction probability is matched with the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
Optionally, the reference result further includes a reference probability corresponding to a reference text, and the training the shared language model through the training data to update the model parameters of the shared language model includes: determining the prediction probability of a predicted text matched with a reference text and the error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
Optionally, the method further comprises: if the training data does not have training value, discarding the training data.
Optionally, the method further comprises: and obtaining the shared language model after the model parameters are updated, wherein the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
Optionally, the training data comprises information relating to user input.
The embodiment of the invention also discloses a data processing device, which specifically comprises: the training data acquisition module is used for acquiring training data; the judging module is used for judging whether the training data has the training value of training the shared language model; and the updating module is used for training the shared language model to update the model parameters of the shared language model through the training data and uploading the updated model parameters to a server if the training data has the training value.
Optionally, the training data comprises training text and reference results; the judging module comprises: the forward calculation sub-module is used for inputting the training text into the shared language model for forward calculation to obtain a prediction result; and the training value judgment sub-module is used for judging whether the training data has the training value for training the shared language model or not according to the prediction result and the reference result.
Optionally, the prediction result includes a prediction text and a corresponding prediction probability, and the training value determination sub-module is configured to determine whether the prediction text with the largest prediction probability matches the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
Optionally, the reference result further includes a reference probability corresponding to the reference text, and the updating module is configured to determine a prediction probability of the predicted text matching the reference text and an error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
Optionally, the apparatus further comprises: and the data abandoning module is used for abandoning the training data if the training data does not have the training value.
Optionally, the apparatus further comprises: and the model acquisition module is used for acquiring the shared language model after the model parameters are updated, and the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
Optionally, the training data comprises information relating to user input.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the electronic equipment, the electronic equipment can execute the data processing method according to any one of the embodiments of the invention.
An embodiment of the present invention also discloses an electronic device, including a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by one or more processors, and the one or more programs include instructions for: acquiring training data; judging whether the training data has a training value for training a shared language model; and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
Optionally, the training data comprises training text and reference results; the judging whether the training data has a training value for training a shared language model comprises: inputting the training text into the shared language model for forward calculation to obtain a prediction result; and judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
Optionally, the prediction result includes a prediction text and a corresponding prediction probability, and determining whether the training data has a training value for training a shared language model according to the prediction result and the reference result includes: judging whether the prediction text with the maximum prediction probability is matched with the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
Optionally, the reference result further includes a reference probability corresponding to a reference text, and the training the shared language model through the training data to update the model parameters of the shared language model includes: determining the prediction probability of a predicted text matched with a reference text and the error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
Optionally, further comprising instructions for: if the training data does not have training value, discarding the training data.
Optionally, further comprising instructions for: and obtaining the shared language model after the model parameters are updated, wherein the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
Optionally, the training data comprises information relating to user input.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, training data can be obtained, and then whether the training data has the training value of training the shared language model or not is judged; if the training data have a training value, training the shared language model through the training data to update model parameters of the shared language model, and uploading the updated model parameters to a server; and then only adopt the training data with training value to train the shared language model, can the effectual quantity that reduces the training data to reduce terminal equipment calculated amount, and the data transmission volume between terminal equipment and the server, reach the effect that reduces the resource consumption.
Drawings
FIG. 1 is a flow chart of the steps of one data processing method embodiment of the present invention;
FIG. 2 is a flow chart of the steps of an alternative embodiment of a data processing method of the present invention;
FIG. 3 is a block diagram of an embodiment of a data processing apparatus according to the present invention;
FIG. 4 is a block diagram of an alternate embodiment of a data processing apparatus of the present invention;
FIG. 5 illustrates a block diagram of an electronic device for data processing in accordance with an exemplary embodiment;
fig. 6 is a schematic structural diagram of an electronic device for data processing according to another exemplary embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
One of the core ideas of the embodiment of the invention is that training is carried out on the shared language model by adopting training data with training value to update and upload model parameters of the shared language model, so that the calculation amount of the terminal equipment and the data transmission amount between the terminal equipment and the server are reduced, and the resource consumption is reduced.
Wherein the shared language model is initially trained by the server on the language model based on the shared data. The subsequent server issues the shared language model to each terminal device, and each terminal device trains the shared language model by adopting training data and updates model parameters; and uploading the updated model parameters to a server. And then updating the shared language model by the server based on the model parameters uploaded by each terminal device, issuing the shared language model with the updated model parameters to each terminal device, and repeating the steps. The shared data may refer to data that the user allows to upload to the server.
The shared language model may refer to various models used in the input method, and the corresponding functions may also include multiple functions, such as association, error correction, long word association, and the like, which is not limited in this embodiment of the present invention.
The following description will take an example in which a terminal device updates model parameters of a shared language model and uploads the updated model parameters to a server.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:
and 102, acquiring training data.
In the embodiment of the present invention, after an input method (hereinafter referred to as an input method) in a terminal device acquires a shared language model issued by a server, input information of a user corresponding to the terminal device may be collected, and the input information is used as training data; model parameters of the shared language model are then updated based on the training data. The input information may include information related to input, such as an input sequence, a candidate for screen-up, a chat log, location information, weather information, and the like, which is not limited in this embodiment of the present invention.
And 104, judging whether the training data has the training value of training the shared language model.
And 106, if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
In the embodiment of the invention, the shared language model can be trained in a Data pruning (Data Prune) mode; the training data is pruned, then the shared language model is trained by adopting the training data reserved after pruning, and the model parameters of the shared language model are updated. And further, training data for training the shared language model can be reduced, so that the calculation amount of the terminal equipment can be reduced, and the data transmission amount between the terminal equipment and the server can be reduced.
In the embodiment of the present invention, one way to prune the training data may be to determine whether the training data has a training value; if the training data has no training value, pruning (if discarding) the training data; if the training data has training value, the training data is retained. Wherein the training value may refer to a value of training a shared language model.
The shared language model may then be trained using training data having training value (i.e., retained training data), updating model parameters of the shared language model; and uploading the updated model parameters to a server. And then the server updates the shared language model based on the model parameters uploaded by each terminal device in the whole network, and then sends the shared language model to each terminal device.
In summary, in the embodiment of the present invention, training data may be obtained, and then it is determined whether the training data has a training value for training a shared language model; if the training data have a training value, training the shared language model through the training data to update model parameters of the shared language model, and uploading the updated model parameters to a server; and then only adopt the training data with training value to train the shared language model, can the effectual quantity that reduces the training data to reduce terminal equipment calculated amount, and the data transmission volume between terminal equipment and the server, reach the effect that reduces the resource consumption.
In another embodiment of the present invention, whether the training data has a training value or not may be determined by comparing a preset result of the training data with a real result of the shared language model, which is specifically as follows:
referring to fig. 2, a flowchart illustrating steps of an alternative embodiment of the data processing method of the present invention is shown, which may specifically include the following steps:
step 202, acquiring training data; the training data comprises training texts and reference results, and the reference results comprise reference texts and corresponding reference probabilities.
In the embodiment of the invention, the calculation amount of the terminal equipment and the data transmission amount between the terminal equipment and the server are further reduced; the input method can acquire training data according to a set period, so that the frequency of training the shared language model is reduced, and the frequency of data transmission between the terminal equipment and the server is reduced. The set period may be set as required, for example, 24 hours, which is not limited in this embodiment of the present invention.
In the embodiment of the invention, a plurality of groups of training data can be generated by adopting input information; then, whether each set of training data has training value is judged respectively. In one example of the present invention, a training text and a corresponding reference text may be determined from input information according to a shared language model function, and a reference probability of the reference text may be determined; wherein, the reference text and the corresponding reference probability may constitute a reference result, and the reference probability may be a default probability (e.g. 1). A set of training texts and corresponding reference results may then be used as a set of training data. For example, the shared language model functions as an association, and the front part of a sentence can be determined as a training text and the rear part as a reference text. For example, the sentence "i have an appointment in the evening today", may use "i have an appointment in the evening today" as a training text, and "has an appointment" as a reference text, where the reference probability corresponding to the reference text is 1; a set of training data is obtained { i am today in the evening, there is an approximation (reference probability-1) }. Certainly, "I today" can also be used as a training text, "having an appointment at night" is used as a reference text, and the reference probability corresponding to the reference text is 1; then get a set of training data { i am today, there is an approximation (reference probability-1) at night }; embodiments of the present invention do not limit the manner in which the front and rear portions are divided. For another example, the shared language model may have a function of paying attention to prediction of a new sound body, and may use a word as a training text, a new sound body of the word as a reference text, and a reference probability corresponding to the reference text is 1. For example, the word: interestingly, the corresponding attention to the new sound volume: (qi) fun (guai); taking 'interesting' as a training text and 'qi interesting (guai)' as a reference text, a set of training data is obtained, wherein the training data is { interesting, qi interesting (guai) (reference probability-1) }.
Wherein, judging whether a set of training data has training value can refer to steps 204-206:
and step 204, inputting the training text into the shared language model for forward calculation to obtain a prediction result.
In the embodiment of the present invention, the training texts in the set of training data may be input into the shared language model; and the shared language model performs forward calculation based on the training text and outputs a corresponding prediction result. Then, whether the training data has the training value for training the shared language model or not can be judged by comparing the prediction result with the reference result in the set of training data; reference may be made to step 206:
and step 206, judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
Wherein the predicted results output by the shared language model may include multiple groups, and a group of predicted results may include predicted text and corresponding predicted probabilities.
This step 206 may include the following sub-steps:
and a substep 22 of judging whether the predicted text with the maximum prediction probability is matched with the reference text.
And a substep 24, if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
And a substep 26, if the predicted text with the maximum prediction probability is matched with the reference text, determining that the training data does not have the training value for training the shared language model.
In the input process of the user, when the input method uses the shared language model for prediction, the first N predicted texts with the maximum prediction probability can be used as candidates and provided for the user to select. Therefore, in order to improve the accuracy of the shared language model prediction, the shared language model can be trained by taking the maximum prediction probability of the language model training output reference text as a target; wherein N is a positive integer. Furthermore, in the embodiment of the invention, whether the shared language model is predicted accurately can be judged by judging whether the predicted text with the maximum prediction probability is matched with the reference text, and whether the training data has the training value for training the shared language model is further determined. Whether the predicted text with the maximum prediction probability is matched with the reference text can be judged by judging whether the predicted text is the same as or similar to the reference text.
If the prediction text with the maximum prediction probability is matched with the reference text, the shared language model can be determined to be accurate in prediction, and training the shared language model by adopting training data is worthless; i.e., the training data does not have the training value of training the shared language model, step 208 may be performed. If the prediction text with the maximum prediction probability is not matched with the reference text, it can be determined that the shared language model prediction is inaccurate, and it is valuable to train the shared language model by adopting training data; i.e., the training data has a training value for training the shared language model, step 210 may be performed.
And step 208, discarding the training data.
In the embodiment of the invention, when any group of training data is determined to have no training value, the shared language model is not required to be trained by adopting the group of training data, and the model parameters of the shared language model are updated; the set of training data may be discarded at this time.
Step 210, determining a prediction probability of a predicted text matching the reference text, and an error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text.
And step 212, performing reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
In the embodiment of the invention, when any group of training data is determined to have the training value, the model parameters of the shared language model can be updated according to the reference result in the group of training data.
In the embodiment of the invention, the prediction text matched with the reference text can be searched from each group of prediction results, and the prediction probability of the prediction text matched with the reference text is determined; then, the prediction probability of the predicted text matched with the reference text and the error of the reference probability corresponding to the reference text are calculated. In one example of the present invention, a loss function of the error may be calculated, and then the model parameters of the shared language model may be updated by performing a reverse learning of the shared language model by minimizing the loss function.
And step 214, uploading the updated model parameters to a server.
In the embodiment of the present invention, the updated model parameters of the terminal device for the shared language model may be part of the model parameters of the shared language model, or all the model parameters of the shared language model; and the method is determined according to factors such as the error and the type of the loss function. The updated model parameters are then uploaded to the server.
After the server receives the model parameters uploaded by each terminal device, all the received model parameters can be fused, and then the model parameters of the corresponding dimensionality of the shared language model in the server are updated according to the fused model parameters. In an example of the present invention, the server may perform weighted calculation, such as averaging, on the model parameters uploaded by each terminal device to obtain the fused model parameters.
Then, the server may issue the shared language model after updating the model parameters to each terminal device, and on one hand, the terminal device may circularly perform the above steps 202 to 214, thereby implementing continuous update of the shared language model. On the other hand, the updated sharing model can be used for predicting the user input, and the model parameters of the sharing language model are updated by the server according to the model parameters uploaded by each terminal device, so that the updated sharing model can be more accurately predicted, and the user input efficiency can be further improved.
In summary, in the embodiments of the present invention, training data may be obtained, and then it is determined whether the training data has a training value for training a shared language model; if the training data have a training value, training the shared language model through the training data to update model parameters of the shared language model, and uploading the updated model parameters to a server; and then only adopt the training data with training value to train the shared language model, can the effectual quantity that reduces the training data to reduce terminal equipment calculated amount, and the data transmission volume between terminal equipment and the server, reach the effect that reduces the resource consumption. Meanwhile, the privacy of the user can be ensured.
Secondly, in the embodiment of the invention, the training text can be input into the shared language model for forward calculation to obtain a prediction result; then judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result; and then the training value of the training data is judged according to the accuracy of the shared language model for the training data prediction, so that the misjudgment of determining whether the training data has the training value can be reduced, and the resource consumption is further reduced.
Thirdly, in the embodiment of the invention, the terminal device can acquire the shared language model with the updated model parameters, and then can adopt the updated shared model to predict the user input; the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device; therefore, the updated sharing model can be used for predicting more accurately, and the input efficiency of the user is improved.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 3, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
a training data acquisition module 302, configured to acquire training data;
a judging module 304, configured to judge whether the training data has a training value for training a shared language model;
an updating module 306, configured to train the shared language model to update the model parameters of the shared language model through the training data if the training data has a training value, and upload the updated model parameters to a server.
Referring to fig. 4, a block diagram of an alternative embodiment of a data processing apparatus of the present invention is shown.
In an optional embodiment of the present invention, the training data includes a training text and a reference result; the determining module 304 includes:
a forward calculation submodule 3042, configured to input the training text into the shared language model for forward calculation, so as to obtain a prediction result;
a training value determining submodule 3044, configured to determine whether the training data has a training value for training the shared language model according to the prediction result and the reference result.
In an optional embodiment of the present invention, the prediction result includes a prediction text and a corresponding prediction probability, and the training value determining sub-module 3044 is configured to determine whether the prediction text with the maximum prediction probability is matched with the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
In an optional embodiment of the present invention, the reference result further includes a reference probability corresponding to the reference text, and the updating module 306 is configured to determine a prediction probability of the predicted text matching the reference text, and an error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
In an optional embodiment of the present invention, the apparatus further comprises: a data discarding module 308, configured to discard the training data if the training data does not have a training value.
In an optional embodiment of the present invention, the apparatus further comprises: the model obtaining module 310 is configured to obtain a shared language model after model parameters are updated, where the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
In an alternative embodiment of the invention, the training data comprises information relating to user input.
In summary, in the embodiment of the present invention, training data may be obtained, and then it is determined whether the training data has a training value for training a shared language model; if the training data have a training value, training the shared language model through the training data to update model parameters of the shared language model, and uploading the updated model parameters to a server; and then only adopt the training data with training value to train the shared language model, can the effectual quantity that reduces the training data to reduce terminal equipment calculated amount, and the data transmission volume between terminal equipment and the server, reach the effect that reduces the resource consumption.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Fig. 5 is a block diagram illustrating a structure of an electronic device 500 for data processing according to an example embodiment. For example, the electronic device 500 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 5, electronic device 500 may include one or more of the following components: a processing component 502, a memory 504, a power component 506, a multimedia component 508, an audio component 510, an input/output (I/O) interface 512, a sensor component 514, and a communication component 516.
The processing component 502 generally controls overall operation of the electronic device 500, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 502 may include one or more processors 520 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 502 can include one or more modules that facilitate interaction between the processing component 502 and other components. For example, the processing component 502 can include a multimedia module to facilitate interaction between the multimedia component 508 and the processing component 502.
The memory 504 is configured to store various types of data to support operation at the device 500. Examples of such data include instructions for any application or method operating on the electronic device 500, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 504 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power component 506 provides power to the various components of the electronic device 500. Power components 506 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 500.
The multimedia component 508 includes a screen that provides an output interface between the electronic device 500 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 508 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 500 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 510 is configured to output and/or input audio signals. For example, the audio component 510 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 500 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 504 or transmitted via the communication component 516. In some embodiments, audio component 510 further includes a speaker for outputting audio signals.
The I/O interface 512 provides an interface between the processing component 502 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 514 includes one or more sensors for providing various aspects of status assessment for the electronic device 500. For example, the sensor assembly 514 may detect an open/closed state of the device 500, the relative positioning of components, such as a display and keypad of the electronic device 500, the sensor assembly 514 may detect a change in the position of the electronic device 500 or a component of the electronic device 500, the presence or absence of user contact with the electronic device 500, orientation or acceleration/deceleration of the electronic device 500, and a change in the temperature of the electronic device 500. The sensor assembly 514 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 514 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 514 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 516 is configured to facilitate wired or wireless communication between the electronic device 500 and other devices. The electronic device 500 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication section 514 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 514 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 500 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 504 comprising instructions, executable by the processor 520 of the electronic device 500 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a data processing method, the method comprising: acquiring training data; judging whether the training data has a training value for training a shared language model; and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
Optionally, the training data comprises training text and reference results; the judging whether the training data has a training value for training a shared language model comprises: inputting the training text into the shared language model for forward calculation to obtain a prediction result; and judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
Optionally, the prediction result comprises a predicted text and a corresponding prediction probability; the judging whether the training data has the training value for training the shared language model according to the prediction result and the reference result comprises the following steps: judging whether the prediction text with the maximum prediction probability is matched with the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
Optionally, the reference result further includes a reference probability corresponding to a reference text, and the training the shared language model through the training data to update the model parameters of the shared language model includes: determining the prediction probability of a predicted text matched with a reference text and the error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
Optionally, the method further comprises: if the training data does not have training value, discarding the training data.
Optionally, the method further comprises: and obtaining the shared language model after the model parameters are updated, wherein the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
Optionally, the training data comprises information relating to user input.
Fig. 6 is a schematic structural diagram of an electronic device 600 for data processing according to another exemplary embodiment of the present invention. The electronic device 600 may be a server, which may vary greatly due to different configurations or capabilities, and may include one or more Central Processing Units (CPUs) 622 (e.g., one or more processors) and memory 632, one or more storage media 630 (e.g., one or more mass storage devices) storing applications 642 or data 644. Memory 632 and storage medium 630 may be, among other things, transient or persistent storage. The program stored in the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 622 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the server.
The server may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input-output interfaces 658, one or more keyboards 656, and/or one or more operating systems 641, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for: acquiring training data; judging whether the training data has a training value for training a shared language model; and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
Optionally, the training data comprises training text and reference results; the judging whether the training data has a training value for training a shared language model comprises: inputting the training text into the shared language model for forward calculation to obtain a prediction result; and judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
Optionally, the prediction result includes a prediction text and a corresponding prediction probability, and determining whether the training data has a training value for training a shared language model according to the prediction result and the reference result includes: judging whether the prediction text with the maximum prediction probability is matched with the reference sample; and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
Optionally, the reference result further includes a reference probability corresponding to a reference text, and the training the shared language model through the training data to update the model parameters of the shared language model includes: determining the prediction probability of a predicted text matched with a reference text and the error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text; and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
Optionally, further comprising instructions for: if the training data does not have training value, discarding the training data.
Optionally, further comprising instructions for: and obtaining the shared language model after the model parameters are updated, wherein the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
Optionally, the training data comprises information relating to user input.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data processing method, the data processing apparatus and the electronic device provided by the present invention are described in detail above, and specific examples are applied herein to illustrate the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A data processing method, comprising:
acquiring training data;
judging whether the training data has a training value for training a shared language model;
and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
2. The method of claim 1, wherein the training data comprises training text and reference results; the judging whether the training data has a training value for training a shared language model comprises:
inputting the training text into the shared language model for forward calculation to obtain a prediction result;
and judging whether the training data has a training value for training a shared language model or not according to the prediction result and the reference result.
3. The method of claim 2, wherein the prediction results comprise predicted text and corresponding prediction probabilities; the judging whether the training data has the training value for training the shared language model according to the prediction result and the reference result comprises the following steps:
judging whether the prediction text with the maximum prediction probability is matched with the reference sample;
and if the prediction text with the maximum prediction probability is not matched with the reference text, determining that the training data has the training value for training the shared language model.
4. The method of claim 3, wherein the reference result further includes a reference probability corresponding to a reference text, and wherein the training the shared language model through the training data to update the model parameters of the shared language model comprises:
determining the prediction probability of a predicted text matched with a reference text and the error between the prediction probability of the matched predicted text and the reference probability corresponding to the reference text;
and carrying out reverse learning on the shared language model according to the error, and updating the model parameters of the shared language model.
5. The method of claim 2, further comprising:
if the training data does not have training value, discarding the training data.
6. The method of claim 1, further comprising:
and obtaining the shared language model after the model parameters are updated, wherein the model parameters of the shared language model are updated by the server according to the model parameters uploaded by each terminal device.
7. The method of any of claims 1-6, wherein the training data comprises information related to user input.
8. A data processing apparatus, comprising:
the training data acquisition module is used for acquiring training data;
the judging module is used for judging whether the training data has the training value of training the shared language model;
and the updating module is used for training the shared language model to update the model parameters of the shared language model through the training data and uploading the updated model parameters to a server if the training data has the training value.
9. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any of method claims 1-7.
10. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for:
acquiring training data;
judging whether the training data has a training value for training a shared language model;
and if the training data has a training value, training the shared language model through the training data to update the model parameters of the shared language model, and uploading the updated model parameters to a server.
CN201911136852.2A 2019-11-19 2019-11-19 Data processing method and device and electronic equipment Pending CN112825247A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911136852.2A CN112825247A (en) 2019-11-19 2019-11-19 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911136852.2A CN112825247A (en) 2019-11-19 2019-11-19 Data processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN112825247A true CN112825247A (en) 2021-05-21

Family

ID=75906179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911136852.2A Pending CN112825247A (en) 2019-11-19 2019-11-19 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112825247A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024065755A1 (en) * 2022-09-30 2024-04-04 华为技术有限公司 Communication method and apparatus

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107508866A (en) * 2017-08-08 2017-12-22 重庆大学 Reduce the method for the transmission consumption of mobile device end neural network model renewal
CN108108428A (en) * 2017-12-18 2018-06-01 苏州思必驰信息科技有限公司 A kind of method, input method and system for building language model
CN109284313A (en) * 2018-08-10 2019-01-29 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on semi-supervised learning
KR20190103090A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for learning a model to generate poi data using federated learning
CN110262819A (en) * 2019-06-04 2019-09-20 深圳前海微众银行股份有限公司 A kind of the model parameter update method and device of federal study
CN110297848A (en) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 Recommended models training method, terminal and storage medium based on federation's study

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107508866A (en) * 2017-08-08 2017-12-22 重庆大学 Reduce the method for the transmission consumption of mobile device end neural network model renewal
CN108108428A (en) * 2017-12-18 2018-06-01 苏州思必驰信息科技有限公司 A kind of method, input method and system for building language model
CN109284313A (en) * 2018-08-10 2019-01-29 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on semi-supervised learning
CN110262819A (en) * 2019-06-04 2019-09-20 深圳前海微众银行股份有限公司 A kind of the model parameter update method and device of federal study
CN110297848A (en) * 2019-07-09 2019-10-01 深圳前海微众银行股份有限公司 Recommended models training method, terminal and storage medium based on federation's study
KR20190103090A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for learning a model to generate poi data using federated learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024065755A1 (en) * 2022-09-30 2024-04-04 华为技术有限公司 Communication method and apparatus

Similar Documents

Publication Publication Date Title
CN109243430B (en) Voice recognition method and device
CN111160448B (en) Training method and device for image classification model
CN113362812B (en) Voice recognition method and device and electronic equipment
CN109961791B (en) Voice information processing method and device and electronic equipment
CN109961094B (en) Sample acquisition method and device, electronic equipment and readable storage medium
CN109447125B (en) Processing method and device of classification model, electronic equipment and storage medium
CN110992979B (en) Detection method and device and electronic equipment
CN110929176A (en) Information recommendation method and device and electronic equipment
US20210089726A1 (en) Data processing method, device and apparatus for data processing
CN111160047A (en) Data processing method and device and data processing device
CN110764627A (en) Input method and device and electronic equipment
CN111739535A (en) Voice recognition method and device and electronic equipment
CN113362813A (en) Voice recognition method and device and electronic equipment
CN110941727A (en) Resource recommendation method and device, electronic equipment and storage medium
CN111831132A (en) Information recommendation method and device and electronic equipment
CN108549641B (en) Song evaluation method, device, equipment and storage medium
CN110968246A (en) Intelligent Chinese handwriting input recognition method and device
CN112712385B (en) Advertisement recommendation method and device, electronic equipment and storage medium
CN112784151B (en) Method and related device for determining recommended information
CN112825247A (en) Data processing method and device and electronic equipment
CN109887492B (en) Data processing method and device and electronic equipment
CN112331194A (en) Input method and device and electronic equipment
CN114268815A (en) Video quality determination method and device, electronic equipment and storage medium
CN111611339B (en) Recommendation method and related device for inputting related users
CN112818841A (en) Method and related device for recognizing user emotion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210521

RJ01 Rejection of invention patent application after publication