CN112684907B

CN112684907B - Text input method, device, equipment and storage medium

Info

Publication number: CN112684907B
Application number: CN202011554136.9A
Authority: CN
Inventors: 刘中媛; 吴志强
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2020-12-24
Filing date: 2020-12-24
Publication date: 2024-04-26
Anticipated expiration: 2040-12-24
Also published as: CN112684907A

Abstract

The application provides a text input method, a text input device, text input equipment and a storage medium, wherein the text input method comprises the following steps: acquiring a candidate text corresponding to a current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation; respectively determining the matching degree of each candidate text and the current editing scene; and determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing procedure, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the user input operation is remarkably simplified, and the user input efficiency is improved.

Description

Text input method, device, equipment and storage medium

Technical Field

The present application relates to the field of input methods, and in particular, to a text input method, apparatus, device, and storage medium.

Background

Text input is an indispensable operation content when a user applies an electronic product, and an input method is a main tool for inputting text on the electronic product by the user.

When a user applies an electronic product, the input text is the same or similar in many cases, and especially the text content input by the user in the same application scene has higher similarity. Therefore, if the text input in the history input operation can be applied to determine the current input content, the user input efficiency can be improved, and the user input operation can be simplified.

Disclosure of Invention

Based on the state of the art, the application provides a text input method, a text input device, text input equipment and a storage medium, which can remarkably simplify user input operation and improve user text input efficiency.

The technical scheme provided by the application is as follows:

a text input method, comprising:

acquiring a candidate text corresponding to a current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

Respectively determining the matching degree of each candidate text and the current editing scene;

And determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene.

Optionally, the obtaining the candidate text corresponding to the current input operation includes:

extracting text entities from texts corresponding to current input operation and texts corresponding to historical input operation matched with the current input operation respectively;

summarizing the extracted text entities according to the rule that each text entity class only comprises one latest text entity to obtain a text entity set, wherein each element in the text entity set corresponds to one class of text entity;

And respectively determining the text of each text entity in the text entity set as a candidate text corresponding to the current input operation.

Optionally, the extracting text entities from the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation includes:

Extracting a text entity from a text corresponding to a current input operation and a text corresponding to a history input operation matched with the current input operation by using a preset regular expression;

Or alternatively

Using a pre-trained entity recognition model to recognize and obtain a text entity from a text corresponding to a current input operation and a text corresponding to a history input operation matched with the current input operation;

The entity recognition model is obtained at least through training of recognizing text entities from text samples.

Alternatively, when a text entity cannot be extracted from a text using a regular expression, the text entity is identified from the text using a pre-trained entity identification model.

Optionally, the determining the matching degree of each candidate text and the current editing scene includes:

Respectively determining text entities in each candidate text;

calculating a matching score of each text entity and the current editing scene;

and determining the matching degree of the candidate text and the current editing scene based on the matching score of the text entity contained in the candidate text and the current editing scene.

Optionally, the calculating a matching score of each text entity and the current editing scene includes:

and calculating and determining a matching score of the text entity and the current editing scene at least according to the text entity category, the current application type and the current input box type.

Optionally, the determining, according to the matching degree between each candidate text and the current editing scene, the content of the input text corresponding to the current input operation from the candidate texts includes:

Determining whether the matching degree of the candidate texts and the current editing scene is greater than or equal to a set first matching degree threshold value in each candidate text;

If the matching degree of the candidate text and the current editing scene is larger than or equal to a set first matching degree threshold value, determining a text entity in the candidate text as input text content corresponding to the current input operation, and filling the determined input text content into an input box corresponding to the current input operation.

Optionally, when there are a plurality of candidate texts and the matching degree of the current editing scene is greater than or equal to a set first matching degree threshold, text entities are respectively extracted from the plurality of candidate texts, and a text entity with the largest matching degree with the current editing scene is selected from the extracted text entities to be used as the input text content corresponding to the current input operation.

Optionally, when there is no matching degree between the candidate text and the current editing scene is greater than or equal to a set first matching degree threshold, the method further includes:

Determining whether the matching degree of the candidate texts and the current editing scene is greater than or equal to a set second matching degree threshold value and smaller than the first matching degree threshold value in each candidate text; wherein the second match degree threshold is smaller than the first match degree threshold;

And if the matching degree of the candidate text and the current editing scene is greater than or equal to a set second matching degree threshold value and smaller than the first matching degree threshold value, extracting a text entity in the candidate text as input text content corresponding to the current input operation.

Optionally, when the matching degree of the candidate text and the current editing scene is greater than or equal to a set second matching degree threshold value and smaller than the first matching degree threshold value, the method further includes:

Sequencing each candidate text according to the sequence from high to low of the matching degree of the candidate text and the current editing scene to obtain a candidate text sequence;

And outputting the candidate text sequence.

Optionally, the current input operation is an operation of clicking an input box after performing text copying or cutting operation;

The obtaining the candidate text corresponding to the current input operation comprises the following steps:

and acquiring all texts stored in the clipboard as candidate texts corresponding to the current input operation, wherein the clipboard stores the text copied or cut at the time and the text copied or cut in the historical input operation.

A text input device, comprising:

A text preselection unit for acquiring a candidate text corresponding to the current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

The text analysis unit is used for respectively determining the matching degree of each candidate text and the current editing scene;

and the input text determining unit is used for determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene.

A text input device comprising:

a memory and a processor;

the memory is connected with the processor and used for storing programs;

The processor is configured to implement the text input method by running the program in the memory.

A storage medium having stored thereon a computer program which, when executed by a processor, implements the text entry method described above.

The text input method provided by the application is characterized in that the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation are used as candidate texts corresponding to the current input operation together. And then, respectively determining the matching degree of each candidate text and the current editing scene, and determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing procedure, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the user input operation is remarkably simplified, and the user input efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a text input method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a structure of an entity recognition model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a matching degree score distribution of a text sample and an editing scene according to an embodiment of the present application;

FIG. 4 is a schematic diagram of candidate text ranking provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a text input device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a text input device according to an embodiment of the present application.

Detailed Description

The technical scheme of the embodiment of the application is suitable for the application scene of text input, and can simplify the user input operation and improve the text input efficiency.

Input method applications are the primary tools for users to enter text information into electronic devices. In order to improve the user input efficiency, the input method application program generally has a memory function, for example, a word group or a phrase selected in the user history input is recorded in an input method word stock, and text content copied by the user is stored in an input method clipboard.

When a user applies an electronic device, there are many cases where the user input content is the same or similar, especially in the same scene or the same service, and the user input content has a high similarity. At the moment, the input method application program can predict the input intention of the user based on a small amount of input operations of the user, and screen texts conforming to the input intention of the user from the historical input contents for the user to select, so that the user is avoided from repeating the input operations, and the input efficiency is improved.

The existing input method application program can only provide relevant historical input texts for users to select according to the current input of the users, but the users are required to further select texts meeting the requirements from the provided texts or select partial contents from the texts to serve as input contents. The process remains independent of the user operation and cannot automatically determine the input text content corresponding to the current input operation.

In view of the above state of the art, the embodiments of the present application provide a text input method, which can automatically determine the content of an input text corresponding to a current input operation according to the current input operation and a historical user input operation, so as to completely avoid the operation of selecting an input text by a user, further improve the user input efficiency, and simplify the user input operation.

The text input method provided by the embodiment of the application can be applied to software programs such as input method application programs or hardware processing equipment such as processors, so that the application programs or the processing equipment can automatically determine complete input text content based on simple input operation of users.

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

An embodiment of the present application provides a text input method, as shown in fig. 1, including:

S101, acquiring a candidate text corresponding to the current input operation.

The candidate text at least comprises a text corresponding to the current input operation and a text corresponding to the historical input operation.

Specifically, the current input operation refers to an input editing operation performed by the user in an input box of the application, where the input operation includes, but is not limited to, a text input operation, a text paste operation, or the like.

The text input method provided by the embodiment of the application aims to determine the complete input text content based on a small amount of input operations executed by a user, thereby simplifying the input operations of the user, improving the input efficiency of the user and realizing the text input purpose without executing the complete input operations. Thus, the current input operation described above is not a complete input operation performed by the user, for example, the user inputs all text contents to be input, and is an incomplete input operation, i.e., a partial input operation performed by the user for the purpose of achieving the final text input.

For example, assuming that the final purpose of the user's current input is to input a long text, the current input operation described above may be the user inputting individual words in the long text, or a portion of pinyin characters of individual words, such as initials, etc.; assuming that the purpose of the user's current input operation is to paste a piece of text into the target input box, the current input operation may be an operation in which the user clicks into the target input box after performing text copying or cutting.

When the current input operation of the user is a text input operation, the text corresponding to the current input operation may be the text input by the user in the current input operation or the text related to the text input by the user in the current input operation, such as a related phrase, an approximate phrase, a long text containing text content input by the user, etc. of the text input by the user; when the current input operation of the user is a text pasting operation, the text corresponding to the current input operation is the text copied or cut by the user.

The above-mentioned history input operation matched with the current input operation refers to an input operation which is performed by the user using the input method application program in the history process and is the same as or similar to the input content or the input operation type of the current input operation, and the history input operation includes but is not limited to a text input operation, a text paste operation, or the like.

When the history input operation is a text input operation, the text corresponding to the history input operation is the text input in the history text input operation; when the history input operation is a text pasting operation, the text corresponding to the history input operation is the text copied or cut by the user in the history text pasting operation.

The text copied or cut by the user when executing the text pasting operation is stored in a clipboard of the input method application program.

At the current moment, when a user executes an input operation, the embodiment of the application acquires a text corresponding to the current input operation and a text corresponding to a history input operation matched with the current input operation as candidate texts corresponding to the current input operation.

As a preferred implementation manner, when the input operation currently executed by the user is a text input operation, acquiring a text corresponding to the current text input operation and a text corresponding to a history text input operation matched with the current input operation as candidate texts corresponding to the current input operation; when the input operation currently executed by the user is a text pasting operation, the text copied or cut by the user in the current text pasting operation and the text copied or cut by the user in the historical text pasting operation are obtained and used as candidate texts corresponding to the current input operation.

For example, when the user performs the text copying or cutting operation and then clicks the input box, and at this time, the text copied or cut by the user is stored in the clipboard, and when the step S101 is performed to obtain the candidate text corresponding to the current input operation, all the texts stored in the clipboard may be used as the candidate text corresponding to the current input operation.

S102, respectively determining the matching degree of each candidate text and the current editing scene.

Specifically, the current editing scene refers to an editing scene in which a user currently performs a text input operation, and is a text input scene formed by factors such as a type of a current application and a type of a current input box.

In general, there is a clear association between text editing scenes and text input content. For example, in a shopping application, the content input by the user on the order page is usually information such as the name, address, telephone, etc. of the consignee; in the verification code input text box, most of contents input by a user are digital verification codes; in a video barrage input text box, the user-entered content is typically text content associated with the video content.

Based on the association relation between the input text and the editing scene, whether the text can be used as the input text content in the current editing scene can be determined by judging whether the text is matched with the editing scene. Theoretically, a text matching the current editing scene can be used as the current input text content, while a text not matching the current editing scene is not suitable as the current input text content.

Based on the above, the embodiment of the application respectively determines the matching degree of each candidate text in the candidate texts corresponding to the current input operation and the current editing scene. The matching degree of the candidate text and the current editing scene can be used for representing the feasibility of the candidate text as the input text content in the current editing scene. If the matching degree of the candidate text and the current editing scene is higher, the feasibility of the candidate text as the input text content in the current editing scene is stronger, and if the matching degree of the candidate text and the current editing scene is lower, the feasibility of the candidate text as the input text content in the current editing scene is weaker.

Illustratively, the matching degree of the candidate text and the current editing scene can be measured by the matching degree of the text content of the candidate text and the factors such as the type of the current application, the type of the current input box and the like. If the matching degree of the text content of the candidate text and the factors such as the type of the current application, the type of the current input box and the like is higher, the matching degree of the candidate text and the current editing scene is higher; if the matching degree of the text content of the candidate text and the factors such as the type of the current application, the type of the current input box and the like is low, the matching degree of the candidate text and the current editing scene is low.

For example, assuming that the current input box is a "recipient" input box of the shopping order page, if the content of the candidate text is name content, it may be determined that the content of the candidate text matches the type of input content required by the "recipient" input box, and thus it may be determined that the matching degree of the candidate text and the current editing scene is higher; if the content of the candidate text is telephone number content, it may be determined that the text content does not match the type of input content required by the "recipient" input box, and therefore the candidate text has a low degree of matching with the current editing scene.

S103, according to the matching degree of each candidate text and the current editing scene, determining the content of the input text corresponding to the current input operation from the candidate texts.

For example, the candidate text with a higher matching degree (for example, highest or higher than a certain threshold value) with the current editing scene may be selected from the candidate texts directly according to the matching degree of the candidate text with the current editing scene, and the candidate text is used as the input text content corresponding to the current input operation.

Or selecting the candidate text which meets the preset requirement with the matching degree of the current editing scene from the candidate texts according to the matching degree of the candidate texts and the current editing scene, and then extracting text content from the selected candidate text as input text content corresponding to the current input operation.

The input text content determined in the above manner is text content matched with the current editing scene, so that the input text content can be directly input into an input box corresponding to the current input operation, namely, text input is realized.

As can be seen from the above description, the text input method provided by the embodiment of the present application uses the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation together as the candidate text corresponding to the current input operation. And then, respectively determining the matching degree of each candidate text and the current editing scene, and determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing procedure, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the user input operation is remarkably simplified, and the user input efficiency is improved.

For example, when acquiring the candidate text corresponding to the current input operation, the text corresponding to the current input operation and all the texts corresponding to the history input operation matched with the current input operation may be acquired together as the candidate text corresponding to the current input operation.

According to the method, the candidate text corresponding to the current input operation is obtained, so that the richness of the candidate text can be ensured, and the content of the input text can be more comprehensively selected from the candidate text through subsequent processing.

For example, assuming that after the user currently performs a text copying or cutting operation, clicking to enter a text box and performing a text pasting operation, the text copied or cut by the user this time and the text copied or cut by the user in a historical text pasting operation are used together as candidate texts corresponding to the current input operation, that is, the candidate texts corresponding to the text pasting operation to be performed by the user, and text contents pasted by the user are determined from the candidate texts. The text copied or cut by the user in the historical text pasting operation can be obtained by inquiring in the input method clipboard.

In some scenarios, the content that the user really wants to input is the content that has been input recently, or is part of the key content in the content that is input this time or has been input recently. For example, the user has copied the text "your mobile phone authentication code 343456 available for login registration", but the user really wants to paste to the input box, only the authentication code "343456" in the text. At this time, if too many texts containing the verification code are all used as candidate texts, the correct verification code is not favorable to be determined from the candidate texts.

Based on the above situation, the embodiment of the present application obtains the candidate text corresponding to the current input operation by:

First, text entities are extracted from texts corresponding to a current input operation and texts corresponding to historical input operations matched with the current input operation respectively.

Specifically, text entities are extracted from texts corresponding to the current input operation and texts corresponding to the historical input operation matched with the current input operation respectively. The specific text entity extraction method will be described in the following embodiments.

And then summarizing the extracted text entities according to the rule that each text entity category only comprises one latest text entity to obtain a text entity set, wherein each element in the text entity set corresponds to one category of text entity respectively.

The text entity category refers to a type to which the content of the text entity belongs.

For example, assuming that the existing texts are "telephone of four is 137 xxxxxxx", "i live in 1 st 101 of happy cell in XX region XX of XX province, you send this address bar", "your mobile phone verification code 343456 is available for login registration", text entities are extracted from the above three texts, and text entities of the following categories can be obtained:

text entity class	Text entity
		Name of name	Liwu four-element bag
Telephone number	137XXXXXXXX
		Address of the house	XX district happy district 1 of XX province XX city
Verification code	343456

It will be appreciated that when the number of text is greater, it is possible to extract text entities of the same category, different content, for example, more names, such as Zhang san, wang, etc., and more telephone numbers or addresses, etc.

According to the embodiment of the application, the text entities extracted from all texts are summarized according to the types of the text entities, and only one latest text entity is reserved for the text entities in the same type, namely only the text entities in the type extracted from the latest text are reserved.

In the above manner, the obtained text entities may form a text entity set, and each element in the set corresponds to a text entity of a category.

And finally, determining the text of each text entity in the text set as a candidate text corresponding to the current input operation.

Specifically, for each text entity in the text set, determining the text in which the text entity is located as a candidate text corresponding to the current input operation.

As an exemplary implementation manner, the text entity may be extracted from the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation, and specifically, the text entity may be extracted from the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation by using a preset regular expression.

The regular expression is used for screening text entities from the text, when the text content accords with the regular expression, the text content can be confirmed to be the text entity which accords with the regular expression, and when the text content does not accord with the regular expression, the text content can be confirmed to be not the text entity which accords with the regular expression.

The embodiment of the application presets the following regular expression, which is used for realizing text entity extraction:

a) Telephone: "telephone number of handset?

B) Address: "(address |live) (|yes)? ([ \u4e00- \u9fa5] +) [ - \u4e00- \u9fa5] "

C) The addressee: "addressee (|: yes)? ([ \u4e00- \u9fa5] {2,4 }) "and method for producing the same"

D) Verification code: "is the verification code? ([ 0-9] {6 }) [ -0-9 ] ";

And respectively matching texts of the text entities to be extracted with all preset regular expressions, and if the text contents in the texts are matched with a certain regular expression, determining the text contents as the text entities matched with the regular expression. By processing in the above manner, text entities such as telephone, address, recipient, verification code and the like can be extracted from the text.

It should be noted that, the regular expression may be flexibly set according to the field to which the processed text belongs and the text entity extraction requirement, and the embodiment of the present application is not limited.

Or a pre-trained entity recognition model can be used for recognizing and obtaining a text entity from a text corresponding to the current input operation and a text corresponding to the history input operation matched with the current input operation;

For example, as shown in fig. 2, the entity recognition model is a Bi-lstm+crf model structure.

Wherein the Bi-LSTM part outputs a hidden layer sequence with the same length as the text sentence fragments:

output from left to right h _L0,h_L1,h_L2,...,h_Ln

Right to left output: h _R0,h_R1,h_R2,…,h_Rn

Splicing the bidirectional LSTM results to obtain the following steps: [ [ h _L0,h_Rn],…,[h_Ln,h_R0 ] ], or expressed as: h ₀,h₁,…,h_n.

The CRF section is used for outputting text sentence fragments tags, which include whether the text sentence fragments are text entities, and the text entity category.

For example, suppose that the text is "telephone of four is 137XXXXXXXX", the text is split into "telephone of four is 137 XXXXXX" after the above model processing, and the sentence fragment sequence x ₁,x₂,x₃,x₄,x₅,x₆,x₇ is obtained correspondingly.

Wherein x ₁,x₂ corresponds to "litetral", and the corresponding label is BNAME ENAME (name entity);

x ₃ corresponds to "and the corresponding label is o representing a non-entity;

x ₄,x₅ corresponds to "telephone", and the corresponding label is o representing a non-entity;

x ₆ corresponds to "yes", and the corresponding label is o to represent a non-entity;

x ₇ corresponds to "137XXXXXXXX", and the corresponding label is ENTITY representing the target ENTITY.

The sentence fragment sequence x ₁,x₂,x₃,x₄,x₅,x₆,x₇ represents a byte feature sequence, in a model training stage, a dictionary can be obtained by encoding characters of a text sample, each character randomly initializes vector features, the entity recognition network is input, then learned parameters are obtained along with network model training, and entity tag information is output when a new sequence is given.

By training the text entity marking of the text sample, the entity recognition model can automatically recognize each character or character segment in the text, thereby recognizing the text entity from the text. And respectively inputting the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation into the entity recognition model, so as to extract the text entity.

Based on the entity recognition model, the embodiment of the application sets that, due to the reasons of incompleteness of the regular expression and the like, when the text entity cannot be extracted from the text by using the regular expression, the text entity is recognized from the text by using the pre-trained entity recognition model, that is, the text is input into the entity recognition model so as to recognize the text entity from the text.

As a preferred implementation manner, the embodiment of the present application determines the matching degree of each candidate text and the current editing scene in the following manner:

first, the text entities in each candidate text are determined separately.

Specifically, for each candidate text, the text entity contained therein is determined.

Based on the candidate text screening method described in the above embodiment, when the candidate text is a candidate text determined based on a set of text entities extracted from all texts, text entities in the candidate text may be directly acquired.

If the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation are all directly used as candidate texts, the text entity can be extracted from each candidate text based on the text entity extraction method described in the above embodiment.

Then, a matching score for each text entity to the current editing scene is calculated.

For example, a matching score of the text entity and the current editing scene may be calculated and determined according to the category of the text entity, the type of the current application, and the type of the current input box.

The more consistent the category of the text entity is with the type of the current application and the type of the current input box, the higher the matching score of the text entity and the current editing scene is, otherwise, the lower the matching score of the text entity and the current editing scene is.

For example, assuming that there are N text entity categories (such as phone number, address, verification code), once scores of M text entities are calculated, assuming that the text entity i (i=0, 1, … …, M) is classified as an entity of the j (j=0, 1, … …, M) th category, the text entity category cj=1 of the text entity i, otherwise cj=0. The current editing scene comprises an app (application) type and an input box type, and supposing that there are S app types and Q input box types, if the current editing scene is a k-th app, the app type ak=1 of the app, otherwise ak=0; if the current input box is the input box of the first type, the input box type xl=1 of the input box, otherwise xl=0.

Based on the rules, a scoring model is established:

The formula can be expressed in terms of vectors:

c＝(C₀,C₁,C₂,…,C_M)

α＝(α₀,α₁,α₂,…,α_M)

a＝(A₀,A₁,A₂,…A_S)

β＝(β₀,β₁,β₂,…β_S)

x＝(X₀,X₁,X₂,…X_Q)

γ＝(γ₀,γ₁,γ₂,…γ_Q)

δ＝(δ₀,δ₁,δ₂,…δ_M*S*Q)

The scoring formula is:

y＝ω₀+cα+aβ+xγ+caxδ

Wherein ω ₀ is the offset; c, a, x are respectively text entity category feature vectors, application type feature vectors and input box type feature vectors; the values in the alpha vector represent the contribution coefficients of the text entity class to the matching score, the values in the beta vector represent the contribution coefficients of the application type to the matching score, the values in the gamma vector represent the contribution coefficients of the input box type to the matching score, the values in the delta vector represent the contribution coefficients of the three feature combinations to the matching score, and the values of the parameter vector are calculated through a machine learning model.

The parameter vector calculation method is as follows:

Firstly, in the cold start stage, since the values of M and Q are smaller, α and γ can be set according to the matching degree possibility of different text entity types and input box types through manual experience, and if there are three text entity types including a telephone number, an address and a verification code, and there are three input box types including a search box, a chat box and a comment box, α= (0.2,0.5,0.8), γ= (0.1,0.5,0.3) can be set, and β and δ can be set as 0 vectors in the early stage. User behaviors are collected through a cold starting stage, the user behaviors are taken as samples, and training is conducted through an FM model, so that all parameter vectors are obtained.

After all the parameter vectors are obtained, the matching score of the text entity and the current editing scene can be calculated by utilizing the formula.

And finally, determining the matching degree of the candidate text and the current editing scene based on the matching score of the text entity contained in the candidate text and the current editing scene.

For example, when only one text entity exists in the candidate text, the matching score of the text entity and the current editing scene is used as the matching score of the candidate text and the current editing scene.

When a plurality of text entities exist in the candidate text, a weighted average value or an average value of the matching scores of each text entity in the candidate text and the current editing scene is used as the matching score of the candidate text and the current editing scene.

Based on the matching degree score of each candidate text and the current editing scene, text content can be selected from the candidate texts as text input content corresponding to the current input operation.

By way of example, the embodiments of the present application respectively determine text input contents corresponding to a current input operation in the following three ways.

According to the embodiment of the application, the matching degree of the candidate text sample and the editing scene and the relation of the input content of the user are studied in advance, so that the matching degree of the candidate text and the editing scene is determined, and the influence on the text input content corresponding to the current input operation is determined from the candidate text.

Through the research, the following conclusion is drawn:

The matching degree score of the candidate text sample and the editing scene is subjected to normal distribution, and assuming that X is the matching degree score of the candidate text sample and the editing scene, the matching degree score is:

X～N(μ，σ²)

Because the normal distribution has different parameters under different sample processing strategies, a specific score is not set to give a judgment suggestion, so the normal distribution is converted into a standard normal distribution, and a new score variable is set as Y, and then:

Wherein Y to N (0, 1), the standard normal distribution curve is shown in FIG. 3.

The standard normal profile may be used to select a manner of determining text input content corresponding to a current input operation.

First, it is determined whether there is a matching degree of the candidate text with the current editing scene that is equal to or greater than a set first matching degree threshold value among the candidate texts.

The first matching degree threshold refers to a first matching degree score. Referring to fig. 3, through empirical data analysis, the text having a matching degree with the current editing scene of 1.29 or more accounts for 10% of all the texts, and the candidate text of the section has a very high matching degree with the current editing scene, so that the content of the input text corresponding to the current input operation can be directly determined from the candidate text.

Therefore, if the matching degree of the candidate text and the current editing scene is larger than or equal to the set first matching degree threshold value, determining the text entity in the candidate text as the input text content corresponding to the current input operation.

If the matching degree with the current editing scene is greater than or equal to the set first matching degree threshold value, selecting a candidate text with the largest matching degree with the current editing scene from the candidate texts, and determining the text entity as the input text content corresponding to the current input operation.

If only one candidate text with the matching degree with the current editing scene being greater than or equal to the set first matching degree threshold value is included, but the candidate text contains a plurality of text entities, the text entity with the highest matching score with the current editing scene in the plurality of text entities is used as the input text content corresponding to the current input operation.

For example, assuming that the user copies text content including names, phones and addresses from other pages and clicks a "recipient" input box entering the online purchase order page to input the content to the input box in a manner of pasting text, according to the technical scheme of the embodiment of the present application, it may be determined by calculation that, among the text copied or cut in the past stored in the clipboard, the "name" content in the text content copied by the user this time has the highest matching score (greater than or equal to the first matching degree threshold value) with the current editing scene, the "name" content is determined directly as the input text content corresponding to the current input operation, that is, the text content determined as the text content that should be filled into the "recipient" input box.

Furthermore, in the above scenario, after determining the input text content corresponding to the current input operation, the input text content may also be directly filled into the input box corresponding to the current input operation, so as to implement automatic text input.

For example, in the online purchase order page, when the "name" content in the clipboard is determined to be the text content currently input to the "addressee" input box, the "name" content is directly filled in the "addressee" input box, so that automatic screen-up of the copied text can be realized on the premise that the user does not execute text pasting operation.

It can be understood that the above processing manner realizes that the text entity corresponding to the current input operation is directly extracted from the candidate text, and the text entity is automatically displayed to the input box corresponding to the current input operation, especially when the user copies the text and intends to paste the input text box, the user does not need to execute the text paste operation or select the text content really needed from the copied text, the text content conforming to the current input box can be extracted from the text content of the clipboard, and the text entity can be automatically displayed to the input box, thereby realizing that the target text is automatically screened from the clipboard and displayed to the input box.

And the process comprehensively analyzes the text corresponding to the current input operation and the text corresponding to the historical input operation, and selects the text content most conforming to the current input operation from the text content as the current input text content. Therefore, the processing mode realizes that the text content of the input text is automatically determined according to the text corresponding to the input operation and the text corresponding to the history input operation, and can realize automatic text input, and the automation degree of the text input is higher.

If there is no candidate text having a matching degree with the current editing scene greater than or equal to a set first matching degree threshold value among the candidate texts corresponding to the current input operation, it is further determined whether there is a candidate text having a matching degree with the current editing scene greater than or equal to a set second matching degree threshold value and less than the first matching degree threshold value among the candidate texts.

Wherein the second match threshold represents a second match score that is less than the first match score represented by the first match threshold.

Referring to fig. 3, through empirical data analysis, texts having a matching degree with the current editing scene of 0.84 or more and less than 1.29 account for 10% of all texts, and candidate texts in the region have a high matching degree with the current editing scene, so that a user can select text contents to be input therefrom, thereby simplifying user input operations.

Therefore, if there is a second matching degree threshold value set to be equal to or greater than the matching degree of the current editing scene and less than the first matching degree threshold value in the candidate text corresponding to the current input operation, a text entity is extracted from the candidate text as the input text content corresponding to the current input operation.

For example, assuming that the user copies text content including a name, a phone, and an address from another page, clicks a "recipient" input box entered into the online purchase order page, and intends to input the content to the input box in a paste text manner, according to the technical solution of the embodiment of the present application, it may be calculated and determined that, among the text that is copied or cut in the past and stored in the clipboard, the matching score of the text content that is copied by the user this time and the current editing scene is higher (greater than or equal to the second matching degree threshold and less than the first matching degree threshold), and the "name" content, "phone" content, and the "address" content in the text content that is copied this time are all determined as the input text content corresponding to the current input operation.

However, in this case, the determined input text content is not directly filled into the input box, but the determined input text content corresponding to the current input operation is output, selected by the user, and when the user clicks to select one or a plurality of output input text contents, the input text content selected by the user is displayed on the screen in the input box.

For example, on the online purchase order page, the "name" content, "telephone" content and "address" content copied by the user are respectively displayed, and the three contents are displayed in different areas, and when the user clicks a certain content, for example, when the user clicks the "name" content, the "name" content is filled in the text box of the "addressee".

Therefore, the processing realizes the prompt function of the input text content, when a user executes the input operation, the input text content corresponding to the current input operation is determined from the text corresponding to the current input operation and the text corresponding to the historical input operation according to the user input operation, and the user is prompted, so that the user can complete the complete text input in a selection mode after executing part of the input operation, the user input efficiency can be improved, and the user input operation is simplified.

In the text copying and pasting operation, the processing procedure is mainly described, and by the technical scheme of the embodiment of the application, the text entity is automatically screened out from the copied or cut content and the history content of the clipboard to be used as the text content of the input or provided for the user to select. In fact, after the user performs the current text copying or cutting operation, a complete text entry may be selected from all text entries in the clipboard, to be used as the current input text content, or a text entry suitable for being used as the current text input content may be selected for the user to select.

Further, if there is no candidate text whose matching degree with the current editing scene is greater than or equal to the set second matching degree threshold value and is smaller than the first matching degree threshold value in the candidate text corresponding to the current input operation, it may be determined that, as shown in fig. 3, the matching degree of the candidate text corresponding to the current input operation with the current editing scene is in a section smaller than 0.84, and the matching degree of the candidate text with the current editing scene is lower and may not include text content that the user really wants to input, at this time, the embodiment of the present application sorts the candidate text corresponding to the current input operation in order from high to low according to the matching degree with the current editing scene, so as to obtain a candidate text sequence, and outputs the candidate text sequence so that the user can select the candidate text as the current input text content.

For example, assuming that a user copies text from another page to be pasted into a barrage input box for barrage input while watching video content, the embodiment of the present application can select text from the text copied currently and the text copied or cut historically saved in the clipboard as a candidate text corresponding to the current input operation.

According to the conventional text input scheme, only the candidate text can be output and displayed, and the user is required to further judge which candidate text is more matched with the current input, and then the candidate text can be selected as the current input content. According to the embodiment of the application, the matching degree of each candidate text and the current editing scene is automatically calculated, the candidate texts are sequenced according to the sequence from high to low of the matching degree of the candidate texts and the current editing scene to obtain the candidate text sequence, and then the candidate text sequence is output, so that a user can intuitively determine through the candidate text sequence that the text sequenced earlier in the sequence is matched with the current editing scene, and therefore, the user can conveniently select the text matched with the current editing scene from the text as input content.

For example, referring to fig. 4, it is assumed that the text of "front high energy" stored in the clipboard in the history copy or cut operation has the highest matching degree with the current editing scene, but the conventional input scheme does not place the text of "front high energy" in the first display, and after the text of "front high energy" is subjected to the sorting processing, the text of "front high energy" in the first display in the technical scheme of the embodiment of the application can be conveniently confirmed by the user to be the text most suitable for the current barrage input, so that the text of "front high energy" can be directly selected from the first as the barrage input content.

As can be seen from the above description, the text input method provided by the embodiment of the present application is capable of comprehensively analyzing the text corresponding to the current input and the text corresponding to the history input operation matched with the current input, determining the candidate text corresponding to the current input operation therefrom, calculating the matching degree of each candidate text and the current editing scene, and determining the content of the input text corresponding to the current input operation from the candidate texts in different manners according to the matching degree of each candidate text and the current editing scene. The processing scheme can automatically provide input references for users or automatically replace users to execute text input operation, so that the complexity of the user input operation can be remarkably reduced, and the input efficiency is improved.

The embodiment of the application also provides a text input device, referring to fig. 5, the device comprises:

A text preselection unit 100 for acquiring a candidate text corresponding to a current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

A text analysis unit 110, configured to determine a matching degree between each candidate text and the current editing scene;

The input text determining unit 120 is configured to determine, according to a matching degree between each candidate text and the current editing scene, an input text content corresponding to the current input operation from the candidate texts.

The text input device provided by the embodiment of the application uses the text corresponding to the current input operation and the text corresponding to the history input operation matched with the current input operation together as the candidate text corresponding to the current input operation. And then, respectively determining the matching degree of each candidate text and the current editing scene, and determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing procedure, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the user input operation is remarkably simplified, and the user input efficiency is improved.

Or alternatively

Respectively determining text entities in each candidate text;

calculating a matching score of each text entity and the current editing scene;

Optionally, when there is no matching degree between the candidate text and the current editing scene is greater than or equal to a set first matching degree threshold, the input text determining unit is further configured to:

Optionally, when there is no matching degree of the candidate text with the current editing scene greater than or equal to the set second matching degree threshold value and less than the first matching degree threshold value, the input text determining unit is further configured to:

And outputting the candidate text sequence.

In detail, for the specific working content of each unit of the text input device, please refer to the content of the method embodiment, and the description is omitted herein.

Another embodiment of the present application also proposes a text input device, as shown in fig. 6, comprising:

A memory 200 and a processor 210;

wherein the memory 200 is connected to the processor 210, and is used for storing a program;

The processor 210 is configured to implement the respective processing steps of the text input method disclosed in any of the above embodiments by executing the program stored in the memory 200.

Specifically, the text input device may further include: a bus, a communication interface 220, an input device 230, and an output device 240.

The processor 210, the memory 200, the communication interface 220, the input device 230, and the output device 240 are interconnected by a bus. Wherein:

A bus may comprise a path that communicates information between components of a computer system.

Processor 210 may be a general-purpose processor, such as a general-purpose Central Processing Unit (CPU), microprocessor, etc., or may be an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs in accordance with aspects of the present invention. But may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.

Processor 210 may include a main processor, and may also include a baseband chip, modem, and the like.

The memory 200 stores programs for implementing the technical scheme of the present invention, and may also store an operating system and other key services. In particular, the program may include program code including computer-operating instructions. More specifically, memory 200 may include read-only memory (ROM), other types of static storage devices that may store static information and instructions, random access memory (random access memory, RAM), other types of dynamic storage devices that may store information and instructions, disk storage, flash, and the like.

The input device 230 may include means for receiving data and information entered by a user, such as a keyboard, mouse, camera, scanner, light pen, voice input device, touch screen, pedometer, or gravity sensor, among others.

Output device 240 may include means, such as a display screen, printer, speakers, etc., that allow information to be output to a user.

The communication interface 220 may include devices using any transceiver or the like for communicating with other devices or communication networks, such as ethernet, radio Access Network (RAN), wireless Local Area Network (WLAN), etc.

The processor 2102 executes programs stored in the memory 200 and invokes other devices that may be used to implement the various steps of the text input method provided by embodiments of the present application.

Another embodiment of the present application also provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the text input method provided in any of the above embodiments.

Specifically, the specific working content of each portion of the above text input device and the specific processing content of the computer program on the storage medium when executed by the processor may refer to the content of each embodiment of the above text input method, which is not described herein again.

For the foregoing method embodiments, for simplicity of explanation, the methodologies are shown as a series of acts, but one of ordinary skill in the art will appreciate that the present application is not limited by the order of acts, as some steps may, in accordance with the present application, occur in other orders or concurrently. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.

The steps in the method of each embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs, and the technical features described in each embodiment can be replaced or combined.

The modules and the submodules in the device and the terminal of the embodiments of the application can be combined, divided and deleted according to actual needs.

In the embodiments provided in the present application, it should be understood that the disclosed terminal, apparatus and method may be implemented in other manners. For example, the above-described terminal embodiments are merely illustrative, and for example, the division of modules or sub-modules is merely a logical function division, and there may be other manners of division in actual implementation, for example, multiple sub-modules or modules may be combined or integrated into another module, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The modules or sub-modules illustrated as separate components may or may not be physically separate, and components that are modules or sub-modules may or may not be physical modules or sub-modules, i.e., may be located in one place, or may be distributed over multiple network modules or sub-modules. Some or all of the modules or sub-modules may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional module or sub-module in the embodiments of the present application may be integrated in one processing module, or each module or sub-module may exist alone physically, or two or more modules or sub-modules may be integrated in one module. The integrated modules or sub-modules may be implemented in hardware or in software functional modules or sub-modules.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software unit executed by a processor, or in a combination of the two. The software elements may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A text input method, comprising:

Acquiring a candidate text corresponding to a current input operation; the candidate text at least comprises a text corresponding to the current input operation and a text corresponding to a history input operation matched with the current input operation, wherein the text corresponding to the current input operation is a text or related text of a part of input operations executed by a user for realizing the final text input purpose; the text corresponding to the history input operation matched with the current input operation is the text corresponding to the input operation matched with the input content or the input operation type of the current input operation and executed by the user application input method application program in the history process;

2. The method of claim 1, wherein the obtaining the candidate text corresponding to the current input operation comprises:

3. The method according to claim 2, wherein extracting text entities from text corresponding to a current input operation and text corresponding to a history input operation that matches the current input operation, respectively, comprises:

Or alternatively

4. A method according to claim 3, wherein when text entities are not extracted from text using regular expressions, text entities are identified from the text using a pre-trained entity identification model.

5. The method according to claim 1 or 2, wherein the determining the matching degree of each candidate text and the current editing scene respectively includes:

Respectively determining text entities in each candidate text;

calculating a matching score of each text entity and the current editing scene;

6. The method of claim 5, wherein calculating a match score for each text entity to a current editing scene comprises:

7. The method according to claim 2, wherein the determining the content of the input text corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene comprises:

8. The method according to claim 7, wherein when there are a plurality of candidate texts having a degree of matching with the current editing scene equal to or greater than a set first degree of matching threshold, text entities are extracted from the plurality of candidate texts, respectively, and a text entity having a highest degree of matching with the current editing scene is selected from the extracted text entities as the input text content corresponding to the current input operation.

9. The method of claim 7, wherein when there is no matching degree of the candidate text to the current editing scene greater than or equal to a set first matching degree threshold, the method further comprises:

10. The method of claim 9, wherein when there is no matching degree of the candidate text with the current editing scene that is greater than or equal to a set second matching degree threshold value and less than the first matching degree threshold value, the method further comprises:

And outputting the candidate text sequence.

11. The method of claim 1, wherein the current input operation is an operation of clicking into an input box after performing a text copying or cutting operation;

12. A text input device, comprising:

A text preselection unit for acquiring a candidate text corresponding to the current input operation; the candidate text at least comprises a text corresponding to the current input operation and a text corresponding to a history input operation matched with the current input operation, wherein the text corresponding to the current input operation is a text or related text of a part of input operations executed by a user for realizing the final text input purpose; the text corresponding to the history input operation matched with the current input operation is the text corresponding to the input operation matched with the input content or the input operation type of the current input operation and executed by the user application input method application program in the history process;

13. A text input device, comprising:

a memory and a processor;

the memory is connected with the processor and used for storing programs;

the processor is configured to implement the text input method according to any one of claims 1 to 11 by running a program in the memory.

14. A storage medium having stored thereon a computer program which, when executed by a processor, implements the text input method of any of claims 1 to 11.