CN112684907A

CN112684907A - Text input method, device, equipment and storage medium

Info

Publication number: CN112684907A
Application number: CN202011554136.9A
Authority: CN
Inventors: 刘中媛; 吴志强
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2020-12-24
Filing date: 2020-12-24
Publication date: 2021-04-20
Anticipated expiration: 2040-12-24

Abstract

The application provides a text input method, a text input device, text input equipment and a storage medium, wherein the method comprises the following steps: acquiring a candidate text corresponding to the current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation; respectively determining the matching degree of each candidate text and the current editing scene; and determining the input text content corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing process, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the input operation of the user is obviously simplified, and the input efficiency of the user is improved.

Description

Text input method, device, equipment and storage medium

Technical Field

The present application relates to the field of input methods, and in particular, to a text input method, apparatus, device, and storage medium.

Background

Text input is indispensable operation content when a user applies an electronic product, and an input method is a main tool for the user to input texts on the electronic product.

When a user applies an electronic product, the input texts are the same or similar in many cases, and especially the text contents input by the user under the same application scene have higher similarity. Therefore, if the text input in the history input operation can be applied to determine the current input content, the method can help to improve the input efficiency of the user and simplify the input operation of the user.

Disclosure of Invention

Based on the technical current situation, the application provides a text input method, a text input device, text input equipment and a text input storage medium, which can remarkably simplify the input operation of a user and improve the text input efficiency of the user.

The technical scheme provided by the application is as follows:

a text entry method comprising:

acquiring a candidate text corresponding to the current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

respectively determining the matching degree of each candidate text and the current editing scene;

and determining the input text content corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene.

Optionally, the obtaining of the candidate text corresponding to the current input operation includes:

respectively extracting text entities from texts corresponding to current input operations and texts corresponding to historical input operations matched with the current input operations;

summarizing the extracted text entities according to a rule that each text entity category only comprises a latest text entity to obtain a text entity set, wherein each element in the text entity set corresponds to a text entity of one category;

and respectively determining the text where each text entity in the text entity set is located as a candidate text corresponding to the current input operation.

Optionally, the extracting the text entities from the texts corresponding to the current input operations and the texts corresponding to the historical input operations matched with the current input operations respectively includes:

extracting a text entity from a text corresponding to the current input operation and a text corresponding to the historical input operation matched with the current input operation by using a preset regular expression;

alternatively, the first and second electrodes may be,

recognizing a text entity from a text corresponding to the current input operation and a text corresponding to the historical input operation matched with the current input operation by using a pre-trained entity recognition model;

the entity recognition model is obtained by training at least through recognizing text entities from text samples.

Optionally, when the regular expression is used and the text entity cannot be extracted from the text, the pre-trained entity recognition model is used to recognize the text entity from the text.

Optionally, the determining the matching degree between each candidate text and the current editing scene respectively includes:

respectively determining a text entity in each candidate text;

calculating the matching score of each text entity and the current editing scene;

and determining the matching degree of the candidate text and the current editing scene based on the matching scores of the text entities contained in the candidate text and the current editing scene.

Optionally, the calculating a matching score of each text entity with the current editing scenario includes:

and calculating and determining the matching score of the text entity and the current editing scene at least according to the text entity category, the current application type and the current input box type.

Optionally, the determining, according to the matching degree between each candidate text and the current editing scene, the input text content corresponding to the current input operation from the candidate texts includes:

determining whether the matching degree of the candidate texts and the current editing scene is greater than or equal to a set first matching degree threshold value or not in each candidate text;

and if the matching degree of the candidate text and the current editing scene is larger than or equal to a set first matching degree threshold value, determining the text entity in the candidate text as the input text content corresponding to the current input operation, and filling the determined input text content into the input box corresponding to the current input operation.

Optionally, when the matching degree between a plurality of candidate texts and the current editing scene is greater than or equal to the set first matching degree threshold, text entities are respectively extracted from the plurality of candidate texts, and a text entity with the maximum matching degree with the current editing scene is selected from the extracted text entities as the input text content corresponding to the current input operation.

Optionally, when the matching degree between the candidate text and the current editing scene is not greater than or equal to the set first matching degree threshold, the method further includes:

determining whether the matching degree of the candidate texts and the current editing scene is greater than or equal to a set second matching degree threshold value and smaller than a first matching degree threshold value or not in each candidate text; wherein the second threshold of degree of match is less than the first threshold of degree of match;

and if the matching degree of the candidate text and the current editing scene is greater than or equal to a set second matching degree threshold value and smaller than a first matching degree threshold value, extracting a text entity in the candidate text as the input text content corresponding to the current input operation.

Optionally, when the matching degree between the candidate text and the current editing scene is not greater than or equal to the set second matching degree threshold and is less than the first matching degree threshold, the method further includes:

sequencing all candidate texts according to the sequence of the matching degree of the candidate texts with the current editing scene from high to low to obtain a candidate text sequence;

and outputting the candidate text sequence.

Optionally, the current input operation is an operation of clicking to enter an input box after text copying or cutting operation is executed;

the acquiring of the candidate text corresponding to the current input operation includes:

and acquiring all texts stored in a clipboard as candidate texts corresponding to the current input operation, wherein the clipboard stores the texts copied or cut at this time and the texts copied or cut in the historical input operation.

A text input device comprising:

the text preselection unit is used for acquiring a candidate text corresponding to the current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

the text analysis unit is used for respectively determining the matching degree of each candidate text and the current editing scene;

and the input text determining unit is used for determining the input text content corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene.

A text input device comprising:

a memory and a processor;

the memory is connected with the processor and used for storing programs;

the processor is used for implementing the text input method by running the program in the memory.

A storage medium having stored thereon a computer program which, when executed by a processor, implements the text input method described above.

According to the text input method, the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation are jointly used as candidate texts corresponding to the current input operation. Then, the matching degree of each candidate text and the current editing scene is respectively determined, and the input text content corresponding to the current input operation is determined from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing process, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the input operation of the user is obviously simplified, and the input efficiency of the user is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic flowchart of a text input method provided in an embodiment of the present application;

FIG. 2 is a schematic structural diagram of an entity recognition model provided in an embodiment of the present application;

fig. 3 is a schematic diagram of a distribution of matching degree scores of text samples and editing scenes provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of an ordering of candidate texts according to an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a text input device according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a text input device according to an embodiment of the present application.

Detailed Description

The technical scheme of the embodiment of the application is suitable for the text input application scene, and the input operation of a user can be simplified and the text input efficiency can be improved by adopting the technical scheme of the embodiment of the application.

Input method applications are the primary tools by which users enter textual information into electronic devices. In order to improve the input efficiency of the user, the input method application program usually has a memory function, for example, phrases or phrases selected in the user's historical input are recorded in the input method lexicon, and text contents copied by the user are stored in the input method clipboard.

When a user applies an electronic device, there are many cases where user input contents are the same or similar, and especially under the same scene or the same service, the user input contents have a high similarity. At the moment, the input method application program can predict the input intention of the user based on a small amount of input operation of the user, and screens texts which accord with the input intention of the user from historical input contents for the user to select, so that repeated input operation of the user is avoided, and the input efficiency is improved.

The existing input method application program can only provide related historical input texts for a user to select according to the current input of the user, but needs the user to further select texts meeting requirements from the provided texts or select partial contents from the texts as input contents. The process remains undisclosed from user operation and cannot automatically determine the input text content corresponding to the current input operation.

In view of the above technical current situation, an embodiment of the present application provides a text input method, which can automatically determine an input text content corresponding to a current input operation according to the current user input operation and a historical user input operation, thereby completely eliminating an operation of selecting an input text by a user, further improving user input efficiency, and simplifying user input operations.

The text input method provided by the embodiment of the application can be applied to software programs such as input method application programs and the like or hardware processing equipment such as a processor and the like, so that the application programs or the processing equipment can automatically determine the complete input text content based on simple input operation of a user.

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

An embodiment of the present application provides a text input method, as shown in fig. 1, the method includes:

and S101, acquiring a candidate text corresponding to the current input operation.

The candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations.

Specifically, the current input operation refers to an input editing operation performed by the user in the input box of the application program at this time, and the input operation includes, but is not limited to, a text input operation, a text paste operation, or the like.

The text input method provided by the embodiment of the application aims to determine the complete input text content based on a small amount of input operation executed by a user, thereby simplifying the input operation of the user, improving the input efficiency of the user and realizing the purpose of text input without executing the complete input operation by the user. Therefore, the current input operation is not a complete input operation performed by the user, for example, the user inputs all the text content desired to be input, and the current input operation is an incomplete input operation, that is, a partial input operation performed by the user for achieving the final text input purpose.

For example, assuming that the final purpose of the current input by the user is to input a long text, the current input operation may be the user inputting an individual character in the long text, or a partial pinyin character of the individual character, such as an initial letter; assuming that the purpose of the current input operation of the user is to paste a piece of text into the target input box, the current input operation may be an operation in which the user clicks into the target input box after performing text copy or cut.

When the current input operation of the user is a text input operation, the text corresponding to the current input operation may be the text input by the user in the current input operation, or may be a text related to the text input by the user in the current input operation, such as an associated phrase, an approximate phrase of the text input by the user, a long text containing the text content already input by the user, and the like; when the current input operation of the user is a text pasting operation, the text corresponding to the current input operation is the text copied or cut by the user at this time.

The history input operation matched with the current input operation refers to an input operation which is executed by the user by applying the input method application program and is the same as or similar to the input content or the input operation type of the current input operation in the history process, and the history input operation includes but is not limited to a text input operation, a text paste operation and the like.

When the historical input operation is a text input operation, the text corresponding to the historical input operation is the text input in the historical text input operation; when the history input operation is a text pasting operation, the text corresponding to the history input operation is the text copied or cut by the user in the history text pasting operation.

The text copied or cut by the user when executing the text pasting operation is stored in the clipboard of the input method application program.

At the current moment, when a user executes an input operation, a text corresponding to the current input operation and a text corresponding to a historical input operation matched with the current input operation are obtained and serve as candidate texts corresponding to the current input operation.

As a preferred implementation manner, when the input operation currently executed by the user is a text input operation, acquiring a text corresponding to the current text input operation and a text corresponding to a historical text input operation matched with the current input operation as candidate texts corresponding to the current input operation; when the input operation currently executed by the user is a text pasting operation, acquiring a text copied or cut by the user in the current text pasting operation and a text copied or cut by the user in the historical text pasting operation as candidate texts corresponding to the current input operation.

For example, after the user performs a text copy or cut operation, the user clicks the entry box, and at this time, the text copied or cut by the user this time is stored in the clipboard, and then, when the candidate text corresponding to the current input operation is acquired in step S101, all the texts stored in the clipboard may be used as candidate texts corresponding to the current input operation.

And S102, respectively determining the matching degree of each candidate text and the current editing scene.

Specifically, the current editing scene refers to an editing scene in which a user currently performs a text input operation, and is a text input scene formed by a type of a current application, a type of a current input box, and other factors.

Generally, there is an obvious association between a text editing scene and text input content. For example, in a shopping application, the content input by the user on the order page is usually information such as name, address, telephone number, etc. of the receiver; in the verification code input text box, the content input by the user is mostly a digital verification code; in the video bullet screen input text box, the content input by the user is usually text content related to the video content.

Based on the incidence relation between the input text and the editing scene, whether the text is matched with the editing scene or not can be determined, and whether the text can be used as the input text content in the current editing scene or not can be determined. Theoretically, a text matching the current editing scenario may be used as the current input text content, while a text not matching the current editing scenario is not suitable for being used as the current input text content.

Based on this, the embodiment of the application determines the matching degree of each candidate text in the candidate texts corresponding to the current input operation and the current editing scene respectively. The matching degree of the candidate text and the current editing scene can be used for representing the feasibility of the candidate text as the input text content in the current editing scene. If the matching degree of the candidate text and the current editing scene is high, the feasibility of the candidate text as the input text content in the current editing scene is high, and if the matching degree of the candidate text and the current editing scene is low, the feasibility of the candidate text as the input text content in the current editing scene is low.

For example, the matching degree of the candidate text with the current editing scene can be measured by the matching degree of the text content of the candidate text with the type of the current application, the type of the current input box, and the like. If the matching degree of the text content of the candidate text with the factors such as the type of the current application, the type of the current input box and the like is high, the matching degree of the candidate text with the current editing scene is high; and if the matching degree of the text content of the candidate text with the factors such as the type of the current application, the type of the current input box and the like is low, the matching degree of the candidate text with the current editing scene is low.

For example, assuming that the current input box is a "recipient" input box of a shopping order page, if the content of the candidate text is name content, the content of the candidate text may be determined to be consistent with the type of input content required by the "recipient" input box, and thus it may be determined that the matching degree of the candidate text with the current editing scene is high; if the content of the candidate text is the telephone number content, it can be determined that the text content does not conform to the type of input content required by the "recipient" input box, and therefore the candidate text has a low degree of matching with the current editing scene.

S103, determining input text content corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text and the current editing scene.

For example, a candidate text with a higher matching degree (e.g., highest, or higher than a certain threshold) with the current editing scene may be selected from the candidate texts as the input text content corresponding to the current input operation directly according to the matching degree of the candidate text with the current editing scene.

Or selecting a candidate text with the matching degree with the current editing scene according to the matching degree of the candidate text and the current editing scene, and then extracting text content from the selected candidate text to serve as input text content corresponding to the current input operation.

The input text content determined in the above manner is the text content matched with the current editing scene, so that the input text content can be directly displayed in the input box corresponding to the current input operation, that is, the text input is realized.

As can be seen from the above description, in the text input method provided in the embodiment of the present application, the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation are jointly used as the candidate text corresponding to the current input operation. Then, the matching degree of each candidate text and the current editing scene is respectively determined, and the input text content corresponding to the current input operation is determined from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing process, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the input operation of the user is obviously simplified, and the input efficiency of the user is improved.

For example, when the candidate text corresponding to the current input operation is obtained, the text corresponding to the current input operation and all texts corresponding to the historical input operations matching the current input operation may be obtained as the candidate text corresponding to the current input operation.

The candidate texts corresponding to the current input operation are obtained according to the method, the richness of the candidate texts can be ensured, and therefore the subsequent processing is facilitated, and the input text content can be selected from the candidate texts more comprehensively.

For example, if the user clicks into a text box after currently performing a text copy or cut operation and then performs a text paste operation, the text copied or cut by the user this time and the text copied or cut by the user in the history text paste operation are collectively used as candidate texts corresponding to the current input operation, that is, candidate texts corresponding to the text paste operation to be performed by the user, and the text content pasted by the user is determined from these candidate texts. The text copied or cut by the user in the history text pasting operation can be searched in the input method clipboard.

In some scenarios, the content that the user currently inputs is the content that has been input recently, or is part of the key content in the content that is input this time or input recently. For example, the user copied the text "your cell phone verification 343456, available for login registration", but what the user really wanted to paste to the input box, is the verification "343456" in the text above. At this time, if too many texts containing verification codes are all used as candidate texts, it is not beneficial to determine the correct verification codes from the candidate texts.

Based on the above situation, in the embodiment of the present application, the candidate text corresponding to the current input operation is obtained as follows:

firstly, text entities are extracted from texts corresponding to current input operations and texts corresponding to historical input operations matched with the current input operations.

Specifically, text entities are respectively extracted from a text corresponding to the current input operation and a text corresponding to the historical input operation matched with the current input operation. The specific text entity extraction method will be described in the following embodiments.

And then summarizing the extracted text entities according to a rule that each text entity type only comprises a latest text entity to obtain a text entity set, wherein each element in the text entity set corresponds to a text entity of one type respectively.

The text entity category refers to a type to which the content of the text entity belongs.

For example, assuming that the existing texts are "the telephone of lie four is 137 XXXXXXXX", "i'm home stays in XX district happy district 1 of XX city of XX 101, you host the address bar", "your mobile phone verification code 343456, and is available for login and registration", respectively, the text entities are extracted from the three texts, and the following categories of text entities can be obtained:

text entity categories	Text entity
		Name (I)	Li Si
Telephone number	137XXXXXXXX
		Address	X district happy district 1 in XX district of XX city of XX province 101
Verification code	343456

It is understood that when the number of texts is larger, text entities of the same category and different contents may be extracted, for example, more names such as zhang san, wang wu, etc. may be extracted, and more telephone numbers or addresses may be extracted.

According to the method and the device, the text entities extracted from all texts are summarized according to the text entity types, and only one latest text entity is reserved for the text entities of the same type, namely only the text entities of the type extracted from the latest text are reserved.

In the above manner, the obtained text entities may form a text entity set, and each element in the set corresponds to a type of text entity.

And finally, respectively determining the text where each text entity in the text set is located as a candidate text corresponding to the current input operation.

Specifically, for each text entity in the text set, the text in which the text entity is located is determined as the candidate text corresponding to the current input operation.

As an exemplary implementation manner, the text entities are extracted from the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation, and specifically, a preset regular expression may be used to extract the text entities from the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation.

The regular expression is an expression used for screening the text entity from the text, when the text content conforms to the regular expression, the text content can be confirmed to be the text entity conforming to the regular expression, and when the text content does not conform to the regular expression, the text content can be confirmed not to be the text entity conforming to the regular expression.

The embodiment of the application presets the regular expressions shown as follows, and is used for realizing text entity extraction:

a) telephone: "(phone | mobile phone number?

b) Address: "(address | live) (|: is)? ([ \ u4e00- \ u9fa5] +) [ ^ u4e00- \ u9fa5 "

c) The receiver: "recipient (|: yes)? ([ \ u4e00- \ u9fa5] {2,4}) ] "

d) Verification code: "is the verification code? ([0-9] {6}) [. sup.0-9 ]) "

And respectively matching the texts of the text entities to be extracted with each preset regular expression, and if the text content in the texts is matched with a certain regular expression, determining the text content as the text entity matched with the regular expression. By processing in the above manner, textual entities such as phone, address, recipient, passcode, etc. may be extracted from the text.

It should be noted that the regular expression may be flexibly set according to the field to which the processed text belongs and the text entity extraction requirement, and the embodiment of the present application is not limited.

Or, a text entity can be identified and obtained from a text corresponding to the current input operation and a text corresponding to the historical input operation matched with the current input operation by utilizing a pre-trained entity identification model;

the entity recognition model is obtained by at least training the recognition of text entities from text samples.

Illustratively, as shown in FIG. 2, the entity recognition model is a Bi-LSTM + CRF model structure.

Wherein, the Bi-LSTM part outputs a hidden state sequence with the same length as the text sentence fragment:

output from left to right h_L0,h_L1,h_L2,...,h_Ln

And outputting from right to left: h is_R0,h_R1,h_R2,…,h_Rn

Splicing the two-way LSTM result: [ [ h ]_L0,h_Rn],…,[h_Ln,h_R0]]Or is represented as: h is₀,h₁,…,h_n。

The CRF section is configured to output a text sentence fragment tag including whether the text sentence fragment is a text entity, and a text entity category.

For example, suppose that the text is "the telephone of lie four is 137 XXXXXXXX", the text is split into "the telephone of lie four is 137 XXXXXXXX" after the above model processing, and a sentence fragment sequence x is obtained correspondingly₁,x₂,x₃,x₄,x₅,x₆,x₇。

Wherein x is₁,x₂Corresponding to "lie four", the corresponding label is BNAME ENAME (name entity);

x₃corresponding to "and the corresponding label is o to denote a non-entity;

x₄,x₅corresponding to "phone", the corresponding label o indicates non-entity;

x₆corresponding to "yes", the corresponding label is o to denote non-entity;

x₇corresponding to "137 XXXXXXX", the corresponding label represents the target ENTITY for ENTITY.

The sentence fragment sequence x₁,x₂,x₃,x₄,x₅,x₆,x₇Representing byte characteristic sequence, in the model training stage, obtaining dictionary by encoding characters of text sample, randomly initializing vector characteristics of each character, inputting the above entity recognition network, and then obtaining the character dictionary with the training of network modelLearned parameters and when a new sequence is given, entity tag information is output.

By performing text entity label training on the text sample, the entity recognition model can automatically recognize each character or character segment in the text, so as to recognize the text entity from the text. And respectively inputting the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation into the entity recognition model, so as to extract and obtain a text entity.

Based on the entity recognition model, the embodiment of the present application sets that, due to incompleteness of the regular expression and the like, when a text entity cannot be extracted from a text by using the regular expression, the text entity is recognized from the text by using the entity recognition model trained in advance, that is, the text is input into the entity recognition model so as to recognize the text entity from the text.

As a preferred implementation manner, in the embodiment of the present application, the matching degree between each candidate text and the current editing scene is determined as follows:

first, the text entities in each candidate text are determined separately.

Specifically, for each candidate text, the text entity contained therein is determined.

Based on the candidate text screening method introduced in the above embodiment, when the candidate text is determined based on the text entity set extracted from all the texts, the text entities in the candidate text may be directly obtained.

If the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation are all directly used as candidate texts, the text entities can be respectively extracted from each candidate text based on the text entity extraction method described in the above embodiment.

Then, a matching score of each text entity with the current editing scene is calculated.

For example, a matching score for determining a text entity and a current editing scenario may be calculated according to the category of the text entity, the type of the current application, and the type of the current input box.

The more consistent the category of the text entity is with the type of the current application and the type of the current input box, the higher the matching score of the text entity with the current editing scene is, and otherwise, the lower the matching score of the text entity with the current editing scene is.

For example, assuming that there are N text entity categories (such as phone number, address, and verification code), the scores of M text entities are calculated at a time, and assuming that the text entity i (i ═ 0,1, … …, M) is classified as the entity of the j (j ═ 0,1, … …, M) category, the text entity category Cj of the text entity i is 1, otherwise, Cj is 0. The current editing scene comprises app (application) types and input box types, and if the current editing scene is the kth type of app, the app type Ak of the app is 1, otherwise, the app type Ak is 0, assuming that S app types and Q input box types are in total; if the current input box is the input box of the l-th type, the input box type Xl of the input box is 1, otherwise, Xl is 0.

Based on the rules, a score model is established:

the formula can be represented as a vector:

c＝(C₀，C₁，C₂，…，C_M)

α＝(α₀，α₁，α₂，…，α_M)

a＝(A₀,A₁,A₂,…A_S)

β＝(β₀,β₁,β₂,…β_S)

x＝(X₀,X₁,X₂,…X_Q)

γ＝(γ₀,γ₁,γ₂,…γ_Q)

δ＝(δ₀,δ₁,δ₂,…δ_M*S*Q)

the score formula is then:

y＝ω₀+cα+aβ+xγ+caxδ

wherein, ω is₀Is an offset; c, a and x are respectively a text entity category feature vector, an application type feature vector and an input box type feature vector; alpha, beta, gamma and delta are parameter vectors, values in the alpha vectors represent contribution coefficients of the text entity categories to the matching scores, values in the beta vectors represent contribution coefficients of the application types to the matching scores, values in the gamma vectors represent contribution coefficients of the input box types to the matching scores, values in the delta vectors represent contribution coefficients of the three feature combinations to the matching scores, and the values of the parameter vectors are calculated through a machine learning model.

The calculation method of the parameter vector is as follows:

first, in the cold start phase, since the values of M and Q are small, α and γ can be set by human experience according to the matching degree possibility of different text entity types and input box types, and assuming that there are three text entity types of telephone number, address and verification code and three input box types of search box, chat box and comment box, α ═ can be set to (0.2,0.5,0.8), γ ═ can be set to (0.1,0.5,0.3), and β and δ may be set to 0 vectors in the early stage. And collecting user behaviors in a cold starting stage, taking the user behaviors as samples, and training through an FM model so as to obtain all parameter vectors.

After all the parameter vectors are obtained, the matching score of the text entity and the current editing scene can be calculated by using the formula.

And finally, determining the matching degree of the candidate text and the current editing scene based on the matching scores of the text entities contained in the candidate text and the current editing scene.

Illustratively, when only one text entity exists in the candidate text, the matching score of the text entity and the current editing scene is used as the matching degree score of the candidate text and the current editing scene.

When a plurality of text entities exist in the candidate text, a weighted average value or an average value of the matching scores of the text entities in the candidate text and the current editing scene is used as the matching degree score of the candidate text and the current editing scene.

Based on the matching degree score of each candidate text and the current editing scene, the text content can be selected from the candidate texts as the text input content corresponding to the current input operation.

Illustratively, the embodiment of the application determines the text input content corresponding to the current input operation in the following three ways.

According to the method and the device, the matching degree of the candidate text sample and the editing scene and the relation of the user input content are studied in advance, so that the matching degree of the candidate text and the editing scene is determined, and the influence on the text input content corresponding to the current input operation determined from the candidate text is achieved.

Through the research, the following conclusions can be reached:

the matching degree scores of the candidate text samples and the editing scenes are subject to normal distribution, and if X is the matching degree score of the candidate text samples and the editing scenes, then:

X～N(μ，σ²)

because the parameters of the normal distribution are different under different sample processing strategies, and a specific score is not easily set to give a judgment suggestion, the normal distribution is converted into a standard normal distribution, and a new score variable is set to be Y, then:

y to N (0,1), the standard normal distribution curve is shown in FIG. 3.

The standard normal distribution curve may be used to select a manner of determining the text input content corresponding to the current input operation.

Firstly, whether the matching degree of the candidate texts and the current editing scene is larger than or equal to a set first matching degree threshold value or not is determined in each candidate text.

The first matching degree threshold is a first matching degree score. Referring to fig. 3, through empirical data analysis, 10% of all texts are occupied by texts with a matching degree of 1.29 or more with the current editing scene, and the matching degree of candidate texts in the interval with the current editing scene is extremely high, so that the input text content corresponding to the current input operation can be directly determined from the candidate texts.

Therefore, if the matching degree of the candidate text and the current editing scene is larger than or equal to the set first matching degree threshold value, the text entity in the candidate text is determined as the input text content corresponding to the current input operation.

And if a plurality of candidate texts with the matching degree with the current editing scene being more than or equal to the set first matching degree threshold value exist, selecting the candidate text with the maximum matching degree with the current editing scene from the candidate texts, and determining a text entity in the candidate text as the input text content corresponding to the current input operation.

And if only one candidate text with the matching degree with the current editing scene being more than or equal to the set first matching degree threshold value exists, but the candidate text contains a plurality of text entities, taking the text entity with the highest matching score with the current editing scene from the plurality of text entities as the input text content corresponding to the current input operation.

For example, assuming that a user copies text contents including a name, a telephone number, and an address from another page, and clicks a "recipient" input box entering a shopping order page, and intends to input contents to the input box in a manner of pasting a text, according to the technical solution of the embodiment of the present application, it may be determined by calculation that, in the text copied or cut in the past stored in the clipboard, the "name" content in the text contents copied this time by the user has the highest matching score (greater than or equal to the first matching degree threshold value) with the current editing scene, the "name" content is directly determined as the input text content corresponding to the current input operation, that is, the text content determined as the text content that should be filled into the "recipient" input box.

Further, in the above scenario, after the input text content corresponding to the current input operation is determined, the input text content may be directly filled in the input box corresponding to the current input operation, that is, the automatic text input is realized.

For example, in the above-mentioned online shopping order page, when it is determined that the "name" content in the clipboard is the text content currently input to the "recipient" input box, the "name" content is directly filled in the "recipient" input box, so that the automatic screen-up of the copied text can be realized without the text pasting operation performed by the user.

It can be understood that the above processing manner realizes that the text entity corresponding to the current input operation is directly extracted from the candidate text and is automatically displayed on the screen to the input box corresponding to the current input operation, and especially when the user copies the text and intends to paste the input text box, the text content conforming to the current input box can be extracted from the text content of the clipboard without the user performing the text paste operation or selecting the really needed text content from the copied text, and can be automatically displayed on the screen to the input box, thereby realizing that the target text is automatically screened from the clipboard and displayed on the screen to the input box.

In addition, the process comprehensively analyzes the text corresponding to the current input operation and the text corresponding to the historical input operation, and selects the text content which is most consistent with the current input operation from the texts as the current input text content. Therefore, the processing mode realizes that the content of the input text is automatically determined according to the text corresponding to the input operation and the text corresponding to the historical input operation, can realize automatic text input, and has higher automation degree of text input.

And if no candidate text with the matching degree with the current editing scene being greater than or equal to the set first matching degree threshold exists in the candidate texts corresponding to the current input operation, further determining whether a candidate text with the matching degree with the current editing scene being greater than or equal to the set second matching degree threshold and smaller than the first matching degree threshold exists in each candidate text.

Wherein the second match threshold represents a second match score that is less than the first match score represented by the first match threshold.

Referring to fig. 3, through empirical data analysis, 10% of all texts are occupied by texts with a matching degree of 0.84 or more and less than 1.29 of the current editing scene, and candidate texts in the interval have a high matching degree with the current editing scene, so that the user can select input text contents from the candidate texts, and the user input operation is simplified.

Therefore, if there is a matching degree with the current editing scene in the candidate text corresponding to the current input operation which is greater than or equal to the set second matching degree threshold value and smaller than the first matching degree threshold value, a text entity is extracted from the candidate text as the input text content corresponding to the current input operation.

For example, assuming that a user copies text content including a name, a telephone, and an address from another page, and clicks an "addressee" input box that enters a shopping order page, and intends to input content to the input box in a manner of pasting a text, according to the technical solution of the embodiment of the present application, it may be determined that, in the text copied or cut in the past stored in the clipboard, the matching score between the text content copied this time by the user and the current editing scene is high (greater than or equal to the second matching degree threshold and smaller than the first matching degree threshold), the "name" content, the "telephone" content, and the "address" content in the text content copied this time are all determined as input text content corresponding to the current input operation.

However, in this case, the specified input text content is not directly filled in the input box, but the specified input text content corresponding to the current input operation is output and selected by the user, and when the user clicks one or some of the input text contents selected and output, the input text content selected by the user is displayed on the input box.

For example, the "name" content, "the" telephone "content, and the" address "content copied by the user are respectively displayed on the online shopping order page, the three contents are displayed in different areas, and when the user clicks a certain content, for example, when the user clicks the" name "content, the" name "content is filled in the" recipient "text box.

Therefore, the processing realizes the function of prompting the content of the input text, when the user executes the input operation, the content of the input text corresponding to the current input operation can be determined from the text corresponding to the current input operation and the text corresponding to the historical input operation according to the input operation of the user, and the content is prompted to the user, so that the user can complete the complete text input through a selection mode after executing part of the input operation, thereby improving the input efficiency of the user and simplifying the input operation of the user.

The processing process mainly introduces that in the text copying and pasting operation, through the technical scheme of the embodiment of the application, the text entity is automatically screened out from the content copied or cut at this time and the historical content of the clipboard as the content of the text input at this time, or is provided for the user to select. In fact, after the user performs the text copying or cutting operation, a complete text entry may be selected from all text entries of the clipboard as the text content to be input this time, or a text entry suitable for being the text content to be input this time may be selected for the user to select.

Further, if the matching degree with the current editing scene is not larger than or equal to the set second matching degree threshold value in the candidate text corresponding to the current input operation, and is less than the first threshold, it may be determined that, as shown in fig. 3, the candidate text corresponding to the current input operation and the current editing scene are both in an interval less than 0.84, these candidate texts have a low matching degree with the current editing scene, may not contain the text content that the user really wants to input, at this time, in the embodiment of the present application, the candidate texts corresponding to the current input operation are sorted according to the order of the matching degree between the candidate texts and the current editing scene from high to low to obtain a candidate text sequence, and outputting the candidate text sequence so that the user can select the candidate text from the candidate text sequence as the current input text content.

For example, assuming that a user copies text from another page to be pasted into a bullet screen input box for bullet screen input while viewing video content, the embodiment of the present application can select the text as a candidate text corresponding to a current input operation from the currently copied text and historically copied or cut texts stored in a clipboard.

According to a conventional text input scheme, only candidate texts can be output and displayed, at the moment, a user is further required to judge which candidate texts are more matched with the current input, and then the candidate texts can be selected as the input content. According to the method and the device, the matching degree of each candidate text and the current editing scene is automatically calculated, the candidate texts are sequenced according to the sequence from high to low of the matching degree of the candidate texts and the current editing scene to obtain the candidate text sequence, and then the candidate text sequence is output.

For example, referring to fig. 4, it is assumed that the text of "front high energy" stored in the clipboard in the history copy or cut operation has the highest matching degree with the current editing scene, but the text of "front high energy" is not displayed at the head in the conventional input scheme, but the text of "front high energy" can be displayed at the head after the sorting processing, so that the user can confirm that the text is the text most suitable for the current bullet screen input, and therefore the text of "front high energy" can be directly selected from the head as the bullet screen input content.

As can be seen from the above description, the text input method provided in the embodiment of the present application can comprehensively analyze the text corresponding to the current input and the text corresponding to the historical input operation matched with the current input, determine the candidate text corresponding to the current input operation from the text, calculate the matching degree between each candidate text and the current editing scene, and determine the input text content corresponding to the current input operation from the candidate texts in different manners according to the matching degree between each candidate text and the current editing scene. The processing scheme can automatically provide input reference for the user or automatically replace the user to execute text input operation, so that the complexity of the input operation of the user can be obviously reduced, and the input efficiency is improved.

An embodiment of the present application further provides a text input device, as shown in fig. 5, the text input device includes:

a text preselection unit 100, configured to acquire a candidate text corresponding to a current input operation; the candidate texts at least comprise texts corresponding to the current input operation and texts corresponding to historical input operations matched with the current input operation;

the text analysis unit 110 is configured to determine matching degrees of each candidate text and the current editing scene respectively;

an input text determining unit 120, configured to determine, according to a matching degree of each candidate text with the current editing scene, an input text content corresponding to the current input operation from the candidate texts.

The text input device provided by the embodiment of the application takes the text corresponding to the current input operation and the text corresponding to the historical input operation matched with the current input operation as candidate texts corresponding to the current input operation. Then, the matching degree of each candidate text and the current editing scene is respectively determined, and the input text content corresponding to the current input operation is determined from the candidate texts according to the matching degree of each candidate text and the current editing scene. Through the processing process, the input text content corresponding to the current input operation can be automatically and directly determined based on the current input operation of the user, so that the input operation of the user is obviously simplified, and the input efficiency of the user is improved.

alternatively, the first and second electrodes may be,

respectively determining a text entity in each candidate text;

Optionally, when the matching degree between the candidate text and the current editing scene is not greater than or equal to the set first matching degree threshold, the input text determining unit is further configured to:

Optionally, when the matching degree between the candidate text and the current editing scene is not greater than or equal to the set second matching degree threshold and is less than the first matching degree threshold, the input text determining unit is further configured to:

and outputting the candidate text sequence.

Specifically, please refer to the contents of the above method embodiments for the specific working contents of each unit of the text input device, which are not described herein again.

Another embodiment of the present application further provides a text input device, as shown in fig. 6, the text input device including:

a memory 200 and a processor 210;

wherein, the memory 200 is connected to the processor 210 for storing programs;

the processor 210 is configured to implement the processing steps of the text input method disclosed in any of the above embodiments by running the program stored in the memory 200.

Specifically, the text input device may further include: a bus, a communication interface 220, an input device 230, and an output device 240.

The processor 210, the memory 200, the communication interface 220, the input device 230, and the output device 240 are connected to each other through a bus. Wherein:

a bus may include a path that transfers information between components of a computer system.

The processor 210 may be a general-purpose processor, such as a general-purpose Central Processing Unit (CPU), microprocessor, etc., an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of programs in accordance with the present invention. But may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.

The processor 210 may include a main processor and may also include a baseband chip, modem, and the like.

The memory 200 stores programs for executing the technical solution of the present invention, and may also store an operating system and other key services. In particular, the program may include program code including computer operating instructions. More specifically, memory 200 may include a read-only memory (ROM), other types of static storage devices that may store static information and instructions, a Random Access Memory (RAM), other types of dynamic storage devices that may store information and instructions, a disk storage, a flash, and so forth.

The input device 230 may include a means for receiving data and information input by a user, such as a keyboard, mouse, camera, scanner, light pen, voice input device, touch screen, pedometer, or gravity sensor, among others.

Output device 240 may include equipment that allows output of information to a user, such as a display screen, a printer, speakers, and the like.

Communication interface 220 may include any device that uses any transceiver or the like to communicate with other devices or communication networks, such as an ethernet network, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc.

The processor 2102 executes programs stored in the memory 200 and invokes other devices that may be used to implement the various steps of the text input method provided by embodiments of the present application.

Another embodiment of the present application further provides a storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the text input method provided in any of the above embodiments.

Specifically, the specific working contents of each part of the text input device and the specific processing contents of the computer program on the storage medium when being executed by the processor can refer to the contents of each embodiment of the text input method, which are not described herein again.

While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present application is not limited by the order of acts or acts described, as some steps may occur in other orders or concurrently with other steps in accordance with the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The steps in the method of each embodiment of the present application may be sequentially adjusted, combined, and deleted according to actual needs, and technical features described in each embodiment may be replaced or combined.

The modules and sub-modules in the device and the terminal in the embodiments of the application can be combined, divided and deleted according to actual needs.

In the several embodiments provided in the present application, it should be understood that the disclosed terminal, apparatus and method may be implemented in other manners. For example, the above-described terminal embodiments are merely illustrative, and for example, the division of a module or a sub-module is only one logical division, and there may be other divisions when the terminal is actually implemented, for example, a plurality of sub-modules or modules may be combined or integrated into another module, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules or sub-modules described as separate parts may or may not be physically separate, and parts that are modules or sub-modules may or may not be physical modules or sub-modules, may be located in one place, or may be distributed over a plurality of network modules or sub-modules. Some or all of the modules or sub-modules can be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, each functional module or sub-module in the embodiments of the present application may be integrated into one processing module, or each module or sub-module may exist alone physically, or two or more modules or sub-modules may be integrated into one module. The integrated modules or sub-modules may be implemented in the form of hardware, or may be implemented in the form of software functional modules or sub-modules.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software unit executed by a processor, or in a combination of the two. The software cells may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A text entry method, comprising:

2. The method of claim 1, wherein obtaining the candidate text corresponding to the current input operation comprises:

3. The method according to claim 2, wherein the extracting text entities from the text corresponding to the current input operation and the text corresponding to the historical input operation matching the current input operation respectively comprises:

alternatively, the first and second electrodes may be,

4. The method of claim 3, wherein when a text entity cannot be extracted from a text using a regular expression, the text entity is identified from the text using a pre-trained entity identification model.

5. The method according to claim 1 or 2, wherein the determining the matching degree of each candidate text with the current editing scene respectively comprises:

respectively determining a text entity in each candidate text;

6. The method of claim 5, wherein calculating a matching score for each text entity with the current editing scenario comprises:

7. The method according to claim 2, wherein the determining the input text content corresponding to the current input operation from the candidate texts according to the matching degree of each candidate text with the current editing scene comprises:

8. The method according to claim 7, wherein when there are a plurality of candidate texts whose matching degree with the current editing scene is greater than or equal to a set first matching degree threshold, text entities are respectively extracted from the plurality of candidate texts, and a text entity with the largest matching degree with the current editing scene is selected from the extracted text entities as the input text content corresponding to the current input operation.

9. The method according to claim 7, wherein when the matching degree of the candidate text and the current editing scene is not greater than or equal to the set first matching degree threshold, the method further comprises:

10. The method according to claim 9, wherein when the matching degree of the candidate text with the current editing scene is not greater than or equal to the set second matching degree threshold and is less than the first matching degree threshold, the method further comprises:

and outputting the candidate text sequence.

11. The method according to claim 1, wherein the current input operation is an operation of clicking an entry box after a text copy or cut operation is performed;

12. A text input device, comprising:

13. A text input device, comprising:

a memory and a processor;

the memory is connected with the processor and used for storing programs;

the processor is configured to implement the text input method according to any one of claims 1 to 11 by executing the program in the memory.

14. A storage medium, having stored thereon a computer program which, when executed by a processor, implements a text input method as claimed in any one of claims 1 to 11.