CN110244861B

CN110244861B - Data processing method and device

Info

Publication number: CN110244861B
Application number: CN201810196490.5A
Authority: CN
Inventors: 左艳波
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2018-03-09
Filing date: 2018-03-09
Publication date: 2024-02-02
Anticipated expiration: 2038-03-09
Also published as: CN110244861A

Abstract

The embodiment of the invention provides a data processing method and device, wherein the method specifically comprises the following steps: acquiring a first screen content and a second screen content from the continuous screen content; the content types corresponding to the first screen content and the second screen content are different, and the first screen content and the second screen content are adjacent; and establishing a multi-element relation between the first on-screen content and the second on-screen content. Acquiring a first screen content and a second screen content which are adjacent and have different content types from continuous screen contents, and establishing a multi-element relation between the first screen content and the second screen content; in this way, in the case that the user needs to input the combined content of different content types, the candidate items corresponding to the combined content can be provided to the user according to the established multiple relation between the first upper screen content and the second upper screen content, so that the input efficiency of the user can be improved.

Description

Data processing method and device

Technical Field

The present invention relates to the field of input methods, and in particular, to a data processing method and apparatus.

Background

Currently, devices involving interactions typically require a user to interactively identify their own operational intent to the device via an input method program. For example, the user may input an input string, and then the input string is converted into candidates of a corresponding language and displayed by the input method program according to a standard mapping rule preset by the input method program, so that the candidates selected by the user are displayed on the screen.

Currently, an input method program can acquire candidate items corresponding to an input string through a binary relation so as to improve the accuracy of the candidate items; the binary relation refers to a collocation relation between words, such as weather-good heat, weather-clear, nature-mush, hundred thousand-eight thousand and the like, which can have the binary relation. The binary relation can be used for realizing an association function or an intelligent word forming function, wherein the association can be used for searching the binary relation in the binary library according to the context input by a user so as to obtain association candidate items corresponding to the context; the intelligent word forming can search the binary relation in the binary library, calculate the path probability of the word string in each word forming scheme according to the hit condition of the binary relation, and return the word forming scheme with the maximum path probability to the user as a preference.

Currently, the vocabulary included in the binary relation generally only includes Chinese vocabulary, however, a user may have a mixed input requirement of Chinese vocabulary and other types of characters, and the other types of characters may include: numbers, letters, combinations of numbers and letters, etc., for example, the user needs to input contents of "36 carats", "12 zodiac", "72 change", "92 oil prices", etc. In this case, the input method program will not be able to provide the user with the desired content according to the binary relation.

Disclosure of Invention

The embodiment of the invention provides a data processing method and a data processing device, which can provide candidates corresponding to combined contents for a user according to the established multiple relations between a first on-screen content and a second on-screen content under the condition that the user needs to input the combined contents with different content types, so that the input efficiency of the user can be improved.

In order to solve the above problems, an embodiment of the present invention discloses a data processing method, including:

acquiring a first screen content and a second screen content from the continuous screen content; the content types corresponding to the first screen content and the second screen content are different, and the first screen content and the second screen content are adjacent;

And establishing a multi-element relation between the first on-screen content and the second on-screen content.

Optionally, the content length corresponding to the first screen content does not exceed a first length threshold, and the content length corresponding to the second screen content does not exceed a second length threshold.

Optionally, the acquiring the first on-screen content and the second on-screen content from the continuous on-screen content includes:

acquiring target screen contents which do not contain preset segmentation symbols from continuous screen contents;

and dividing the target on-screen content according to the content type corresponding to the content contained in the target on-screen content to obtain a first on-screen content and a second on-screen content contained in the target on-screen content.

Optionally, the dividing the target on-screen content according to the content type corresponding to the content included in the target on-screen content includes:

according to the content types corresponding to the content contained in the target on-screen content, acquiring first on-screen content, second on-screen content and third on-screen content with different content types from the target on-screen content;

if the content type corresponding to the first screen content and the content type corresponding to the second screen content meet a first merging condition, merging the first screen content and the second screen content to be used as first screen content, and taking the third screen content as second screen content; or alternatively

And if the content type corresponding to the second upper screen content and the content type corresponding to the third upper screen content meet a second merging condition, merging the second upper screen content and the third upper screen content to be used as second upper screen content.

judging whether a segmentation operation triggered by a user is received or not in the process of receiving the on-screen content of the user so as to obtain a corresponding first judgment result;

judging whether the content types corresponding to two adjacent language units in the on-screen content are consistent or not in the process of receiving the on-screen content of the user so as to obtain a corresponding second judging result; the language unit includes: characters, or individual words, or words;

if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the first time is no, extracting first on-screen content positioned at the front part from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the first time;

and if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the second time is no, extracting second on-screen content positioned behind the first on-screen content from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the second time.

Optionally, the acquiring the first on-screen content and the second on-screen content from the continuous on-screen content further includes:

if the first judgment result is no and the second judgment result corresponding to two adjacent language units appears for the third time is no, extracting third on-screen content positioned behind the second on-screen content from the on-screen content according to two adjacent language units with inconsistent content types appearing for the third time;

if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a first preset condition, combining the second on-screen content and the third on-screen content, and taking the combined content as second on-screen content; or alternatively

If the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content meet a second preset condition, and combining the first on-screen content and the second on-screen content, taking the combined content as the first on-screen content, and taking the third on-screen content as the second on-screen content.

If the content length corresponding to the first screen content does not exceed a first length threshold, reserving the first screen content; or alternatively

And discarding the first screen content if the content length corresponding to the first screen content exceeds a first length threshold.

Optionally, the extracting the second on-screen content located after the first on-screen content from the on-screen content includes:

and extracting the content which is positioned behind the first on-screen content and has the content length not exceeding a second length threshold value from the on-screen content according to two adjacent language units with inconsistent content types appearing for the second time, and taking the content as second on-screen content.

Optionally, the extracting the third on-screen content located after the second on-screen content from the on-screen content includes:

and extracting the content which is positioned behind the second on-screen content and has the content length not exceeding a third length threshold value from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the third time, and taking the content as third on-screen content.

Optionally, the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content include: at least two of chinese type, letter type, number type, symbol type, english type, and picture type.

In yet another aspect, an embodiment of the present invention discloses a data processing method, including:

receiving input content of a user; the input content includes: on screen content and/or input strings;

determining candidate items corresponding to the input content according to the multivariate relation; the multivariate relationship comprises: language units with different content types;

and outputting the candidate items corresponding to the input content.

Optionally, the determining the candidate item corresponding to the input content according to the multivariate relation includes:

determining a target multivariate relation matched with the on-screen content from at least one pair of multivariate relations, and determining candidate items corresponding to the on-screen content according to the target multivariate relation; or alternatively

Determining a target multivariate relation matched with the on-screen content from at least one pair of multivariate relations, and determining a language unit matched with the input string from the target multivariate relation as a candidate item corresponding to the input content; or alternatively

And searching in a multi-element relation according to the input string or the word candidate corresponding to the input string to obtain the candidate corresponding to the input string.

Optionally, a preset segmentation symbol does not exist between the on-screen content and the input string; and/or

And the content length corresponding to the on-screen content does not exceed a length threshold.

In still another aspect, an embodiment of the present invention discloses a data processing apparatus, including:

the acquisition module is used for acquiring the first on-screen content and the second on-screen content from the continuous on-screen content; the content types corresponding to the first screen content and the second screen content are different, and the first screen content and the second screen content are adjacent;

and the relation establishing module is used for establishing a multi-element relation between the first on-screen content and the second on-screen content.

Optionally, the acquiring module includes:

the acquisition sub-module is used for acquiring target on-screen contents which do not contain preset segmentation symbols from the continuous on-screen contents;

and the segmentation sub-module is used for segmenting the target on-screen content according to the content type corresponding to the content contained in the target on-screen content so as to obtain a first on-screen content and a second on-screen content contained in the target on-screen content.

Optionally, the segmentation submodule includes:

the acquisition unit is used for acquiring first screen contents, second screen contents and third screen contents with different content types from the target screen contents according to the content types corresponding to the contents contained in the target screen contents;

the first merging unit is configured to merge the first on-screen content and the second on-screen content as first on-screen content and use the third on-screen content as second on-screen content if a content type corresponding to the first on-screen content and a content type corresponding to the second on-screen content meet a first merging condition; or alternatively

And the second merging unit is used for merging the second on-screen content and the third on-screen content as second on-screen content if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a second merging condition.

Optionally, the acquiring module includes:

the first judging sub-module is used for judging whether the segmentation operation triggered by the user is received or not in the process of receiving the on-screen content of the user so as to obtain a corresponding first judging result;

The second judging sub-module is used for judging whether the content types corresponding to two adjacent language units in the on-screen content are consistent or not in the process of receiving the on-screen content of the user so as to obtain a corresponding second judging result; the language unit includes: characters, or individual words, or words;

the first extraction sub-module is used for extracting first on-screen content positioned at the front part from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the first time if the first judgment result is NO and the second judgment result corresponding to the two adjacent language units appears for the first time is NO;

and the second extraction sub-module is used for extracting second on-screen content positioned behind the first on-screen content from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the second time if the first judgment result is NO and the second judgment result corresponding to the two adjacent language units appears for the second time is NO.

Optionally, the acquiring module further includes:

a third extraction sub-module, configured to extract, from the on-screen content, third on-screen content located after the second on-screen content according to the two adjacent language units whose content types are inconsistent in the third occurrence if the first determination result is no and the second determination result corresponding to the two adjacent language units appears in the third occurrence is no;

The first merging sub-module is used for merging the second on-screen content and the third on-screen content if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a first preset condition, and taking the merged content as second on-screen content; or alternatively

And the second merging sub-module is used for merging the first on-screen content and the second on-screen content if the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content meet a second preset condition, taking the merged content as the first on-screen content and the third on-screen content as the second on-screen content.

Optionally, the acquiring module further includes:

a content reservation sub-module, configured to reserve the first on-screen content if a content length corresponding to the first on-screen content does not exceed a first length threshold; or alternatively

And the content discarding sub-module is used for discarding the first on-screen content if the content length corresponding to the first on-screen content exceeds a first length threshold value.

Optionally, the second extraction submodule includes:

and the extraction unit is used for extracting the content which is positioned behind the first on-screen content and has the content length not exceeding a second length threshold value from the on-screen content according to two adjacent language units with inconsistent content types appearing for the second time, and the content is used as second on-screen content.

Optionally, the third extraction submodule includes:

and the extraction unit is used for extracting the content which is positioned behind the second on-screen content and has the content length not exceeding a third length threshold value from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the third time, and the content is used as third on-screen content.

In yet another aspect, an embodiment of the present invention discloses a data processing apparatus, including:

the receiving module is used for receiving the input content of the user; the input content includes: on screen content and/or input strings;

the determining module is used for determining candidate items corresponding to the input content according to the multivariate relation; the multivariate relationship comprises: language units with different content types;

and the output module is used for outputting the candidate items corresponding to the input content.

Optionally, the determining module includes:

the first determining submodule is used for determining a target multi-element relation matched with the on-screen content from at least one pair of multi-element relations and determining candidates corresponding to the on-screen content according to the target multi-element relation; or alternatively

A second determining sub-module, configured to determine a target multivariate relation matched with the on-screen content from at least one pair of multivariate relations, and determine a language unit matched with the input string from the target multivariate relation, as a candidate item corresponding to the input content; or alternatively

And the searching sub-module is used for searching in the multi-element relation according to the input string or the word candidate corresponding to the input string so as to obtain the candidate corresponding to the input string.

In yet another aspect, an embodiment of the present invention discloses an apparatus for data processing, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:

if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the third time is no, extracting third on-screen content positioned behind the second on-screen content from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the third time;

If the content type corresponding to the first screen content and the content type corresponding to the second screen content meet a second preset condition, combining the first screen content and the second screen content, taking the combined content as the first screen content, and taking the third screen content as the second screen content.

In yet another aspect, embodiments of the invention disclose a machine-readable medium having instructions stored thereon that, when executed by one or more processors, cause an apparatus to perform a data processing method as described in one or more of the data processing methods.

and outputting the candidate items corresponding to the input content.

The embodiment of the invention has the following advantages:

acquiring a first screen content and a second screen content which are adjacent and have different content types from continuous screen contents, and establishing a multi-element relation between the first screen content and the second screen content; in this way, in the case that the user needs to input the combined content of different content types, the candidate items corresponding to the combined content can be provided to the user according to the established multiple relation between the first upper screen content and the second upper screen content, so that the input efficiency of the user can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of steps of an embodiment of a data processing method of the present invention;

FIG. 2 is a flow chart of steps of an embodiment of a data processing method of the present invention;

FIG. 3 is a flow chart of steps of an embodiment of a data processing method of the present invention;

FIG. 4 is a flow chart of steps of an embodiment of a data processing method of the present invention;

FIG. 5 is a flow chart of steps of an embodiment of a data processing method of the present invention;

FIG. 6 is a block diagram of an embodiment of a data processing apparatus of the present invention;

FIG. 7 is a block diagram of an embodiment of a data processing apparatus of the present invention;

FIG. 8 is a block diagram of a data processing apparatus 800 of the present invention; and

Fig. 9 is a schematic diagram of a server in some embodiments of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The current multi-element relation generally only comprises Chinese words, but in practical application, a user may need to input a combination of Chinese type characters and other types of characters, and then the input method program cannot provide required candidate items for the user according to the multi-element relation, so that the input efficiency of the user is low.

The embodiment of the invention provides a data processing scheme which can acquire first on-screen content and second on-screen content from continuous on-screen content and establish a multi-element relation between the first on-screen content and the second on-screen content. Wherein the content type of the first on-screen content and the content type of the second on-screen content may be different, and the first on-screen content is adjacent to the second on-screen content.

According to the embodiment of the invention, from continuous on-screen contents, adjacent first on-screen contents and second on-screen contents with different content types are acquired, and a multi-element relation between the first on-screen contents and the second on-screen contents is established; in this way, in the case that the user needs to input the combined content of different content types, the candidate items corresponding to the combined content can be provided to the user according to the established multiple relation between the first upper screen content and the second upper screen content, so that the input efficiency of the user can be improved.

For example, when the multivariate relation is established, a binary relation can be established according to the acquired first on-screen content and second on-screen content, and the third on-screen content is acquired again, and the binary relation between the second on-screen content and the third on-screen content is established, so that the multivariate relation is established.

The embodiment of the invention can be applied to input method programs of various input modes such as keyboard symbols, voices, handwriting and the like, namely, a user can input characters through coded character strings (namely, input strings in the embodiment of the invention). In the field of input methods, for input method programs such as chinese, japanese, korean, or other languages, it is common to convert an input string input by a user into candidates of the corresponding language. In the following, chinese is mainly taken as an example, and other languages may be referred to each other. It will be appreciated that the chinese input method may include, but is not limited to, full spellings, simple spellings, strokes, wubi, etc., and the embodiment of the present invention does not limit the specific input method program corresponding to a certain language.

The input method program may be run on a terminal, where the terminal specifically includes, but is not limited to: smart phones, tablet computers, e-book readers, MP3 (dynamic video expert compression standard audio plane 3,Moving Picture Experts Group Audio Layer III) players, MP4 (dynamic video expert compression standard audio plane 4,Moving Picture Experts Group Audio Layer IV) players, laptop portable computers, car computers, desktop computers, set-top boxes, smart televisions, wearable devices, and the like.

In practical applications, for the keyboard symbol input method, the user may input the input string through a physical keyboard, a virtual keyboard, or the like. For example, for a terminal with a touch screen, a virtual keyboard may be set in an input interface to use input of an input string by triggering virtual keys included in the virtual keyboard. Alternatively, examples of the above virtual keyboard may include: a 9-key keyboard, a 26-key keyboard, etc.

According to some embodiments, the input string may include, but is not limited to: one key symbol or a combination of key symbols entered by the user through the keys. The key symbol may specifically include: pinyin, strokes, kana, etc.

In embodiments of the present invention, candidates may be used to represent one or more characters corresponding to an input string to be selected by a user. The candidate item can be a character of a language such as a Chinese character, an English character, a Japanese character and the like, and the candidate entry can also be a symbol combination in the forms of a pigment, a picture and the like. Wherein, the said pigment literal includes but is not limited to the line, symbol, picture that the literal forms, for example, the example of the said pigment literal can include: ": p ",": -o ",": -) ", and the like. In practical application, the candidate item can be obtained by searching in the word stock according to the input string. For example, the word stock may include: system word stock, user word stock, cell word stock, cloud word stock, etc.

Method embodiment

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include:

step 101, acquiring a first on-screen content and a second on-screen content from continuous on-screen content;

the first screen content and the second screen content are adjacent, and the content type corresponding to the first screen content and the content type corresponding to the second screen content are different.

Step 102, establishing a multiple relation between the first on-screen content and the second on-screen content.

The method embodiment shown in fig. 1 may be used to establish a multivariate relation based on continuous on-screen content to provide a user with combined content of different content types based on language units of different content types in the multivariate relation. The language unit is a component unit of the on-screen content, and may include a word, a letter, a number, a symbol, and the like.

Optionally, the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content may include: at least two of chinese type, letter type, number type, symbol type, english type, and picture type.

Step 101 may perform a statistical analysis on the continuous on-screen content, and acquire the first on-screen content and the second on-screen content from the continuous on-screen content, so that in step 102, a multiple relationship is established according to the acquired first on-screen content and second on-screen content.

According to one embodiment, the on-screen content of the user may be recorded in advance, and the first on-screen content and the second on-screen content may be acquired according to the recorded on-screen content. In this case, whether the on-screen content is continuous or not can be determined according to whether punctuation marks or paragraph marks exist between language units in the recorded on-screen content. If so, the first on-screen content and the second on-screen content can be acquired from the continuous on-screen content according to the content type of the language unit.

According to another embodiment, the on-screen content can be monitored in real time, and the first on-screen content and the second on-screen content are acquired according to the monitored on-screen content. Specifically, punctuation marks or paragraph marks input by a user can be monitored in real time, and continuous on-screen content is extracted from the on-screen content. And acquiring the first on-screen content and the second on-screen content from the continuous on-screen content according to the content type of the on-screen content.

After the first on-screen content and the second on-screen content are acquired, a multi-component relationship between the first on-screen content and the second on-screen content may be established. Alternatively, the established multi-element relationship may be stored in a preset memory space so as to invoke the pre-stored multi-element relationship during the input process.

Further, a plurality of first screen contents and a plurality of second screen contents can be acquired, a multi-element relation is established between each first screen content and the corresponding second screen content, a multi-pair multi-element relation is finally obtained, and the obtained multi-pair multi-element relation is stored in a preset storage space.

Moreover, in establishing the multiple-pair multivariate relationship, the second on-screen content in the first pair of multivariate relationships may be the first on-screen content in the second pair of multivariate relationships. For example, if the continuous screen content is "i like Nike basketball shoes", a first pair of multivariate relations between "like" and "Nike" may be established, and a second pair of multivariate relations between "Nike" and "basketball shoes" may be established, where "Nike" is the second screen content in the first pair of multivariate relations, but is the first screen content in the second pair of multivariate relations.

In addition, the content length corresponding to the first screen content does not exceed the first length threshold, and the content length corresponding to the second screen content does not exceed the second length threshold. Therefore, in the process of acquiring the first screen content and the second screen content, the content lengths corresponding to the first screen content and the second screen content respectively can be acquired, whether the content lengths of the acquired first screen content and/or second screen content meet the corresponding length conditions or not is judged, and then corresponding processing is carried out on the first screen content and/or the second screen content according to the judging result.

The content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content may include: at least two of Chinese type, letter type, number type, symbol type, english type and picture type, when the content length of the first on-screen content and/or the second on-screen content does not meet the corresponding requirements, different processing can be carried out on the first on-screen content and/or the second on-screen content according to the content type.

According to an embodiment, if the content type corresponding to the first on-screen content or the second on-screen content is a chinese type, the first on-screen content or the second on-screen content may be segmented, so that the content length of the segmented first on-screen content or second on-screen content meets the corresponding length threshold.

According to another embodiment, if the content type corresponding to the first on-screen content or the second on-screen content is an english type, the first on-screen content or the second on-screen content needs to be discarded, and the multiple relationship between the first on-screen content and the second on-screen content is not established.

At least one step included in the method embodiment shown in fig. 1 may be performed by any one or a combination of a client and/or a server, and the embodiment of the present invention is not limited to a specific implementation body of the step. For example, the client may establish a multiple relationship according to the pre-acquired on-screen content, or may establish a multiple relationship according to the real-time on-screen content. For another example, the server may obtain the first on-screen content and the second on-screen content according to the on-screen content periodically uploaded by the client, so as to establish a multiple relationship. For another example, in the input process of the user, the client sends the on-screen content of the user to the server, so that the server establishes a multiple relationship according to the on-screen content sent by the client.

In summary, according to the data processing method provided by the embodiment of the invention, from continuous on-screen contents, adjacent first on-screen contents and second on-screen contents with different content types are acquired, and a multiple relation between the first on-screen contents and the second on-screen contents is established; in this way, in the case that the user needs to input the combined content of different content types, the candidate items corresponding to the combined content can be provided to the user according to the established multiple relation between the first upper screen content and the second upper screen content, so that the input efficiency of the user can be improved.

Referring to fig. 2, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include:

step 201, acquiring target on-screen content which does not contain preset segmentation symbols from continuous on-screen content;

step 202, dividing the target on-screen content according to the content type corresponding to the content contained in the target on-screen content, so as to obtain a first on-screen content and a second on-screen content contained in the target on-screen content.

In step 201, the preset division symbol may be a punctuation symbol, a paragraph symbol, a space symbol, or the like, which is not limited in the embodiment of the present invention.

In order to reduce the workload of establishing the multivariate relation, continuous screen contents can be screened firstly, the screen contents containing preset segmentation symbols are removed, and reserving the screen contents which do not contain the preset segmentation symbols to obtain target screen contents, so that in the subsequent step, the first screen contents and the second screen contents can be obtained from the target screen contents.

In step 201, a preset division symbol in the continuous on-screen content may be detected, and when the preset division symbol is detected, on-screen content after the first preset division symbol and before a second preset division symbol adjacent to the first preset division symbol, where no other preset division symbol exists, may be used as target on-screen content, that is, on-screen content between two adjacent preset division symbols is used as target on-screen content.

In practical application, the continuous on-screen content may include more than two preset segmentation symbols, so as to obtain multiple target on-screen contents, and the embodiment of the invention mainly illustrates one target on-screen content as an example, and the data processing methods corresponding to more than one target on-screen content are mutually referred to, so that the number of the preset segmentation symbols and the number of the target on-screen contents are not limited.

In the embodiment of the invention, the content type corresponding to the first screen content is different from the content type corresponding to the second screen content, and the first screen content is adjacent to the second screen content.

After the target on-screen content is obtained, according to the content type corresponding to the content in the target on-screen content, and dividing the target on-screen content to obtain a first on-screen content and a second on-screen content with different content types.

Specifically, the content type corresponding to each content in the target on-screen content can be identified, the content type of each content can be determined, when detecting that the content types corresponding to any two adjacent contents are different, acquiring the first screen content and the second screen content according to the two adjacent contents.

The first on-screen content may include a first content (such as a previous content or a subsequent content) of the two adjacent contents, and the content type of the first content is consistent with the content type of the other content in the first on-screen content, and the second on-screen content may include a second content of the two adjacent contents, and the content type of the second content is consistent with the content type of the other content in the second on-screen content.

In one embodiment of the present invention, the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content may include: at least two of chinese type, letter type, number type, symbol type, english type, and picture type. Accordingly, in practical application, the target on-screen content may be segmented according to the content type, and according to the appearance sequence of the on-screen content, the first on-screen content, the second on-screen content and the third on-screen content are obtained by segmentation, that is, according to the content type, the first on-screen content may be obtained by segmentation first, then the second on-screen content is obtained, and then the third on-screen content is obtained. And combining the first screen content, the second screen content and the third screen content according to the content type to obtain the first screen content and the second screen content.

The above-described process of merging two of the first on-screen content, the second on-screen content, and the third on-screen content may include: if the content type corresponding to the first screen content and the content type corresponding to the second screen content meet a first merging condition, merging the first screen content and the second screen content to be used as first screen content, and taking the third screen content as second screen content; or if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a second merging condition, merging the second on-screen content and the third on-screen content to be used as second on-screen content.

Wherein, the first merging condition may be: the content type corresponding to the first screen content and the content type corresponding to the second screen content are two different content types of letter type, number type, symbol type, english type and picture type. The second merging condition may be: the content type corresponding to the second screen content and the content type corresponding to the third screen content are two different content types of letter type, number type, symbol type, english type and picture type.

In summary, according to the data processing method provided by the embodiment of the invention, the adjacent first screen content and second screen content with different content types are acquired from the continuous screen content, and the multiple relations between the first screen content and the second screen content are established, so that when a user needs to input the multiple relations with different content types, the client can accurately recommend candidate items of the user expected screen to the user according to the established multiple relations between the first screen content and the second screen content, thereby reducing the time spent by the user in selecting the candidate items and improving the efficiency of user input.

Referring to fig. 3, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include:

step 301, judging whether a segmentation operation triggered by a user is received or not in the process of receiving the on-screen content of the user so as to obtain a corresponding first judgment result;

step 302, in the process of receiving the on-screen content of the user, judging whether the content types corresponding to two adjacent language units in the on-screen content are consistent or not so as to obtain a corresponding second judgment result;

step 303, acquiring first screen content and second screen content according to the first judgment result and the second judgment result;

Step 304, establishing a multiple relationship between the first on-screen content and the second on-screen content.

The embodiment of the invention can monitor the on-screen content of the user in real time and establish a multi-element relation according to the on-screen content of the user.

Specifically, in the process of receiving the on-screen content of the user, the on-screen content of the user can be recorded in real time, and whether the user triggers a segmentation operation or not is monitored, that is, whether the user triggers the segmentation operation is judged, and the segmentation operation is used for segmenting the on-screen content and the on-screen content. The above determination may result in a first determination result so that a multiple relationship is established according to the first determination result in a subsequent step.

Wherein, the dividing operation may be an operation triggered by a dividing key, and the dividing key may include: keys corresponding to each punctuation mark, keys corresponding to paragraph segmentation, space keys, and the like, which are not limited in this embodiment of the present invention.

In step 302, the language unit may include: characters, or individual words, or words.

And when receiving the on-screen content of the user, acquiring the content types corresponding to two adjacent language units in the on-screen content, and judging whether the content types of the two adjacent language units are consistent according to the acquired content types, thereby obtaining a second judgment result corresponding to the content types.

For example, when a certain language unit is detected on the screen of the user, the content type of the language unit can be obtained, and the content type of the language unit is compared with the content types of the adjacent language units on the screen before, so as to judge whether the content types of the two adjacent language units are consistent, thereby obtaining a second judgment result.

Step 303 may acquire the first on-screen content and the second on-screen content from the continuous on-screen content according to the first determination result and the second determination result after acquiring the first determination result and the second determination result, so as to establish a multiple relationship between the first on-screen content and the second on-screen content in a subsequent step.

The content type corresponding to the first screen content is different from the content type corresponding to the second screen content, and the first screen content is adjacent to the second screen content.

Because the on-screen content of the user can comprise a plurality of adjacent language units, multiple judgment can be performed, so that a plurality of second judgment results are obtained. Thus, this step 303 may include step 303a and step 303b.

Step 303a, if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the first time is no, extracting the first on-screen content located at the front part from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the first time.

And if the first judgment result is NO and the first acquired second judgment result is NO, the fact that the on-screen content of the user is continuous on-screen content is indicated, and if the content types of the two adjacent language units are inconsistent for the first time, the language unit positioned at the front part in the on-screen content can be used as the first on-screen content.

For example, when the on-screen content of the user is "flight MN of me" and the content types corresponding to the front language unit "flight" and the rear language unit "MN" are inconsistent, the front language unit "flight" may be used as the first on-screen content.

Step 303b, if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the second time is no, extracting the second on-screen content located behind the first on-screen content from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the second time.

If the first determination result is no, if the obtained second determination result is no for the second time, it is indicated that the content type of the language unit located at the rear in step 303a is different from the content type of the language unit that is on the screen again by the user, and then the adjacent language unit that is inconsistent according to the current content type may be used as the second on-screen content from the on-screen content.

Alternatively, the content which is located after the first on-screen content and has the content length not exceeding the second length threshold value can be extracted from the on-screen content according to two adjacent language units with inconsistent content types appearing for the second time, and the content is taken as the second on-screen content.

For example, when the on-screen content of the user is "flight MN3587 for me take", the content types corresponding to the front language unit "MN" and the rear language unit "3587" are inconsistent, and the front language unit "MN" may be used as the second on-screen content.

Optionally, the method may further include: step 303c and step 303d.

Step 303c, if the first judgment result is no and the second judgment result corresponding to the two adjacent language units appears for the third time is no, extracting third on-screen content located behind the second on-screen content from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the third time.

If the first judgment result is continuous no, and the second judgment result is no, which means that the content types of the language units located at the rear part in the step 303b and the language units on the screen of the user are changed, the third on-screen content can be extracted again according to two adjacent language units with inconsistent content types.

Alternatively, the content which is located after the second on-screen content and has the content length not exceeding the third length threshold value may be extracted from the on-screen content according to two adjacent language units of which the content types are inconsistent in the third occurrence, as the third on-screen content.

For example, when the on-screen content of the user is "the flight MN3587 of me is late", the content types corresponding to the front language unit "3587" and the rear language unit "late" are inconsistent, and the front language unit "3587" may be used as the third on-screen content.

And 303d, if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet the first preset condition, merging the second on-screen content and the third on-screen content, and taking the merged content as the second on-screen content.

If the third on-screen content is extracted, judging according to the content types corresponding to the second on-screen content and the third on-screen content respectively, and determining whether the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet the first preset condition. And when the first preset condition is met, the second screen content and the third screen content can be combined, and the combined screen content is used as the second screen content.

The content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content may include: two of chinese type, letter type, number type, symbol type, english type and picture type, the first preset condition may be: the content type corresponding to the second screen content and the content type corresponding to the third screen content are two different content types of letter type, number type, symbol type, english type and picture type.

For example, when the on-screen content of the user is "the flight MN3587 of me is late", the first on-screen content is "the flight", the second on-screen content is "MN", the third on-screen content is "3587", the content type of the second on-screen content is letter type, the content type of the third on-screen content is digital type, the second on-screen content and the third on-screen content may be combined, and "MN3587" is used as the new second on-screen content.

Correspondingly, the first screen content and the second screen content can be combined when corresponding preset conditions are met.

Optionally, if the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content meet the second preset condition, combining the first on-screen content and the second on-screen content, taking the combined content as the first on-screen content, and taking the third on-screen content as the second on-screen content.

Similar to the first preset condition, the second preset condition may be: the content type corresponding to the first screen content and the content type corresponding to the second screen content are two different content types of letter type, number type, symbol type, english type and picture type.

It should be noted that, if the first determination result is continuously no, but only two cases in which the second determination result is no appear, steps 303c and 303d do not need to be executed; that is, before the first determination result is detected as yes, only two cases where the second determination result is no, step 303 may include only step 303a and step 303b.

In addition, in the process of acquiring the first screen content and the second screen content, the content length of the first screen content and the second screen content can be judged, and the conditions that the storage space is occupied and the utilization rate is low due to overlong content length are avoided.

Optionally, if the content length corresponding to the first on-screen content does not exceed the first length threshold, the first on-screen content may be reserved; or if the content length corresponding to the first on-screen content exceeds the first length threshold, discarding the first on-screen content.

The embodiment of the invention not only can establish the multi-element relation according to the on-screen content input by the user, but also can recommend the candidate item matched with the input string of the user to the user according to the established multi-element relation in the process of inputting by the user, thereby improving the accuracy of the candidate item and improving the efficiency of inputting by the user.

Referring to fig. 4, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include:

step 401, receiving input content of a user.

Step 402, determining candidate items corresponding to the input content according to the multivariate relation.

Step 403, outputting candidates corresponding to the input content.

In the process of user input, the input method program can monitor the content input by the user in real time, such as the screen content and/or the input string input by the user, and when the content input by the user is monitored, the content input by the user can be used as the input content, so that in the subsequent steps, candidates matched with the input content can be found.

Wherein the input content may include: on screen content and/or input strings.

Specifically, after the user starts the input method program, the input method program can monitor the content input by the user in real time, and when detecting that the user types in a character corresponding to a certain key, the input method program can receive the character corresponding to the key and take the character as the input content of the user.

Moreover, the input method program can provide candidates for the user according to the input string input by the user, and when the screen operation triggered by the user is detected, the candidate corresponding to the screen operation can be displayed on the screen, and the screen candidate can be used as input content.

For example, when the input string inputted by the user is "s-h-e-n-g-x-i-a-o", the input method program may sequentially acquire "s", "h", "e", "n", "g", "x", "i", "a", and "o" as the input string in the order of the user input. If the user has not yet input "i", "a" and "o", then "s-h-e-n-g-x" may be used as the input string, and if it is detected that the user continues to input "i", "a" and "o", then "s-h-e-n-g-x-i-a-o" may be used as the input string. Further, the input method program can recommend candidate items "Chinese zodiac" to the user according to the input string "s-h-e-n-g-x-i-a-o", when the operation of triggering the screen candidate item "Chinese zodiac" by the user is detected, the screen "Chinese zodiac" can be displayed on the screen, and the "Chinese zodiac" is also used as input content of the user.

In step 402, the multivariate relationship may comprise: the language units with different content types, that is, the language units forming the multivariate relation, are different in content types, and the language units are constituent units of the on-screen content and may include characters, words, letters, numbers, symbols, and the like. For example, the multiple relationship may be composed of a front language unit of "12" and a rear language unit of "Chinese zodiac" if the multiple relationship is "12-Chinese zodiac".

Since the multi-pair multivariate relation is established in advance, after the input content input by the user is detected, the input content can be compared with the language units in the multi-pair multivariate relation to obtain candidates matched with the input content.

For example, a target multivariate relation matched with the on-screen content can be found from the multiple pairs of multivariate relations according to the on-screen content in the input content, and then a language unit matched with the input string in the input content is selected as a candidate item according to the input string of the language unit in the target multivariate relation.

It should be noted that, the candidate may form a multiple relationship with the on-screen content in the input content, and the candidate may also include a front language unit and a rear language unit in the multiple relationship, which is not limited in the embodiment of the present invention.

In the process of inputting by the user, the input method program can display the candidate item matched with the input content to the user according to the input content input by the user, and the input content of the user can be matched with a plurality of candidate items, so that the order corresponding to the candidate item obtained in the step 402 can be adjusted for improving the accuracy of recommending the candidate item to the user, the user can conveniently and rapidly screen the candidate item, and the output of the candidate item is realized.

Specifically, after the target candidate item matched with the input content is obtained according to the multivariate relation, the sorting result corresponding to the candidate item can be obtained first, then the sorting result is updated according to the target candidate item, and the candidate item is output according to the updated sorting result, so that a user can quickly look up the target candidate item when the candidate item is on screen, and the screen of the candidate item is completed.

For example, if the input content input by the user is "s-h-e-n-g-x-i-a-o", the candidates aligned in the order of the ranks matched with the input content may include "effective", "vertical bamboo flute", "sound effect", and "Chinese zodiac", and the ranks corresponding to "Chinese zodiac" are 4, that is, the ranks aligned at the 4 th position are presented to the user, the ranks of "Chinese zodiac" may be replaced with 1, and the ranks of other candidates are added with 1, that is, one rank is followed, so as to adjust the ranks of the candidates, and output the candidates according to the adjusted ranks.

In summary, according to the data processing method provided by the embodiment of the invention, through receiving the input content of the user, matching the input content with the multivariate relation, wherein the multivariate relation comprises language units with different content types, determining the candidate item corresponding to the input content, and finally outputting the candidate item, so that the user can quickly look up the candidate item and quickly screen the candidate item, and the user input efficiency can be improved.

Referring to fig. 5, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention may specifically include:

step 501, receiving the on-screen content of the user.

Step 502, receiving an input string of a user.

Step 503, determining candidates corresponding to input content according to the multivariate relation, where the input content includes: on screen content and/or input strings.

Step 504, outputting candidates corresponding to the input content.

In the input process of the user, different contents can be continuously displayed on the screen; the on-screen content entered by the user may be received to obtain candidates that can make up a multivariate relationship with the on-screen content.

Moreover, the content types corresponding to the on-screen content can be the same or different, and then the on-screen content can be acquired in different modes according to the different content types.

For example, when the on-screen content is a chinese-type content, the content length of the on-screen content may be acquired first, and when the content length does not exceed the length threshold, the on-screen content may be received. However, if the content length of the on-screen content exceeds the first length threshold, the on-screen content may be subjected to word segmentation, and the processed on-screen content is used as the received on-screen content.

For another example, when the content on the screen is letters and/or numbers, the content length of the content on the screen is also required to be acquired first, that is, the continuous letters, numbers or a combination of letters and numbers is acquired, and when the content length corresponding to the content on the screen does not exceed the length threshold, the content on the screen can be received. However, if the length of the on-screen content exceeds the first length threshold, the on-screen content is no longer received. Therefore, the content length corresponding to the on-screen content does not exceed the length threshold.

After receiving the on-screen content input by the user, whether the user triggers the selection operation on the split keys or not can be detected, and then the on-screen content is split. If a selection operation triggered by the user is detected, it is indicated that the user needs to split the on-screen content and the input string input subsequently, so that it can be determined that the user does not screen the language unit corresponding to the multi-element relationship, then the subsequent step may not be executed any more, but step 501 is executed continuously until a selection operation triggered by the user for the split key is not detected. Wherein, this segmentation button can include: the key, enter key, space key, etc. corresponding to the punctuation marks are not limited in this embodiment of the present invention.

However, if, after receiving the on-screen content input by the user, it is not detected that the user has triggered a selection operation on the split key, the user is not required to divide the on-screen content, and the language units corresponding to the on-screen multivariate relation can be determined. Therefore, step 502 may be further performed, which is to say that there is no preset division symbol between the on-screen content and the input string, and the preset division symbol is the symbol corresponding to the division key.

After receiving the on-screen content and the input string input by the user, the on-screen content and the input string may be used as input content of the user, and candidates corresponding to the input content may be determined according to a preset multivariate relation and the input content, so as to output the candidates in a subsequent step.

Alternatively, a target multivariate relation matched with the on-screen content may be determined from at least one pair of multivariate relations, and candidates corresponding to the on-screen content may be determined according to the target multivariate relation.

Specifically, according to the preset multi-pair multi-element relation and the on-screen content in the input content, a target multi-element relation corresponding to the on-screen content is selected from the multi-pair multi-element relation, and candidates corresponding to the on-screen content are determined according to the target multi-element relation.

For example, if the user inputs the "12" on the screen, the user may first find a target multivariate relation corresponding to the "12" on the screen in the multivariate relation when the user does not input the input string, and select the language unit with the highest use frequency as the candidate item according to the target multivariate relation. If the target multivariate relation corresponding to "12" includes language units such as "Chinese zodiac" and "month", the "Chinese zodiac" may be used as the candidate.

Furthermore, the input method program can also search according to the input string input by the user to obtain the candidate item matched with both the on-screen content and the input string, so as to accurately recommend the corresponding candidate item to the user and improve the input efficiency of the user.

Alternatively, a target multivariate relation matching the on-screen content may be determined from at least one pair of multivariate relations, and a language unit matching the input string may be determined from the target multivariate relation as a candidate for the input content.

Specifically, after the target multi-element relation is determined according to the on-screen content, matching can be performed according to the input strings corresponding to the language units in the target multi-element relation and the input strings in the input content, so as to obtain candidate items corresponding to the input strings.

When the input string corresponding to the language unit in the target polynary relation is not completely identical with the input string in the input content, the matching degree between the input string in the input content and the input string corresponding to each language unit can be obtained, and the language unit to which the input string with the largest matching degree belongs can be determined as the candidate item corresponding to the input content.

For example, after the user inputs "12", a target multivariate relation corresponding to "12" may be selected from preset multi-to-multivariate relations as shown in table 1, and if the input string in the user input content is detected to be "s-h-e-n-g-x", the input string corresponding to each rear language unit may be obtained from the target multivariate relation as shown in table 2, so as to obtain the matching degree between the input string in the input content and the input string of each rear language unit, and finally obtain the candidate "Chinese zodiac" matched with the input string.

TABLE 1

	Front language unit	Rear language unit
			Polynary relation 1	12	Chinese zodiac
Polynary relation 2	36	Meter with a meter body
			Multiple relation 3	72	Variable

TABLE 2

Front language unit	Rear language unit	Rear languageInput string corresponding to unit
			12	Chinese zodiac	s-h-e-n-g-x-i-a-o
12	Month of moon	y-u-e

In addition, in the input process, the user may input not only the digital type content but also the letter type content, and then the letter type content and the input string corresponding to the candidate item may be used as the input content.

Therefore, in practical application, the input method program can identify the input string, judge whether the input content comprises two language units capable of forming a multi-element relation, and then search matched candidate items in the preset multi-element relation according to the judging result.

Alternatively, the candidate item corresponding to the input string may be obtained by searching in the multiple relations according to the input string or the word candidate corresponding to the input string.

Specifically, according to the input content of the user, searching can be performed in the multi-to-multi-element relation, and whether the front language unit of each multi-element relation in at least one pair of multi-element relation is matched with part of characters of the input content is judged. When the front language unit of any pair of polynary relations is matched with part of characters of the input content, the input strings corresponding to the rear language unit of the polynary relations can be compared with the rest characters of the input content, and when the front language unit of the polynary relations and the rest characters are matched, the target polynary relations can be used as candidates.

For example, if the input content of the user is "Q-u", and the preset multiple relation includes "Q-interest", the character "Q" in the input string may be compared with the front language unit "Q" in the multiple relation, so as to determine the multiple relation "Q-interest" as the target multiple relation, then the remaining input strings "Q-u" are compared with the input strings corresponding to the rear language unit "interest" in the target multiple relation, and finally "Q-interest" is used as the candidate item.

It should be noted that, for simplicity of description, the method embodiments are described as a series of combinations of motion actions, but those skilled in the art should appreciate that the embodiments of the present invention are not limited by the order of motion actions described, as some steps may be performed in other order or simultaneously in accordance with the embodiments of the present invention. Further, it should be understood by those skilled in the art that the embodiments described in the specification are all preferred embodiments and that the movement involved is not necessarily required by the embodiments of the present invention.

Device embodiment

Referring to fig. 6, there is shown a block diagram of an embodiment of a data processing apparatus of the present invention, which may specifically include:

An obtaining module 601, configured to obtain a first on-screen content and a second on-screen content from continuous on-screen content; the content type corresponding to the first screen content is different from the content type corresponding to the second screen content, and the first screen content is adjacent to the second screen content;

a relationship establishment module 602, configured to establish a multiple relationship between the first on-screen content and the second on-screen content.

Optionally, the content length corresponding to the first screen content does not exceed the first length threshold, and the content length corresponding to the second screen content does not exceed the second length threshold.

Alternatively, the obtaining module 601 may include:

the dividing sub-module is used for dividing the target on-screen content according to the content type corresponding to the content contained in the target on-screen content so as to obtain a first on-screen content and a second on-screen content contained in the target on-screen content.

Optionally, the dividing sub-module may include:

the acquisition unit is used for acquiring first screen content, second screen content and third screen content with different content types from the target screen content according to the content types corresponding to the content contained in the target screen content;

The first merging unit is configured to merge the first on-screen content and the second on-screen content as first on-screen content and use the third on-screen content as second on-screen content if the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content meet a first merging condition; or alternatively

And the second merging unit is used for merging the second on-screen content and the third on-screen content to be used as second on-screen content if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a second merging condition.

Alternatively, the obtaining module 601 may include:

Optionally, the obtaining module 601 may further include:

the first merging sub-module is used for merging the second on-screen content and the third on-screen content if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a first preset condition, and taking the merged content as the second on-screen content; or alternatively

Optionally, the obtaining module 601 may further include:

the content reservation sub-module is used for reserving the first on-screen content if the content length corresponding to the first on-screen content does not exceed a first length threshold; or alternatively

Optionally, the second extraction sub-module may include:

Optionally, the third extraction sub-module may include:

and the extraction unit is used for extracting the content which is positioned behind the second on-screen content and has the content length not exceeding a third length threshold value from the on-screen content according to the two adjacent language units with inconsistent content types appearing for the third time, and the content is used as the third on-screen content.

In summary, the data processing device provided by the embodiment of the invention acquires the adjacent first on-screen content and second on-screen content with different content types from the continuous on-screen content, and establishes a multiple relationship between the first on-screen content and the second on-screen content; in this way, in the case that the user needs to input the combined content of different content types, the candidate items corresponding to the combined content can be provided to the user according to the established multiple relation between the first upper screen content and the second upper screen content, so that the input efficiency of the user can be improved.

Referring to FIG. 7, there is shown a block diagram of another embodiment of a data processing apparatus of the present invention, which may include:

a receiving module 701, configured to receive input content of a user; the input content includes: on screen content and/or input strings;

a determining module 702, configured to determine, according to the multivariate relation, candidates corresponding to the input content; the multiple relationship includes: language units with different content types;

and an output module 703, configured to output candidates corresponding to the input content.

Optionally, the determining module 702 includes:

the first determining submodule is used for determining a target multivariate relation matched with the on-screen content from at least one pair of multivariate relations and determining candidates corresponding to the on-screen content according to the target multivariate relation; or alternatively

The content length corresponding to the on-screen content does not exceed the length threshold.

In summary, the data processing device provided by the embodiment of the invention matches the input content with the multivariate relation by receiving the input content of the user, wherein the multivariate relation comprises language units with different content types, determines candidate items corresponding to the input content, and finally outputs the candidate items, so that the user can quickly look up the candidate items and quickly screen the candidate items, thereby improving the input efficiency of the user.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

An embodiment of the invention discloses a device for data processing, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and configured to be executed by one or more processors, the one or more programs comprising means for performing any of the methods of fig. 1-5.

Referring to fig. 8, an apparatus 800 for data processing may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the apparatus 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing element 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the apparatus 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or one component of the apparatus 800, the presence or absence of user contact with the apparatus 800, an orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices, either in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of apparatus 800 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Fig. 9 is a schematic diagram of a server in some embodiments of the invention. The server 900 may vary considerably in configuration or performance and may include one or more central processing units (central processing units, CPU) 922 (e.g., one or more processors) and memory 932, one or more storage media 930 (e.g., one or more mass storage devices) storing applications 942 or data 944. Wherein the memory 932 and the storage medium 930 may be transitory or persistent. The program stored in the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 922 may be arranged to communicate with a storage medium 930 to execute a series of instruction operations in the storage medium 930 on the server 900.

The server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input/output interfaces 958, one or more keyboards 956, and/or one or more operating systems 941, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

Embodiments of the present invention provide a machine-readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform a data processing method as described in one or more of fig. 1-5.

A non-transitory computer readable storage medium, which when executed by a processor of an apparatus (terminal or server) causes the apparatus to perform a data processing method, the method comprising: acquiring a first screen content and a second screen content from the continuous screen content; the content types corresponding to the first screen content and the second screen content are different, and the first screen content and the second screen content are adjacent; and establishing a multi-element relation between the first on-screen content and the second on-screen content.

A non-transitory computer readable storage medium, which when executed by a processor of an apparatus (terminal or server) causes the apparatus to perform a data processing method, the method comprising: receiving input content of a user; the input content includes: on screen content and/or input strings; determining candidate items corresponding to the input content according to the multivariate relation; the multivariate relationship comprises: language units with different content types; outputting the candidate item corresponding to the input content

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

The foregoing has described in detail a data processing method and apparatus provided by the present invention, and specific examples have been provided herein to illustrate the principles and embodiments of the present invention, the above examples being provided only to assist in understanding the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A method of data processing, the method comprising:

establishing a multi-element relation between the first on-screen content and the second on-screen content;

the method for acquiring the first on-screen content and the second on-screen content from the continuous on-screen content comprises the following steps:

2. The method of claim 1, wherein the first on-screen content corresponds to a content length that does not exceed a first length threshold and the second on-screen content corresponds to a content length that does not exceed a second length threshold.

3. The method of claim 1, wherein the obtaining the first on-screen content and the second on-screen content from the continuous on-screen content comprises:

4. The method of claim 3, wherein the dividing the target on-screen content according to the content type corresponding to the content included in the target on-screen content includes:

if the content type corresponding to the first screen content and the content type corresponding to the second screen content meet a first merging condition, merging the first screen content and the second screen content to be used as first screen content, and taking the third screen content as second screen content;

if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a second merging condition, merging the second on-screen content and the third on-screen content to be used as second on-screen content;

the first merging condition is that the content type corresponding to the first screen content and the content type corresponding to the second screen content are two different content types of letter type, number type, symbol type, english type and picture type;

the second merging condition is that the content type corresponding to the second upper screen content and the content type corresponding to the third upper screen content are two different content types of letter type, number type, symbol type, english type and picture type.

5. The method of claim 1, wherein the obtaining the first on-screen content and the second on-screen content from the continuous on-screen content further comprises:

if the content type corresponding to the second on-screen content and the content type corresponding to the third on-screen content meet a first preset condition, combining the second on-screen content and the third on-screen content, and taking the combined content as second on-screen content;

if the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content meet a second preset condition, combining the first on-screen content and the second on-screen content, taking the combined content as the first on-screen content, and taking the third on-screen content as the second on-screen content;

the first preset condition is that the content type corresponding to the second upper screen content and the content type corresponding to the third upper screen content are two different content types of letter type, number type, symbol type, english type and picture type;

The second preset condition is that the content type corresponding to the first screen content and the content type corresponding to the second screen content are two different content types of letter type, number type, symbol type, english type and picture type.

6. The method of claim 5, wherein the obtaining the first on-screen content and the second on-screen content from the continuous on-screen content further comprises:

if the content length corresponding to the first screen content does not exceed a first length threshold, reserving the first screen content;

7. The method of claim 1, wherein the extracting second on-screen content located after the first on-screen content from the on-screen content comprises:

8. The method of claim 5, wherein the extracting third on-screen content located after the second on-screen content from the on-screen content comprises:

9. The method of claim 1, wherein the content type corresponding to the first on-screen content and the content type corresponding to the second on-screen content comprise: at least two of chinese type, letter type, number type, symbol type, english type, and picture type.

10. A method of data processing, the method comprising:

determining candidate items corresponding to the input content according to the multivariate relation; the multivariate relationship comprises: language units with different content types; the acquisition mode of the multivariate relation is realized by the data processing method in the claim 1;

outputting candidate items corresponding to the input content;

a preset segmentation symbol does not exist between the upper screen content and the input string; and/or

11. The method of claim 10, wherein the determining candidates corresponding to the input content according to the multivariate relation comprises:

Determining from at least one pair of multivariate relations the target polynary relation of the on-screen content matching, determining a language unit matched with the input string from the target multivariate relation as a candidate item corresponding to the input content; or alternatively

12. A data processing apparatus, the apparatus comprising:

The relation establishing module is used for establishing a multi-element relation between the first on-screen content and the second on-screen content;

13. A data processing apparatus, the apparatus comprising:

the determining module is used for determining candidate items corresponding to the input content according to the multivariate relation; the multivariate relationship comprises: language units with different content types; the acquisition mode of the multivariate relation is realized by the data processing method in the claim 1;

14. An apparatus for data processing comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:

15. A machine readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the data processing method of one or more of claims 1 to 9.

16. An apparatus for data processing comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:

outputting candidate items corresponding to the input content;

17. A machine readable medium having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the data processing method of one or more of claims 10 to 11.