CN116956829A - Word stock construction method, device, equipment and storage medium - Google Patents

Word stock construction method, device, equipment and storage medium Download PDF

Info

Publication number
CN116956829A
CN116956829A CN202310118779.6A CN202310118779A CN116956829A CN 116956829 A CN116956829 A CN 116956829A CN 202310118779 A CN202310118779 A CN 202310118779A CN 116956829 A CN116956829 A CN 116956829A
Authority
CN
China
Prior art keywords
text
custom
collection
word stock
program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310118779.6A
Other languages
Chinese (zh)
Inventor
田野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310118779.6A priority Critical patent/CN116956829A/en
Publication of CN116956829A publication Critical patent/CN116956829A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a word stock construction method, a word stock construction device, word stock construction equipment and a storage medium, and relates to the field of computers. The method comprises the following steps: displaying a text editing area, wherein the text editing area displays an upper screen text; in response to an operation of selecting and collecting a first text in the on-screen text, adding the first text to a custom thesaurus, wherein the custom thesaurus comprises at least one collected text in nouns, phrases and sentences; and in response to the input of the first character string, displaying a recommendation window, wherein the recommendation window displays at least one collection text matched with the first character string in the custom word stock. The method can improve the text input efficiency of the user.

Description

Word stock construction method, device, equipment and storage medium
Technical Field
The present application relates to the field of computers, and in particular, to a method, an apparatus, a device, and a storage medium for word stock construction.
Background
The method for inputting Chinese by using the input method program comprises the following steps: the user inputs pinyin, the input method program identifies pinyin and displays a plurality of candidate words corresponding to the pinyin in a candidate word window, and the user selects one of the candidate words to enable the candidate word to be on the screen.
In the related art, the input method program has an automatic completion function, and when a user inputs a part of pinyin of a phrase, the input method program can match the phrase according to the part of pinyin and display the phrase in the candidate words. For example, when a user inputs 'fei liu zhi xia', the automatic complement of the common word 'three thousand feet below the flyer' appears in the candidate words, so that the input efficiency of the user is improved.
In the related art, the input method program can only provide an automatic completion function based on the common word stock, and for some nouns which are common to users but not exist in the common word stock, the user can only input through a full spelling mode.
Disclosure of Invention
The embodiment of the application provides a word stock construction method, a word stock construction device, word stock construction equipment and a storage medium, which can improve the text input efficiency of a user. The technical scheme is as follows:
according to an aspect of the present application, there is provided a word stock construction method, the method being performed by a terminal, the method comprising:
displaying a text editing area, wherein the text editing area displays an upper screen text;
in response to an operation of selecting and collecting a first text in the on-screen text, adding the first text to a custom thesaurus, wherein the custom thesaurus comprises at least one collected text in nouns, phrases and sentences;
And in response to the input of the first character string, displaying a recommendation window, wherein the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
According to another aspect of the present application, there is provided a thesaurus construction apparatus for implementing a terminal, the apparatus comprising:
the display module is used for displaying a text editing area, and the text editing area is displayed with an upper screen text;
the interaction module is used for receiving the operation of selecting and collecting the first text in the on-screen text;
the word stock module is used for responding to the operation of selecting and collecting a first text in the upper screen text, and adding the first text into a custom word stock, wherein the custom word stock comprises at least one collected text in nouns, phrases and sentences;
the interaction module is used for receiving the operation of inputting the first character string;
the display module is used for responding to the input of the first character string and displaying a recommendation window, and the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
According to another aspect of the present application, there is provided a computer device comprising a processor and a memory having stored therein at least one instruction, at least one program, a set of codes or a set of instructions, the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by the processor to implement the lexicon construction method as described in the above aspect.
According to another aspect of the present application, there is provided a computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes or a set of instructions, the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by the processor to implement the word stock construction method as described in the above aspect.
According to another aspect of the present application there is provided a computer program product having stored therein at least one instruction, at least one program, code set or instruction set, which is loaded and executed by the processor to implement the lexicon construction method as described in the above aspect.
The technical scheme provided by the embodiment of the application has the beneficial effects that at least:
when the document editing is carried out, the terminal can receive the operation of selecting a part of edited texts and collecting the texts, and the part of texts are added into the custom word stock. The custom word stock is specially used for collecting texts edited by users, and matching is performed according to the custom word stock when candidate words are matched according to character strings input by the users later. By adopting the method, the user can add texts with higher use frequency into the custom word stock, and when the texts are needed to be input later, the automatic complement function can be used for providing automatic complement candidate words according to the custom word stock. And, different users have their own custom word banks, the text in the custom word bank is the user's own custom word, have greatly improved the matching degree of word bank and user, display the candidate word on the basis of the custom word bank, can both improve the matching degree of candidate word and user's expectation, can improve the input efficiency of user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a computer system provided in accordance with an exemplary embodiment of the present application;
FIG. 2 is a flowchart of a method for word stock construction provided by an exemplary embodiment of the present application;
FIG. 3 is a schematic diagram of a word stock construction method according to an exemplary embodiment of the present application;
FIG. 4 is a schematic diagram of a word stock construction method according to an exemplary embodiment of the present application;
FIG. 5 is a schematic diagram of a word stock construction method according to an exemplary embodiment of the present application;
FIG. 6 is a schematic diagram of a word stock construction method provided by an exemplary embodiment of the present application;
FIG. 7 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 8 is a schematic diagram of a word stock construction method according to an exemplary embodiment of the present application;
FIG. 9 is a schematic diagram of a word stock construction method provided by an exemplary embodiment of the present application;
FIG. 10 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 11 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 12 is a schematic diagram of a word stock construction method provided by an exemplary embodiment of the present application;
FIG. 13 is a schematic diagram of a word stock construction method provided by an exemplary embodiment of the present application;
FIG. 14 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 15 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 16 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 17 is a schematic diagram of a word stock construction method according to an exemplary embodiment of the present application;
FIG. 18 is a diagram of a method for word stock construction provided by an exemplary embodiment of the present application;
FIG. 19 is a flowchart of a method for thesaurus construction provided by an exemplary embodiment of the present application;
FIG. 20 is a block diagram illustrating a construction of a thesaurus construction apparatus according to an exemplary embodiment of the present application;
fig. 21 is a block diagram of a terminal provided in an exemplary embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
FIG. 1 is a block diagram illustrating a computer system according to an exemplary embodiment of the present application. The computer system 100 includes: a first terminal 110, a server 120, a second terminal 130.
The first terminal 110 installs and runs a first client 111 of an input method program and/or a document editing program. When the first terminal 110 runs the first client 111, an interface of the first client 111 is displayed on a screen of the first terminal 110. The input method program can be any one of a pinyin input method program, a wubi input method program, a voice input method program, a handwriting input method program, a Chinese input method program, an English input method program, a Korean input method program, a Japanese input method program and the like. In this embodiment, the input method program is exemplified by a Chinese Pinyin input method program. The document editing program may be any of a local document editing program, an online document editing program (e.g., a collaborative document editing program). The online document editing program may refer to a web page or an application program. The first terminal 110 is a terminal used by the first user 112, and the first user account of the first user 112 is registered on the first client 111. Optionally, the first client 111 stores a custom word stock corresponding to the first user account; or, the server of the input method program/document editing program stores a custom word stock corresponding to the first user account. Optionally, the first client 111 stores a custom word stock corresponding to each online document; or, the server of the document editing program stores a custom word stock corresponding to each online document.
The second terminal 130 installs and runs a second client 131 of the input method program and/or the document editing program. When the second terminal 130 operates the second client 131, an interface of the second client 131 is displayed on a screen of the second terminal 130. The input method program can be any one of a pinyin input method program, a wubi input method program, a voice input method program, a handwriting input method program, a Chinese input method program, an English input method program, a Korean input method program, a Japanese input method program and the like. In this embodiment, the input method program is exemplified by a Chinese Pinyin input method program. The document editing program may be any one of a local document editing program and an online document editing program. The online document editing program may refer to a web page or an application program. The second terminal 130 is a terminal used by the second user 132, and the second client 131 has a second user account of the second user 132 registered thereon. Optionally, the second client 131 stores a custom word stock corresponding to the second user account; or, the server of the input method program/document editing program stores a custom word stock corresponding to the second user account. The custom thesaurus of the first user account and the second user account are different. Optionally, the second client 131 stores a custom word stock corresponding to each online document; or, the server of the document editing program stores a custom word stock corresponding to each online document.
Alternatively, the input method programs/document editors installed on the first terminal 110 and the second terminal 130 are the same, or the input method programs/document editors installed on the two terminals are the same type of application programs on different operating system platforms (android or IOS). When the first client and the second client are clients of an online/collaborative document editing program, the first user account and the second user account may collaboratively edit the same online document. The first terminal 110 may refer broadly to one of the plurality of terminals and the second terminal 130 may refer broadly to another of the plurality of terminals, the present embodiment being illustrated with only the first terminal 110 and the second terminal 130. The device types of the first terminal 110 and the second terminal 130 are the same or different, and the device types include: at least one of a smart phone, a tablet computer, an electronic book reader, an MP3 player, an MP4 player, a laptop portable computer, and a desktop computer.
Only two terminals are shown in fig. 1, but in different embodiments there are a plurality of other terminals 140 that can access the server 120. Optionally, there are one or more terminals 140 corresponding to the developer, a development and editing platform for the input method program is installed on the terminal 140, the developer can edit and update the client on the terminal 140, and transmit the updated installation package to the server 120 through a wired or wireless network, and the first terminal 110 and the second terminal 130 can download the installation package from the server 120 to implement the update of the client.
The first terminal 110, the second terminal 130, and the other terminals 140 are connected to the server 120 through a wireless network or a wired network.
Server 120 includes at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. The server 120 is used to provide background services for clients of the input method program/document editing program. Optionally, the server 120 takes on primary computing work and the terminal takes on secondary computing work; alternatively, the server 120 takes on secondary computing work and the terminal takes on primary computing work; alternatively, a distributed computing architecture is used for collaborative computing between the server 120 and the terminals.
In one illustrative example, server 120 includes a processor 122, a user account database 123, a custom thesaurus 124, and a user-oriented Input/Output Interface (I/O Interface) 125. The processor 122 is configured to load instructions stored in the server 121, and process data in the user account database 123 and the custom thesaurus 124; the user account database 123 is used for storing data of user accounts used by the first terminal 110, the second terminal 130, and the other terminals 140, such as an avatar of the user account, a nickname of the user account, a group in which the user account is located, and the like; the custom word stock 124 is used for storing, managing and updating custom word stocks corresponding to the user accounts; the user-oriented I/O interface 125 is used to establish communication exchanges of data with the first terminal 110 and/or the second terminal 130 via a wireless network or a wired network.
The word stock construction method provided by the embodiment of the present application is described with reference to the above description of the implementation environment, and the execution subject of the method is illustrated as a terminal shown in fig. 1. The terminal runs a client with an input method program.
Fig. 2 is a flowchart illustrating a word stock construction method according to an exemplary embodiment of the present application. This embodiment is illustrated by the execution of the method by the terminal shown in fig. 1. The method comprises the following steps:
step 220: and displaying a text editing area, wherein the text editing area is displayed with an upper screen text.
The text editing area may be a text editing area provided by the second application. The second application program may be any one of a document editing program, an online document editing program, and a social application program, and may also be an online document editing web page. The second application may be any application that supports text input, the text editing area may be any area in the second application that supports text input, for example, the text editing area may be a document editing area of a document editing program, or the text editing area may be a document editing area on an online document editing web page, or the text editing area may be a chat content input area in a social application.
The text editing area is an area supporting text input, and the text editing area can display text content (on-screen text) that has been input by the user. The user may enter text using an input method program in the text editing area.
The text editing area can be displayed with an input cursor, the input cursor is used for indicating the input position of a text, a user inputs a character string at the input cursor, the input method program can be triggered to display a character editing box and a candidate word window, the character editing box is displayed with the character string just input by the user, and the candidate word window is displayed with at least one candidate word matched with the character string. The user clicks on a candidate word in the candidate word window to cause the candidate word to be displayed on the screen at the input cursor in the text editing area.
The on-screen text is text that the user has entered within the text editing area. The upper screen is a description with respect to the input method program, and since the input method program displays various texts during the user input process, the text input by the user using the input method program on the text editing area is described by using the upper screen text.
In one input method of the input method program, as shown in fig. 3, a character string 302 input by a user using a keyboard is displayed in a character editing box, and the character string may be referred to as a user input text; matching candidate words according to a text input method program input by a user, and displaying at least one matched candidate word 303 in a candidate word window; the user selects a candidate word, and the candidate word is displayed on the screen, that is, the candidate word is displayed in the text editing area 301, that is, the screen text is the text finally input by the user using the input method program.
In another input method of the input method program, the character string input by the user is directly displayed in the text editing area, and the character string input by the user is also on-screen text, for example, english characters input by the user using the english mode of the input method program, symbols, spaces, special characters, and the like input by the user using the input method program are directly displayed in the text editing area.
The on-screen text may include: chinese, english, number, symbol, space and special character.
Step 240: and in response to the operation of selecting and collecting the first text in the on-screen text, adding the first text into a custom thesaurus, wherein the custom thesaurus comprises at least one collected text of nouns, phrases and sentences.
The first text may include at least one of chinese, english, numerals, symbols, spaces, special characters. The first text may be one of a noun, a phrase, a sentence. The first text may be some or all of the on-screen text. The first text is a continuous piece of text in the on-screen text. Or the first text is a discontinuous multi-segment text in the on-screen text, and the multi-segment text is spliced into a segment text sequentially and is used as the first text to be stored in the custom word stock.
In one embodiment, the custom thesaurus is a thesaurus set by the input method program specifically for the user, and different users have different custom thesauruses. Optionally, if the client of the input method program logs in with the user account (the first user account), the first text may be stored in a custom word stock of the first user account (may be synchronized with the custom word stock of the first user account in the server); if the client of the input method program does not log in the user account, the first text can be stored in a local custom word stock. The custom thesaurus may be stored on the terminal (in the client of the input method program) or on the server of the input method program.
In one embodiment, the custom thesaurus is a thesaurus set by the document editor for the user, with different users having different custom thesauruses. Or, the custom word stock is a word stock set by the document editing program for each online document, one online document can be provided with a plurality of custom word stocks, and a plurality of online documents can also share one custom word stock. For example, after the first online document is opened, any user account can use the custom word stock of the first online document, and the user account can collect the text in the first online document into the custom word stock in the process of editing the first online document. The custom word stock of the online document is updated in real time, and the text collected by one user account is immediately synchronized to all clients opening the online document.
The custom thesaurus is used for storing at least one text collected by the user. For example, the custom word stock stores at least one non-idiomatic noun collected by the user in the process of editing the document, or the custom word stock stores commonly used phrases and sentences of the user.
The custom thesaurus is used to match the favorite text when the user enters a string. When a user inputs a character string, the input method program or the document editing program can match the collection text for the user according to the custom word stock. When the custom word stock is a word stock of the input method program, the input method program can match candidate words according to the original rule while matching the collection text, the input method program can display both the candidate words and the collection text matched in two ways into one word window (for example, a recommendation window or a candidate word window), and can also display the candidate words and the collection text matched in two ways into different windows (for example, the candidate word window and the recommendation window).
When the custom thesaurus is a thesaurus of the input method program, the custom thesaurus can also be used for an automatic completion function of the input method program, and the collection text in the custom thesaurus is partially matched according to the character string input by the user (the character string can be considered to be matched with the collection text only by matching with a part of contents in the collection text).
The text in the custom thesaurus is referred to as the collection text, i.e., the first text is also referred to as the collection text, and the custom thesaurus includes at least one collection text. The favorites text may be at least one of nouns, phrases, sentences.
Alternatively, the collection text in the custom thesaurus may be divided into multiple groupings, with different groupings being used to store collection text for different classifications/fields. Alternatively, the groupings may be set by the user or the groupings may be automatically generated by the input method program/document editing program.
Wherein, the operations of selecting and collecting can be:
(1) Responding to the operation of selecting a first text in the upper screen text, and displaying a first collection control corresponding to the first text; in response to triggering the operation of the first collection control, the first text is added to the custom thesaurus.
The first collection control may be displayed near the first text, e.g., the first collection control is displayed in the upper right corner of the first text.
For example, as shown in fig. 4, in the text editing area 301, an on-screen text "Virtual Reality (VR)" 304 is displayed, the on-screen text "Virtual Reality (VR)" 304 is selected as the first text, it is highlighted (for marking that the text has been selected), a first collection control 305 is displayed in the upper right corner of the first text, and clicking on the first collection control 305 can collect the first text "Virtual Reality (VR)" into the custom thesaurus. The next time the user enters the phrase, he only needs to enter "xu ni" to see the favorites text "Virtual Reality" (VR) in the recommendation window.
(2) The status bar of the input method program/the document editing program displays a second collection control; and in response to an operation of selecting a first text in the on-screen text and dragging the first text to the second collection control, adding the first text to the custom thesaurus.
It should be noted that, as shown in fig. 5, when the user uses the input method program, the following three windows of the input method program may appear: status bar 308, character edit box 306, candidate word window 307.
The status bar is used for displaying and managing the input status of the input method program, for example, adjusting the input mode (chinese/english, simplified/traditional, half angle/full angle) of the input method program, changing the skin of the input method program, opening the setting interface of the input method program, and the like.
The character edit box is used for displaying character strings input by a user. For example, in a pinyin input method program, a character editing box is used to display pinyin characters entered by a user; in the wubi input method program, a character edit box is used to display english characters input by a user. The character strings within the character edit box are used to match candidate words.
The candidate word window is used for displaying candidate words, and the candidate words are matched with the character strings in the character editing frame. For example, in a pinyin input method program, a pinyin character string is input in a character editing box, and candidate words are words, words or phrases of the same pinyin as the pinyin character string; or the candidate words are words, words or phrases with the same pinyin as the pinyin character string part; alternatively, the candidate word is a word, word or phrase having the same/partially the same acronym as the pinyin string. The candidate word window can also be used as a recommendation window to display the collection text.
For example, as shown in fig. 6, a status bar of the input method program may display a second collection control 309, display a screen text "Virtual Reality (VR)" 304 in the text editing area 301, select the screen text "Virtual Reality (VR)" 304 as the first text, highlight it (for marking that the text has been selected), drag the first text "Virtual Reality (VR)" to the second collection control 309 and collect the first text in the custom word stock. The next time the user enters the phrase, the user simply needs to enter "xu ni" to see the candidate word "Virtual Reality (VR)" in the candidate word window.
(3) And responding to the first text selected from the on-screen texts, triggering the operation of collecting the shortcut key, and adding the first text into the custom thesaurus.
The collection shortcut key can be preset by an input method program or set by a user. For example, the collection shortcut key may be ctrl+1.
(4) The input method program/document editing program automatically identifies an on-screen text, automatically highlights the identified first text, and adds the first text to a custom thesaurus in response to an operation of triggering a first collection control of the first text; or, in response to an operation of dragging the first text to the second collection control, adding the first text to the custom thesaurus.
For example, the input method program may be authorized by the user to allow the screen currently displayed by the terminal to be intercepted. The input method program is accessed to a current display picture of the terminal, character recognition is carried out on the current display picture to obtain at least one character recognition result, and a first character recognition result (which can comprise one or more sections of text) which is matched with a screen text record (the text on the screen of the user history recorded by the input method program) is determined to be the screen text.
And calling a word segmentation algorithm to segment the upper screen text to obtain at least one phrase of the upper screen text, and removing the idioms in the at least one phrase by using a common word stock to obtain at least one non-idiom in the at least one phrase. The method comprises the steps of taking at least one non-idiom as a first text, displaying a highlight prompt at the position of the first text, and displaying a first collection control nearby the first text (when the first text comprises a plurality of non-idioms at a plurality of positions, each non-idiom is correspondingly displayed with one first collection control), wherein a user can click on the first collection control to collect the first text quickly, or the user can drag the first text to a second collection control in a status bar to collect the first text quickly.
For another example, the document editing program can directly obtain the on-screen text input by the user, call the word segmentation algorithm to segment the on-screen text to obtain at least one phrase of the on-screen text, and remove the idioms in the at least one phrase by using the common word stock to obtain at least one non-idioms in the at least one phrase. The method comprises the steps of taking at least one non-idiom as a first text, displaying a highlight prompt at the position of the first text, and displaying a first collection control nearby the first text (when the first text comprises a plurality of non-idioms at a plurality of positions, each non-idiom is correspondingly displayed with one first collection control), wherein a user can click on the first collection control to collect the first text quickly, or the user can drag the first text to a second collection control in a status bar to collect the first text quickly.
Step 260: and in response to the input of the first character string, displaying a recommendation window, wherein the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
After the custom word stock is constructed by adopting the methods in the step 220 and the step 240, when the user inputs the character string by using the input method program, the input method program not only matches the candidate word for the user according to the original rule, but also matches the candidate word for the user in the custom word stock. And if the collection text is matched in the custom word stock, displaying the collection text as a candidate word in a candidate word window.
The method of matching favorite text according to the first string may include at least one of:
(1) The triggering abbreviations of the first character string and the collection text are the same; the trigger abbreviations are set by the user for the favorites, and the user can set a respective trigger abbreviation for each favorites. Or, a character string abbreviated as the first letter of the collection text is triggered.
(2) The first character string is identical to the pinyin full-spelling of the collection text. For example, the favorites text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the first string matching it may be "wu li shang xing gong xiang xin dao".
(3) The first character string is the same as the pinyin of a portion of the text of the favorite text. For example, the favorites text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the first string matched thereto may be "wuli" or "shang xing". Alternatively, the portion of text may be the first n Chinese characters of the favorite text, n being a positive integer.
(4) The first character string is identical to the pinyin initials of the favorite text. For example, the favorites text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the first string matched thereto may be "w/s x g x d".
(5) The first character string is identical to the pinyin initials of the partial text of the favorite text. For example, the favorites text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the first string matching it may be "w l" or "s x". Alternatively, the portion of text may be the first n Chinese characters of the favorite text, n being a positive integer.
(6) The first character string is identical to a portion of English in the collection text. For example, the favorites text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the first string matched thereto may be "PUSCH". Alternatively, the portion of English may be an english capital letter in the collection text, or the portion of English may be an english first letter in the collection text.
(7) The first character string is identical to a portion of the digits in the favorite text. For example, the favorites text is "pi= 3.1415926" and the first string that matches it may be "3.14".
(8) The first character string is identical to a portion of the special characters in the favorite text. For example, the favorites text is "pi= 3.1415926" and the first string that matches it may be "pi".
In one embodiment, in response to entering a first string in a character edit box of the input method program, a candidate word window (recommendation window) of the input method program is displayed, the candidate word window (recommendation window) displaying at least one favorite text in the custom thesaurus that matches the first string.
In another embodiment, in response to entering the first string in a text editing area of the document editing program, a recommendation window of the document editing program is displayed, the recommendation window displaying at least one favorite text in the custom thesaurus that matches the first string.
In another alternative embodiment, in response to entering a second string in the text editing area, a candidate word window (recommendation window) of the input method program is displayed, the candidate word window (recommendation window) displaying at least one favorite text in the custom thesaurus of the input method program that matches the second string, the second string being an on-screen text that has been entered in the text editing area. That is, the input method program matches the collection text in the custom thesaurus according to the character string on the upper screen, and displays the successfully matched collection text into a candidate word window which is dedicated to complement the second character string on the upper screen. The method for matching the favorite text according to the second character string is the same as the above-described "method for matching the favorite text according to the first character string".
The candidate word window of the input method program is used for displaying candidate words, and the candidate words in the candidate word window can be screen-displayed by selecting the candidate words by a user.
In summary, in the method provided by the embodiment of the application, when editing a document, the terminal may receive an operation of selecting and collecting a part of edited text, and add the part of text to the custom word stock. The custom word stock is specially used for collecting texts edited by users, and matching is performed according to the custom word stock when candidate words are matched according to character strings input by the users later. By adopting the method, the user can add texts with higher use frequency into the custom word stock, and when the texts are needed to be input later, the automatic complement function can be used for providing automatic complement candidate words according to the custom word stock. And, different users have their own custom word banks, the text in the custom word bank is the user's own custom word, have greatly improved the matching degree of word bank and user, display the candidate word on the basis of the custom word bank, can both improve the matching degree of candidate word and user's expectation, can improve the input efficiency of user.
For example, the terminal may automatically categorize the text of the user's collection.
For example, when multiple favorite texts are matched, the matched favorite texts may be arranged according to a certain ranking rule in the recommendation window.
The input method program also provides a custom word stock management window for custom word stock, so that a user can manage the custom word stock conveniently.
Fig. 7 is a flowchart illustrating a word stock construction method according to an exemplary embodiment of the present application. This embodiment is illustrated by the execution of the method by the terminal shown in fig. 1. Based on the embodiment shown in fig. 2, step 240 includes step 241, and step 260 is followed by step 280.
Step 220: and displaying a text editing area, wherein the text editing area is displayed with an upper screen text.
Step 241: and in response to the operation of selecting and collecting the first text in the on-screen text, adding the first text into a first group related to the on-screen text semantically in a custom word stock, wherein the custom word stock comprises at least one collected text of nouns, phrases and sentences.
The custom thesaurus includes at least one grouping, each of the at least one grouping including at least one favorite text. These groupings may be set by the user, or may be automatically classified by an input method program or a document editing program (hereinafter, simply referred to as "input method program/document editing program" for convenience of description).
The input method program/document editing program may have a semantic recognition model stored therein. The semantic recognition model is used for extracting semantic features of the input text. For example, the semantic recognition model may be BERT (Bidirectional Encoder Representation from Transformers, multi-layer bi-directional Transformers coding model).
The input method program/document editing program calls a semantic recognition model to carry out semantic recognition on the screen text to obtain a first semantic feature; invoking a semantic recognition model to carry out semantic recognition on the collection text in one group to obtain group semantic features, and repeating the steps to obtain at least one group semantic feature corresponding to at least one group respectively; respectively calculating the semantic distance between the first semantic feature and at least one grouping semantic feature to obtain at least one semantic distance; and adding the first text into a first group in the custom thesaurus, wherein the first group is a group corresponding to the minimum value in at least one semantic distance.
The input method program/document editing program can extract semantic features of each group in the custom thesaurus, splice all collection text sequences in one group, and input a semantic recognition model to obtain one group semantic feature. Or inputting each collection text in a group into a semantic recognition model to obtain a plurality of semantic features, and averaging the plurality of semantic features to obtain a group semantic feature.
When a user edits a document, the input method program/document editing program can acquire an on-screen text (on-line document content) which is edited by the user, and input the on-screen text into the semantic recognition model to obtain a first semantic feature. Alternatively, the input method program/document editing program may segment the on-screen text using a segmentation algorithm (e.g., a bargain segmentation algorithm), screen out common words in the segmentation result using a common word stock, and then input non-conventional words in the segmentation result (words other than the common words in the segmentation result) into the semantic recognition model to obtain the first semantic feature.
And then, respectively calculating the semantic distance between the first semantic feature and the group semantic feature of each group, and selecting the group (the first group) with the smallest semantic distance, wherein the text (for example, the first text) collected by the user in the document is automatically classified into the first group.
Therefore, the input method program/document editing program can automatically classify the collected texts in the document into the corresponding groups of the technical field according to the technical field of the document currently edited by the user, and the classification efficiency of the collected texts in the custom lexicon is improved. And, when the collection text is recommended to the user according to the custom word stock later, the collection text in the group in which the corresponding field is located can be recommended preferentially according to the field to which the user edit text belongs.
Step 260: and in response to the input of the first character string, displaying a recommendation window, wherein the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
Sequencing the collection texts displayed in the recommendation window according to the recommendation rule; the recommendation rules include at least one of:
(1) The first collection text with the same triggering abbreviation as the first character string has a first priority, and the triggering abbreviation comprises triggering characters of the first collection text set by a user;
(2) Matching the first character string, wherein the second collection text existing in the first field word stock has a second priority; the input method program/document editing program is provided with at least one field word stock, and the first field word stock is the field word stock closest to the semantics of the upper screen text in the at least one field word stock;
(3) Matching with the first character string, and not having a third priority with a third collection text in the first domain lexicon;
wherein the first priority is higher than the second priority and higher than the third priority; the first, second and third collection texts are collection texts in the custom word stock.
If the trigger abbreviation of the first collection text is identical to the first character string, the first collection text is arranged at the forefront. For example, the first favorite text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the trigger set by the user for the first favorite text is abbreviated as "PUSCH". Then, when the first string input by the user is "PUSCH", the first favorite text "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)" is ranked first in the candidate word window.
In addition, the matching collection text can be ranked according to the field matching degree of the collection text and the current on-screen text.
The field matching degree is determined by the following method:
firstly, a domain word stock with a plurality of domains is arranged in an input method program/document editing program, each domain word stock is provided with a plurality of texts, each domain word stock is also provided with a feature vector, and the feature vector is obtained by respectively carrying out feature extraction and average value taking on the plurality of texts in the domain word stock by utilizing a BERT feature extraction model.
And then, inputting the current on-screen text into a BERT feature extraction model to perform feature extraction, so as to obtain the on-screen text features.
And calculating the feature distance between the on-screen text feature and the feature vector of each domain word stock, and selecting the domain word stock (for example, the first domain word stock) with the minimum feature distance.
And then, when a plurality of collection texts are obtained according to the first character string matching, selecting a second collection text (comprising at least one text) belonging to the word stock in the first field from the plurality of collection texts, and arranging the second collection text in the front. And a third collection text (including at least one text) of the plurality of collection texts, which does not belong to the first field word stock, is arranged at the back.
The terminal is provided with an input method program, and the custom word stock can be a word stock of the input method program; and/or, the terminal is operated with a document editing program, and the custom word stock can be a word stock of the document editing program.
Step 280: and responding to the operation of triggering the custom thesaurus management portal, and displaying a custom thesaurus management window.
The custom thesaurus management window may have a variety of management functions, such as removing the favorite text, modifying the favorite text, setting a trigger abbreviation for the favorite text, setting groupings of the favorite text, and the like.
For example, as shown in fig. 8, the status bar of the input method program may have a custom thesaurus management entry control, and the user triggers the custom thesaurus management entry control to display a custom thesaurus management window 401. The custom thesaurus management window displays at least one grouping of custom thesaurus, e.g., default grouping 402. At least one collection text, such as a physical uplink control channel, augmented reality, virtual reality, is also displayed in the custom thesaurus management window.
1. The favorite text is removed.
Displaying a custom word stock management window, wherein the custom word stock management window displays a fourth collection text in the custom word stock and a removal control corresponding to the fourth collection text; and in response to triggering the operation of the removal control, removing the fourth collection text from the custom thesaurus.
As shown in FIG. 8, the custom thesaurus management window may also display a remove control 403 for each favorite text, which can be removed from the custom thesaurus by triggering the remove control 403.
2. The favorite text is modified.
Displaying a custom word stock management window, wherein the custom word stock management window displays a fifth collection text in the custom word stock and a modification control corresponding to the fifth collection text; in response to a text modification operation triggering the modification control, modifying text content of the fifth favorite text according to the text modification operation.
As shown in FIG. 8, in the custom thesaurus management window, the user may also modify the favorites text, e.g., the user may modify the favorites text within the modification area 404.
3. Trigger abbreviations for the favorite text are set.
Displaying a custom thesaurus management window, wherein the custom thesaurus management window displays a sixth collection text in the custom thesaurus and a trigger abbreviation editing control corresponding to the sixth collection text; and responding to the trigger abbreviation editing operation of the trigger abbreviation editing control, and setting the trigger abbreviation of the fifth favorite text as the first trigger abbreviation input by the trigger abbreviation editing operation.
For example, the user may open a trigger abbreviation settings window for the sixth favorite text by double clicking on the sixth favorite text in the custom thesaurus management window. As shown in fig. 9, in the trigger abbreviation setting window 405, there is a region 406 in which the trigger abbreviation is set, and the user can input the abbreviation in the region 406 to set the trigger abbreviation of the sixth favorite text, for example, input VR in the region 406 to set the trigger abbreviation of the sixth favorite text to VR.
In summary, according to the method provided by the embodiment of the application, the terminal can automatically classify the collected text according to the field to which the screen text currently edited by the user belongs, so that the user can manage the custom word stock conveniently, and the collected text can be recommended according to the custom word stock conveniently, so that the input efficiency is improved.
The method provided by the embodiment of the application can also sort the matched collection texts according to the coincidence degree of the matched collection texts and the field of the current edited screen texts. And arranging the collection texts with the same field in the front and arranging the collection texts with different fields in the rear, so that the efficiency of selecting candidate words by a user is improved, and the efficiency of inputting the collection texts by the user is further improved.
According to the method provided by the embodiment of the application, the user can remove, modify and set the collection text in the custom thesaurus management window by providing the custom thesaurus management window, so that the user can conveniently and quickly manage the custom thesaurus. In the custom thesaurus management window, the user can create groups to classify the collection texts, or the user can set trigger abbreviations of the collection texts to enable the user to quickly input the collection texts, so that the efficiency of inputting by using the custom thesaurus is improved. And the matching degree of the custom word stock and the custom of the user is improved.
The terminal is provided with an input method program, and the custom word stock is a word stock of the input method program.
The input method program can display a candidate word window for the candidate words obtained by matching in the custom word stock independently, or can display the candidate words obtained by matching in the custom word stock in a mixed mode into the original candidate word window.
Fig. 10 is a flowchart illustrating a word stock construction method according to an exemplary embodiment of the present application. This embodiment is illustrated by the execution of the method by the terminal shown in fig. 1. Based on the embodiment shown in fig. 2, step 240 includes step 242 and step 260 includes step 261.
Step 220: and displaying a text editing area, wherein the text editing area is displayed with an upper screen text.
Step 242: and in response to the operation of selecting and collecting the first text in the on-screen text, adding the first text to a custom thesaurus of the input method program, wherein the custom thesaurus comprises at least one collected text of nouns, phrases and sentences.
Step 261: and in response to inputting the first character string in the character editing box of the input method program, displaying a candidate word window of the input method program, wherein the candidate word window is displayed with at least one collection text matched with the first character string in the custom word stock.
In an alternative embodiment, the recommendation window includes a first candidate word window of the input method program; and in response to the first character string being input in the character editing box of the input method program, displaying a first candidate word window of the input method program, wherein the first candidate word window displays candidate words matched with the first character string and collection texts, the candidate words are obtained according to text input rules, and the collection texts are obtained by matching in a custom word stock.
In another alternative embodiment, the recommendation window includes a third candidate word window of the input method program; and in response to the first character string being input in the character editing box of the input method program, displaying a second candidate word window and a third candidate word window of the input method program, wherein the second candidate word window displays candidate words matched with the first character string, and the third candidate word window displays collection text matched with the first character string in a custom word stock, and the candidate words are obtained according to a text input rule.
In one embodiment, the collection text matched by the custom thesaurus and the candidate word matched by the original matching rule are displayed in the same candidate word window.
And in response to the first character string being input in the character editing box of the input method program, displaying a first candidate word window of the input method program, wherein the first candidate word window displays candidate words matched with the first character string and collection texts, the candidate words are obtained according to text input rules, and the collection texts are obtained by matching in a custom word stock.
The text input rule refers to an original matching rule in an input method program, for example, characters, words and phrases which are subjected to Pinyin matching with the first character string according to a Pinyin matching method; or according to the automatic completion function, automatically-completed words and phrases are obtained from the common word stock according to the first character string. Namely, the candidate words obtained according to the text input rule are obtained by matching the input method program by adopting a method other than the matching of the custom word stock.
And in the first candidate word window, the candidate words obtained by matching the two matching modes are ordered according to a preset rule. The preset rule may be to preferentially display the matched favorite text, and then display the candidate word obtained by matching according to the text input rule. The preset rule may also be that the ranking is performed according to a weight coefficient, and the weight coefficient may be calculated according to the degree of matching, the degree of usage, whether the information is a multi-dimensional information such as a favorite text, etc.
In another embodiment, the candidate words obtained by matching the custom word stock and the candidate words matched by the original matching rules are displayed in different candidate word windows.
And in response to the first character string being input in the character editing box of the input method program, displaying a second candidate word window and a third candidate word window of the input method program, wherein the second candidate word window displays candidate words matched with the first character string, the third candidate word window displays collection text matched with the first character string, the candidate words are obtained according to a text input rule, and the collection text is obtained by matching in a custom word stock.
And displaying the candidate words of the text input rules and the candidate words obtained by matching in the custom word stock in two candidate word windows respectively, wherein the ordering rules of the candidate words in the two candidate word windows can not interfere with each other. Because the candidate words obtained by matching in the custom word stock are not very common words of the user, but rather are rarely used words which can be reused by some users, if the words are frequently displayed in an original candidate word window, the input efficiency of the user at ordinary times can be affected; the method can independently display the matched collection text in the custom word stock to a candidate word window, so that the input efficiency of the user at ordinary times can be ensured, and the user can be ensured to input the rarely-used words efficiently. Alternatively, the third candidate window may also be referred to as: a text recommendation window is collected.
Since the candidate word window becomes two, the newly added third candidate word window may have a new on-screen shortcut. For example, the on-screen shortcut of the second candidate window is a "space key" and a "number key"; the on-screen shortcut of the third candidate word window may be "shift + space bar" and "shift + numeric key".
The collection texts displayed in the candidate word windows (the first candidate word window and/or the third candidate word window) are ordered according to the recommendation rule; the recommendation rules include at least one of:
(1) The first collection text with the same triggering abbreviation as the first character string has a first priority, and the triggering abbreviation comprises triggering characters of the first collection text set by a user;
(2) Matching the first character string, wherein the second collection text existing in the first field word stock has a second priority; the input method program is provided with at least one field word stock, and the first field word stock is the field word stock closest to the semantics of the on-screen text in the at least one field word stock;
(3) Matching with the first character string, and not having a third priority with a third collection text in the first domain lexicon;
wherein the first priority is higher than the second priority and higher than the third priority; the first, second and third collection texts are collection texts in the custom word stock.
If the trigger abbreviation of the first collection text is identical to the first character string, the first collection text is arranged at the forefront. For example, the first favorite text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the trigger set by the user for the first favorite text is abbreviated as "PUSCH". Then, when the first string input by the user is "PUSCH", the first favorite text "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)" is ranked first in the candidate word window.
In addition, the matching collection text can be ranked according to the field matching degree of the collection text and the current on-screen text.
The field matching degree is determined by the following method:
firstly, a domain word stock with a plurality of domains is provided in an input method program, each domain word stock is provided with a plurality of texts, each domain word stock is also provided with a feature vector, and the feature vector is obtained by respectively carrying out feature extraction and average value taking on the plurality of texts in the domain word stock by utilizing a BERT feature extraction model.
And then, inputting the current on-screen text into a BERT feature extraction model to perform feature extraction, so as to obtain the on-screen text features.
And calculating the feature distance between the on-screen text feature and the feature vector of each domain word stock, and selecting the domain word stock (for example, the first domain word stock) with the minimum feature distance.
And then, when a plurality of collection texts are obtained according to the first character string matching, selecting a second collection text (comprising at least one text) belonging to the word stock in the first field from the plurality of collection texts, and arranging the second collection text in the front. And a third collection text (including at least one text) of the plurality of collection texts, which does not belong to the first field word stock, is arranged at the back.
In summary, according to the method provided by the embodiment of the application, the input method program can respectively display the candidate words obtained by matching according to the conventional rules and the candidate words obtained by matching from the custom word stock into two candidate word windows, so that the input efficiency of the user in normal times is not affected, the input efficiency of the user for inputting the rarely-used word is improved, and the ordering of the two candidate word windows is not affected, and even if the user selects the rarely-used word in the custom word stock for many times in a document editing process, the ordering of the candidate words in the conventional candidate word windows is not affected, and the ordering of the word in the candidate words is prevented from being affected due to the fact that the user frequently inputs the same word in a short time.
The terminal is provided with a document editing program in an exemplary mode, and the custom word stock is a word stock of the document editing program.
For example, each online document may correspond to at least one custom thesaurus.
For example, different user accounts may have different thesaurus editing rights for different custom thesauruses.
Fig. 11 is a flowchart illustrating a word stock construction method according to an exemplary embodiment of the present application. This embodiment is illustrated by the execution of the method by the terminal shown in fig. 1. Based on the embodiment shown in fig. 2, step 240 includes step 243 and step 260 includes step 262.
Step 220: and displaying a text editing area, wherein the text editing area is displayed with an upper screen text.
Step 243: in response to the operation of selecting and collecting a first text in the on-screen text, the first text is added to a custom thesaurus of the document editing program, the custom thesaurus including at least one collected text of nouns, phrases, sentences.
Illustratively, the document editing program stores a custom thesaurus. The custom word stock can be stored in association with the user account or in association with the document currently being edited.
When the user-defined word stock is stored in association with the user account, the user account can use the user-defined word stock to recommend the collection text when any document is opened, or the user account can use the text in the document as the collection text to be added into the user-defined word stock when any document is edited.
When the custom word stock is stored in association with a document (including an online document/collaborative document), any user account opening document may recommend a favorites text using the custom word stock corresponding to the document, or any user account may add text in the document as a favorites text to the custom word stock when editing the document.
Further, when the custom thesaurus is stored in association with a document (including an online document/collaborative document), different thesaurus editing rights can be set for different users. Only when the user account has word stock editing authority (or word stock using authority), the user account can use the user-defined word stock corresponding to the document to recommend the collection text; only when the user account has thesaurus editing authority (or thesaurus management authority), the user account can add the text in the document as the collection text into the custom thesaurus.
Or when one document corresponds to a plurality of custom word banks, the authority of each user account in each custom word bank can be set respectively. For example, the first document has ten custom word banks, and the ten custom word banks correspond to ten departments respectively, so that the employee account of each department only has the word bank editing authority of the custom word bank corresponding to the affiliated department, and does not have the word bank editing authority of the custom word bank corresponding to other departments, so that people in one department can professionally maintain and use the custom professional field word bank. For example, an employee of a financial department may maintain a financial domain thesaurus (a first custom thesaurus) that uses a first document, and an employee of an algorithmic department may maintain an algorithmic domain thesaurus (a second custom thesaurus) that uses a first document.
Step 262: and in response to the first character string being input in the text editing area of the document editing program, displaying a collection text recommendation window of the input method program, wherein the collection text recommendation window displays at least one user collection text matched with the first character string in the custom thesaurus.
In an alternative embodiment, in response to entering the first string in a text editing area of the document editing program, a favorite text recommendation window of the document editing program is displayed, the favorite text recommendation window displaying favorite text in the custom thesaurus that matches the first string.
The document editing program can match the collection text in the custom word stock according to the character string input by the user in the text editing area, and display a recommendation window after the matching is successful, and display the matched collection text.
Alternatively, the first character string may be n characters recently input by the user in the text editing area, where n is a positive integer. Illustratively, whenever a newly entered text appears in the text editing area, the document editing program obtains the most recently entered 1 character, matches the favorite text with that character to obtain a first result; acquiring 2 recently input characters, and matching the collected text with the 2 characters to obtain a second result; acquiring 3 recently input characters, acquiring a third result … … by using 3 characters to match the collection text, acquiring n recently input characters, and acquiring an nth result by using n characters to match the collection text; the first, second, third, … … nth results are displayed in the recommendation window.
In an alternative embodiment, the document editing program is an online document editing program, and the text editing area displays a first online document that is currently opened by the online document editing program; the online document editing program logs in with a first user account; responding to the operation of selecting and collecting a first text in the on-screen text of the first online document, and adding the first text into a custom thesaurus of the first online document under the condition that the first user account has thesaurus editing authority of the first online document; responding to the input of a first character string in a text editing area of a first online document, displaying a collection text recommendation window of a document editing program, wherein the collection text recommendation window displays collection texts matched with the first character string in a custom word stock corresponding to the first online document; the custom word stock is used for storing: a user account with word stock editing rights for a first online document, text collected during editing of the first online document.
In an alternative embodiment, the first online document corresponds to at least one custom word stock, and the word stock editing authority comprises a word stock editing authority corresponding to each custom word stock in the at least one custom word stock; the word stock editing authorities comprise first word stock editing authorities corresponding to the first custom word stock; in the case that the first user account has the first thesaurus editing authority, the first user account is allowed to edit and/or use the first custom thesaurus; in the case that the first user account does not have the first thesaurus editing authority, the first user account is not allowed to edit and/or use the first custom thesaurus.
Sequencing the collection texts displayed in the recommendation window according to the recommendation rule; the recommendation rules include at least one of:
(1) The first collection text with the same triggering abbreviation as the first character string has a first priority, and the triggering abbreviation comprises triggering characters of the first collection text set by a user;
(2) Matching the first character string, wherein the second collection text existing in the first field word stock has a second priority; the input method program is provided with at least one field word stock, and the first field word stock is the field word stock closest to the semantics of the on-screen text in the at least one field word stock;
(3) Matching with the first character string, and not having a third priority with a third collection text in the first domain lexicon;
wherein the first priority is higher than the second priority and higher than the third priority; the first, second and third collection texts are collection texts in the custom word stock.
If the trigger abbreviation of the first collection text is identical to the first character string, the first collection text is arranged at the forefront. For example, the first favorite text is "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)", and the trigger set by the user for the first favorite text is abbreviated as "PUSCH". Then, when the first string input by the user is "PUSCH", the first favorite text "physical uplink shared channel (Physical Uplink Shared Channel, PUSCH)" is ranked first in the candidate word window.
In addition, the matching collection text can be ranked according to the field matching degree of the collection text and the current on-screen text.
The field matching degree is determined by the following method:
firstly, a domain word stock with a plurality of domains is provided in an input method program, each domain word stock is provided with a plurality of texts, each domain word stock is also provided with a feature vector, and the feature vector is obtained by respectively carrying out feature extraction and average value taking on the plurality of texts in the domain word stock by utilizing a BERT feature extraction model.
And then, inputting the current on-screen text into a BERT feature extraction model to perform feature extraction, so as to obtain the on-screen text features.
And calculating the feature distance between the on-screen text feature and the feature vector of each domain word stock, and selecting the domain word stock (for example, the first domain word stock) with the minimum feature distance.
And then, when a plurality of collection texts are obtained according to the first character string matching, selecting a second collection text (comprising at least one text) belonging to the word stock in the first field from the plurality of collection texts, and arranging the second collection text in the front. And a third collection text (including at least one text) of the plurality of collection texts, which does not belong to the first field word stock, is arranged at the back.
In summary, according to the method provided by the embodiment of the application, the document editing program can store at least one custom word stock for each document, and when a user edits the document, the custom word stock can be used and managed to improve the input efficiency when the user edits the document.
According to the method provided by the embodiment of the application, the document editing program can set different word stock editing authorities for different users, and manage and control the management and use of the user on the custom word stock. When a plurality of persons collaboratively edit a document, different users can be restricted to have the management authority of different custom word banks, so that the specialty of collection texts in each custom word bank is ensured, and the custom word banks are prevented from being edited by different types of users at will.
The method provided by the embodiment of the application is applied to an online document, and the method provided by the embodiment of the application is respectively described from a product side and a technology side.
1. Product side
The description is made in terms of collection function of proper nouns (collection text), arrangement function of collected proper nouns, custom abbreviation triggering quick association, proper noun recommendation function for current edited document, and the like.
1. Collecting function of proper nouns
The proper noun collection function may be used when the user has entered a set of proper nouns and considers that the proper nouns are subsequently necessary to be intelligently recommended and filled in.
As shown in FIG. 4, the user may trigger the quick collection proper noun response button (first collection control 305) using a mouse box to highlight the corresponding proper noun. After clicking the button, the collection record of the proper nouns in the frame selection area can be completed.
2. Management function of proper nouns
After adding the collection, the vocabulary will be added to the default group by default.
The user can customize the new group to classify the proper nouns into labels.
As shown in FIG. 8, the user may modify and refine the content of the proper nouns, such as refining the "physical control channel" to "physical uplink control channel" and adding English definitions.
3. Custom abbreviation association function for proper nouns
When collecting a proper noun, a trigger abbreviation for triggering the association of the proper noun recommendation may be set.
For example, for a proper noun of "virtual reality," a "VR" letter may be set for it as an abbreviation to trigger an association condition. When the user inputs VR again, the virtual reality is directly recommended as candidate word recommendation for quickly filling in proper nouns.
4. Intelligent ordering recommendation of collected proper nouns and quick filling recommendation function
When the user inputs the characters newly in the online document, the vocabulary entry which is matched with the user to be input newly can be identified in real time, if the pre-character is matched with the collected proper noun, a rapid filling floating window can appear, and the user can press the corresponding shortcut key or click the mouse mark, so that the rapid filling can be realized.
For example, the "image recognition module" has been collected as a proper noun, as shown in (1) of fig. 12, when the user inputs the "image", the input method program matches the existing proper noun according to the "image", as shown in (2) of fig. 12, and then all candidate words are presented in the quick fill-up function floating window 501, and the recommendation of the proper noun of the "image filtering module" and the "image filtering system" appears. The user can quickly input the candidate word image filtering module for sorting the first digit into the text by pressing the shortcut key "shift+1".
Candidate word ordering rules: the higher the correlation calculated by the algorithm, the earlier the candidate ranking.
High priority recommendation: among the collected proper nouns, proper noun words which are intersected with proper noun tables of the category of the field to which the online document which is edited currently belongs;
Sub-priority recommendation: among the collected proper nouns, proper noun vocabulary of intersection does not exist in the proper noun list of the category to which the online document currently edited belongs.
2. Technical side
On one hand, the online document-based proper noun collection and quick recommendation and alignment capability is provided;
on the other hand, through artificial intelligence technology, when proper nouns are too many, the priority order problem of recommending proper nouns is solved, intelligent supplementary recommendation is enabled to be more convenient and accurate.
1. Schematic functional diagram
As shown in FIG. 13, after the user logs in with account 601, the online document editing can be performed using the Tenced document.
When text 602 is entered, text that meets proper noun triggering conditions triggers proper noun recommendation capabilities based on which a user can perform quick content replenishment of proper nouns.
The quick fill 603 has two ways of fill: quick filling is carried out through a shortcut key; and clicking the recommendation panel by a mouse to quickly fill in.
The user can add the text in the online document to the proper noun list (custom word stock) through the function of collecting proper nouns, and the user has a special management panel, which can manage the proper nouns 604: modifying the packet; removing the collected proper nouns; modifying the content of the collected proper nouns; trigger association abbreviations for proper nouns are modified.
As shown in FIG. 14, after the user opens the online document, the text may be edited in the online document. Step 701 may then be performed, where the user boxes proper nouns in the document body, and the collection control appears; step 702: the user uses the collection control to collect proper nouns. The user can set the trigger association abbreviation (trigger abbreviation) of the proper noun in the proper noun management panel, and can also set proper noun groups and modify proper noun contents. Through the operation, the collected proper noun list (custom word stock) of the user is finally obtained.
As shown in fig. 15, after the user opens the online document, the input method program intercepts the document content, processes and analyzes the document content in the document classification module 703 to obtain a classification of the document, and provides the document classification to the proper noun quick-fill-in module 704. The proper noun quick-fill module 704 obtains a weighted proper noun recommendation table for the current document based on the document classification, the user's historical collected proper noun library (custom library). When the user inputs characters, an intelligent recommendation and complement function is triggered. The recommendation-only alignment function matches and aligns candidate words according to the weighted proper noun recommendation table. The user quickly fills in text by mouse click or shortcut.
2. Functional timing diagram
The online document is characterized in that the document content is stored in the cloud, so that preprocessing can be performed in the background based on the content data of the document, the professional field to which the document belongs is judged, and then the sorting weight is added for the collected proper nouns.
When the user starts editing the online document, a new vocabulary input by the user is detected, if a noun recommendation rule is triggered, a proper noun recommendation function is evoked, and the user can quickly select and supplement content in a shortcut key and mouse clicking mode, so that the input efficiency is improved.
As shown in FIG. 16, in an online document 705, a user inputs document content. The document classification module 706 of the input method program intercepts the document content and classifies the document. The proper noun quick-fill module 706 of the input method program provides the collected proper nouns to the document classification module, which generates recommendation weights (weighted proper noun recommendation tables) for the document categories based on the document classifications and the collected proper nouns. When a user inputs a new word in the online document, the input method program can display a recommended proper noun set according to the recommendation weight, the user selects proper nouns through shortcut keys/clicking, and the input method program fills the selected proper nouns in the online document.
3. Document classification module
By utilizing the artificial intelligence technology, more reasonable proper noun recommendation ordering is performed based on document content, and the proper noun matching sequence can be provided for the category orientation of the field of a single document instead of relying on the input habit of a global user.
The proper noun generation algorithm module in the specific field is divided into two core flow links of semantic dictionary database construction and semantic dictionary dynamic extraction.
3.1 semantic dictionary database construction
In order to complete the task of automatically matching proper nouns based on the user input vocabulary, a more powerful relational dictionary is needed.
A. Preprocessing stage-labeling category labels based on existing document data
First, the construction algorithm is required to learn the required data material, and because each document is spread around a particular topic, the documents can be classified based on a set of topics, labeled with different class labels.
Taking the digital image field as an example, when documents with subjects such as image classification, object detection and the like are encountered, the documents can be classified into a class set C of the digital image field.
Next, several documents may be randomly extracted for each different set of documents, each document labeled A i ,A i Representing the ith document. Namely A i ,s.t.Classification(A i )∈C。
B. Generating proper noun tables
Taking set C as an example, a plurality of documents A are taken in set C i Then, the jieba word segmentation is carried out on the document, one document is split into a plurality of words, all the common words are filtered by combining the common word list, and the rest words are the proper noun list (field word stock) of the field.
For example, a sentence of "in order to improve the robustness of the image recognition system," a high-cost performance CMOS camera is used as a hardware platform to program the image recognition module of the intelligent detection trolley, "and after jieba word segmentation, the whole sentence can be subjected to phrase decomposition, and the phrase thus obtained is shown in table 1.
TABLE 1
/>
Then, preposition filtering and common word filtering are carried out, and proper noun tables shown in the table 2 are obtained.
TABLE 2
Proper noun list
Image recognition system
Intelligent detection trolley
Image recognition module
C. Feature encoding the document and obtaining the average feature encoding of each set
When the extraction of the proper noun table is completed, the document is subjected to feature extraction. Adopting BERT coding processing, and obtaining each document A after processing i Feature encoding of (a)Then, all the document feature codes in each set C are averaged to obtain the feature code of each document set label category in the space +.>。/>
D. Building a dictionary for each document collection category
Tag class C has been previously aggregated for each document i Obtaining the feature codes of the document set label category
Tag class C has also been aggregated for each document i And obtaining a proper noun table of each document under the document set label category.
Next, all proprietary under each tag class will beWord list, combining, constructing into dictionary of document set label category, summarizing into term list of the category
For the feature codes, average value processing is adopted to obtain the feature codes of the category, and the feature codes of the category labels to which the documents belong are also obtained after the feature codes of the category are processed to obtain the proper noun list of the documents at the moment
In summary, as shown in fig. 17, the method of the scheme for generating the special vocabulary of the document set tag class and the method of the scheme for extracting the feature vector take the document set of the class 1 as an example, perform word segmentation on each document in the document set of the class 1 to obtain a word segmentation table, and screen out common words from the word segmentation table to obtain the special vocabulary of the class 1. And extracting the characteristics of each document in the document set of the category 1 by using a BERT characteristic extractor, and then averaging the extracted characteristics to obtain a characteristic vector of the category 1.
3.2 dynamic extraction of semantic dictionaries
By predicting the category labels of the newly added editing content vocabulary, proper nouns are matched more accurately.
When the content of the document is too much, the spatial characteristics of the document itself are re-extracted after each document content update, which becomes costly in terms of time complexity and difficult to implement. It is contemplated that the body content of the edited document will not change drastically during normal editing by the user, such as the document will not jump from a "image processing" domain to an "economics principles" domain during editing. However, hidden variables in the process of editing the document can change along with the advancement of the document structure, and the gate structure of the LSTM (Long Short-Term Memory) is matched with the current problem, so that the characteristics of the document are predicted by using an LSTM network, and the characteristics of recently edited words are better fused by forgetting the early characteristics of the document. The content of the document is continuously input into the LSTM network during the training process, and when proper nouns are encountered, the LSTM network is trained using features of the proper noun subclass as target feature constraints. When the network is used, the characteristic information generated by BERT is continuously input into the LSTM network, and the hidden layer output by the LSTM is used as the current hidden characteristic of the document.
After each editing of the sentence, the sentences are sent into the BERT for processing without considering the sequence, and the obtained characteristics are fused to be used as the characteristic information F of the document to be edited.
Proper nouns that the user has actively collected are defined as proper noun list W a
After obtaining the feature F of the document, the cosine distance is used to measure the document and the document set label class C which is preprocessed initially i The distance between, i.e. |F.C i | a. The invention relates to a method for producing a fibre-reinforced plastic composite. By calculating cosine distance, finding the document set C nearest to the current document b Then reading the document set label class C closest to the document set label class C b Proper noun list W in b . As shown in fig. 18, the intersection of two collection tables is taken as the priority recommendation table W r =∩(W a ,W b )。W a And W is equal to b The complement part of the non-intersection is used as a candidate recommended noun list with lower priority.
As shown in fig. 19, according to the content input by the user, step 801 is executed to determine whether the content triggers abbreviation matching, and if the content triggers abbreviation matching, the proper nouns matching the abbreviations are recommended with the highest priority. If the trigger abbreviation is not matched with the trigger abbreviation, reading a collected proper noun table; and performing BERT feature extraction processing on sentences recently input in the current document, inputting the extracted features into an LSTM network for feature fusion, and obtaining the document features of the current document. And selecting a proper noun list closest to a proper word stock in a series of field set categories according to the document characteristics. The intersection of the closest proper noun table and the collected proper noun table is used as a high-priority recommended noun table, and the other is used as a secondary priority recommended noun table.
Many word workers, such as researchers, publishers, patent engineers, composers, etc., who often use online documents for editing and writing, often use proper nouns that are unique to a particular professional field when the field is involved. In particular, proper nouns are frequently used in documents in a particular domain direction for scholars across multiple discipline subdivision directions.
For example, in the case of written patents, long term terminology is often present in patent documents. For example, in patent documents in the field of image recognition, the term "image recognition module" is intended to appear hundreds of times. The conventional input method is to input three words of an image, an identification and a module respectively, and if the conventional function of the input method is only relied on, the user needs to repeatedly spell and input for hundreds of times.
The method provided by the embodiment of the application provides the capability of actively collecting the proper nouns for the user, and the collected proper nouns can be rapidly recommended and complemented into the document, so that the text input efficiency is improved. When the user edits the online document, the collected proper nouns can be intelligently recommended by combining the prefix input by the user or triggering abbreviations capable of triggering association, and the user can quickly supplement the collected proper nouns to the document text through simple interaction. The recommended proper nouns can semantically adjust the recommendation priority based on the current document content, and even if a large number of proper nouns are collected, the recommendation can be accurate, so that the purpose of improving the document input efficiency is achieved.
In order to realize the function, a collection control is provided on the input method program, and a user can collect nouns in the collection control (custom word stock) through operations such as dragging, right key point collection after selection and the like.
According to the method provided by the embodiment of the application, the category classification is carried out on the current document through the algorithm, so that the subsequent targeted recommendation is facilitated. Specifically, when the user edits the input text, if the prefix keyword is matched, an intelligent recommendation floating window (a third candidate word window) appears, so that the user can click the floating window to recommend content to fill in the text, or press a shortcut key such as "shift+1", and the content can be quickly and intelligently filled.
The method provided by the embodiment of the application greatly improves the working efficiency of word workers which are involved in different professional fields and need to input a plurality of proper nouns in daily word processing work.
Fig. 20 is a block diagram showing a construction of a word stock construction apparatus according to an exemplary embodiment of the present application. The device is used for realizing the terminal, and the device comprises:
the display module 901 is configured to display a text editing area, where an on-screen text is displayed in the text editing area;
The interaction module 903 is configured to receive an operation of selecting and collecting a first text in the on-screen text;
a thesaurus module 902, configured to, in response to an operation of selecting and collecting a first text in the on-screen text, add the first text to a custom thesaurus, where the custom thesaurus includes at least one collected text of a noun, a phrase, and a sentence;
the interaction module 903 is configured to receive an operation of inputting a first character string;
the display module 901 is configured to display a recommendation window in response to inputting a first character string, where the recommendation window displays at least one collection text matching the first character string in the custom thesaurus.
In an optional embodiment, the display module 901 is configured to display a first collection control corresponding to the first text in response to an operation of selecting the first text in the on-screen text;
the thesaurus module 902 is configured to add the first text to the custom thesaurus in response to triggering the operation of the first collection control.
In an alternative embodiment, the display module 901 is configured to display a status bar, where the status bar displays a second collection control;
The thesaurus module 902 is configured to add the first text to the custom thesaurus in response to an operation of selecting the first text in the on-screen text and dragging the first text to the second collection control.
In an alternative embodiment, the custom thesaurus comprises at least one group, each of the at least one group comprising at least one favorite text;
the word stock module 902 is configured to invoke a semantic recognition model to perform semantic recognition on the on-screen text to obtain a first semantic feature; invoking the semantic recognition model to carry out semantic recognition on the collection text in one group to obtain group semantic features, and repeating the steps to obtain at least one group semantic feature respectively corresponding to the at least one group; respectively calculating the semantic distance between the first semantic feature and at least one grouping semantic feature to obtain at least one semantic distance; and adding the first text to a first group in the custom thesaurus, wherein the first group is a group corresponding to the minimum value in the at least one semantic distance.
In an optional embodiment, an input method program is run on the terminal, and the custom word stock is a word stock of the input method program;
And/or a document editing program is run on the terminal, and the custom word stock is a word stock of the document editing program.
In an optional embodiment, the terminal is provided with the input method program, and the recommendation window comprises a first candidate word window of the input method program;
the display module 901 is configured to display, in response to inputting the first character string in a character edit box of the input method program, the first candidate word window of the input method program, where the first candidate word window displays a candidate word matched with the first character string and the favorite text, where the candidate word is obtained according to a text input rule, and the favorite text is obtained by matching in the custom word stock.
In an optional embodiment, the terminal is provided with the input method program, and the recommendation window comprises a third candidate word window of the input method program;
the display module 901 is configured to respond to inputting a first character string in a character editing box of the input method program, and display a second candidate word window and a third candidate word window of the input method program, where the second candidate word window displays candidate words matched with the first character string, and the third candidate word window displays the collection text matched with the first character string in the custom word stock, and the candidate words are obtained according to a text input rule.
In an alternative embodiment, the document editing program is run on the terminal, and the recommendation window comprises a collection text recommendation window of the document editing program;
the display module 901 is configured to display the favorite text recommendation window of the document editing program in response to the first character string being input in the text editing area of the document editing program, where the favorite text recommendation window displays the favorite text matching the first character string in the custom thesaurus.
In an optional embodiment, the terminal is provided with the document editing program, the document editing program is an online document editing program, and the text editing area displays a first online document currently opened by the online document editing program; the online document editing program logs in with a first user account;
the thesaurus module 902 is configured to, in response to an operation of selecting and collecting the first text in the on-screen text of the first online document, add the first text to the custom thesaurus of the first online document if the first user account has a thesaurus editing authority of the first online document;
The display module 901 is configured to display the favorite text recommendation window of the document editing program in response to the first character string being input in the text editing area of the first online document, where the favorite text recommendation window displays the favorite text matched with the first character string in the custom word stock corresponding to the first online document;
wherein, the custom word stock is used for storing: and the user account with the word stock editing authority of the first online document is a text collected in the process of editing the first online document.
In an optional embodiment, the first online document corresponds to at least one custom word stock, and the word stock editing authority includes a word stock editing authority corresponding to each custom word stock in the at least one custom word stock;
the word stock editing authorities comprise first word stock editing authorities corresponding to the first custom word stock; in the case that the first user account has the first thesaurus editing authority, the first user account is allowed to edit and/or use the first custom thesaurus; and under the condition that the first user account does not have the first word stock editing authority, the first user account is not allowed to edit and/or use the first custom word stock.
In an alternative embodiment, the collection text displayed in the recommendation window is ranked according to recommendation rules;
the recommendation rules include at least one of:
the first collection text with the same triggering abbreviation as the first character string has a first priority, and the triggering abbreviation comprises triggering characters of the first collection text set by a user;
matching the first character string, wherein the second collection text existing in the first field word stock has a second priority; the first domain word stock is a domain word stock closest to the semantics of the on-screen text in at least one domain word stock;
matching with the first character string, and not having a third priority with a third collection text in the first domain word stock;
wherein the first priority is higher than the second priority than the third priority; the first collection text, the second collection text and the third collection text are collection texts in the custom word stock.
In an optional embodiment, the display module 901 is configured to display a custom lexicon management window, where the custom lexicon management window displays a fourth collection text in the custom lexicon and a removal control corresponding to the fourth collection text;
The thesaurus module 902 is configured to remove the fourth favorite text from the custom thesaurus in response to triggering the operation of the remove control.
In an optional embodiment, the display module 901 is configured to display a custom lexicon management window, where the custom lexicon management window displays a fifth collection text in the custom lexicon and a modification control corresponding to the fifth collection text;
the thesaurus module 902 is configured to modify text content of the fifth favorite text according to a text modification operation that triggers the modification control.
In an optional embodiment, the display module 901 is configured to display a custom thesaurus management window, where the custom thesaurus management window displays a sixth collection text in the custom thesaurus and a trigger abbreviation editing control corresponding to the sixth collection text;
the thesaurus module 902 is configured to set a trigger abbreviation of the fifth favorite text as a first trigger abbreviation input by the trigger abbreviation editing operation in response to a trigger abbreviation editing operation that triggers the trigger abbreviation editing control.
The application also provides a computer device (terminal), which comprises a processor and a memory, wherein at least one instruction is stored in the memory, and the at least one instruction is loaded and executed by the processor to realize the word stock construction method provided by each method embodiment. It should be noted that the computer device may be a computer device as provided in fig. 21 below.
As shown in fig. 21, the above-mentioned computer device 1000 may include: processor 1001, network interface 1004, and memory 1005, and in addition, the above-described computer device 1000 may further include: a target user interface 1003, and at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The target user interface 1003 may include a Display (Display) and a Keyboard (Keyboard), and the optional target user interface 1003 may further include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a nonvolatile memory (non-volatile memory), such as at least one magnetic disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the processor 1001. As shown in fig. 10, an operating system, a network communication module, a target user interface module, and a device control application may be included in a memory 1005, which is one type of computer-readable storage medium.
In the computer device 1000 shown in fig. 21, the network interface 1004 may provide a network communication function; while target user interface 1003 is primarily an interface for providing input to a target user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement the word stock construction method.
It should be understood that the computer device 1000 described in the embodiments of the present application may perform the description of the word stock construction method in any of the foregoing embodiments, which is not repeated herein.
The application provides a computer readable storage medium, wherein at least one instruction is stored in the storage medium, and the at least one instruction is loaded and executed by the processor to realize the word stock construction method provided by each method embodiment.
The application also provides a computer program product which, when run on a computer, causes the computer to execute the word stock construction method provided by the method embodiments.
The above ordering of embodiments of the application is merely for illustration, and does not represent the advantages or disadvantages of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims (18)

1. A method for constructing a word stock, the method being performed by a terminal, the method comprising:
displaying a text editing area, wherein the text editing area displays an upper screen text;
in response to an operation of selecting and collecting a first text in the on-screen text, adding the first text to a custom thesaurus, wherein the custom thesaurus comprises at least one collected text in nouns, phrases and sentences;
and in response to the input of the first character string, displaying a recommendation window, wherein the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
2. The method of claim 1, wherein the adding the first text to the custom thesaurus in response to the selecting and collecting a first text of the on-screen text comprises:
responding to the operation of selecting the first text in the on-screen text, and displaying a first collection control corresponding to the first text;
and in response to triggering the operation of the first collection control, adding the first text to the custom thesaurus.
3. The method according to claim 1, wherein the method further comprises:
Displaying a status bar, wherein the status bar displays a second collection control;
the operation of responding to the selection and collection of the first text in the on-screen text, adding the first text to a custom thesaurus, comprises the following steps:
and in response to an operation of selecting the first text in the on-screen text and dragging the first text to the second collection control, adding the first text to the custom thesaurus.
4. A method according to any one of claims 1 to 3, wherein the custom thesaurus comprises at least one grouping, each of the at least one grouping comprising at least one favorite text;
the adding the first text to the custom thesaurus includes:
invoking a semantic recognition model to carry out semantic recognition on the screen text to obtain a first semantic feature;
invoking the semantic recognition model to carry out semantic recognition on the collection text in one group to obtain group semantic features, and repeating the steps to obtain at least one group semantic feature respectively corresponding to the at least one group;
respectively calculating the semantic distance between the first semantic feature and at least one grouping semantic feature to obtain at least one semantic distance;
And adding the first text to a first group in the custom thesaurus, wherein the first group is a group corresponding to the minimum value in the at least one semantic distance.
5. A method according to any one of claims 1 to 3, wherein an input method program is run on the terminal, and the custom word stock is a word stock of the input method program;
and/or a document editing program is run on the terminal, and the custom word stock is a word stock of the document editing program.
6. The method of claim 5, wherein the input method program is run on the terminal, and the recommendation window comprises a first candidate word window of the input method program;
the displaying a recommendation window in response to entering the first string, comprising:
and responding to the first character string input in a character editing box of the input method program, displaying a first candidate word window of the input method program, wherein the first candidate word window displays candidate words matched with the first character string and the collection text, the candidate words are obtained according to a text input rule, and the collection text is obtained by matching in the custom word stock.
7. The method of claim 5, wherein the input method program is run on the terminal, and the recommendation window comprises a third candidate word window of the input method program;
the displaying a recommendation window in response to entering the first string, comprising:
and responding to the input of a first character string in a character editing box of the input method program, displaying a second candidate word window and a third candidate word window of the input method program, wherein the second candidate word window displays candidate words matched with the first character string, the third candidate word window displays the collection text matched with the first character string in the custom word stock, and the candidate words are obtained according to a text input rule.
8. The method of claim 5, wherein the document editing program is run on the terminal, and the recommendation window comprises a favorite text recommendation window of the document editing program;
the displaying a recommendation window in response to entering the first string, comprising:
and in response to the first character string being input in the text editing area of the document editing program, displaying the collection text recommendation window of the document editing program, wherein the collection text recommendation window displays the collection text matched with the first character string in the custom word stock.
9. The method of claim 5, wherein the terminal has the document editing program running thereon, the document editing program being an online document editing program, the text editing area displaying a first online document currently opened by the online document editing program; the online document editing program logs in with a first user account;
the operation of responding to the selection and collection of the first text in the on-screen text, adding the first text to a custom thesaurus, comprises the following steps:
in response to an operation of selecting and collecting the first text in the on-screen text of the first online document, adding the first text to the custom thesaurus of the first online document under the condition that the first user account has thesaurus editing authority of the first online document;
the displaying a recommendation window in response to entering the first string, comprising:
responding to the input of the first character string in the text editing area of the first online document, displaying the collection text recommendation window of the document editing program, wherein the collection text recommendation window displays the collection text matched with the first character string in the custom word stock corresponding to the first online document;
Wherein, the custom word stock is used for storing: and the user account with the word stock editing authority of the first online document is a text collected in the process of editing the first online document.
10. The method of claim 9, wherein the first online document corresponds to at least one custom thesaurus, and the thesaurus editing rights comprise a thesaurus editing rights corresponding to each of the at least one custom thesaurus;
the word stock editing authorities comprise first word stock editing authorities corresponding to the first custom word stock; in the case that the first user account has the first thesaurus editing authority, the first user account is allowed to edit and/or use the first custom thesaurus; and under the condition that the first user account does not have the first word stock editing authority, the first user account is not allowed to edit and/or use the first custom word stock.
11. A method according to any one of claims 1 to 3, wherein the favorite text displayed in the recommendation window is ordered according to recommendation rules;
The recommendation rules include at least one of:
the first collection text with the same triggering abbreviation as the first character string has a first priority, and the triggering abbreviation comprises triggering characters of the first collection text set by a user;
matching the first character string, wherein the second collection text existing in the first field word stock has a second priority; the first domain word stock is a domain word stock closest to the semantics of the on-screen text in at least one domain word stock;
matching with the first character string, and not having a third priority with a third collection text in the first domain word stock;
wherein the first priority is higher than the second priority than the third priority; the first collection text, the second collection text and the third collection text are collection texts in the custom word stock.
12. A method according to any one of claims 1 to 3, wherein the method further comprises:
displaying a custom lexicon management window, wherein the custom lexicon management window displays a fourth collection text in the custom lexicon and a removal control corresponding to the fourth collection text;
And responding to the operation of triggering the removal control, and removing the fourth collection text from the custom word stock.
13. A method according to any one of claims 1 to 3, wherein the method further comprises:
displaying a custom thesaurus management window, wherein the custom thesaurus management window displays a fifth collection text in the custom thesaurus and a modification control corresponding to the fifth collection text;
and responding to a text modification operation triggering the modification control, and modifying the text content of the fifth collection text according to the text modification operation.
14. A method according to any one of claims 1 to 3, wherein the method further comprises:
displaying a custom thesaurus management window, wherein the custom thesaurus management window displays a sixth collection text in the custom thesaurus and a trigger abbreviation editing control corresponding to the sixth collection text;
and responding to the trigger abbreviation editing operation triggering the trigger abbreviation editing control, and setting the trigger abbreviation of the fifth favorite text as a first trigger abbreviation input by the trigger abbreviation editing operation.
15. A word stock construction device, wherein the device is used for implementing a terminal, the device comprises:
The display module is used for displaying a text editing area, and the text editing area is displayed with an upper screen text;
the interaction module is used for receiving the operation of selecting and collecting the first text in the on-screen text;
the word stock module is used for responding to the operation of selecting and collecting a first text in the upper screen text, and adding the first text into a custom word stock, wherein the custom word stock comprises at least one collected text in nouns, phrases and sentences;
the interaction module is used for receiving the operation of inputting the first character string;
the display module is used for responding to the input of the first character string and displaying a recommendation window, and the recommendation window displays at least one collection text matched with the first character string in the custom word stock.
16. A computer device comprising a processor and a memory having stored therein at least one instruction, at least one program, code set, or instruction set that is loaded and executed by the processor to implement the lexicon construction method of any of claims 1 to 14.
17. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the lexicon construction method of any one of claims 1 to 14.
18. A computer program product, characterized in that it has stored therein at least one instruction, at least one program, a set of codes or a set of instructions, which are loaded and executed by the processor to implement the lexicon construction method according to any one of claims 1 to 14.
CN202310118779.6A 2023-01-31 2023-01-31 Word stock construction method, device, equipment and storage medium Pending CN116956829A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310118779.6A CN116956829A (en) 2023-01-31 2023-01-31 Word stock construction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310118779.6A CN116956829A (en) 2023-01-31 2023-01-31 Word stock construction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116956829A true CN116956829A (en) 2023-10-27

Family

ID=88453627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310118779.6A Pending CN116956829A (en) 2023-01-31 2023-01-31 Word stock construction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116956829A (en)

Similar Documents

Publication Publication Date Title
CN111324771B (en) Video tag determination method and device, electronic equipment and storage medium
JP7296419B2 (en) Method and device, electronic device, storage medium and computer program for building quality evaluation model
US7120613B2 (en) Solution data edit processing apparatus and method, and automatic summarization processing apparatus and method
US10956790B1 (en) Graphical user interface tool for dataset analysis
CN106940726B (en) Creative automatic generation method and terminal based on knowledge network
JP2011501258A (en) Information extraction apparatus and method
JP7297458B2 (en) Interactive content creation support method
CN110162771A (en) The recognition methods of event trigger word, device, electronic equipment
US10073828B2 (en) Updating language databases using crowd-sourced input
WO2002073531A1 (en) One-step data mining with natural language specification and results
CN107807968A (en) Question and answer system, method and storage medium based on Bayesian network
US11887011B2 (en) Schema augmentation system for exploratory research
CN109947934A (en) For the data digging method and system of short text
CN112631437A (en) Information recommendation method and device and electronic equipment
CN114971730A (en) Method for extracting file material, device, equipment, medium and product thereof
US11928418B2 (en) Text style and emphasis suggestions
US20200285324A1 (en) Character inputting device, and non-transitory computer readable recording medium storing character inputting program
US20230274084A1 (en) Facilitating generation of fillable document templates
CN110347806B (en) Original text screening method, original text screening device, original text screening equipment and computer readable storage medium
CN116992010A (en) Content distribution and interaction method and system based on multi-mode large model
CN107908792B (en) Information pushing method and device
CN115796177A (en) Method, medium and electronic device for realizing Chinese word segmentation and part-of-speech tagging
CN110874408A (en) Model training method, text recognition device and computing equipment
CN116956829A (en) Word stock construction method, device, equipment and storage medium
CN114722832A (en) Abstract extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication