CN107203504B - Character string replacing method and device - Google Patents

Character string replacing method and device Download PDF

Info

Publication number
CN107203504B
CN107203504B CN201710351638.3A CN201710351638A CN107203504B CN 107203504 B CN107203504 B CN 107203504B CN 201710351638 A CN201710351638 A CN 201710351638A CN 107203504 B CN107203504 B CN 107203504B
Authority
CN
China
Prior art keywords
word
replaced
target
clip
matched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710351638.3A
Other languages
Chinese (zh)
Other versions
CN107203504A (en
Inventor
朱德伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710351638.3A priority Critical patent/CN107203504B/en
Publication of CN107203504A publication Critical patent/CN107203504A/en
Application granted granted Critical
Publication of CN107203504B publication Critical patent/CN107203504B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application discloses a character string replacing method and device. One embodiment of the method comprises: in response to the detection of the pointing operation of a user to a target text, extracting a character string in the target text within a preset range of a position pointed by the user, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance; segmenting the cut and pasted character string to generate a cut and pasted word set, and segmenting the character string to be replaced to generate a word set to be replaced; analyzing the cut and pasted word set and the word set to be replaced, and determining a target cut and pasted word in the cut and pasted word set and a target word to be replaced which is matched with the target cut and pasted word in the word set to be replaced; and replacing the target word to be replaced in the character string to be replaced with the target clip word. The embodiment reduces the labor cost and improves the text processing efficiency.

Description

Character string replacing method and device
Technical Field
The application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a character string replacing method and device.
Background
The user often uses the copy and paste function during the use of the mobile device or the computer. Generally, a certain character string needs to be copied and then manually pasted to a specified position; or selecting a certain original character string and manually pasting to realize the replacement of the character string.
However, in the case of a large number of characters to be replaced, if the user manually searches for the character string to be replaced, the labor cost is high, and the text processing efficiency is low.
Disclosure of Invention
An object of the embodiments of the present application is to provide an improved method and apparatus for replacing character strings, so as to solve the technical problems mentioned in the above background.
In a first aspect, an embodiment of the present application provides a character string replacement method, where the method includes: in response to the detection of the pointing operation of a user to a target text, extracting a character string in a preset range of a position pointed by the user in the target text, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance; segmenting the cut and pasted character strings to generate a cut and pasted word set, and segmenting the character strings to be replaced to generate a word set to be replaced; analyzing the cut and pasted word set and the word set to be replaced, and determining a target cut and pasted word in the cut and pasted word set and a target word to be replaced which is matched with the target cut and pasted word in the word set to be replaced; and replacing the target word to be replaced in the character string to be replaced with the target clip word.
In some embodiments, parsing the set of clipped words and the set of words to be replaced, and determining a target clipped word in the set of clipped words and a target word to be replaced matching the target clipped word in the set of words to be replaced includes: extracting a plurality of preset near meaning phrases, wherein each near meaning phrase in the plurality of near meaning phrases comprises a plurality of words of mutually near meaning words; for each clip word in the clip word set, taking a near meaning word group, in a plurality of near meaning word groups, in which a word matched with the clip word exists, as a target near meaning word group, matching each word to be replaced in the word set to be replaced with a word in the target near meaning word group except the clip word one by one, responding to the fact that the word to be replaced successfully matched exists in the word set to be replaced, determining the determined word to be replaced successfully matched as the target word to be replaced matched with the clip word, and determining the clip word as the target clip word.
In some embodiments, parsing the set of clipped words and the set of words to be replaced, and determining a target clipped word in the set of clipped words and a target word to be replaced matching the target clipped word in the set of words to be replaced includes: extracting a plurality of preset similar phrases, wherein each similar phrase in the plurality of similar phrases comprises a plurality of words belonging to the same type; for each clip word in the clip word set, the similar word group with the word matched with the clip word in the similar word groups is used as a target similar word group, each word to be replaced in the word set to be replaced is matched with the word except the clip word in the target similar word group one by one, in response to the fact that the word to be replaced which is successfully matched exists in the word set to be replaced, the determined word to be replaced which is successfully matched is determined as the target word to be replaced which is matched with the clip word, and the clip word is determined as the target clip word.
In some embodiments, parsing the set of clipped words and the set of words to be replaced, and determining a target clipped word in the set of clipped words and a target word to be replaced matching the target clipped word in the set of words to be replaced includes: determining word vectors of all the cut-and-pasted words in the cut-and-pasted word set and word vectors of all the words to be replaced in the word set to be replaced; for each clip word in the clip word set, taking a word vector of the clip word as a target word vector, determining similarity between the target word vector and the word vector of each word to be replaced, determining the clip word as an appointed clip word in response to determining that a word to be replaced exists in the word set to be replaced, wherein the similarity between the word vector and the target word vector is greater than a preset similarity threshold value, and determining the determined word to be replaced as an appointed word to be replaced matched with the appointed clip word; and for each determined appointed cut word, determining whether the appointed cut word and an appointed word to be replaced matched with the appointed cut word are mutually similar words or not based on a preset similar meaning word group, if so, determining the appointed word to be replaced matched with the appointed cut word as a target word to be replaced, and determining the appointed cut word as the target cut word.
In some embodiments, after determining the clipword as a specified clipword and determining the determined word to be replaced as a specified word to be replaced matched with the specified clipword, parsing the set of clipwords and the set of words to be replaced, and determining a target clipword in the set of clipwords and a target word to be replaced in the set of words to be replaced, which is matched with the target clipword, further includes: and for each determined appointed cut word, determining whether the appointed cut word and an appointed word to be replaced matched with the appointed cut word belong to the same type or not based on a preset similar phrase, if so, determining the appointed word to be replaced matched with the appointed cut word as a target word to be replaced, and determining the appointed cut word as the target cut word.
In some embodiments, the method further comprises: and for each determined appointed cut word, responding to the fact that the appointed cut word and an appointed word to be replaced matched with the appointed cut word do not belong to the same similar phrase and do not belong to the same similar phrase, generating a similar phrase formed by the appointed cut word and the appointed word to be replaced matched with the appointed cut word, and replacing the appointed word to be replaced matched with the appointed cut word in the character string to be replaced with the appointed cut word.
In a second aspect, an embodiment of the present application provides a character string replacing apparatus, including: the extraction unit is configured to respond to the detection of the pointing operation of the user on the target text, extract character strings in a preset range of the position pointed by the user in the target text, determine the extracted character strings as character strings to be replaced, and extract clip character strings stored in a clipboard in advance; the word segmentation unit is configured to segment the clipped character strings to generate a clipped word set and segment the character strings to be replaced to generate a word set to be replaced; the analysis unit is configured to analyze the cut-and-pasted word set and the word set to be replaced, and determine a target cut-and-pasted word in the cut-and-pasted word set and a target word to be replaced which is matched with the target cut-and-pasted word in the word set to be replaced; the first replacing unit is configured to replace a target word to be replaced in the character string to be replaced with a target clip word.
In some embodiments, the parsing unit comprises: the device comprises a first extraction module, a second extraction module and a third extraction module, wherein the first extraction module is configured to extract a plurality of preset near-meaning phrases, and each near-meaning phrase in the plurality of near-meaning phrases comprises a plurality of words of mutually near-meaning words; the first determining module is configured to, for each clip word in the clip word set, take a near-meaning word group in the plurality of near-meaning word groups, in which a word matched with the clip word exists, as a target near-meaning word group, match each word to be replaced in the set of words to be replaced one by one with a word in the target near-meaning word group except the clip word, in response to determining that a word to be replaced successfully matched exists in the set of words to be replaced, determine the determined word to be replaced successfully matched as the target word to be replaced matched with the clip word, and determine the clip word as the target clip word.
In some embodiments, the parsing unit comprises: the second extraction module is configured to extract a plurality of preset similar phrases, wherein each similar phrase in the plurality of similar phrases comprises a plurality of words belonging to the same type; and the second determining module is configured to, for each clip word in the clip word set, take a similar word group in the plurality of similar word groups in which a word matched with the clip word exists as a target similar word group, match each word to be replaced in the set of words to be replaced one by one with a word in the target similar word group except the clip word, determine, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, the determined successfully-matched word to be replaced as a target word to be replaced matched with the clip word, and determine the clip word as the target clip word.
In some embodiments, the parsing unit comprises: the third determining module is configured to determine word vectors of all the clipped words in the clipped word set and word vectors of all the words to be replaced in the word set to be replaced; a fourth determining module, configured to, for each scrap word in the scrap word set, use a word vector of the scrap word as a target word vector, determine similarity between the target word vector and the word vectors of the respective to-be-replaced words, determine, in response to determining that there is a to-be-replaced word in the to-be-replaced word set in which the similarity between the word vector and the target word vector is greater than a preset similarity threshold, the scrap word as a designated scrap word, and determine the determined to-be-replaced word as a designated to-be-replaced word that matches the designated scrap word; and a fifth determining module, configured to determine, for each determined specified clipword, based on a preset near-meaning phrase, whether the specified clipword and a specified to-be-replaced word matched with the specified clipword are mutually near-meaning words, if so, determine the specified to-be-replaced word matched with the specified clipword as a target to-be-replaced word, and determine the specified clipword as the target clipword.
In some embodiments, the parsing unit further comprises: and a sixth determining module, configured to determine, for each determined designated clip word, based on a preset homogeneous phrase, whether the designated clip word and a designated to-be-replaced word matched with the designated clip word belong to the same type, if so, determine the designated to-be-replaced word matched with the designated clip word as a target to-be-replaced word, and determine the designated clip word as the target clip word.
In some embodiments, the apparatus further comprises: and the second replacing unit is configured to respond to the determination that the specified clipword and the specified to-be-replaced word matched with the specified clipword do not belong to the same similar phrase and do not belong to the same similar phrase for each determined specified clipword, generate a similar phrase formed by the specified clipword and the specified to-be-replaced word matched with the specified clipword, and replace the specified to-be-replaced word matched with the specified clipword in the character string to be replaced with the specified clipword.
In a third aspect, an embodiment of the present application provides a terminal device, where the terminal device includes: one or more processors; a storage device for storing one or more programs which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the embodiments of the string replacement method described above.
According to the character string replacing method and device provided by the embodiment of the application, the character string in the preset range of the position pointed by the user in the target text is extracted in response to the detection of the execution operation of the user on the target text, the extracted character string is determined as the character string to be replaced, then the extracted clipped character string stored in a clipboard in advance and the extracted character string to be replaced are subjected to word segmentation, so that a clipped word set and a word set to be replaced are generated respectively, the clipped word set and the word set to be replaced are analyzed, the target clipped word and the target word to be replaced are determined, and finally the target word to be replaced in the character string to be replaced is replaced by the target clipped word, so that manual editing and replacement are not needed, the labor cost is reduced, and the text processing efficiency is improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram for one embodiment of a string replacement method according to the present application;
FIG. 3 is a schematic diagram of an application scenario of a string replacement method according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a string replacement method according to the present application;
FIG. 5 is a schematic block diagram of one embodiment of a string replacement apparatus according to the present application;
fig. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which the character string replacement method or the character string replacement apparatus of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 interact with a server 105 via a network 104 to receive or send messages or the like. Various communication client applications, such as a text editing application, a reading application, etc., may be installed on the terminal devices 101, 102, 103. The terminal devices 101, 102, 103 may perform word segmentation, parsing, replacement, and the like on the character strings copied, selected, and pointed by the user.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting text editing, including but not limited to smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a management server that provides support for a thesaurus stored on the terminal devices 101, 102, 103. The management server can perform operations such as updating, managing, and the like on the word banks stored on the terminal apparatuses 101, 102, 103.
Note that, the terminal apparatuses 101, 102, and 103 may directly perform operations such as updating and managing a lexicon, and in this case, the server 105 and the network 104 may not be present.
It should be noted that the character string replacement method provided in the embodiment of the present application is generally executed by the terminal devices 101, 102, and 103, and accordingly, the character string replacement apparatus is generally disposed in the terminal devices 101, 102, and 103.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a string replacement method according to the present application is shown. The character string replacing method comprises the following steps:
step 201, in response to detecting the pointing operation of the user to the target text, extracting a character string in a preset range of a position pointed by the user in the target text, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance.
In this embodiment, an electronic device (for example, the terminal devices 101, 102, and 103 shown in fig. 1) on which the character string replacement method operates may be provided with a clipboard in which a clipboard character string copied in advance by a user may be stored. The electronic device may further display a text, and when a user selects a certain region or paragraph in the displayed text, the electronic device may determine the region or paragraph selected by the user as a target text. In practice, the clipboard is an area in the memory, and information can be transferred and shared between various applications.
In this embodiment, in response to detecting a pointing operation of the user to the target text, the electronic device may extract a character string in the target text within a preset range of a position pointed by the user, determine the extracted character string as a character string to be replaced, and extract a clip character string stored in a clipboard in advance. Here, the pointing operation may be an operation of moving a mouse pointer or a cursor to a certain position in the target text, or an operation of clicking or pressing a certain position in the target text displayed on a touch panel of the electronic device by a user. It should be noted that the preset range may be a range formed by a preset number of characters (for example, 5 or 10 characters) before and after the position pointed by the user. As an example, the target text is "small edited here brings you ten taboos of wearing small girls", the user pointing position is the position where the character "woman" is located, the preset range is the range of a character string composed of 3 characters before and after the position pointed by the user, and the character string in the preset range is "small girls wearing clothes".
Step 202, performing word segmentation on the clipped character string to generate a clipped word set, and performing word segmentation on the character string to be replaced to generate a word set to be replaced.
In this embodiment, the electronic device may perform word segmentation on the clipped character string by using various word segmentation methods to generate a clipped word set, and perform word segmentation on the character string to be replaced to generate a word set to be replaced.
In some optional implementations of this embodiment, the word segmentation method may be a statistical-based word segmentation method. Specifically, the frequency of a character combination formed by adjacent characters in the clipped character string and the character string to be replaced may be counted, and the frequency of occurrence of the character combination may be calculated. And when the frequency is higher than a preset frequency threshold value, judging that the combination forms a word, thereby realizing word segmentation of the cut and pasted character string and the character string to be replaced.
In some optional implementation manners of this embodiment, the word segmentation method may also be a word segmentation method based on a character string matching principle. Specifically, the clipped character string and the character string to be replaced may be respectively matched with each word in a machine dictionary preset in the electronic device by using a character string matching principle, where the character string matching principle may be a forward maximum matching method, a reverse maximum matching method, a segmentation labeling method, a word-by-word traversal matching method, a forward optimal matching method, or a reverse optimal matching method.
In some optional implementations of the embodiment, the electronic device may perform word segmentation on the clipped character string and the to-be-replaced character string by using a Hidden Markov Model (HMM). Taking word segmentation of the clipped character string as an example, the electronic device may first determine a quintuple that forms the markov model, where the quintuple includes an observable sequence, a hidden state set, an initial state space probability, a state transition matrix, and an observation probability distribution matrix. Wherein the observable sequence is the clip character string; the hidden state set can comprise four states of single word formation, prefix, word middle and word end; the initial state space probability may be an initial probability distribution of each state in the hidden state set in a preset lexicon; the state transition matrix may be used to represent a state transition probability (e.g., a probability of converting from a prefix to a word) of each character in the clipped character string; the observed probability distribution matrix is used to characterize the probability of each character in each state. Then, the electronic device may label a state for each character, and determine a maximum probability state of each character based on a viterbi algorithm. Finally, based on the maximum probability state of each character, the cut-and-pasted character strings are segmented, and a cut-and-pasted word set is obtained.
It should be noted that the above word segmentation methods are well-known technologies that are widely researched and applied at present, and are not described herein again.
Step 203, parsing the clipped word set and the word set to be replaced, and determining a target clipped word in the clipped word set and a target word to be replaced which is matched with the target clipped word in the word set to be replaced.
In this embodiment, the electronic device may analyze the clipped word set and the to-be-replaced word set by using various analysis methods, and determine a target clipped word in the clipped word set and a target to-be-replaced word in the to-be-replaced word set, which is matched with the target clipped word. As an example, the electronic device may determine similarity between each clipped word and each replacement word by performing similarity calculation between words in the clipped word set and words in the to-be-replaced word set, determine the clipped word in a group of words with the highest similarity as a target clipped word, and determine the to-be-replaced word in the group of words as the target to-be-replaced word; in addition, the electronic device may further determine a word scrap word in the word scrap word set input or selected by the user as a target scrap word, and determine a word in the word to be replaced set input or selected by the user as a target word to be replaced matched with the target scrap word.
In some optional implementation manners of this embodiment, a plurality of similar phrases preset by a user may be stored in the electronic device in advance, where each of the similar phrases includes a plurality of words of mutually similar words. For example, a certain similar phrase includes the word "like" and the word "like", another similar phrase includes the word "short" and the word "small", and so on. The electronic device may extract the preset plurality of similar phrases, and for each clip word in the clip word set, the following steps may be performed:
first, a near-meaning phrase in the plurality of near-meaning phrases, in which a word matching the clip word exists, may be determined as a target near-meaning phrase. As an example, the above-mentioned plurality of similar phrases are a first similar phrase composed of the word "like" and the word "like", and a second similar phrase composed of the word "short child" and the word "small child", respectively. The above-mentioned clipword set contains the clipword "short child". The electronic device can match the cut-and-pasted word "short child" with the words in the first similar phrase and the second similar phrase one by one. In this example, there is a word matching the clipword "short child" in the second similar phrase, so for the clipword "short child", the second similar phrase may be determined as the target similar phrase corresponding to the clipword "short child".
Then, the electronic device may match each word to be replaced in the set of words to be replaced with a word in the target synonym phrase, except for the clipped word, one by one. Taking the above example as an example, the word set to be replaced may include the word "kid" to be replaced, the word "girl" to be replaced, and the word "dressing". The electronic device may match the word to be replaced "boy", the word to be replaced "girl", and the word to be replaced "dressing" with the word (i.e., the word "boy") other than the word "short" in the second similar phrase one by one.
Finally, in response to determining that there is a word to be replaced successfully matched in the set of words to be replaced, the electronic device may determine the determined word to be replaced successfully matched as a target word to be replaced matched with the clipped word, and determine the clipped word as a target clipped word. Taking the above example as an example, the target similar phrase corresponding to the clip word "short child" is the second similar phrase. After matching the word to be replaced "boy", the word to be replaced "girl", and the word to be replaced "dressing" in the word set to be replaced one by one with the word "boy" in the second similar word group, the electronic device may determine that the word to be replaced "boy" successfully matched exists in the word set to be replaced. The electronic device may determine the word to be replaced "small child" as a target word to be replaced that matches the clipping word "short child", and determine the clipping word "small child" as a target clipping word.
In some optional implementation manners of this embodiment, a plurality of similar phrases preset by a user may be stored in the electronic device in advance, where each of the plurality of similar phrases includes a plurality of words belonging to a same type. For example, a generic phrase includes the word "Integer" and the word "String" for indicating the type of variable, another generic phrase includes the word "apple" and the word "watermelon" belonging to fruit, and so on. It should be noted that, a plurality of words belonging to the same type in each similar phrase may also be a plurality of words preset by the user for describing the same object, for example, a similar phrase includes the word "red apple" and the word "green apple". The electronic device may extract the preset plurality of similar phrases, and for each clip word in the clip word set, the following steps may be performed: firstly, a plurality of preset similar phrases can be extracted; and then, regarding each clip word in the clip word set, taking the same-kind word group in the plurality of same-kind word groups in which the word matched with the clip word exists as a target same-kind word group, matching each word to be replaced in the set of words to be replaced one by one with the words except for the clip word in the target same-kind word group, responding to the determination that the word to be replaced successfully matches exists in the set of words to be replaced, determining the determined word to be replaced successfully matching as the target word to be replaced matched with the clip word, and determining the clip word as the target clip word. It should be noted that the method for determining the target clip word and the target to-be-replaced word based on the similar phrases is substantially the same as the method for determining the target clip word and the target to-be-replaced word based on the synonymous phrases, and is not described herein again.
And step 204, replacing the target word to be replaced in the character string to be replaced with the target cut-and-pasted word.
In this embodiment, the electronic device may replace a target word to be replaced in the character string to be replaced with a corresponding target clip word. As an example, the character string to be replaced is "small girl dresses", the target word to be replaced is "small", the target clip word is "short", and the electronic device may replace the character string "small" in the character string "small girl dresses" to be replaced with "short", so as to obtain a replaced character string "short girl dresses".
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the character string replacement method according to the present embodiment. In the application scenario of fig. 3, the user first selects the target text 301 and points to the location of the character "woman" in the target text 301. Then, the terminal device extracts a character string in a range formed by three characters before and after the position pointed by the user in the target text 301, that is, a character string "little girl dresses", determines the character string "little girl dresses" as a character string to be replaced, and extracts a clip character string stored in the clipboard in advance. And then, the terminal equipment divides words of the cut and pasted character string 'short child classmates' and the character string to be replaced 'small child girl dresses' respectively to generate a cut and pasted word set and a word set to be replaced. Then, the terminal device analyzes the clipped word set and the to-be-replaced word set, and determines a target clipped word "short child" in the clipped word set and a target to-be-replaced word "small child" in the to-be-replaced word set, which is matched with the target clipped word. And finally, replacing the target to-be-replaced word 'small children' in the to-be-replaced character string 'small girls dressing' with the target cut and pasted word 'short children'.
The method provided by the embodiment of the application, in response to detecting that a user performs an operation on a target text, extracts a character string in the target text within a preset range of a position pointed by the user, determines the extracted character string as a character string to be replaced, performs word segmentation on the extracted clipped character string and the extracted character string to be replaced, which are stored in a clipboard in advance, so as to generate a clipped word set and a word set to be replaced respectively, analyzes the clipped word set and the word set to be replaced, determines a target clipped word and a target word to be replaced, and finally replaces the target word to be replaced in the character string to be replaced with the target clipped word, so that manual editing and replacement are not required, the labor cost is reduced, and the text processing efficiency is improved.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a string replacement method is shown. The process 400 of the character string replacement method includes the following steps:
step 401, in response to detecting a pointing operation of a user on a target text, extracting a character string in a preset range of a position pointed by the user in the target text, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance.
In this embodiment, an electronic device (for example, the terminal devices 101, 102, and 103 shown in fig. 1) on which the character string replacement method operates may be provided with a clipboard in which a clipboard character string copied in advance by a user may be stored. The electronic device may further display a text, and when a user selects a certain region or paragraph in the displayed text, the electronic device may determine the region or paragraph selected by the user as a target text. In response to detecting a pointing operation of the user to the target text, the electronic device may extract a character string in the target text within a preset range of a position pointed by the user, determine the extracted character string as a character string to be replaced, and extract a clip character string stored in a clipboard in advance.
Step 402, performing word segmentation on the clipped character string to generate a clipped word set, and performing word segmentation on the character string to be replaced to generate a word set to be replaced.
In this embodiment, the electronic device may perform word segmentation on the clipped character string by using various word segmentation methods to generate a clipped word set, and perform word segmentation on the character string to be replaced to generate a word set to be replaced. For example, the electronic device may perform segmentation of the clipped character string and the character string to be replaced by using a hidden markov model.
Step 403, determining word vectors of each clipped word in the clipped word set and word vectors of each word to be replaced in the word set to be replaced.
In this embodiment, the electronic device may determine a word vector of each clipped word in the clipped word set and a word vector of each word to be replaced in the word set to be replaced. In practice, a word vector may be a vector used to represent word features, with the values of each dimension of the word vector representing a feature having certain semantic and grammatical interpretations. Here, the electronic device may determine a word vector of each word to be replaced by using various word vector calculation tools (e.g., word2vec, etc.) using an open source.
Step 404, regarding each clip word in the clip word set, taking the word vector of the clip word as a target word vector, determining similarity between the target word vector and the word vector of each word to be replaced, in response to determining that a word to be replaced exists in the word set to be replaced, wherein the similarity between the word vector and the target word vector is greater than a preset similarity threshold, determining the clip word as an appointed clip word, and determining the determined word to be replaced as an appointed word to be replaced matched with the appointed clip word;
in this embodiment, for each clipboard word in the clipboard word set, the electronic device may use a word vector of the clipboard word as a target word vector, determine similarity between the target word vector and word vectors of respective words to be replaced by using various similarity calculation methods (e.g., a similarity calculation method based on euclidean distance, a cosine similarity calculation method, etc.) or an open-source word vector calculation tool (e.g., word2vec, etc.), in response to determining that a word to be replaced exists in the set of words to be replaced whose similarity between the word vector and the target word vector is greater than a preset similarity threshold, may determine the clipboard word as a designated clipboard word, determine the determined word to be replaced as a designated word to be replaced that matches the designated clipboard word, and perform the operation in step 405 or step 406. It should be noted that the above method for generating a word vector and the method for calculating the similarity of a word vector are well-known technologies that are widely researched and applied at present, and are not described herein again.
Step 405, for each determined designated clip word, determining whether the designated clip word and the designated word to be replaced matched with the designated clip word are mutually similar words based on a preset near-meaning phrase, if so, determining the designated word to be replaced matched with the designated clip word as a target word to be replaced, and determining the designated clip word as the target clip word.
In this embodiment, for each determined designated clip word, the electronic device may determine, based on a preset near-meaning phrase, whether the designated clip word and a designated to-be-replaced word matched with the designated clip word are mutually near-meaning words, if so, determine the designated to-be-replaced word matched with the designated clip word as a target to-be-replaced word, and determine the designated clip word as the target clip word. In practice, for each determined designated clipword, if the designated clipword and the designated word to be replaced matched with the designated clipword belong to the same synonym phrase, the electronic device may determine that the designated clipword and the designated word to be replaced matched with the designated clipword are synonyms.
Step 406, for each determined designated clip word, determining whether the designated clip word and the designated word to be replaced matched with the designated clip word belong to the same type based on a preset similar phrase, if so, determining the designated word to be replaced matched with the designated clip word as a target word to be replaced, and determining the designated clip word as the target clip word.
In this embodiment, for each determined designated clip word, the electronic device may determine, based on a preset homogeneous phrase, whether the designated clip word and a designated word to be replaced matched with the designated clip word belong to the same type, if so, determine the designated word to be replaced matched with the designated clip word as a target word to be replaced, and determine the designated clip word as the target clip word. In practice, for each determined designated clipword, if the designated clipword and the designated word to be replaced matched with the designated clipword belong to the same phrase group, the electronic device may determine that the designated clipword and the designated word to be replaced matched with the designated clipword are words of the same kind.
It should be noted that, for each determined specified clipword, in response to determining that the specified clipword and the specified to-be-replaced word matched with the specified clipword do not belong to the same similar phrase and do not belong to the same similar phrase, the electronic device may generate a similar phrase formed by the specified clipword and the specified to-be-replaced word matched with the specified clipword, and replace the specified to-be-replaced word matched with the specified clipword in the character string to be replaced with the specified clipword.
Step 407, replacing the target word to be replaced in the character string to be replaced with the target clip word.
In this embodiment, the electronic device may replace a target word to be replaced in the character string to be replaced with a corresponding target clip word.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the character string replacement method in this embodiment highlights the parsing steps of the clipped word set and the word set to be replaced. Therefore, the scheme described in this embodiment can more accurately determine the target clipped word in the clipped word set and the target to-be-replaced word in the to-be-replaced word set, and can also add a near word group based on the parsing result, thereby reducing the labor cost and further improving the efficiency of text processing.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of a character string replacement apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 5, the character string replacing apparatus 500 according to the present embodiment includes: an extracting unit 501, configured to, in response to detection of a pointing operation of a user on a target text, extract a character string in a preset range of a position pointed by the user in the target text, determine the extracted character string as a character string to be replaced, and extract a clip character string stored in a clipboard in advance; a word segmentation unit 502 configured to segment the clipped character string to generate a clipped word set, and segment the character string to be replaced to generate a word set to be replaced; an analyzing unit 503 configured to analyze the clipped word set and the to-be-replaced word set, and determine a target clipped word in the clipped word set and a target to-be-replaced word in the to-be-replaced word set, which is matched with the target clipped word; a first replacing unit 504, configured to replace the target to-be-replaced word in the to-be-replaced character string with the target clip word.
In this embodiment, the string replacing apparatus 500 may be provided with a clipboard, and the clipboard may store a clipboard string copied in advance by a user. The character string replacing apparatus 500 may further display text, and when a user selects a certain region or paragraph in the displayed text, the region or paragraph selected by the user may be determined as the target text. In response to detecting a pointing operation of the user to the target text, the extracting unit 501 of the character string replacing apparatus 500 may extract a character string in the target text within a preset range of a position pointed by the user, determine the extracted character string as a character string to be replaced, and extract a clip character string stored in a clipboard in advance.
In this embodiment, the word segmentation unit 502 may perform word segmentation on the clipped character string by using various word segmentation methods to generate a clipped word set, and perform word segmentation on the character string to be replaced to generate a word set to be replaced.
In this embodiment, the parsing unit 503 may parse the clipped word set and the to-be-replaced word set by using various analysis methods, and determine a target clipped word in the clipped word set and a target to-be-replaced word in the to-be-replaced word set, which is matched with the target clipped word.
In this embodiment, the first replacing unit 504 may replace a target word to be replaced in the character string to be replaced with a corresponding target clip word.
In some optional implementations of this embodiment, the parsing unit 503 may include a first extracting module and a first determining module (not shown in the figure). The first extraction module may be configured to extract a plurality of preset near-meaning phrases, where each of the plurality of near-meaning phrases includes a plurality of words that are near-meaning words to each other. The first determining module may be configured to, for each clip word in the clip word set, take a near-meaning word group in the plurality of near-meaning word groups in which a word matching the clip word exists as a target near-meaning word group, match each word to be replaced in the set of words to be replaced with a word other than the clip word in the target near-meaning word group one by one, determine, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, the determined successfully-matched word to be replaced as a target word to be replaced matching the clip word, and determine the clip word as the target clip word.
In some optional implementations of this embodiment, the parsing unit 503 may include a second extracting module and a second determining module (not shown in the figure). The second extraction module may be configured to extract a plurality of preset similar phrases, where each of the similar phrases includes a plurality of words belonging to the same type. The second determining module may be configured to, for each clipword in the clipword set, take a similar phrase in the multiple similar phrases in which a word matching the clipword exists as a target similar phrase, match each word to be replaced in the set of words to be replaced one by one with a word other than the clipword in the target similar phrase, determine, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, the determined successfully-matched word to be a target word to be replaced matching the clipword, and determine the clipword as the target clipword.
In some optional implementations of this embodiment, the parsing unit 503 may include a third determining module, a fourth determining module, and a fifth determining module (not shown in the figure). The third determining module may be configured to determine a word vector of each clipped word in the clipped word set and a word vector of each word to be replaced in the word set to be replaced. The fourth determining module may be configured to, for each clipboard word in the clipboard word set, use a word vector of the clipboard word as a target word vector, determine similarity between the target word vector and a word vector of each to-be-replaced word, determine, in response to determining that a to-be-replaced word in the to-be-replaced word set exists, where the similarity between the word vector and the target word vector is greater than a preset similarity threshold, the clipboard word as an appointed clipboard word, and determine the determined to-be-replaced word as an appointed to-be-replaced word matched with the appointed clipboard word. The fifth determining module may be configured to determine, for each determined designated clip word, based on a preset near-meaning phrase, whether the designated clip word and a designated to-be-replaced word that matches the designated clip word are mutually near-meaning words, if so, determine the designated to-be-replaced word that matches the designated clip word as a target to-be-replaced word, and determine the designated clip word as the target clip word.
In some optional implementations of this embodiment, the parsing unit 503 may further include a sixth determining module (not shown in the figure). The sixth determining module may be configured to determine, for each determined designated clip word, based on a preset homogeneous phrase, whether the designated clip word and a designated to-be-replaced word that matches the designated clip word belong to a same type, and if so, determine the designated to-be-replaced word that matches the designated clip word as a target to-be-replaced word, and determine the designated clip word as the target clip word.
In some optional implementations of the present embodiment, the above-mentioned character string replacing apparatus 500 may further include a second replacing unit (not shown in the figure). The second replacing unit may be configured to, for each determined designated clip word, in response to determining that the designated clip word and the designated to-be-replaced word matched with the designated clip word do not belong to the same similar phrase and do not belong to the same homogeneous phrase, generate a similar phrase formed by the designated clip word and the designated to-be-replaced word matched with the designated clip word, and replace the designated to-be-replaced word matched with the designated clip word in the character string to be replaced with the designated clip word.
In the apparatus provided in the above embodiment of the present application, in response to detecting that a user performs an operation on a target text, an extracting unit 501 extracts a character string in a preset range of a position pointed by the user in the target text, and determines the extracted character string as a character string to be replaced, then a word segmentation unit 502 performs word segmentation on the extracted clipped character string and the character string to be replaced, which are stored in a clipboard in advance, so as to generate a clipped word set and a word set to be replaced, respectively, then an analyzing unit 503 analyzes the clipped word set and the word set to be replaced, determines a target clipped word and a target word to be replaced, and finally a first replacing unit 504 replaces the target word to be replaced in the character string to be replaced with the target clipped word to be replaced, so that manual editing and replacement are not required, labor cost is reduced, and efficiency of text processing is improved.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a terminal device of an embodiment of the present application. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a touch screen, a touch panel, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an extraction unit, a word segmentation unit, a parsing unit, and a first replacement unit. The names of the cells do not in some cases constitute a limitation on the cell itself, and for example, the extraction cell may also be described as a "cell that extracts a clip character string and a character string to be replaced".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: in response to the detection of the pointing operation of a user to a target text, extracting a character string in the target text within a preset range of a position pointed by the user, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance; segmenting the cut and pasted character string to generate a cut and pasted word set, and segmenting the character string to be replaced to generate a word set to be replaced; analyzing the cut and pasted word set and the word set to be replaced, and determining a target cut and pasted word in the cut and pasted word set and a target word to be replaced which is matched with the target cut and pasted word in the word set to be replaced; and replacing the target word to be replaced in the character string to be replaced with the target clip word. The embodiment reduces the labor cost and improves the text processing efficiency.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method for string replacement, the method comprising:
in response to the detection of the pointing operation of a user to a target text, extracting a character string in a preset range of a position pointed by the user in the target text, determining the extracted character string as a character string to be replaced, and extracting a clip character string stored in a clipboard in advance;
segmenting the cut and pasted character string to generate a cut and pasted word set, and segmenting the character string to be replaced to generate a word set to be replaced;
analyzing the cut and pasted word set and the word set to be replaced, and determining a target cut and pasted word in the cut and pasted word set and a target word to be replaced in the word set to be replaced, which is matched with the target cut and pasted word, including: extracting a plurality of preset near meaning phrases, wherein each near meaning phrase in the plurality of near meaning phrases comprises a plurality of words of mutually near meaning words; for each clip word in the clip word set, taking a near-meaning word group, in the plurality of near-meaning word groups, in which a word matched with the clip word exists, as a target near-meaning word group, matching each word to be replaced in the set of words to be replaced one by one with words, in the target near-meaning word group, except for the clip word, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, determining the successfully-matched word to be replaced as a target word to be replaced matched with the clip word, and determining the clip word as a target clip word;
and replacing the target word to be replaced in the character string to be replaced with the target cut-and-pasted word.
2. The method according to claim 1, wherein the parsing the set of clipped words and the set of words to be replaced to determine a target clipped word in the set of clipped words and a target word to be replaced in the set of words to be replaced, which is matched with the target clipped word, includes:
extracting a plurality of preset similar phrases, wherein each similar phrase in the plurality of similar phrases comprises a plurality of words belonging to the same type;
for each clip word in the clip word set, taking a similar word group in the similar word groups, in which a word matched with the clip word exists, as a target similar word group, matching each word to be replaced in the set of words to be replaced with a word in the target similar word group except the clip word one by one, in response to determining that a word to be replaced successfully matched exists in the set of words to be replaced, determining the determined word to be replaced successfully matched as a target word to be replaced matched with the clip word, and determining the clip word as the target clip word.
3. The method according to claim 1, wherein the parsing the set of clipped words and the set of words to be replaced to determine a target clipped word in the set of clipped words and a target word to be replaced in the set of words to be replaced, which is matched with the target clipped word, includes:
determining word vectors of all the clipped words in the clipped word set and word vectors of all the words to be replaced in the word set to be replaced;
for each clip word in the clip word set, taking a word vector of the clip word as a target word vector, determining similarity between the target word vector and the word vectors of the words to be replaced, determining the clip word as an appointed clip word in response to determining that a word to be replaced exists in the word set to be replaced, wherein the similarity between the word vector and the target word vector is greater than a preset similarity threshold, and determining the determined word to be replaced as an appointed word to be replaced matched with the appointed clip word;
and for each determined appointed cut word, determining whether the appointed cut word and an appointed word to be replaced matched with the appointed cut word are mutually similar words or not based on a preset similar meaning word group, if so, determining the appointed word to be replaced matched with the appointed cut word as a target word to be replaced, and determining the appointed cut word as the target cut word.
4. The method according to claim 3, wherein after the determining the clipped word as a designated clipped word and the determining word to be replaced as a designated word to be replaced matching the designated clipped word, the parsing is performed on the set of clipped words and the set of words to be replaced to determine a target clipped word in the set of clipped words and a target word to be replaced matching the target clipped word in the set of words to be replaced, and further comprising:
and for each determined appointed cut word, determining whether the appointed cut word and an appointed word to be replaced matched with the appointed cut word belong to the same type or not based on a preset similar phrase, if so, determining the appointed word to be replaced matched with the appointed cut word as a target word to be replaced, and determining the appointed cut word as the target cut word.
5. The character string replacement method according to claim 4, further comprising:
and for each determined appointed cut word, responding to the fact that the appointed cut word and an appointed word to be replaced matched with the appointed cut word do not belong to the same similar phrase and do not belong to the same similar phrase, generating a similar phrase formed by the appointed cut word and the appointed word to be replaced matched with the appointed cut word, and replacing the appointed word to be replaced matched with the appointed cut word in the character string to be replaced with the appointed cut word.
6. A character string replacing apparatus, characterized in that the apparatus comprises:
the system comprises an extraction unit, a judgment unit and a display unit, wherein the extraction unit is configured to respond to detection of pointing operation of a user to a target text, extract a character string in a preset range of a position pointed by the user in the target text, determine the extracted character string as a character string to be replaced, and extract a clip character string stored in a clipboard in advance;
the word segmentation unit is configured to segment the clipped character string to generate a clipped word set and segment the character string to be replaced to generate a word set to be replaced;
the analysis unit is configured to analyze the clip word set and the to-be-replaced word set, and determine a target clip word in the clip word set and a target to-be-replaced word matched with the target clip word in the to-be-replaced word set;
the analysis unit includes: the first replacing unit is configured to replace the target word to be replaced in the character string to be replaced with the target clip word; the device comprises a first extraction module, a second extraction module and a third extraction module, wherein the first extraction module is configured to extract a plurality of preset near-meaning phrases, and each near-meaning phrase in the plurality of near-meaning phrases comprises a plurality of words of mutually near-meaning words; the first determining module is configured to, for each clipword in the clipword set, take a near-meaning phrase in the multiple near-meaning phrases, in which a word matched with the clipword exists, as a target near-meaning phrase, match each word to be replaced in the set of words to be replaced one by one with words other than the clipword in the target near-meaning phrase, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, determine the successfully-matched word to be replaced as a target word to be replaced matched with the clipword, and determine the clipword as the target clipword.
7. The character string replacing apparatus according to claim 6, wherein the parsing unit includes:
the second extraction module is configured to extract a plurality of preset similar phrases, wherein each similar phrase in the plurality of similar phrases comprises a plurality of words belonging to the same type;
and the second determining module is configured to, for each clipword in the clipword set, take a similar phrase in the multiple similar phrases, in which a word matched with the clipword exists, as a target similar phrase, match each word to be replaced in the set of words to be replaced one by one with a word in the target similar phrase except for the clipword, in response to determining that a successfully-matched word to be replaced exists in the set of words to be replaced, determine the successfully-matched word to be a target word to be replaced matched with the clipword, and determine the clipword to be a target clipword.
8. The character string replacing apparatus according to claim 6, wherein the parsing unit includes:
a third determining module, configured to determine word vectors of each clipped word in the clipped word set and word vectors of each word to be replaced in the word set to be replaced;
a fourth determining module, configured to, for each clipboard word in the clipboard word set, use a word vector of the clipboard word as a target word vector, determine similarity between the target word vector and word vectors of respective words to be replaced, determine, in response to determining that a word to be replaced exists in the set of words to be replaced, where the similarity between the word vector and the target word vector is greater than a preset similarity threshold, the clipboard word as a designated clipboard word, and determine the determined word to be replaced as a designated word to be replaced that matches the designated clipboard word;
and a fifth determining module, configured to determine, for each determined specified clipword, based on a preset near-meaning phrase, whether the specified clipword and a specified to-be-replaced word matched with the specified clipword are mutually near-meaning words, if so, determine the specified to-be-replaced word matched with the specified clipword as a target to-be-replaced word, and determine the specified clipword as the target clipword.
9. The character string replacing apparatus according to claim 8, wherein the parsing unit further includes:
and a sixth determining module, configured to determine, for each determined designated clip word, based on a preset homogeneous phrase, whether the designated clip word and a designated to-be-replaced word matched with the designated clip word belong to the same type, if so, determine the designated to-be-replaced word matched with the designated clip word as a target to-be-replaced word, and determine the designated clip word as the target clip word.
10. The character string replacing apparatus according to claim 9, wherein said apparatus further comprises:
and the second replacing unit is configured to respond to the determination that the specified scrap word and the specified to-be-replaced word matched with the specified scrap word do not belong to the same similar phrase and do not belong to the same similar phrase for each determined specified scrap word, generate a similar phrase formed by the specified scrap word and the specified to-be-replaced word matched with the specified scrap word, and replace the specified to-be-replaced word matched with the specified scrap word in the character string to be replaced with the specified scrap word.
11. A terminal device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201710351638.3A 2017-05-18 2017-05-18 Character string replacing method and device Active CN107203504B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710351638.3A CN107203504B (en) 2017-05-18 2017-05-18 Character string replacing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710351638.3A CN107203504B (en) 2017-05-18 2017-05-18 Character string replacing method and device

Publications (2)

Publication Number Publication Date
CN107203504A CN107203504A (en) 2017-09-26
CN107203504B true CN107203504B (en) 2021-02-26

Family

ID=59906530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710351638.3A Active CN107203504B (en) 2017-05-18 2017-05-18 Character string replacing method and device

Country Status (1)

Country Link
CN (1) CN107203504B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062879A (en) * 2018-07-04 2018-12-21 珠海市魅族科技有限公司 A kind of replacement method and device
CN110162794A (en) * 2019-05-29 2019-08-23 腾讯科技(深圳)有限公司 A kind of method and server of participle
CN110929522A (en) * 2019-08-19 2020-03-27 网娱互动科技(北京)股份有限公司 Intelligent synonym replacement method and system
CN111159978B (en) * 2019-12-30 2023-07-21 北京爱医生智慧医疗科技有限公司 Character string replacement processing method and device
CN111611788B (en) * 2020-04-14 2024-02-09 大唐软件技术股份有限公司 Data processing method and device, electronic equipment and storage medium
CN113688359B (en) * 2020-05-18 2024-08-16 北京京东尚科信息技术有限公司 Processing method, device, computing equipment and medium for program code
CN111783858B (en) * 2020-06-19 2022-07-15 厦门市美亚柏科信息股份有限公司 Method and device for generating category vector
CN113705213A (en) * 2021-03-01 2021-11-26 腾讯科技(深圳)有限公司 Wrongly written character recognition method, device, equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818458B2 (en) * 2007-12-03 2010-10-19 Microsoft Corporation Clipboard for application sharing
CN102141933A (en) * 2011-01-17 2011-08-03 博视联(苏州)信息科技有限公司 System for providing multiple multiplexing and pasting of computer application program and method thereof
CN103617154A (en) * 2013-11-29 2014-03-05 百度在线网络技术(北京)有限公司 Method and device for having control over content paste operation
CN105095222A (en) * 2014-04-25 2015-11-25 阿里巴巴集团控股有限公司 Unit word replacing method, search method and replacing apparatus
CN105868236A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Synonym data mining method and system
CN106649783A (en) * 2016-12-28 2017-05-10 上海智臻智能网络科技股份有限公司 Synonym mining method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818458B2 (en) * 2007-12-03 2010-10-19 Microsoft Corporation Clipboard for application sharing
CN102141933A (en) * 2011-01-17 2011-08-03 博视联(苏州)信息科技有限公司 System for providing multiple multiplexing and pasting of computer application program and method thereof
CN103617154A (en) * 2013-11-29 2014-03-05 百度在线网络技术(北京)有限公司 Method and device for having control over content paste operation
CN105095222A (en) * 2014-04-25 2015-11-25 阿里巴巴集团控股有限公司 Unit word replacing method, search method and replacing apparatus
CN105868236A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Synonym data mining method and system
CN106649783A (en) * 2016-12-28 2017-05-10 上海智臻智能网络科技股份有限公司 Synonym mining method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
大批量同义词替换的思路;didibaba;《https://bbs.csdn.net/topics/330009727》;20091227;第1-3页 *
安卓上的剪贴板增强神器工具;X-Force;《https://www.iplaysoft.com/clipboard-plus.html》;20161103;第1-6页 *

Also Published As

Publication number Publication date
CN107203504A (en) 2017-09-26

Similar Documents

Publication Publication Date Title
CN107203504B (en) Character string replacing method and device
CN107066449B (en) Information pushing method and device
CN107168952B (en) Information generation method and device based on artificial intelligence
CN107301170B (en) Method and device for segmenting sentences based on artificial intelligence
CN108628830B (en) Semantic recognition method and device
CN107861954B (en) Information output method and device based on artificial intelligence
US20210042470A1 (en) Method and device for separating words
CN112988753B (en) Data searching method and device
CN111538837A (en) Method and device for analyzing enterprise operation range information
CN113657113A (en) Text processing method and device and electronic equipment
CN113407610B (en) Information extraction method, information extraction device, electronic equipment and readable storage medium
CN112148841B (en) Object classification and classification model construction method and device
CN110874532A (en) Method and device for extracting keywords of feedback information
CN112528641A (en) Method and device for establishing information extraction model, electronic equipment and readable storage medium
CN111861596A (en) Text classification method and device
CN113408660A (en) Book clustering method, device, equipment and storage medium
CN112182141A (en) Key information extraction method, device, equipment and readable storage medium
CN113641724B (en) Knowledge tag mining method and device, electronic equipment and storage medium
CN117992569A (en) Method, device, equipment and medium for generating document based on generation type large model
CN112905753A (en) Method and device for distinguishing text information
CN110807097A (en) Method and device for analyzing data
CN113987180A (en) Method and apparatus for outputting information and processing information
CN113904943A (en) Account detection method and device, electronic equipment and storage medium
CN111783433A (en) Text retrieval error correction method and device
CN112860860A (en) Method and device for answering questions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant