CN112559711A

CN112559711A - Synonymous text prompting method and device and electronic equipment

Info

Publication number: CN112559711A
Application number: CN202011539680.6A
Authority: CN
Inventors: 任帅; 王博弘; 张振; 蒋宏飞; 宋旸; 王瑞阳; 王阳; 赵慧娟
Original assignee: Zuoyebang Education Technology Beijing Co Ltd
Current assignee: Zuoyebang Education Technology Beijing Co Ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2021-03-26

Abstract

The invention belongs to the technical field of computers, and provides a synonymous text prompting method, a synonymous text prompting device and electronic equipment, wherein the method comprises the following steps: dividing an input text into word units; determining a target word unit from the word units according to the segmentation condition of the input text, acquiring candidate words corresponding to the target word unit through a preset model to form a candidate set, and sequencing the candidate words in the candidate set to obtain a comprehensive sequencing candidate set corresponding to the target word unit; and prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sequencing candidate set. According to the method and the device, the user experience is improved while the synonymous text recognition rate is improved, and the user can select the target synonymous text according to the previous two synonymous texts which are prompted.

Description

Synonymous text prompting method and device and electronic equipment

Technical Field

The invention belongs to the technical field of computers, and particularly relates to a synonymous text recognition technology for a computer, in particular to a synonymous text prompting method, a synonymous text prompting device, electronic equipment and a computer readable medium.

Background

With the development of computer technology and internet technology, synonymous information plays an indispensable role in application fields such as quality inspection systems, Web search, question and answer systems, knowledge map construction and the like. For example, a quality inspection platform looks for text that is synonymous with a user-entered keyword, phrase, etc., a search engine looks for text that is semantically identical or similar to the user-entered text, or a question and answer platform looks for a set of questions that are synonymous with new questions posed by the user, etc.

In the prior art, when identifying a synonymous text, auxiliary processing needs to be performed on the text by depending on auxiliary tools such as a word segmentation tool, part of speech analysis, sentence template extraction and the like to obtain core words, and whether the two core words are synonymous or not is determined by an editing distance between the two core words. The edit distance refers to the number of times a character string needs to be changed into another character string, and the edit distance may represent the difference between the two character strings. When the editing distance between two words is smaller than or equal to a preset value, determining the two words as synonyms; and when the edit distance between the two words is larger than a preset value, determining the two words as non-synonyms. In reality, the non-synonym pair with a small editing distance exists, so that the recognition accuracy of the synonym text is low.

Disclosure of Invention

Technical problem to be solved

The method and the device aim to solve the technical problem that the synonymy text recognition accuracy rate is low in the prior art.

(II) technical scheme

In order to solve the technical problem, an aspect of the present invention provides a method for prompting synonymous text, where the synonymous text refers to a text having the same or similar meaning as an input text, the synonymous text includes a text having the same meaning as the input text and/or a text having a similar meaning as the input text, and the text may be a single word or a text composed of at least two words. The method comprises the following steps:

dividing an input text into word units;

determining a target word unit from the word units according to the segmentation condition of the input text, wherein the segmentation condition of the input text comprises the following steps: the input text is segmented into only one word unit, and the input text is segmented into at least two word units;

obtaining candidate words corresponding to the target word unit through a preset model to form a candidate set, wherein the candidate words are synonyms or near-synonyms of the target word unit;

sorting the candidate words in the candidate set to obtain a comprehensive sorting candidate set corresponding to the target word unit;

and prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sequencing candidate set.

According to a preferred embodiment of the present invention, the obtaining, by a preset model, a candidate word composition candidate set corresponding to the target word unit includes:

acquiring different corpora as a training set to train a plurality of word2vec models;

and obtaining candidate words corresponding to the target word unit through each trained word2vec model to form a candidate set.

According to a preferred embodiment of the present invention, the obtaining candidate words corresponding to the target word unit by using each trained word2vec model to form a candidate set includes:

inputting the target word unit into a trained word2vec model to obtain a word vector of the target word unit output by the word2vec model;

acquiring candidate word vectors with the similarity between the word vectors and the target word unit smaller than a threshold value;

and converting the candidate word vectors into corresponding candidate words to form a candidate set.

According to a preferred embodiment of the present invention, the sorting the candidate words in the candidate set includes:

acquiring a candidate set output by each word2vec model;

determining the total weight of the candidate words according to the preset weight of the candidate words and the arrangement positions of the candidate words in each candidate set;

and sorting the candidate words in the candidate set according to the total weight of the candidate words.

According to a preferred embodiment of the present invention, the input text is segmented into only one word unit, and the only one word unit is used as the target word unit.

According to a preferred embodiment of the present invention, the prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive ranking candidate set includes:

filtering out target words meeting conditions from the comprehensive ordering candidate set according to preset word length, preset part of speech and preset word frequency;

and prompting the target words according to the sequence of the target words in the comprehensive ranking candidate set.

According to a preferred embodiment of the present invention, the input text is segmented into at least two word units, and one of the at least two word units is selected as a target word unit according to a part of speech and/or a word frequency.

acquiring candidate words N before the total weight in the comprehensive sorting candidate set;

merging other word units segmented from the input text with the candidate words N before the total weight to obtain N candidate texts;

sorting the candidate texts according to the historical occurrence frequency of the candidate texts;

and prompting the candidate texts according to the ordering of the candidate texts.

According to a preferred embodiment of the present invention, if the historical occurrence frequency of the candidate texts is zero, the candidate texts are ranked according to the similarity between the word vector corresponding to the candidate word in the candidate texts and the word vector corresponding to the target word unit.

A second aspect of the present invention provides a synonymous text presentation device, which is a device for presenting a synonymous text having a meaning equal to or similar to that of an input text, the device including:

the segmentation unit is used for segmenting the input text into word units;

a determining module, configured to determine a target word unit from the word units according to a segmentation condition of the input text, where the segmentation condition of the input text includes: the input text is segmented into only one word unit, and the input text is segmented into at least two word units;

the acquisition module is used for acquiring candidate words corresponding to the target word unit through a preset model to form a candidate set, wherein the candidate words are synonyms or near-synonyms of the target word unit;

the sorting module is used for sorting the candidate words in the candidate set to obtain a comprehensive sorting candidate set corresponding to the target word unit;

and the prompt module is used for prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sorting candidate set.

A third aspect of the invention proposes an electronic device comprising a processor and a memory for storing a computer-executable program, which, when executed by the processor, performs the method.

The fourth aspect of the present invention also provides a computer-readable medium storing a computer-executable program, which when executed, implements the method.

(III) advantageous effects

According to the method, the candidate set of the target word unit is obtained through the preset model, and then the candidate words in the candidate set are ranked, so that the candidate words are ranked in the candidate set according to the synonymy recognition accuracy between the candidate words and the target word unit, and the synonymy recognition accuracy of the target word unit is improved. And finally, determining and prompting the synonymous text of the input text according to the segmentation condition of the input text and the sequence of the candidate words corresponding to the target word unit, thereby ensuring that the synonymous text is prompted according to the accuracy of synonymous identification. According to the method and the device, the user experience is improved while the synonymous text recognition rate is improved, and the user can select the target synonymous text according to the previous two synonymous texts which are prompted.

The method comprises the steps of training a plurality of word2vec models by obtaining different corpora as a training set; and obtaining candidate words corresponding to the target word unit through each trained word2vec model to form a candidate set. Thus, a target word unit has a plurality of candidate sets, and candidate words in each candidate set may be the same or different. Multiple word2vec models trained on different corpuses guarantee the comprehensiveness of the candidate words.

According to the method, a weighted sorting mode is adopted, and a candidate set output by each word2vec model is obtained; determining the total weight of the candidate words according to the preset weight of the candidate words and the arrangement positions of the candidate words in each candidate set; and sorting the candidate words in each candidate set according to the total weight of the candidate words. Therefore, the candidate words output by different word2vec models are sorted according to the synonymy recognition accuracy between the candidate words and the target word unit, and the synonymy recognition accuracy of the target word unit is improved.

Drawings

FIG. 1 is a schematic flow chart of a synonymous text prompting method according to the present invention;

FIG. 2 is a schematic diagram illustrating a candidate set formed by candidate words corresponding to a target word unit obtained by a preset model according to the present invention;

FIG. 3 is a schematic structural diagram of a synonymous text prompt device according to the present invention;

FIG. 4 is a schematic structural diagram of an electronic device of one embodiment of the invention;

fig. 5 is a schematic diagram of a computer-readable recording medium of an embodiment of the present invention.

Detailed Description

In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.

The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.

The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.

The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.

In order to solve the technical problem, the invention provides a synonymy text prompting method, wherein the synonymy text refers to a text with the same or similar meaning as the input text, and the input text is divided into word units; determining a target word unit from the word units according to the segmentation condition of the input text, acquiring candidate words corresponding to the target word unit through a preset model to form a candidate set, and sequencing the candidate words in the candidate set to obtain a comprehensive sequencing candidate set corresponding to the target word unit; and prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sequencing candidate set. Wherein, the segmentation condition of the input text comprises: the input text is segmented into only one word unit, and the input text is segmented into at least two word units; the candidate words are synonyms or near synonyms of the target word unit; according to the method, the candidate set of the target word unit is obtained through the preset model, and then the candidate words in the candidate set are ranked, so that the candidate words are ranked in the candidate set according to the synonymy recognition accuracy between the candidate words and the target word unit, and the synonymy recognition accuracy of the target word unit is improved. And finally, determining and prompting the synonymous text of the input text according to the segmentation condition of the input text and the sequence of the candidate words corresponding to the target word unit, thereby ensuring that the synonymous text is prompted according to the accuracy of synonymous identification. According to the method and the device, the user experience is improved while the synonymous text recognition rate is improved, and the user can select the target synonymous text according to the previous two synonymous texts which are prompted.

When each word2vec model obtains a candidate set of a target word unit, firstly inputting the target word unit into a trained word2vec model to obtain a word vector of the target word unit output by the word2vec model; acquiring candidate word vectors with the similarity between the word vectors and the target word unit smaller than a threshold value; and converting the candidate word vectors into corresponding candidate words to form a candidate set.

In a mode of sequencing candidate words in a candidate set, a weighted sequencing mode is adopted, and a candidate set output by each word2vec model is obtained; determining the total weight of the candidate words according to the preset weight of the candidate words and the arrangement positions of the candidate words in each candidate set; and sorting the candidate words in each candidate set according to the total weight of the candidate words. Therefore, the candidate words output by different word2vec models are sorted according to the synonymy recognition accuracy between the candidate words and the target word unit, and the synonymy recognition accuracy of the target word unit is improved.

In a case of segmenting an input text, the input text is segmented into only one word unit, and the only one word unit is used as a target word unit. Filtering out target words meeting the conditions from the comprehensive ordering candidate set according to preset word length, preset part of speech and preset word frequency; and then prompting the target words according to the sequence of the target words in the comprehensive sequencing candidate set.

In another segmentation condition of the input text, the input text is segmented into at least two word units, and one of the at least two word units is selected as a target word unit according to the part of speech and/or the word frequency. Acquiring candidate words N before the total weight in the comprehensive sorting candidate set; merging other word units segmented from the input text with the candidate words N before the total weight to obtain N candidate texts; sorting the candidate texts according to the historical occurrence frequency of the candidate texts; and finally prompting the candidate texts according to the sequence of the candidate texts.

In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.

Fig. 1 is a schematic flow chart of a method for prompting a synonymous text according to the present invention, where the synonymous text refers to a text having the same or similar meaning as an input text, and as shown in fig. 1, the method includes the following steps:

s1, dividing the input text into word units;

wherein the input text may be a single word, such as a keyword or the like input by the user. Or a phrase consisting of a plurality of words. The word unit can be specifically a Chinese word, an English word and the like. The invention can adopt word segmentation tools (such as a Hanlp word segmentation device, an Ansj word segmentation tool and the like) to segment the input text into word units.

S2, determining a target word unit from the word units according to the segmentation condition of the input text,

specifically, if the input text is a single word in step S1, the segmentation condition of the input text is as follows: and when the input text is segmented into only one word unit, taking the only one word unit as a target word unit.

If the input text is a phrase in step S1, the segmentation condition of the input text is: and when the input text is segmented into at least two word units, selecting one of the at least two word units as a target word unit according to the part of speech and/or the word frequency. The word frequency refers to the frequency of word units appearing in a predetermined corpus or a predetermined document. When the part of speech and the word frequency are simultaneously used for determining the target word unit, the priority of the part of speech and the priority of the word frequency are preset, for example, the priority of the part of speech is higher than the priority of the word frequency, the target word unit is selected according to the part of speech, and if the parts of speech of all the word units are the same, the target word unit is selected according to the word frequency. Taking the example of selecting the target word unit according to the part of speech, wherein the part of speech may include: nouns, adjectives, verbs, and words other than the three parts of speech, the present invention can preset the priority of each part of speech, such as: the priorities of nouns, adjectives, verbs, and words other than the three parts of speech increase in order. And then selecting the word unit with the highest word priority as the target word unit. For example, the user inputs "no time", which is divided into two word units of "no" and "time", and the part of speech priority of "no" is higher than that of "time", and then "no" is the target word unit.

S3, obtaining candidate words corresponding to the target word unit through a preset model to form a candidate set,

wherein the candidate word is a synonym or a synonym of the target word unit; the method comprises the steps of training a plurality of word2vec models by obtaining different corpora as a training set; and obtaining candidate words corresponding to the target word unit through each trained word2vec model to form a candidate set. Thus, a target word unit has a plurality of candidate sets, and candidate words in each candidate set may be the same or different. Multiple word2vec models trained on different corpuses guarantee the comprehensiveness of the candidate words.

When each word2vec model obtains a candidate set of a target word unit, firstly inputting the target word unit into a trained word2vec model to obtain a word vector of the target word unit output by the word2vec model; acquiring candidate word vectors with the similarity between the word vectors and the target word unit smaller than a threshold value; and converting the candidate word vectors into corresponding candidate words to form a candidate set. Where the word2vec model is a group of models used to generate word vectors. After training is complete, the word2vec model may be used to map each word to a vector, which may be used to represent word-to-word relationships. In the invention, the similarity of the word vectors can be calculated by the Euclidean distance, the cosine distance, the edit distance, the Hamming distance and the like between the two.

For example, as shown in fig. 2, taking an example of obtaining a candidate set of a target word unit by using three word2vec models, firstly, training the three word2vec models by using different corpora, for example: the method comprises the steps of training a first word2vec model by adopting open-source corpora, training a second word2vec model by adopting a first corpus (such as a juvenile subject corpus) inside an enterprise, and training a third word2vec model by adopting a second corpus (such as a high school subject corpus) inside the enterprise. If the text input by the user is a test paper, the test paper is a target word unit, and the test paper is input into each trained word2vec model. Each word2vec model outputs a word vector of the test paper, calculates Euclidean distances between the word vector of the test paper and other word vectors to obtain similarity between the word vector of the test paper and other word vectors, takes the other word vectors with the similarity smaller than a threshold value as candidate word vectors, and converts the candidate word vectors into candidate words to form a candidate word set. For example, the candidate word set of the "test paper" obtained by the first word2vec model is: paper, examination questions, test paper, true questions, examination, interim, early, end, and exercise questions. The candidate word set of the test paper obtained by the second word2vec model is as follows: examination questions, examination papers, examinees, examination paper reading, examination questions, composition, examination, pen test, question setting and examination room. The candidate word set of the test paper obtained by the third word2vec model is as follows: the test paper is composed of a small section, a paper, an end-of-term test paper, an interim test paper, a monthly exam paper, a test question, a test coupon.

S4, sorting the candidate words in the candidate set to obtain a comprehensive sorting candidate set corresponding to the target word unit;

in a mode of sequencing candidate words in a candidate set, a weighted sequencing mode is adopted, and a candidate set output by each word2vec model is obtained; determining the total weight of the candidate words according to the preset weight of the candidate words and the arrangement positions of the candidate words in each candidate set; and sorting the candidate words in each candidate set according to the total weight of the candidate words. Therefore, the candidate words output by different word2vec models are sorted according to the synonymy recognition accuracy between the candidate words and the target word unit, and the synonymy recognition accuracy of the target word unit is improved. The arrangement positions of the candidate words in each candidate set may be arranged according to the similarity between the word vector of the candidate word and the word vector of the target word unit.

Specifically, the total weight Pi of the candidate word i can be obtained by the following formula:

Pi＝pi×[rank(Ai)+rank(Bi)+…+rank(Ni)]

wherein pi is a preset weight of the candidate word i, rank (ai) is an arrangement position of the candidate word i in the word2vec model output candidate set with the number of A, rank (Bi) is an arrangement position of the candidate word i in the word2vec model output candidate set with the number of B, and rank (Ni) is an arrangement position of the candidate word i in the word2vec model output candidate set with the number of N.

Taking the candidate word "test question" in fig. 2 as an example, assuming that the preset weight of the candidate word "test question" is 0.9, the candidate word "test question" is ranked second in the first word2vec model, first in the second word2vec model, and eighth in the third word2vec model, then the total weight of the candidate word "test question" is: 0.9 × (2+1+ 8).

And S5, prompting the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sorting candidate set.

Specifically, if the input text is a single word in step S1, the segmentation condition of the input text is as follows: the input text is segmented into only one word unit, and the step comprises the following steps:

s51, filtering out target words meeting conditions from the comprehensive sorting candidate set according to preset word length, preset part of speech and preset word frequency;

for example, the preset word length may be set as: greater than 1 byte and less than 4 bytes; the preset part of speech may be set to be the same as the part of speech of the target word unit, and the preset word frequency refers to the frequency of occurrence of the candidate word in a predetermined corpus or a predetermined document, and may be specifically set according to a training sample of the word2vec model.

And S52, prompting the target words according to the sequence of the target words in the comprehensive sorting candidate set.

Specifically, the target words may be sequentially displayed from front to back according to the order of the target words in the comprehensive ranking candidate set.

If the input text is a phrase in step S1, the segmentation condition of the input text is: the input text is segmented into at least two word units, and the step includes:

s501, acquiring candidate words N before the total weight in the comprehensive ranking candidate set;

wherein, N can be set according to the needs of users.

S502, combining other word units segmented from the input text with the candidate words N before the total weight to obtain N candidate texts;

taking the input text "what time" as an example, it is segmented into "what" and "time", the "what" being the target word unit, and the "time" being the other word units segmented from the input text. If "what" corresponds to the candidate word at the top 9 of the total weight in the comprehensive ranking candidate set: any, what, other, what, each, where, various, everything; then 9 candidate texts are obtained: any time, what time, other times, which times, each time, where time, various times, all times.

S503, sorting the candidate texts according to the historical occurrence frequency of the candidate texts;

the historical frequency of occurrence refers to the frequency of occurrence in the historical query documents. And if the historical occurrence frequency of a plurality of candidate texts is zero, sequencing the candidate texts with the historical occurrence frequency of zero according to the similarity between the word vectors corresponding to the candidate words and the word vectors corresponding to the target word unit.

S504, presenting the candidate texts according to the sequence of the candidate texts.

Fig. 3 is a schematic diagram of a framework of a synonymous text presentation device according to the present invention, where the synonymous text refers to a text having the same or similar meaning as an input text, and as shown in fig. 3, the device includes:

a segmentation unit 31 configured to segment an input text into word units;

a determining module 32, configured to determine a target word unit from the word units according to a segmentation condition of the input text, where the segmentation condition of the input text includes: the input text is segmented into only one word unit, and the input text is segmented into at least two word units;

the obtaining module 33 is configured to obtain candidate words corresponding to the target word unit through a preset model to form a candidate set, where the candidate words are synonyms or near-synonyms of the target word unit;

the sorting module 34 is configured to sort the candidate words in the candidate set to obtain a comprehensive sorting candidate set corresponding to the target word unit;

and the prompting module 35 is configured to prompt the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive sorting candidate set.

In a specific embodiment, the obtaining module 33 includes:

the first acquisition module is used for acquiring different corpora as a training set to train a plurality of word2vec models;

and the second acquisition module is used for acquiring candidate words corresponding to the target word unit through each trained word2vec model to form a candidate set.

Further, the second obtaining module includes:

the input module is used for inputting the target word unit into a trained word2vec model to obtain a word vector of the target word unit output by the word2vec model;

the sub-acquisition module is used for acquiring candidate word vectors of which the similarity with the word vectors of the target word unit is smaller than a threshold value;

and the conversion module is used for converting the candidate word vectors into corresponding candidate words to form a candidate set.

The sorting module 34 includes:

a third obtaining module, configured to obtain a candidate set output by each word2vec model;

the sub-determination module is used for determining the total weight of the candidate words according to the preset weight of the candidate words and the arrangement positions of the candidate words in each candidate set;

and the sub-ordering module is used for ordering the candidate words in the candidate set according to the total weight of the candidate words.

In one example, the input text is segmented into only one word unit, and the only one word unit is used as a target word unit. The prompt module 35 includes:

the filtering module is used for filtering out target words meeting conditions from the comprehensive sorting candidate set according to preset word length, preset word property and preset word frequency;

and the first prompting module is used for prompting the target words according to the sequence of the target words in the comprehensive sorting candidate set.

In another example, the input text is segmented into at least two word units, and one of the at least two word units is selected as a target word unit according to the part of speech and/or the word frequency. The prompt module 35 includes:

a fourth obtaining module, configured to obtain candidate words N before the total weight in the comprehensive ranking candidate set;

the merging module is used for merging other word units segmented from the input text with the candidate words N before the total weight to obtain N candidate texts;

the first sequencing module is used for sequencing the candidate texts according to the historical occurrence frequency of the candidate texts; if the historical occurrence frequency of the candidate texts is zero, the first ordering module orders the candidate texts according to the similarity between word vectors corresponding to candidate words in the candidate texts and word vectors corresponding to the target word unit.

And the second prompting module is used for prompting the candidate texts according to the sequence of the candidate texts.

Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device includes a processor and a memory, where the memory is used to store a computer-executable program, and when the computer program is executed by the processor, the processor executes a synonymous text prompting method.

As shown in fig. 4, the electronic device is in the form of a general purpose computing device. The processor can be one or more and can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The electronic device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.

The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable an electronic device to perform the method of the invention, or at least some of the steps of the method.

The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may also be non-volatile memory, such as read-only memory (ROM).

Optionally, in this embodiment, the electronic device further includes an I/O interface, which is used for data exchange between the electronic device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.

It should be understood that the electronic device shown in fig. 4 is only one example of the present invention, and elements or components not shown in the above example may be further included in the electronic device of the present invention. For example, some electronic devices further include a display unit such as a display screen, and some electronic devices further include a human-computer interaction element such as a button, a keyboard, and the like. Electronic devices are considered to be covered by the present invention as long as the electronic devices are capable of executing a computer-readable program in a memory to implement the method of the present invention or at least a part of the steps of the method.

Fig. 5 is a schematic diagram of a computer-readable recording medium of an embodiment of the present invention. As shown in fig. 5, a computer-readable recording medium stores therein a computer-executable program, which when executed, implements the synonymous text presentation method described above according to the present invention. The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system, and the present invention can also be implemented by a vehicle including at least a part of the above system or components. The invention can also be implemented by computer software executing the method of the invention, for example, by control software executed by a microprocessor, an electronic control unit, a client, a server, etc. of a live device. It should be noted that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, but may also be implemented in a distributed manner by hardware entities without specific details, and for the computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be stored in a distributed manner on a network, as long as it can enable an electronic device to execute the method according to the present invention.

While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims

1. A method for prompting synonymous texts is characterized by comprising the following steps:

dividing an input text into word units;

2. The method for prompting synonymous text according to claim 1, wherein the obtaining of the candidate word composition candidate set corresponding to the target word unit through a preset model comprises:

3. The method for prompting synonymous text according to claim 1 or 2, wherein the obtaining of the candidate words corresponding to the target word unit by the trained word2vec model to form a candidate set comprises:

4. The method for hinting synonymous text according to any one of claims 1-3, wherein the ranking the candidate words in the candidate set comprises:

acquiring a candidate set output by each word2vec model;

sorting the candidate words in the candidate set according to the total weight of the candidate words; optionally, the input text is segmented into only one word unit, and the only one word unit is used as a target word unit.

5. The method for prompting synonymous text according to any one of claims 1-4, wherein the prompting of the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive ranking candidate set comprises:

6. The method according to claim 4, wherein the input text is segmented into at least two word units, and one of the at least two word units is selected as the target word unit according to part of speech and/or word frequency.

7. The method for prompting synonymous text according to claim 7, wherein the prompting of the synonymous text of the input text according to the segmentation condition of the input text and the comprehensive ranking candidate set comprises:

and optionally prompting the candidate texts according to the ordering of the candidate texts, and if the historical occurrence frequency of the candidate texts is zero, ordering the candidate texts according to the similarity between the word vector corresponding to the candidate word in the candidate texts and the word vector corresponding to the target word unit.

8. A synonymous text presentation device characterized in that the synonymous text refers to a text having the same or similar meaning as an input text, the device comprising:

the segmentation unit is used for segmenting the input text into word units;

9. An electronic device comprising a processor and a memory, the memory for storing a computer-executable program, characterized in that:

the computer program, when executed by the processor, performs the method of any of claims 1-9.

10. A computer-readable medium storing a computer-executable program, wherein the computer-executable program, when executed, implements the method of any of claims 1-7.