CN115080695A

CN115080695A - Chinese similar character retrieval method and device based on knowledge graph and electronic equipment

Info

Publication number: CN115080695A
Application number: CN202210752941.5A
Authority: CN
Inventors: 贾伟; 倪江柳伊; 许春媛; 董传磊; 屈迪; 张安洁; 陈梓健; 汪利飞
Original assignee: Rajax Network Technology Co Ltd
Current assignee: Rajax Network Technology Co Ltd
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-09-20

Abstract

The application provides a Chinese similar character retrieval method and device based on a knowledge graph, electronic equipment and a storage medium, and relates to the technical field of internet. The method comprises the steps of obtaining a word to be retrieved, searching a Chinese similar word data pair where a Chinese character matched with the word to be retrieved is located in a pre-constructed knowledge graph, wherein the knowledge graph comprises a data pair representing the similar relation between every two Chinese characters; and acquiring the Chinese similar characters corresponding to the characters to be searched by utilizing the searched Chinese similar character data pairs. According to the embodiment of the application, the knowledge graph of the Chinese characters is constructed, the knowledge graph comprises the structured data pairs representing the similarity relation between every two Chinese characters, and the safety prevention and control capability of the Chinese content is enhanced by retrieving similar characters through the knowledge graph.

Description

Chinese similar character retrieval method and device based on knowledge graph and electronic equipment

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for retrieving similar chinese characters based on a knowledge graph, an electronic device, and a storage medium.

Background

Along with the increasing presentation of the Chinese Internet contents, the attention of users to Internet platforms is also increasing. The expression forms of Chinese content are diverse, so that the same content has a plurality of expression modes, and the traditional prevention and control means are increasingly caught by the Internet platform for the purposes of dealing with black and grey products and the like. Aiming at dealing with the black and gray products and the like, how to efficiently and accurately search similar characters, realize the improvement and the expansion of Chinese content and enhance the safety prevention and control capability of the Chinese content becomes a technical problem which needs to be solved urgently.

Disclosure of Invention

In view of the above problems, the present application is made to provide a method and apparatus for retrieving similar chinese characters based on a knowledge-graph, an electronic device, and a storage medium that overcome or at least partially solve the above problems. The technical scheme is as follows:

in a first aspect, a method for retrieving Chinese similar characters based on a knowledge graph is provided, which includes:

acquiring a word to be retrieved, and searching a Chinese similar word data pair where a Chinese character matched with the word to be retrieved is located in a pre-constructed knowledge graph, wherein the knowledge graph comprises a data pair representing the similar relation between every two Chinese characters;

and acquiring the Chinese similar characters corresponding to the characters to be searched by utilizing the searched Chinese similar character data pairs.

In one possible implementation, the knowledge-graph is constructed by:

acquiring a plurality of Chinese characters, and constructing a Chinese character characteristic index aiming at each Chinese character in the plurality of Chinese characters;

determining similar characters in the plurality of Chinese characters according to the Chinese character characteristic indexes of the Chinese characters;

generating a data pair representing the similarity relation between every two Chinese characters according to the determined similar characters in the Chinese characters;

and constructing a knowledge graph by using the data pairs representing the similarity between every two Chinese characters as knowledge items.

In one possible implementation, the chinese character characteristic index includes: one or more of pinyin index, structure index, character splitting index, four-corner code index, five-stroke index, stroke sequence index and semantic index.

In a possible implementation manner, determining similar words in the plurality of Chinese characters according to the Chinese character characteristic index of each Chinese character includes:

and based on the pinyin indexes of the Chinese characters, uniformly processing the front nasal sound and the rear nasal sound, the flat tongue sound and the curled tongue sound, and determining that the Chinese characters with the same pinyin are near characters.

and determining a side part and a rest part of each Chinese character based on the structure index and the character splitting index of each Chinese character, and determining the Chinese characters with the same rest part in the plurality of Chinese characters as similar characters.

determining the parts with the same positions and the same stroke sequences as a public string based on the stroke sequence indexes of the Chinese characters;

and determining the Chinese characters with the longest continuous public character string length in the plurality of Chinese characters, which accounts for the total length of the Chinese character stroke sequences, more than or equal to a first preset proportion threshold value as the similar characters.

and determining the Chinese characters of which the same parts of the codes in the plurality of Chinese characters are greater than or equal to a second preset proportion threshold value as the similar characters based on the four-corner code index or the five-stroke index of each Chinese character.

inputting a pre-constructed relation prediction model between Chinese character nodes by taking the Chinese character characteristic index of each Chinese character as a characteristic;

and predicting the relation among the Chinese characters by using the relation prediction model among the Chinese character nodes to determine the shape and the shape of the Chinese characters.

In a possible implementation manner, obtaining the chinese similar word corresponding to the word to be retrieved by using the searched chinese similar word data pair includes:

extracting the characters and the similar relation in the Chinese similar character data pairs by using the searched Chinese similar character data pairs;

and acquiring the Chinese similar characters corresponding to the characters to be retrieved according to the extracted Chinese similar character data pairs and the similarity relation.

In a second aspect, a chinese similar word retrieval device based on knowledge graph is provided, which includes:

the first acquisition module is used for acquiring the word to be retrieved;

the searching module is used for searching a Chinese similar character data pair where the Chinese characters matched with the characters to be searched are located in a pre-constructed knowledge graph, wherein the knowledge graph comprises a data pair representing the similar relation between every two Chinese characters;

and the second acquisition module is used for acquiring the Chinese similar characters corresponding to the characters to be retrieved by utilizing the searched Chinese similar character data pairs.

In one possible implementation, the apparatus further includes:

the building module is used for obtaining a plurality of Chinese characters and building a Chinese character characteristic index aiming at each Chinese character in the plurality of Chinese characters;

In one possible implementation, the building module is further configured to:

and predicting the relation among the Chinese characters by using the relation prediction model among the Chinese character nodes to determine the approximate form characters in the Chinese characters.

In a possible implementation manner, the second obtaining module is further configured to:

In a third aspect, an electronic device is provided, which includes a processor and a memory, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method for retrieving the chinese similar words based on the knowledge-graph according to any one of the above embodiments.

In a fourth aspect, a storage medium is provided, where the storage medium stores a computer program, where the computer program is configured to execute the method for retrieving chinese similar words based on a knowledge-graph of any one of the above when running.

By means of the technical scheme, the Chinese similar character retrieval method and device based on the knowledge graph, the electronic equipment and the storage medium, provided by the embodiment of the application, are used for acquiring a character to be retrieved, searching a Chinese similar character data pair where a Chinese character matched with the character to be retrieved is located in the pre-constructed knowledge graph, wherein the knowledge graph comprises the data pair representing the similar relation between every two Chinese characters; and acquiring the Chinese similar characters corresponding to the characters to be searched by utilizing the searched Chinese similar character data pairs. According to the embodiment of the application, the knowledge graph of the Chinese characters is constructed, the knowledge graph comprises the structured data pairs representing the similarity relation between every two Chinese characters, and the safety prevention and control capability of the Chinese content is enhanced by retrieving similar characters through the knowledge graph.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

FIG. 1 is a flowchart illustrating a method for retrieving similar Chinese characters based on a knowledge-graph according to an embodiment of the present application;

FIG. 2 is a schematic diagram of Chinese character property indices, node triples, and relationship triples of a knowledge-graph underlying data store provided by another embodiment of the present application;

FIG. 3 shows a graphical illustration of a knowledge-graph provided by another embodiment of the present application;

FIG. 4 is a block diagram of a knowledge-graph based Chinese similar word retrieval apparatus according to an embodiment of the present application;

FIG. 5 is a block diagram of a knowledge-graph based Chinese similar words retrieval device according to another embodiment of the present application;

fig. 6 shows a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the accompanying drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that such uses are interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to".

Before describing embodiments of the present application in detail, the following technical terms are introduced.

Knowledge Graph (Knowledge Graph): the knowledge domain visualization or knowledge domain mapping map is a series of different graphs for displaying the relationship between the knowledge development process and the structure, and the visualization technology is used for describing knowledge resources and carriers thereof, mining, analyzing, constructing, drawing and displaying knowledge and the mutual relation between the knowledge resources and the carriers. The knowledge graph is a modern theory which achieves the aim of multi-discipline fusion by combining theories and methods of applying subjects such as mathematics, graphics, information visualization technology, information science and the like with methods such as metrology introduction analysis, co-occurrence analysis and the like and utilizing a visualized graph to vividly display the core structure, development history, frontier field and overall knowledge framework of the subjects.

Triplet: the data structure stored in the knowledge graph bottom layer data can be divided into node triples and relation triples according to types, wherein the node triples store attribute information of nodes, and the relation triples are used for representing relations between the nodes.

In the embodiment of the application, the node in the node triple and the relationship triple may be a Chinese character, and the node triple may store the attribute and the value of the attribute of the Chinese character and the Chinese character. The relationship triplets may store two Chinese characters and the similarity relationship between two Chinese characters.

Chinese language characteristics: chinese pinyin, strokes, four-corner coding, five-stroke retrieval, components and remainders, structures, compositions and the like.

Chinese character characteristic index: the Chinese language feature-based construction can comprise one or more items of pinyin indexes, structure indexes, character splitting indexes, four-corner code indexes, five-stroke indexes, stroke sequence indexes and semantic indexes.

And (3) relation prediction: in the knowledge Graph, a common method for predicting a relationship between nodes, that is, predicting a relationship between a Chinese character and a Chinese character, includes GCN (Graph Convolutional Network), CNN (Convolutional Neural Network), and the like.

In order to solve the above technical problem, an embodiment of the present application provides a method for retrieving a chinese similar word based on a knowledge graph, as shown in fig. 1, the method for retrieving a chinese similar word based on a knowledge graph may include the following steps S101 and S102:

step S101, obtaining a word to be retrieved, and searching a Chinese similar word data pair where a Chinese character matched with the word to be retrieved is located in a pre-constructed knowledge graph, wherein the knowledge graph comprises a data pair representing the similar relation between every two Chinese characters.

In the step, the word to be retrieved can be obtained from a text, a picture, an audio/video or the like, taking the text as an example, one or more words to be retrieved can be obtained from the text, and for each word to be retrieved, a Chinese similar word data pair where the Chinese character matched with the word to be retrieved is located is searched in a pre-constructed knowledge map.

The knowledge graph can contain node triples and relation triples, and the node triples can store Chinese characters and attributes and attribute values of the Chinese characters; the relationship triplets can store two Chinese characters and the similarity relationship between two Chinese characters, and the relationship triplets are data pairs representing the similarity relationship between two Chinese characters.

And S102, acquiring the Chinese similar characters corresponding to the characters to be searched by utilizing the searched Chinese similar character data pairs.

The method and the device can acquire the word to be retrieved, and search the Chinese similar word data pair where the Chinese character matched with the word to be retrieved is located in a pre-constructed knowledge graph, wherein the knowledge graph comprises the data pair representing the similar relation between every two Chinese characters; and acquiring the Chinese similar characters corresponding to the characters to be searched by utilizing the searched Chinese similar character data pairs. The embodiment of the application realizes the improvement and the expansion of the Chinese content by constructing the knowledge graph of the Chinese characters, wherein the knowledge graph comprises the structured data pair representing the similarity relation between every two Chinese characters, and the knowledge graph is used for searching similar characters, thereby enhancing the safety prevention and control capability of the Chinese content.

A possible implementation manner is provided in the embodiment of the present application, and the knowledge graph may be constructed through the following steps a1 to a 4:

step A1, obtaining a plurality of Chinese characters, and constructing a Chinese character characteristic index for each Chinese character in the plurality of Chinese characters.

In this step, a Chinese character characteristic index can be constructed according to Chinese language characteristics, specifically including Chinese pinyin, strokes, four-corner coding, five-stroke search, components and remainders, structures, compositions, and the like. The Chinese character characteristic index constructed here may include pinyin index, structure index, character splitting index, four-corner code index, five-stroke index, stroke sequence index, semantic index, etc., such as the Chinese character characteristic index stored in the knowledge graph underlying data shown in fig. 2.

For example, the pinyin index: such as: huai4

And (3) structural indexing: such as: structure of left and right bad- >

Character splitting indexing: such as: root of ruo-earth

Four-corner code index: such as: bad- >81790

Five-stroke indexing: such as bad- > FGIY

Stroke indexing: such as: bad- > horizontal, vertical, lifting, horizontal, left-falling, vertical and dot

It should be noted that the above examples are merely illustrative and do not limit the embodiments of the present application.

Step A2, according to the Chinese character characteristic index of each Chinese character, determining the similar character in the plurality of Chinese characters.

And step A3, generating a data pair representing the similarity relation between every two Chinese characters according to the determined similar characters in the plurality of Chinese characters.

As shown in fig. 2, the knowledge graph may include a node triple and a relationship triple, where the node triple may store a Chinese character, an attribute of the Chinese character, and a value of the attribute; the relationship triplets can store two Chinese characters and the similarity relationship between two Chinese characters, and the relationship triplets are data pairs representing the similarity relationship between two Chinese characters. It should be noted that the example in fig. 2 is only illustrative and does not limit the embodiment of the present application.

And step A4, constructing a knowledge graph by using data pairs representing the similarity between every two Chinese characters as knowledge items.

As described above, the character characteristic index, the node triplet and the relationship triplet in the knowledge graph bottom data storage, the constructed knowledge graph can be as shown in fig. 3, taking a bad blank and a blank as an example, both show respective node triplet information, and the bad blank and the blank are similar characters; taking bad and plutonium as an example, the bad and plutonium both show respective node triple information, and the bad and the blank are similar characters; taking bad and nostalgic as an example, the bad and nostalgic both show respective node triple information, and the bad and nostalgic are similar characters and similar characters; taking the bad sum as an example, the bad sum and the bad sum show respective node triple information, and the bad sum is a similar word; taking a bad ring and a ring as examples, the bad ring and the ring both show respective node triple information, and the bad ring and the ring are approximate characters; taking the bad and projection as an example, both show respective node triplet information, and the bad and projection are projection near words.

In FIG. 3, the knowledge-map is also indexed with respect to a neutral characteristic of the projection and the embryo, such as between the projection and the embryo, to determine whether the projection and the embryo have a similar relationship; if there is a question mark between the rings, it can be determined whether the rings have a similar relationship according to the Chinese character characteristic index of the rings. It should be noted that the example in fig. 3 is only illustrative and does not limit the embodiments of the present application.

With the knowledge graph of fig. 3, the relation between the Chinese characters and the Chinese characters can be easily determined manually, but under the condition of limited manpower, the relation between all the Chinese characters is difficult to be exhausted completely, so that reasoning is needed, the embodiment of the application provides a reasoning scheme based on rules and models, and the reasoning scheme based on rules considers that the Chinese characters are similar to each other in terms of pronunciation and the Chinese characters are similar to each other in terms of remaining parts, such as pinyin; the inference scheme based on the model may be to convert the Chinese language feature index into features and then perform algorithmic inference, and common ways include GCN, CNN, and the like, which will be described in detail below.

In the embodiment of the present application, a possible implementation manner is provided, in the step a2, similar characters in a plurality of chinese characters are determined according to the chinese character characteristic index of each chinese character, specifically, similar characters in a plurality of chinese characters may be determined based on the pinyin index of each chinese character, and after the front nasal sound and the rear nasal sound, the tongue-flattening sound and the tongue-curling sound are processed in a unified manner, it is determined that the pinyin in a plurality of chinese characters is the same as the near character. For example, bad huai4 and huai2 are near words, blanks pi1 and Brassica pi1 are near words, and the like.

In this embodiment, a possible implementation manner is provided, in the step a2, a similar word in the plurality of chinese characters is determined according to the chinese character characteristic index of each chinese character, specifically, a side part and a remainder of each chinese character may be determined based on the structure index and the character splitting index of each chinese character, and a chinese character having the same remainder in the plurality of chinese characters is determined as a similar word. For example, the rest of bad and carry is not, both are near-word; the rest of the bad sum is not, the two are similar characters, and the like.

In the embodiment of the present application, a possible implementation manner is provided, in which in the step a2, according to the Chinese character characteristic index of each Chinese character, a similar character in a plurality of Chinese characters is determined, and specifically, based on the stroke sequence index of each Chinese character, a part having the same position and the same stroke sequence is determined as a common string; and determining the Chinese characters with the longest continuous public character string length in the plurality of Chinese characters, which accounts for the total length of the Chinese character stroke sequences, more than or equal to a first preset proportion threshold value as the similar characters. For example, the children and the writings are in the shape of a word, the wins and the wins are in the shape of a word, the bad and the blank are in the shape of a word, and the like.

In this embodiment, a possible implementation manner is provided, in which in step a2, similar words in the plurality of chinese characters are determined according to the chinese character characteristic index of each chinese character, and particularly, a chinese character in which the same part of codes in the plurality of chinese characters is greater than or equal to a second preset proportion threshold may be determined as a near-word based on a four-corner index or five-stroke index of each chinese character. For example, the wind and the phoenix are shaped like a character, etc.

In the embodiment of the present application, a possible implementation manner is provided, where the step a2 determines similar words in multiple chinese characters according to the chinese character characteristic index of each chinese character, and in the inference scheme based on the model, the following steps B1 and B2 may be specifically included:

and step B1, inputting the Chinese character characteristic indexes of the Chinese characters as features into a pre-constructed relation prediction model between Chinese character nodes.

And step B2, predicting the relation between the Chinese characters by using a relation prediction model between the Chinese character nodes, and determining the form-similar characters in the Chinese characters.

In the steps B1 and B2, a relationship prediction model between the chinese character nodes may be pre-constructed by using an algorithm such as GCN or CNN; by adopting the scheme based on the rules, the preliminary relation construction is carried out on the ways of the same pinyin, the congruence parts (such as bad part, wye part and the like) and the similar stroke sequences (such as thousand part, stem part and the like) to obtain a batch of combinations of shape characters and pronunciation characters as initial samples; training a pre-constructed relation prediction model between Chinese character nodes based on an initial sample to obtain a trained relation prediction model between the Chinese character nodes; after the characteristics of the Chinese characters in the trained relation prediction model among the Chinese character nodes are used for carrying out feature expression, relation prediction among the Chinese character nodes is carried out, and therefore the graph network of the knowledge graph is perfected.

The embodiment comprehensively quotes various language characteristics such as pinyin, radicals and remainders, four-corner codes, five strokes, stroke sequences and the like, and based on the language characteristics, the similar characters are stored and retrieved by initially utilizing a knowledge map mode, thereby realizing the improvement and the expansion of Chinese content and enhancing the safety control capability of the Chinese content.

In the embodiment of the present application, a possible implementation manner is provided, in the above step S102, the searched chinese similar character data pair is used to obtain a chinese similar character corresponding to the word to be retrieved, specifically, the searched chinese similar character data pair is used to extract two chinese characters and a similar relationship in the chinese similar character data pair; and acquiring the Chinese similar characters corresponding to the characters to be retrieved according to the pairwise Chinese characters and the similar relation in the extracted Chinese similar character data pairs.

For example, the word "hua" to be retrieved is obtained, and the similar Chinese words corresponding to the word "hua" may be changed, hua, swoosh, bright-bright, birch, light, flower, corrupt, goods, ploughshare, boot, Wei, rail, and the like obtained in steps S101 and S102.

It should be noted that, in practical applications, all the possible embodiments described above may be combined in a combined manner at will to form possible embodiments of the present application, and details are not described here again.

Based on the Chinese similar word retrieval method based on the knowledge graph provided by each embodiment, the embodiment of the application also provides a Chinese similar word retrieval device based on the knowledge graph based on the same invention concept.

Fig. 4 is a block diagram illustrating a chinese similar word retrieval apparatus based on a knowledge-graph according to an embodiment of the present application. As shown in fig. 4, the apparatus for retrieving chinese similar words based on a knowledge-graph may specifically include a first obtaining module 410, a searching module 420, and a second obtaining module 430.

A first obtaining module 410, configured to obtain a word to be retrieved;

the searching module 420 is configured to search a pre-constructed knowledge graph for a Chinese similar character data pair where a Chinese character matched with the word to be retrieved is located, where the knowledge graph includes a data pair representing a similarity relationship between every two Chinese characters;

a second obtaining module 430, configured to obtain, by using the found chinese similar word data pair, a chinese similar word corresponding to the word to be retrieved.

In the embodiment of the present application, a possible implementation manner is provided, as shown in fig. 5, the apparatus shown in fig. 4 above may further include a constructing module 510, configured to obtain a plurality of chinese characters, and construct a chinese character characteristic index for each of the plurality of chinese characters;

The embodiment of the present application provides a possible implementation manner, where the Chinese character characteristic index includes: one or more items of pinyin index, structure index, character splitting index, quadrangle code index, five-stroke index, stroke sequence index and semantic index.

In the embodiment of the present application, a possible implementation manner is provided, and the building module 510 shown in fig. 5 is further configured to:

In the embodiment of the present application, a possible implementation manner is provided, and the second obtaining module 430 shown in fig. 4 or fig. 5 is further configured to:

Based on the same inventive concept, the embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the computer program to execute the method for retrieving the chinese similar words based on the knowledge graph according to any of the above embodiments.

In an exemplary embodiment, there is provided an electronic device, as shown in fig. 6, the electronic device 600 shown in fig. 6 including: a processor 601 and a memory 603. The processor 601 is coupled to the memory 603, such as via a bus 602. Optionally, the electronic device 600 may also include a transceiver 604. It should be noted that the transceiver 604 is not limited to one in practical applications, and the structure of the electronic device 600 is not limited to the embodiment of the present application.

The Processor 601 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 601 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.

Bus 602 may include a path that transfers information between the above components. The bus 602 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 602 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 6, but this is not intended to represent only one bus or type of bus.

The Memory 603 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical disk storage, optical disk storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.

The memory 603 is used for storing computer program code for performing the solution of the present application and is controlled by the processor 601 for execution. The processor 601 is adapted to execute computer program code stored in the memory 603 to implement the content shown in the foregoing method embodiments.

Among them, electronic devices include but are not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

Based on the same inventive concept, the present application further provides a storage medium, in which a computer program is stored, where the computer program is configured to execute the method for retrieving chinese similar words based on a knowledge graph according to any one of the above embodiments when running.

It can be clearly understood by those skilled in the art that the specific working processes of the system, the apparatus, and the module described above may refer to the corresponding processes in the foregoing method embodiments, and for the sake of brevity, the details are not repeated herein.

Those of ordinary skill in the art will understand that: the technical solution of the present application may be essentially or wholly or partially embodied in the form of a software product, where the computer software product is stored in a storage medium and includes program instructions for enabling an electronic device (e.g., a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application when the program instructions are executed. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Alternatively, all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware (an electronic device such as a personal computer, a server, or a network device) associated with program instructions, which may be stored in a computer-readable storage medium, and when the program instructions are executed by a processor of the electronic device, the electronic device executes all or part of the steps of the method described in the embodiments of the present application.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments can be modified or some or all of the technical features can be equivalently replaced within the spirit and principle of the present application; such modifications or substitutions do not depart from the scope of the present application.

Claims

1. A Chinese similar word retrieval method based on knowledge graph is characterized by comprising the following steps:

2. The method of claim 1, wherein the knowledge-graph is constructed by:

3. The method of claim 2, wherein the chinese character characteristic index comprises: one or more items of pinyin index, structure index, character splitting index, quadrangle code index, five-stroke index, stroke sequence index and semantic index.

4. The method of claim 3, wherein determining similar words in the plurality of Chinese characters according to the Chinese character characteristic index of each Chinese character comprises:

5. The method of claim 3, wherein determining similar words in the plurality of Chinese characters according to the Chinese character characteristic index of each Chinese character comprises:

6. The method of claim 3, wherein determining similar words in the plurality of Chinese characters according to the Chinese character characteristic index of each Chinese character comprises:

7. The method of claim 3, wherein determining similar words in the plurality of Chinese characters according to the Chinese character characteristic index of each Chinese character comprises:

8. A Chinese similar word retrieval device based on knowledge graph is characterized by comprising:

the first acquisition module is used for acquiring the word to be retrieved;

9. An electronic device comprising a processor and a memory, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method for retrieving Chinese similar words based on a knowledge-graph according to any one of claims 1 to 7.

10. A storage medium having a computer program stored therein, wherein the computer program is configured to execute the method for retrieving chinese similar words based on a knowledge-graph of any one of claims 1 to 7 when running.