CN112651226B

CN112651226B - Knowledge analysis system and method based on dependency syntax tree

Info

Publication number: CN112651226B
Application number: CN202010997505.5A
Authority: CN
Inventors: 裴正奇; 王树徽; 朱斌斌; 刘潇; 段必超; 于秋鑫; 余志炜
Original assignee: Shenzhen Qianhai Heidun Technology Co ltd
Current assignee: Shenzhen Qianhai Heidun Technology Co ltd
Priority date: 2020-09-21
Filing date: 2020-09-21
Publication date: 2022-03-29
Anticipated expiration: 2040-09-21
Also published as: CN112651226A

Abstract

The invention provides a knowledge analysis system and a knowledge analysis method based on a dependency syntax tree. A dependency syntax tree based knowledge parsing system, comprising: a knowledge base module and an analysis module. The knowledge analysis method based on the dependency syntax tree enables knowledge points in the Chinese context to be clearly defined for accurate analysis. The knowledge base can be dynamically maintained in real time, is clear and controllable, can directly position and solve unreasonable problem parts, and is not like a traditional deep learning model which is like a black box and can not be analyzed. The knowledge analysis scene is not limited by the diversity and complexity of Chinese grammar/syntax any more, and the application requirements of the scene can be met to the utmost extent as long as the resources of the knowledge base are ensured to be high-quality and comprehensive.

Description

Knowledge analysis system and method based on dependency syntax tree

Technical Field

The invention relates to the field of natural language processing, in particular to a knowledge analysis system and a knowledge analysis method based on a dependency syntax tree.

Background

Dependency parsing is an important component in natural language processing. The dependency syntax can embody the internal logic rule of natural language, and is a syntax theory which breaks through the language restriction and exists in each language family. The concept of "dependency syntax" was originally proposed by the linguistic scientist Panini in india in the 4 th century before the notations, the initial intention was to perform classified research on grammar, syntax, semantics and dependency morphology, the book "structure syntax base" published by the french linguistic scientist Lucien Tesniere in 1959 was always considered as the theoretical basis of the modern dependency syntax, and Robinson in 1970 proposed four dependency principles based on dependency syntax to lay the theoretical structural basis for dependency syntax, and the four principles are: (1) pure node conditions: only contain the bottom leaf nodes; (2) single parent node condition: all non-root nodes in the dependency tree have one and only one father node; (3) conditions of the single root node: a complete dependency tree only comprises a root node, and all other nodes depend on the root node; (4) mutual exclusion condition: the predecessors of siblings and the dependences of parent and child nodes in the dependency tree are mutually exclusive, that is, if there is a dominant and a dominated relationship between two nodes, there may be no predecessor between them. Dependency syntax analysis by building a formalized mathematical model, designing an efficient algorithm, using a computer to analyze and process a sentence, converting it from a word sequence form to a syntax tree form, thereby capturing the sentence internal structure and the dependency relationship between words to reveal its syntax structure, which claims that the core verb in the sentence is the central component that governs other components, but itself is not governed by any other component, all governed components being subordinate to the governors with some dependency relationship. The computer analyzes the dependency syntax, that is, analyzes the collocation relationship between each word and the structure of the whole sentence for the word sequence of the given input sentence, and obtains a dependency syntax analysis tree. The dependency parse tree is a representation of the dependency parse result. At present, the mainstream dependency syntax research mainly focuses on a data-driven dependency syntax analysis method, that is, iterative learning is performed on a training data set, so as to obtain a dependency syntax analyzer, and there are two mainstream methods: shift-based Dependency analysis methods (Transition-based Dependency matching) and Graph-based Dependency analysis methods (Graph-based Dependency matching). The former is to model the generation process of the dependency syntax analysis book into an action sequence and convert the dependency analysis problem into the problem of finding the optimal action sequence; the latter is the problem of translating the dependency parsing problem into finding the largest spanning tree from a fully directed graph.

However, the dependency syntax analysis method in the prior art has the following problems:

(1) the linguists rely on the "near principle" excessively, and the linguists summarize the existence of the "near" principle on the language organization by observation, that is, people actively place the modifier around the center component when organizing the language. However, natural language does not exist completely according to the principle, for example, for the identification of long distance dependencies, because the "near principle" essentially already implies a higher probability and priority for short distance dependencies than for long distance dependencies, and in the parallel structure, each component usually has the same position in the semantic hierarchy, and even the position can be interchanged without affecting the semantic relationship, which results in the accuracy of analysis being reduced.

(2) The method analyzes and judges the large and excellent corpus which is very dependent on the text through the dependency syntax, the largest task of establishing the corpus is alignment, the higher the alignment efficiency is, the higher the accuracy is, and the greater the use is. The existing corpus has some problems, for example, the overall development is unbalanced, which is mainly reflected in that the quantity of the written linguistic data is very different from that of the spoken linguistic data, because the collection and sampling processes of the spoken linguistic data are complicated and tedious. The accuracy of the corpus cannot be guaranteed, a large corpus contains a lot of sentences to be modified, and the fundamental reason is that an effective self-checking method is lacked. These problems all reflect the urgent need for flexible and accurate corpus establishment.

Disclosure of Invention

In order to solve the above problems in the prior art, the technical solution proposed by the present application is as follows:

according to one aspect of the present invention, a dependency syntax tree based knowledge parsing system is disclosed, comprising: the knowledge base module and the analysis module; wherein knowledge base module includes:

the word segmentation module is used for carrying out word segmentation processing on the natural language sentence according to the pre-trained dependency syntactic model and indicating the syntactic dependency relationship among all the components;

the dependency syntax tree generation module collects the sentences covering the target knowledge points, obtains the dependency syntax trees of all the sentences by using the dependency syntax model, and labels the core words;

the simplified processing module is used for reserving the core words in the dependency syntax tree obtained in the dependency syntax tree generating module and simplifying and processing the redundant words and the peripheral structures thereof;

the calculation module is used for calculating and obtaining adjacent characteristics of each core word and storing the adjacent characteristics corresponding to the core words of each knowledge point to form a knowledge base;

wherein, analysis module includes:

the syntax tree processing module is used for processing the text input by the user through the dependency syntax tree to obtain a corresponding word segmentation result;

and the adjacent feature comparison module is used for comparing the obtained adjacent features of all the words with all the adjacent features in the knowledge base, judging whether the words corresponding to the adjacent features in the knowledge base are similar to the adjacent features of the core words in the adjacent feature acquisition module if the matching degree is greater than a first threshold value, outputting an analysis result if the words corresponding to the adjacent features in the knowledge base are similar to the adjacent features of the core words in the adjacent feature acquisition module, and prompting the words corresponding to the adjacent features in the knowledge base if the words corresponding to the adjacent features in the knowledge base are not similar to the adjacent features in the knowledge base.

According to an aspect of the present invention, a knowledge parsing method based on a dependency syntax tree is also disclosed, which includes the following steps:

step S1, performing word segmentation processing on the natural language sentence according to the pre-trained dependency syntax model and indicating the syntax dependency relationship among the components;

step S2, summarizing sentences covering the target knowledge points, obtaining dependency syntax trees of all the sentences by using a dependency syntax model, and labeling core words;

step S3, reserving the core words in the dependency syntax tree obtained in step S2, and simplifying processing of redundant words and peripheral structures thereof;

step S4, calculating to obtain adjacent features of each core word, and storing the adjacent features corresponding to the core words of each knowledge point to form a knowledge base;

step S5, processing the text input by the user through the dependency syntax tree to obtain the corresponding word segmentation result;

step S6, comparing the obtained adjacent features of each word with each adjacent feature in the knowledge base, if the matching degree is larger than a first threshold value, judging whether the word corresponding to the adjacent features in the knowledge base is approximate to the adjacent features of the core word, if so, outputting an analysis result, and if not, prompting the word corresponding to the adjacent features in the knowledge base.

Compared with the prior art, the invention has the following beneficial effects:

1. the knowledge points in the Chinese context can be clearly defined for accurate resolution.

2. Knowledge points can be efficiently and unambiguously stored, i.e., knowledge points are no longer stored independently and unambiguously, but rather are stored specifically with respect to a particular context, a particular word, thereby increasing the accuracy of knowledge point retrieval.

3. A knowledge tree (contiguous features) that describes knowledge points in a particular context is subjected to a series of sifting processes, tailored according to the linguistic features of individual dependencies (e.g., COO, ATT).

4. The knowledge points in the Chinese context can be accurately analyzed, for example, a user inputs ' Maotai wine and rice are made into distiller's yeast ', and the analysis system can correct the knowledge of the text input by the user according to the knowledge points of the contexts of ' Maotai wine ' and ' distiller ' pre-stored in the knowledge base, and informs that ' rice ' should be corrected to ' wheat '.

5. The knowledge base can be dynamically maintained in real time, is clear and controllable, can directly position and solve unreasonable problem parts, and is not like a traditional deep learning model which is like a black box and can not be analyzed.

6. The knowledge analysis scene is not limited by the diversity and complexity of Chinese grammar/syntax any more, and the application requirements of the scene can be met to the utmost extent as long as the resources of the knowledge base are ensured to be high-quality and comprehensive.

Drawings

FIG. 1 is a flow chart of building a dynamic structured knowledge base according to the present invention;

FIG. 2 is a flow chart of computing neighboring features according to aspects of the present disclosure;

fig. 3 is a schematic diagram of obtaining an analysis result according to the technical solution of the present invention.

Detailed Description

The technical solution of the present invention is described in detail below with reference to the accompanying drawings and the detailed description.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

Fig. 1 is a flow chart of building a dynamic structured knowledge base according to the technical solution of the present invention. Knowledge points in the Chinese context can be well defined for accurate parsing. Knowledge points can be efficiently and unambiguously stored, i.e. knowledge points are no longer stored independently and unambiguously, but rather are stored specifically with respect to a particular context, a particular word. Specifically, the knowledge parsing system based on the dependency syntax tree of the present invention includes: the knowledge base module and the analysis module; wherein knowledge base module includes:

and the calculation module is used for calculating the adjacent characteristics of each core word and storing the adjacent characteristics corresponding to the core words of each knowledge point to form a knowledge base.

Wherein, analysis module includes:

According to one aspect of the invention, a knowledge parsing method based on a dependency syntax tree is disclosed, which comprises the following steps:

and step S4, calculating to obtain the adjacent characteristics of each core word, and storing the adjacent characteristics corresponding to the core words of each knowledge point to form a knowledge base.

In step S1, the dependency syntax relationship between words is directional. Each sentence has at least one root source word, and for any word except the root source word, there is only one father node and at least one son node.

In step S3, the simplification process includes: if the two redundant words have dependency relationship, combining the two redundant words into a new redundant word; if the dependency relationship of the two words is a parallel relationship, the parent node and the child node of the two words are shared.

In step S1, it is necessary to first prepare a Dependency syntax model (Dependency syntax model) that has been pre-trained. The model can perform word segmentation processing on natural language sentences and mark syntactic dependencies among components. The details are as follows:

given a sentence of n characters, S ═ S₁S₂S₃…S_nAfter the dependency syntax tree processing, the sentence S has a structure S of m words, i.e., W₁W₂W₃…W_mAnd obtain dependency syntactic relations between words, e.g., R (W)_i,W_j) SBV, for W_iAnd W_jThere is a SBV (main predicate) relationship between them. W_jIs W_iParent node of W_iIs W_jThe child node of (1).

Specifically, in step S1, the dependency syntax relationship between words is directional, i.e., R (W)_i,W_j)≠R(W_j,W_i). There must be one root source word W per sentence_root. For words other than the root source word W_rootArbitrary word other than W_iThere is and only one word W_jWith the presence of R (W)_i,W_j) The relationship of (1); i.e. W_iThere is only one parent node. For a word W_jThere may be multiple words (e.g., W)₁、W₂、W₃) With which there is a radical such as R (W)₁,W_j)、R(W₂,W_j)、R(W₃,W_j) The relationship of (1); i.e. W_jThere may be multiple child nodes.

Specifically, in step S1, the sentences that cover the target knowledge point are collected, the dependency syntax trees of all the sentences are obtained by using the dependency syntax model, and the core words are labeled. For example, the sentence "the Maotai liquor in China uses high-quality sorghum as a raw material", we can mark out the core words that can constitute the knowledge points: maotai, wine, sorghum and raw materials; the non-core words may also be referred to as "redundant words".

Specifically, in step S1, a series of screen reduction simplification processes are performed on the obtained dependency syntax tree, the core words are retained, the redundant words and their surrounding structures are simplified, and a dependency syntax structure for each knowledge point and a normalized knowledge tree are formed and stored for later use. The simplified processing means includes:

if two redundant words x_i，x_jThere is a dependency relationship, and R (x)_i，x_j) Where ATT stands for "centered relationship", e.g., "red" and "apple" are centered relationship, then x may be expressed_i，x_jAnd the words are combined into a new redundant word, thereby achieving the aim of simplification.

If two words W_i，W_jThe dependency relationship of (A) is R (W)_i，W_j) COO (COO herein stands for "parallel relationship"), then W_iParent node and child node of (2) and W_jMay be shared with the child nodes.

The above dependencies may also be any of the types of relationships shown in the dependency table in the appendix.

Fig. 2 is a flow chart of calculating the adjacent features according to the technical solution of the present invention. Specifically, in step S4, the adjacent feature of each core word, the arbitrary word W, is calculated_iAdjacent feature F of_iRepresents W_iRelationships with other words;

wherein, g_ijRepresented in a normalized knowledge tree, from W_iAt node to W_jPaths between nodes where they are located; the path can be encoded into a high-dimensional vector through a neural network model and can also be expressed as a vectorThe specific functional relationship can compare the structure (dependency relationship between each node) and the content (content of each core word on the path) of two different paths. To simplify the processing, the adjacent feature F is calculated_iOnly the core words may be considered, and the redundant words are ignored. A certain core word W in a certain knowledge tree_iIs counted as F_i。

In a specific knowledge statement S^(x)Specific core word W in the knowledge tree of_iAdjacent feature of (D) is denoted as F_i ^(x)Adjacent characteristics of each core word of each knowledge point are stored to form a knowledge base, and the structure of a storage unit is as follows

F_i ^(x)→W_i

Strictly speaking, W_iAnd F_i ^(x)Each core word in (2) can be expressed in the form of a high-dimensional vector or a set of a series of similar words, so that the situation of similar word replacement can be handled.

After the knowledge base is established, the knowledge parsing system based on the dependency syntax tree of the present invention can be used to parse the input of the user, and as shown in fig. 3, a schematic diagram of the parsing result obtained according to the technical solution of the present invention is shown, which includes the following steps:

Specifically, in step S5, the user-entered text S is given^(U)＝S₁S₂S₃...S_nAfter the dependency syntax tree processing, the dependency syntax tree can be obtained, and the corresponding word segmentation result is S^(U)＝W₁W₂W₃...W_m。

Specifically, in step S6, the adjacent features of the respective words are obtained

E.g. for the core word W in the user text_i ^(U)Acquiring its adjacent feature F in the user input text_i ^(U)。

Specifically, in step S6, each adjacent feature F in the knowledge base is identified_a，F_b，F_c.., and F_i ^(U)By contrast, the adjacent feature with the highest degree of match (e.g. F) is taken_j) If the matching degree is higher than a certain threshold (such as a first threshold), the corresponding word W of the adjacent feature in the knowledge base is obtained_jThen word W_jShould be related to the core word W_i ^(U)High approximation, if not sufficiently approximate, of the core word W in the text entered by the user_i ^(U)And the information is not fused with the knowledge base and should be marked and corrected, so that a series of knowledge analysis operations such as knowledge auditing/correcting are realized.

The degree of match may be calculated by comparing the semantic proximity of two words. For example, the word vectors of two words may be compared, or a synonym table may be defined in advance, and it is checked whether each word is a synonym in the synonym table.

Preferably, different weights may be assigned to the core word and the other words in the adjacent feature to calculate a total score, and the total score may be output as a result of the analysis. For example, the core words are similarly configured with a first weight and the redundant words are configured with a second weight. If the adjacent characteristic is judged to be approximate, the output value is 1, and if the adjacent characteristic is not similar, the output value is 0. And multiplying the output value by the corresponding weight, and finally counting the overall score condition as a similarity result. Because the weights in the embodiment of the invention are different, the score is higher if the core words are more similar, and the resolution precision of the system is improved.

The first embodiment is as follows:

according to the first embodiment of the invention, the effects of intelligent error correction and intelligent filling can be realized by constructing the dynamic structured knowledge base in advance and then inputting the user through the analysis algorithm module.

Constructing a dynamic structured knowledge base:

assuming a knowledge statement that "einstein proposed a narrow relativity theory in the odd track year 1905 and expounds the principle of photoelectric effect", it can be obtained after processing of dependency syntax tree:

assuming that the knowledge we are interested in is "einstein proposed the principle of the photoelectric effect in 1905", we need to label as core words: einstein, 1905, photoelectric, effect and principle. A series of screen reduction and simplification processes can be carried out to obtain a normalized knowledge tree:

here, the variables beginning with "G _" represent core words, the variables beginning with "t _" represent redundant words, and the specific vocabulary is:

{ ' G _0' [ ' Einstein ' ],'t _1': ' ], ' G _2' [ '1905 year ' ],'t _4': ' [ ' ], ' G _12': photoelectric ' ], ' G _13' [ ' effect ' ], ' G _14': principle ' ] }

For convenience of illustration, the words are not represented by high-dimensional vectors, but by a set of similar words.

The adjacent feature of the word "einstein" is actually a summary of paths from the node (i.e., "G _ 0") where the word "einstein" is located to other words, such as the adjacent feature of "einstein" (denoted as "einstein

)：

Where "f" and "b" represent forward (from child node to parent node) and reverse (from parent node to child node), respectively, for example, the path from "G _ 0" to "G _ 13" can be taken from the index "G _ 13" in the graph, i.e., the path from "G _ 0" to "G _ 13" is taken

[['G_0',['f','SBV'],'t_1'],

['t_1',['b','VOB'],'G_14'],

['G_14',['b','ATT'],'G_13']]

It represents that going from "G _ 0" to "G _ 13" needs to go forward to some redundant node "t _ 1", the dependency relationship in the period is SBV, then going backward from "t _ 1" to core node "G _ 14", the dependency relationship in the period is VOB, and finally going backward to "G _ 13", and the dependency relationship in the period is ATT. Judging whether the two paths are consistent not only needs to compare whether the dependency relationships between the nodes of the two paths are consistent, but also needs to compare whether the contents of the nodes are consistent or sufficiently similar.

Repeating the steps, collecting mass knowledge sentences, wherein each knowledge sentence may correspond to more than one knowledge point, generating a knowledge tree for each knowledge point according to the steps, and obtaining each node W in the knowledge tree_iAdjacent feature of

Each of the adjacent features is then stored as an index.

An intelligent analysis process:

assuming that the user inputs "in 1995, the physicist einstein in germany proved the principles of the photoelectric effect," the resulting dependency syntax tree is:

adjacent features of the respective words are obtained. Traversing each adjacent feature stored in the knowledge base, and judging whether the adjacent feature of each word is adjacent to a certain pre-stored adjacent feature in the knowledge baseThe adjacent features are completely matched, and finally, the adjacent feature of the word of '1995' is found to be matched with an adjacent feature of a knowledge base, the word of '1905' is taken as a numerical value

And (4) complete matching. Therefore, the term "1995" should be interpreted

When the content of the G _2 node in the knowledge base is consistent, that is, the word "1995" is only replaced by "1905" to ensure that the sentence input by the user does not conflict with the knowledge base.

To prevent miscorrection, if the adjacent feature of the word "1995" matches another adjacent feature in the knowledge base completely, and the word "1995" matches the node corresponding to the adjacent feature, the adjacent feature is described previously

The case of a conflict is invalidated.

For example, the user inputs the Maotai wine and the rice are made into the distiller's yeast, and the analysis system can correct the knowledge of the text input by the user according to the knowledge points of the contexts of the Maotai wine, the distiller's yeast and the like pre-stored in the knowledge base, so as to inform that the rice is corrected to be the wheat.

Example two:

intelligent knowledge population can also be achieved according to the parsing system utilizing the present invention. If the user inputs to the system of the present invention the form "physical scientist einstein in x, germany proves the principle of the photoelectric effect", then the system only needs to search the knowledge base for x, thereby realizing the product effect of "knowledge filling". The user enters "einstein has won x year's nobel y prize" and the system informs that x is "1921" and that y is "physics". The retrieval process is the same as in embodiment one.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Appendix, dependency relationship table:

type of relationship	Tag	Description	Example
				Relationship between major and minor	SBV	subject-verb	I send her a bunch of flowers (I)<- -send)
Moving guest relationship	VOB	Direct object, verb-object	I send her bundle of flowers (send-)>Flower)
				Inter-guest relationships	IOB	Indirect object, indeect-object	I send her bundle of flowers (send-)>She)
Preposition object	FOB	Front-object of preceding object	He reads what book (book)<- - -read)
				Concurrent language	DBL	double	He asks me to eat>I)
Centering relationships	ATT	attribute	Red apple (Red)<- - -apple)
				Middle structure	ADV	adverbial	Very beautiful (very beautiful)<- - -beautiful)
Dynamic compensation structure	CMP	complement	Has done the operation (do-)>Go to)
				In a parallel relationship	COO	coordinate	Dashan and Dahai (Dashan-)>Sea)
Intermediary relation	POB	preposition-object	In the trade area (in-)>Inner)
				Left additive relationship	LAD	left adjunct	Mountain and sea (Hehe)<- - -sea)
Right attachmentRelationships between	RAD	right adjunct	Children (children-)>People)
				Independent structure	IS	independent structure	The two separate sentences being structurally independent of each other
Punctuation	WP	punctuation	。
				Core relationships	HED	head	Refers to the core of the whole sentence

Claims

1. A dependency syntax tree based knowledge parsing system, comprising: the knowledge base module and the analysis module; wherein, knowledge base module includes:

the calculation module is used for calculating the adjacent characteristics of each core word and storing the adjacent characteristics corresponding to the core words of each knowledge point to form a knowledge base;

wherein, analysis module includes:

2. The dependency syntax tree-based knowledge parsing system of claim 1 wherein: in the word segmentation module, the dependency syntactic relation among the words is directional.

3. The dependency syntax tree-based knowledge parsing system of claim 1 wherein: in the word segmentation module, each sentence has at least one root source word, and for any word except the root source word, only one father node and at least one child node exist.

4. The dependency syntax tree-based knowledge parsing system of claim 1 wherein: in the simplified processing module, if two redundant words have dependency relationship, the two redundant words are combined into a new redundant word; if the dependency relationship of the two words is a parallel relationship, the parent node and the child node of the two words are shared.

5. A knowledge parsing method based on a dependency syntax tree is characterized by comprising the following steps:

6. The dependency syntax tree-based knowledge parsing method of claim 5, wherein: in step S1, the dependency syntax relationship between words is directional.

7. The dependency syntax tree-based knowledge parsing method of claim 5, wherein: in step S1, each sentence has at least one root source word, and for any word except the root source word, there is only one parent node and at least one child node.

8. The dependency syntax tree-based knowledge parsing method of claim 5, wherein: in step S3, the simplification process includes: if the two redundant words have dependency relationship, combining the two redundant words into a new redundant word; if the dependency relationship of the two words is a parallel relationship, the parent node and the child node of the two words are shared.

9. An intelligent learning content push system, comprising: memory, processor and computer program stored on said memory and executable on said processor, characterized in that the processor performs the method according to any of the claims 5-8.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 5-8.