CN110889034A

CN110889034A - Data analysis method and data analysis system

Info

Publication number: CN110889034A
Application number: CN201910105173.2A
Authority: CN
Inventors: 李实恭; 詹诗涵; 柯兆轩; 蓝国诚
Original assignee: Delta Electronics Inc
Current assignee: Delta Electronics Inc
Priority date: 2018-09-07
Filing date: 2019-02-01
Publication date: 2020-03-17
Also published as: CN110891202A; TW202011231A; TW202011749A; TWI696386B; SG10201906347QA; CN110888896B; SG10201905532QA; CN110888994A; JP2020042777A; TWI700597B; TWI709905B; TWI725375B; SG10201905236WA; TW202011221A; TW202011222A; TW202011232A; CN110895654A; SG10201905523TA; TWI699663B; CN110888896A

Abstract

The present disclosure relates to a data analysis method and a data analysis system thereof. The data analysis method includes receiving first learning data, and adding multiple segment marks to the first learning data to divide the first learning data into multiple first learning sections. These first learning sections are connected with each other in time sequence. And screening out the first keyword string corresponding to each first learning section from each first learning section. An analysis instruction is received, and the analysis instruction and the respective first keyword strings of each first learning section are analyzed to obtain the similarity between the analysis instruction and each corresponding first learning section. And finally, screening out the first learning section with the highest similarity.

Description

Data analysis method and data analysis system

Technical Field

The present disclosure relates to a data analysis method and a data analysis system, which are used for screening out corresponding learning materials according to an analysis instruction.

Background

The on-line learning platform is a network service that stores a plurality of learning data in a server, so that a user can connect to the server through the internet to browse the learning data at any time. In the existing various online learning platforms, the types of learning materials provided include films, audios, presentations, documents or forums, etc.

Because the amount of learning materials stored in the online learning platform is huge, a user needs to input a search instruction (search instruction) according to the own needs to retrieve the relevant learning materials from the online learning platform, and thus, there is still room for improvement.

Disclosure of Invention

One aspect of the present disclosure is a data analysis method. The data analysis method comprises the following steps: the first learning data is received. A plurality of first segment marks are added to the first learning material to divide the first learning material into a plurality of first learning segments. According to each of the plurality of first learning sections, a first keyword string corresponding to each first learning section is generated. And receiving an analysis instruction related to the operation of a user. The analysis instruction and the first keyword string of each first learning section are analyzed to obtain a plurality of first similarities between the analysis instruction and each corresponding first learning section. And screening out one of the first learning sections with the highest similarity from the plurality of first learning sections.

Another aspect of the present disclosure is a data analysis system. The data analysis system comprises a first server, a second server and a storage unit. The first server is used to receive the first learning data. The storage unit is used for receiving and storing the first learning data from the first server. The second server is used to add a plurality of first segment marks in the first learning data to divide the first learning data into a plurality of first learning segments. The second server is used for generating a first keyword string corresponding to each first learning section according to each of the plurality of first learning sections. The second server is used for receiving the analysis instruction and analyzing the analysis instruction and the respective first keyword string of each first learning section to obtain a plurality of first similarities between the analysis instruction and each corresponding first learning section, and the second server screens out one of the first learning sections with the highest similarity from the plurality of first learning sections.

Therefore, the data analysis system can add the first segment marks to the first learning materials, and generate the first keyword string on each first learning segment after distinguishing the first learning segments, so that each first learning segment of the first learning materials can be accurately searched when a subsequent user logs in the data analysis system, or the data analysis system can actively recommend a proper learning segment to the user, thereby improving the operation experience of the user.

Drawings

Fig. 1 is a schematic diagram of a data analysis system according to some embodiments of the present disclosure.

Fig. 2A is a schematic view of a text file of first learning data according to some embodiments of the disclosure.

Fig. 2B is a schematic image frame of the first learning data according to some embodiments of the disclosure.

Fig. 3 is a schematic diagram of a data analysis method according to some embodiments of the disclosure.

Fig. 4 is a schematic diagram of a data analysis method according to some embodiments of the disclosure.

[ description of reference ]

100 data analysis system

110 first server

120 second server

130 memory cell

131 course database

131a first learning material

131b second learning material

131c third learning material

132 analytical database

133 behavior database

134 recommendation database

200 terminal device

A1 text file

A11 learning segment

A12 learning segment

A13 learning segment

A14 learning segment

A21 learning segment

A22 learning segment

B1 video file

B01 video picture

B02 video picture

B03 video picture

B04 video picture

B11 learning section

B12 learning section

T1 first correlation marker

T2 second correlation marker

S301 to S311

S401 to S408 steps

Detailed Description

Reference will now be made in detail to the present embodiments of the present application, examples of which are illustrated in the accompanying drawings. It should be understood, however, that these implementation details should not be used to limit the application. That is, in some embodiments of the disclosure, such practical details are not necessary. In addition, for simplicity, some conventional structures and elements are shown in the drawings in a simple schematic manner.

When an element is referred to as being "connected" or "coupled," it can be referred to as being "electrically connected" or "electrically coupled. "connected" or "coupled" may also be used to indicate that two or more elements are in mutual engagement or interaction. Moreover, although terms such as "first," "second," …, etc., may be used herein to describe various elements, these terms are used merely to distinguish one element or operation from another element or operation described in similar technical terms. Unless the context clearly dictates otherwise, the terms do not specifically refer or imply an order or sequence nor are they intended to limit the invention.

In the existing online learning platform, when a user inputs a search command (search command), the server only compares and screens the search command with the file name, caption text or mark (such as leave word) of the learning material. However, if the content of the learning material is huge (e.g. a movie with a length of two hours), the user still needs to manually adjust the learning material (e.g. adjust the playing time to 45 th minute) to find the section most related to his/her own needs. That is, the analysis mechanism of the existing online learning platform can only search names or subtitles, and cannot perform a fine search according to the requirement. In addition, the user cannot find out the interesting learning materials except through active search.

Fig. 1 is a schematic diagram of a data analysis system according to some embodiments of the present disclosure. Referring to fig. 1, the present disclosure relates to a data analysis system. The data analysis system 100 includes a first server 110, a second server 120, and a storage unit 130. In the present embodiment, the first server 110 is electrically connected to the second server 120, and in other embodiments, a connection can be established between the first server 110 and the second server 120 through a network for data transmission. The storage unit 130 is a data storage device, for example: flash memory devices, memory cards, hard disks, and the like. In some embodiments, the storage unit 130 is stored in a separate server. In some other embodiments, the storage unit 130 may be disposed in the first server 110 or the second server 120. In other embodiments, the first server 110 and the second server 120 can be integrated into a single server.

In the present embodiment, the data analysis system 100 is used to provide an online learning service, for example, a user can connect to the first server 110 through a terminal device 200 (e.g., a personal computer, a notebook computer, or a smart phone) to browse an online learning interface. When the user wants to browse the learning content, the first server 110 can access the corresponding file from the storage unit 130 through the processor therein. The second server 120 is used to perform classification, management and statistics functions through its processor. However, the application of the present disclosure is not limited thereto, and the data analysis system 100 may also be applied to an audio video streaming platform or an internet discussion forum.

The first server 110 is used to receive a plurality of learning data. In some embodiments, the first server 110 receives the learning data from the terminal device 200 via the internet. The learning material may be a film, sound, presentation, or discussion string. For convenience of description, the present embodiment is described by dividing the plurality of learning materials into the first learning material 131a, the second learning material 131b, and the third learning material 131 c. However, the disclosure is not limited thereto, and the number of learning materials can be arbitrarily adjusted.

In some embodiments, after the first server 110 receives the first learning data 131a from the terminal device 200, the first server 110 uploads the first learning data 131a to the storage unit 130, and the first server 110 sends a notification message to the second server 120. The second server 120 is connected to the storage unit 130 to add a plurality of first segment marks in the first learning data 131a, so that the first learning data 131a can be divided into a plurality of first learning sections according to the first segment marks. In some embodiments, the first learning sections are connected to each other according to a chronological sequence (e.g., a predetermined time axis in the first learning material 131 a).

For example, if the first learning data 131a is a movie file with a film length of 30 minutes, the second server 120 can add the first segment marks at the locations of the movie time of 10 minutes and 20 minutes, respectively, to divide the movie file into three first learning segments. Similarly, if the first learning data is a 10-page presentation document, the second server 120 can add the first segment marks at pages 2, 5 and 7 respectively to divide the presentation document into four learning segments. In some other embodiments, the first learning sections are not necessarily consecutive to each other, but only in chronological order. For example, the first learning section is the 1-20 minute paragraph of the movie, and the second learning section is the 30-45 minute paragraph of the movie.

In some embodiments, the segment mark is an identification tag (tag) or an identifier (identifier) added to the first learning material 131a, so that the first server 110 or the second server 120 can quickly find a specific part of the content of the first learning material 131a, but the form of the segment mark is not limited thereto. The generation of the segmentation markers will be described in detail later.

After the second server 120 adds the first segment flag to the first learning data 131a, the second server 120 analyzes a respective first keyword string (keyword) from each of the first learning segments. In some embodiments, the first keyword string comprises at least one keyword. For example: for a movie file divided into three first learning sections, the keyword string of each first learning section can be "projector, image, principle", "high frequency signal, sharpening, enhancing" and "lifting, definition". In some embodiments, the first keyword string may be a text content in each of the first learning sections, the occurrence frequency of which is higher than a set value. The analysis of the first keyword string will be described in detail later.

The second server 120 is used for storing the first learning data and the corresponding first segment flag and the first keyword string in the storage unit 130. In some embodiments, the first server 110 stores the first learning material 131a in the course database 131 in the storage unit 130. After the second server 120 generates the first segment flag and generates the first keyword string, the second server 120 stores the first segment flag and the first keyword string in the analysis database 132 of the storage unit 130. In some other embodiments, the second server 120 further stores a first identification code corresponding to the first learning material in the analysis database 132, so that the first segment mark and the first keyword string can correspond to the first learning material in the course database 131 according to the first identification code.

The data analysis system 100 of the present disclosure can recommend suitable learning materials to the user according to an analysis instruction related to the user operation. In some embodiments, the first server 110 is configured to transmit the analysis command to the second server 120. Then, the second server 120 is connected to the storage unit 130 according to the analysis command, and analyzes the analysis command and the respective first keyword string of each first learning section to obtain a first similarity between the analysis command and the respective first keyword string corresponding to each first learning section. The second server 120 will select the first learning section with the highest similarity from the first learning sections, and integrate the selected first learning sections into the analysis information, which is transmitted to the terminal device 200 for displaying (e.g., displaying the search screen or the recommendation screen) on the terminal device 200.

Next, the generation method and timing of the analysis command will be described. In some embodiments, the analysis command may be generated according to a user operation (e.g., a search operation), and the analysis command includes a first search command transmitted from the terminal device 200. For example, the first search instruction may be a search string "projector, principle", and the keyword strings of the three first learning sections in the first learning data 131a are "projector, image, principle", "high frequency signal, sharpening, enhancing" and "lifting, definition", respectively. After analysis (e.g., comparing the similarity of the strings), the first keyword string of the first learning section is most similar to the search string, so that the second server 120 can transmit the comparison result back to the terminal device 200 through the first server 110 (e.g., a recommended result is displayed on the user interface), so that the user can know that the first learning section of the first learning data 131a is most similar to the search instruction. In some other embodiments, the second server 120 can compare the search command with the keyword string of each learning segment in all the learning data 131 a-131 c to accurately determine the learning data most related to the search command and the corresponding learning segment.

In some other embodiments, the user inputs the first search command through the terminal device 200, and the first search command may be a spoken (semantic) text, that is, the first search command may be a semantic text, for example: "what is the principle of enhancement of the projector". The second server 120 performs an analysis process on the first search command to generate a search string. In some embodiments, the second server 120 analyzes the search instruction using a semantic analysis technique, such as: the first search command can be analyzed to find words such as "projector", "principle", "what", etc. Then, the search string is compared with the first keyword string of each first learning segment. The operation principle of semantic analysis can be understood by those skilled in the art, and therefore, will not be described herein.

In some other embodiments, the analysis instruction may be a recommendation instruction actively generated by the data analysis system 100. That is, the first server 110 can generate the analysis command according to the user's operation. For example, when the first server 110 determines that the terminal device 200 is connected to the data analysis system 100 (e.g., the user logs in to the online learning system), the first server 110 generates an analysis command to actively analyze the document that may be of interest to the user through the second server 120. Alternatively, the first server 110 can generate the analysis command when the operation of the user is determined to meet the predetermined condition (e.g., the user browses the learning data for half an hour, the user asks questions, leaves a message or marks about the learning data) according to the operation of the user.

In some other embodiments, the first server 110 further generates an analysis command according to the behavior data stored in the storage unit 130 after confirming that the user operation meets the predetermined condition. For example, the data analysis system 100 may generate the analysis command according to behavior data (e.g., usage records of the user) stored in the behavior database 133 of the storage unit 130, the details of which will be described later.

In addition, in some embodiments, if the first server 110 generates the analysis command after confirming that the operation of the user meets the predetermined condition, and the second server 120 selects the first learning section closest to the analysis command, the second server 120 may first store the selected first learning section in the recommendation database 134 in the storage unit 130. The first server 110 may transmit the first learning section stored in the recommendation database 134 to the terminal device 200 at a predetermined recommendation time (e.g., when the user logs in or out of the online learning system, or after the user browses a movie).

Accordingly, after receiving the first learning material 131a, the data analysis system 100 subdivides the first learning material 131a into a plurality of first learning sections, and each of the first learning sections has a corresponding first keyword string. Therefore, after the user logs in the data analysis system 100, the data analysis system 100 can accurately provide the appropriate learning data to the user according to the analysis command. As mentioned above, the present disclosure includes at least two application modes: first, when the user searches by using the data analysis system 100, the data analysis system 100 can precisely find out which learning section of the first learning material 131a is most similar to the first search command, in addition to finding out the first learning material 131a most similar to the first search command. Second, when the user logs in the data analysis system 100, the first server 110 can generate an analysis command (e.g., search the user's usage record) according to the user's operation when the predetermined condition is met, and find a corresponding first learning segment according to the analysis command to recommend the first learning segment to the user. Accordingly, the accuracy of the data analysis system 100 in analysis and search can be greatly improved, and the user experience can be improved.

Next, referring to fig. 1 and fig. 2A, please refer to a generation manner of the segment marks, and fig. 2A is a schematic text file diagram of the first learning data 131a according to some embodiments of the disclosure. . In some embodiments, the first learning material 131a includes a text file a1 (e.g., subtitles). After receiving the first learning data 131a, the second server 120 analyzes the text file a1, for example: generating a plurality of characteristic sentences by a semantic analysis method. These characteristic sentences have precedence relationships. Then, the similarity between adjacent characteristic sentences is calculated to generate a first segment flag.

For example, after the text document a1 is analyzed, the generated characteristic sentence includes "the projector adjusts the light emitting unit according to the image signal", "the light emitted by the light emitting unit is reflected as an image picture", and "in another type of projector". The first sentence and the second sentence have the same words of image and luminescence, and have higher similarity, and the second sentence and the third sentence have lower similarity. Therefore, when the second server 120 determines that the similarity between adjacent characteristic sentences is lower than a predetermined value (e.g., there is no identical word or one of the adjacent characteristic sentences is a turning sentence, such as … in other embodiments), the second server 120 generates the first segment flag. So as to divide the character file A1 into a plurality of first learning sections A11-A14.

In the foregoing embodiment, the text file a1 may generate the feature sentences through a semantic analysis technique, and calculate the similarity between the feature sentences, but the disclosure is not limited thereto. In some embodiments, the processor in the second server 120 may also perform binarization (binary) on the text file a1, and compare the processed text file with the processed text file to determine similarity, so as to establish the feature sentences or determine similarity between the feature sentences.

The text file in the foregoing embodiment refers to the text content of the subtitles or the brief report of the movie, and if the text file is the discussion content of the internet forum, the text file can still be segmented by the same principle. Similarly, if the first learning data 131a includes a voice file, the second server 120 can generate a text file a1 through voice recognition and then perform semantic analysis to generate a plurality of characteristic sentences.

In some other embodiments, referring to fig. 2B, the first learning data 131B includes an image file B1. The video file B1 further includes a plurality of video frames B01-B04. The video frames B01-B04 can be a plurality of frames of video files linked according to time sequence. The second server 120 is used for determining the similarity between the adjacent image frames B01-B04 to generate a first segment flag. For example, the video frames B01-B02 are used to display the structure of the projector, and the video frames B03-B04 are used to display the path diagram of the light projection. Since the similarity between the video frames B02 and B03 is low, the second server 120 can add the first segment flag between the video frames B02 and B03 to form a plurality of first learning segments B11 and B12.

Referring again to FIG. 2A, a method for generating a first keyword string by the data analysis system 100 is described. The second server 120 performs an analysis process (e.g., a semantic analysis) on the text file A1 in the first learning data 131a to generate a plurality of feature words. Then, after the second server 120 generates the first segment flag in the manner described above, such that the first learning data 131a is divided into a plurality of first learning sections A11-A14 or B11-B12, the second server 120 will determine the number of the feature words in each of the first learning sections A11-A14 or A21-A22, and when the number is greater than the threshold, set the first learning section as the first keyword string. For example, a first learning section a11 of the text document a1 includes the following contents: the projector adjusts the light-emitting unit according to the image signal, and the light projected by the light-emitting unit is reflected as an image, wherein the image appears 2 times, the light-emitting unit appears 2 times, and the projector and the light respectively appear 1 time. The second server 120 may set the feature word "image, light-emitting unit" appearing 2 times as the first keyword string.

In some embodiments, if the second server 120 finds a plurality of matching learning data (e.g., the first learning data 131a and the second learning data 131b have the "projector" keyword) according to the analysis command (e.g., the first search command), the second server 120 can further generate the search list. In some other embodiments, the first server 110 is further configured to provide a management interface. The management interface is used for a manager or a maintenance person of the data analysis system 100 to view the internal parameters and response data of the data analysis system 100, so that the manager or the maintenance person can adjust the parameters (e.g., threshold values, semantic recognition parameters, etc.) in the data analysis system 100 to optimize the performance of the data analysis system 100.

In addition, the second server 120 can refer to the behavior record of the user to sort the learning data in the search list. Referring to fig. 1, in some embodiments, the storage unit 130 stores a first learning material 131a, a second learning material 131b, and a third learning material 131 c. The second learning material 131b can be divided into a plurality of second learning sections according to the second segmentation labels in the manner described above, and each second learning section includes a respective second keyword string; similarly, the third learning data 131c is divided into a plurality of third learning sections according to a plurality of third segment labels, and each third learning section includes a third keyword string. After the second server 120 screens out the matched first learning data 131a according to the first search instruction, the second server 120 transmits the first learning data 131a to the terminal device 200 through the first server 110 for the user to browse. Meanwhile, the first server 110 or the second server 120 can generate behavior data accordingly and store the behavior data in a behavior database in the storage unit 130.

The behavior data is used to record various operations of the user after connecting to the data analysis system 100, such as: browsing specific learning materials, transmitting messages, marking important documents, etc. The data analysis system 100 can sort the search listings based on the behavioral data. In some embodiments, the first server 110 is configured to generate an analysis command according to the behavior data, for example, the first server 110 may filter out the search string (e.g., the most frequently appearing title name) according to the learning data most frequently browsed by the user to calculate the first similarity. Accordingly, even if the user does not actively input the first search command, the data analysis system 100 can still periodically generate the analysis command and recommend the appropriate learning data and the learning sections thereof.

In some embodiments, the first server 110 is further configured to receive a second search command from the terminal device 200. The second server 120 screens a plurality of learning data, such as a second learning data 131b and a second learning data 131c, from the storage unit 130 according to the second search command. In some embodiments, the second server 120 compares the second search command with the similarity of the keyword string of each learning section in each of the learning data 131 a-131 c, and selects the learning section with the similarity higher than a predetermined value. For example, if the second search command is "image enhancement", and 3 times of "image enhancement" occur in one of the second learning sections of the second learning data 131b and 5 times of "image enhancement" occur in one of the third learning sections of the third learning data 131c, the second server 120 lists the two learning sections as entries in the search list.

Accordingly, after the second server 120 screens the second learning material 131b and the third learning material 131c according to the second search instruction, since the user has previously viewed the first learning material 131a, the second server 120 will further compare the first keyword string in the first learning material 131a with the second keyword string in the second learning material 131b and the third keyword string in the third learning material 131c to obtain a second similarity. Then, a search list is generated and the second learning data 131b and the third learning data 131c are sorted according to the level of the second similarity.

For example, the user has previously browsed the keyword string of the first learning section of the first learning data 131a, which includes 5 keywords. The second keyword string in the second learning data 131b analyzed by the second server 120 includes "projector" and other 3 keywords; the third keyword string in the third learning data 131c screened by the second server 120 includes "projector" and other 4 keywords. The similarity between the first keyword string and the second keyword string is 60% (e.g., three keywords are the same), and the similarity between the first keyword string and the third keyword string is 20% (e.g., only one keyword is the same), which means that the content of the second learning data 131b is relatively similar to the first learning data 131a previously browsed by the user, therefore, the second server 120 will arrange the second learning section of the screened second learning data 131b before the third learning section of the third learning data 131 c.

In some embodiments, the data analysis system 100 can also establish correlations between multiple learning sections. Referring to fig. 1, fig. 2A and fig. 2B, for convenience of description, the text file a1 of fig. 2A and the image file B1 of fig. 2B are regarded as contents of different learning materials. Wherein the learning section A13 has a first related flag T1, and the image B02 in the learning section B11 has a second related flag T2. After the second server 120 screens the learning section A13 according to the first search command, if the learning section B11 is determined to have the second correlation flag T2, the second server 120 will generate the recommendation list according to the learning section B11. For example, the learning section a13 is a movie illustrating the "principle of operation of the projector", and the learning section B11 is a brief file illustrating the "structure of the projector", so that when the user is browsing the learning section a13, the data analysis system 100 can recommend the user to browse the learning section B11 together.

Please refer to fig. 3, which is a schematic diagram illustrating a data analysis method according to some embodiments of the present disclosure. The data analysis method segments the first learning data 131a and generates a first keyword string through the following steps S301-S311. In step S301, the terminal device 200 transmits the first learning data 131a to the first server 110. In step S302, the first server 110 uploads the first learning data 131a to the storage unit 130. In step S303, the storage unit 130 stores the first learning material 131a in the course database 131. In step S304, the storage unit 130 notifies the first server 110 that the storage operation is completed.

In step S305, the first server 110 transmits the analysis information to the second server 120. In steps S306 and S307, the second server 120 sends a request message to the storage unit 130 to obtain the first learning data 131a at the storage unit 130. The second server 120 adds the first segment flag in the manner described above, and generates a respective first keyword string for each first learning segment. In step S308, the second server 120 uploads the first segment flag and the corresponding first keyword string to the storage unit 130, so that the storage unit 130 stores the first segment flag and the corresponding first keyword string in the analysis database 132. Then, the storage unit 130 transmits the completion information to the second server 120, and transmits the completion information to the first server 110 through the second server 120, so as to display the information of "learning material uploading completion" to the user on the interface of the online learning system.

Please refer to fig. 4, which is a schematic diagram illustrating a data analysis method according to some embodiments of the present disclosure. The data analysis method searches for learning data and learning sections according to an analysis command (e.g., a first search command) through the following steps S401-S408. In step S401, the terminal device 200 sends an analysis command to the first server 110. In steps S402 and S403, the first server 110 transmits a first search command of the analysis commands to the second server 120, and the second server 120 searches the learning data in the storage unit 130 according to the first search command. In step S404, the second server 120 obtains the filtered learning data from the storage unit 130. If there are multiple selected learning data, such as the second learning data 131b and the third learning data 131c, in step S405 and step S406, the second server 120 obtains the behavior data from the behavior database 133 in the storage unit 130 to compare the similarity between the behavior data (e.g., the first keyword string) and the second learning data 131b and the third learning data 131c to generate a search list. Finally, in step S407, the second server 120 transmits the search list to the first server 110, and in step S408, the first server 110 displays the search list on an interface of the online learning system for the user to browse or download.

Although the present disclosure has been described with reference to the above embodiments, it should be understood that various changes and modifications can be made by one skilled in the art without departing from the spirit and scope of the disclosure, and therefore, the scope of the disclosure should be determined by that of the appended claims.

Claims

1. A method of data analysis, comprising:

receiving a first learning material;

adding a plurality of first segment marks into the first learning material to divide the first learning material into a plurality of first learning segments;

generating a first keyword string corresponding to each of the first learning sections according to each of the first learning sections;

receiving an analysis instruction related to user operation;

analyzing the analysis instruction and the first keyword string of each first learning segment to obtain a plurality of first similarities between the analysis instruction and each corresponding first learning segment; and

and screening out one of the first learning sections with the highest similarity from the plurality of first learning sections.

2. The data analysis method as claimed in claim 1, wherein the first learning material comprises a text file, the analysis method further comprising:

analyzing the text file to generate a plurality of characteristic sentences, wherein the characteristic sentences have a precedence relationship; and

and judging the similarity of the adjacent characteristic sentences to generate a plurality of first segment marks.

3. The data analysis method of claim 1, wherein the first learning material comprises a plurality of image frames, the analysis method further comprising:

and judging the similarity of the adjacent image frames to generate a plurality of first segment marks.

4. The data analysis method as claimed in claim 1, wherein the first learning material comprises a text file, the data analysis method further comprising:

analyzing the character file to generate a plurality of characteristic characters;

after the first learning materials are divided into a plurality of first learning sections, the number of the characteristic words in each first learning section is judged; and

the plurality of feature words with a number greater than a threshold are set as the first keyword string.

5. The data analysis method as claimed in claim 1, wherein the analysis command comprises a first search command, the data analysis method further comprising:

analyzing the first search command to generate a first search string; and

calculating the first similarity between the first search command and the first learning segments according to the first search string.

6. The data analysis method of claim 5, further comprising:

one of the first learning sections with the highest similarity and a corresponding one of the first learning data are transmitted to a terminal device, and behavior data are generated according to the transmitted one of the first learning sections and the corresponding one of the first learning data.

7. The data analysis method of claim 6, further comprising:

storing the first learning data into a storage unit, wherein the storage unit further stores a second learning data and a third learning data, the second learning data is divided into a plurality of second learning sections according to a plurality of second segment labels, and each second learning section comprises a second keyword string; the third learning data is divided into a plurality of third learning sections according to a plurality of third segment labels, and each third learning section comprises a third keyword string.

8. The data analysis method of claim 7, further comprising:

receiving a second search command;

according to the second search instruction, screening out the second learning data and the third learning data from the storage unit;

calculating a plurality of second similarities between the first keyword string and the second keyword string and the third keyword string, respectively, according to the behavior data; and

according to the second similarities, the second learning data and the third learning data are sorted to generate a search list.

9. A data analysis system, comprising:

a first server for receiving a first learning data;

a storage unit for receiving and storing the first learning data from the first server; and

a second server for adding a plurality of first segment marks to the first learning data to divide a plurality of first learning segments on the first learning data; the second server is used for respectively generating a first keyword string corresponding to each first learning section according to each of the plurality of first learning sections; the second server is configured to receive an analysis command, analyze the analysis command and the first keyword string of each first learning segment to obtain a plurality of first similarities between the analysis command and each first learning segment, and the second server screens out one of the first learning segments with a highest similarity from the plurality of first learning segments.

10. The data analysis system of claim 9, wherein the first learning data includes a text file, and the second server is configured to analyze the text file to generate a plurality of feature sentences, wherein the feature sentences have a precedence relationship; the second server is further configured to determine similarities between the adjacent feature sentences to generate the first segment marks.

11. The data analysis system of claim 9, wherein the first learning data comprises a plurality of image frames, and the second server is configured to determine similarity between adjacent image frames to generate the first segmentation markers.

12. The data analysis system of claim 9, wherein the first learning data comprises a text file, the second server is used for analyzing the text file to generate a plurality of feature words; after the first learning sections are divided from the first learning data, the second server is configured to determine the number of the plurality of feature words in each first learning section, and set the number of the plurality of feature words larger than a threshold as the first keyword string.

13. The data analysis system of claim 9, wherein the analysis command comprises a first search command; the second server is used for analyzing and processing the first search instruction to generate a first search string, and calculating the first search instruction and the plurality of first similarities corresponding to each first learning section by using the first search string.

14. The data analysis system of claim 13, wherein the second server is configured to transmit one of the first learning sections with the highest similarity and a corresponding one of the first learning data to a terminal device, and generate a behavior data according to the one of the first learning sections and the corresponding one of the first learning data.

15. The data analysis system of claim 14, wherein the storage unit further stores a second learning data and a third learning data, the second learning data is divided into a plurality of second learning sections according to a plurality of second segment labels, and each of the second learning sections comprises a second keyword string; the third learning data is divided into a plurality of third learning sections according to a plurality of third segment labels, and each third learning section comprises a third keyword string.

16. The data analysis system of claim 15, wherein the second server is configured to sift the second learning data and the third learning data from the storage unit according to a second search command; the second server is further configured to calculate a plurality of second similarities between the first keyword string and the second keyword string and between the first keyword string and the third keyword string, respectively, according to the behavior data, and sort the second learning data and the third learning data according to the plurality of second similarities to generate a search list.