CN110888896A - Data searching method and data searching system thereof - Google Patents

Data searching method and data searching system thereof Download PDF

Info

Publication number
CN110888896A
CN110888896A CN201910104937.6A CN201910104937A CN110888896A CN 110888896 A CN110888896 A CN 110888896A CN 201910104937 A CN201910104937 A CN 201910104937A CN 110888896 A CN110888896 A CN 110888896A
Authority
CN
China
Prior art keywords
learning
data
search
string
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910104937.6A
Other languages
Chinese (zh)
Other versions
CN110888896B (en
Inventor
詹诗涵
柯兆轩
蓝国诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Delta Electronics Inc
Original Assignee
Delta Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Delta Electronics Inc filed Critical Delta Electronics Inc
Priority to JP2019090932A priority Critical patent/JP6829740B2/en
Priority to SG10201905532QA priority patent/SG10201905532QA/en
Priority to EP19188646.4A priority patent/EP3621021A1/en
Priority to US16/529,820 priority patent/US11386163B2/en
Publication of CN110888896A publication Critical patent/CN110888896A/en
Application granted granted Critical
Publication of CN110888896B publication Critical patent/CN110888896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • G06F16/437Administration of user profiles, e.g. generation, initialisation, adaptation, distribution

Abstract

The present disclosure relates to a data searching method and a data searching system thereof. The data searching method comprises the following steps: the first learning data is received. The first learning data includes a plurality of first learning sections. The first learning data is analyzed to generate a first keyword string corresponding to each first learning segment. Then, receiving the search information and analyzing the search information to generate a search string. And comparing the search string with the first keyword string, and generating a search list according to the first learning section corresponding to the first keyword string matched with the search string.

Description

Data searching method and data searching system thereof
Technical Field
The present disclosure relates to a data searching method and a data searching system thereof, and more particularly, to a technique for searching corresponding learning data in a database according to search information.
Background
The on-line learning platform is a network service that stores a plurality of learning data in a server, so that a user can connect to the server through the internet to browse the learning data at any time. In the existing various online learning platforms, the types of learning materials provided include films, audios, presentations, documents or forums.
Because the amount of the learning materials stored in the on-line learning platform is huge, a user needs to input search information according to the requirement of the user, and the user can read the relevant learning materials from the on-line learning platform. Therefore, whether the search mechanism of the on-line learning platform can accurately identify the search information of the user and quickly and correctly provide the corresponding learning data to the user is a key index for determining the service efficiency of the on-line learning platform.
Disclosure of Invention
One aspect of the present disclosure is a data searching (searching) method. The data searching method comprises the following steps: first learning data is received, wherein the first learning data comprises a plurality of first learning sections. The first learning data is analyzed to generate a plurality of first keyword strings corresponding to each of the first learning sections. Search information is received. The search information is analyzed to generate a search string. Comparing the search string with the first keyword string. Generating a search list according to the first learning sections corresponding to the first keyword strings matching the search string.
Another aspect of the present disclosure is a data searching (search) system. The data search system comprises a storage unit, an analysis unit and an operation unit. The storage unit is used for storing first learning data, wherein the first learning data comprises a plurality of first learning sections. The analysis unit is used for generating a plurality of first keyword strings corresponding to each first learning section according to the first learning data. The analysis unit is further used for analyzing the search information to generate a search string. The operation unit is electrically connected to the analysis unit. The computing unit is used for comparing the search word string with the first keyword strings and generating a search list according to the first learning sections corresponding to the first keyword strings which are consistent with the search word string.
Therefore, when the search information is subsequently received, the data search system can accurately search the first learning section corresponding to the first learning material by comparing the search information with the first keyword string, so that a user can quickly browse the learning content to be searched, and the learning efficiency is greatly improved.
Drawings
Fig. 1A is a schematic diagram of a data search system according to some embodiments of the disclosure.
Fig. 1B is a schematic diagram of a first server and a behavior database according to some embodiments of the disclosure.
Fig. 2 is a schematic diagram illustrating an operation of a data search system according to some embodiments of the present disclosure.
Fig. 3A is a schematic view of a text file of first learning data according to some embodiments of the disclosure.
Fig. 3B is a schematic image frame of the first learning data according to some embodiments of the disclosure.
Fig. 4 is a schematic diagram of a data searching method according to some embodiments of the disclosure.
[ description of reference ]
100 data search system
110 first server
120 second server
121 arithmetic unit
122 analysis unit
122a automatic encoder
122b semantic analysis network
123 transmission unit
130 memory cell
131 course database
131a first learning material
131b second learning material
131c third learning material
132 analytical database
133 behavior database
133a behavioral data
133b behavioral data
133c behavioral data
133d weight value
133e weight value
133f weighted value
200 terminal device
A1 text file
A11 learning segment
A12 learning segment
A13 learning segment
A14 learning segment
A21 learning segment
A22 learning segment
B1 video file
B01 video picture
B02 video picture
B03 video picture
B04 video picture
B11 learning section
B12 learning section
S401 to S408 steps
Detailed Description
Reference will now be made in detail to the present embodiments of the present application, examples of which are illustrated in the accompanying drawings. It should be understood, however, that these implementation details should not be used to limit the application. That is, in some embodiments of the disclosure, such practical details are not necessary. In addition, for simplicity, some conventional structures and elements are shown in the drawings in a simple schematic manner.
When an element is referred to as being "connected" or "coupled," it can be referred to as being "electrically connected" or "electrically coupled. "connected" or "coupled" may also be used to indicate that two or more elements are in mutual engagement or interaction. Moreover, although terms such as "first," "second," …, etc., may be used herein to describe various elements, these terms are used merely to distinguish one element or operation from another element or operation described in similar technical terms. Unless the context clearly dictates otherwise, the terms do not specifically refer or imply an order or sequence nor are they intended to limit the invention.
In the existing online learning platform, when a user inputs search information (search information), the server only compares the search information with the file name of the learning material to screen out similar learning material. However, if the content of the learning material is huge (e.g. a movie with a length of two hours), the user still needs to manually adjust the learning material (e.g. adjust the playing time to 45 th minute) to find the section most related to his/her own needs. In addition, if the search information is too spoken, the conventional online learning platform may also search irrelevant learning data because the search information cannot be identified. That is, the search mechanism of the conventional on-line learning platform performs a fine search according to the requirement. The data search system and method provided by the present disclosure can improve this phenomenon.
Referring to fig. 1A and 1B, the present disclosure relates to a data search system 100. The data search system 100 includes a first server 110, a second server 120, and a storage unit 130. The first server 110 is electrically connected to the second server 120, and in other embodiments, the first server 110 and the second server 120 can establish a connection via a network for data transmission. The storage unit 130 is a data storage device, for example: flash memory devices, memory cards, hard disks, and the like. In some embodiments, the storage unit 130 is stored in a separate server. In some other embodiments, the storage unit 130 may be disposed in the first server 110 or the second server 120. In other embodiments, the first server 110 and the second server 120 can be integrated into a single server.
In the present embodiment, the data search system 100 is used to provide online learning services, such as: the user can connect to the first server 110 through the terminal device 200 to browse the online learning interface. When the user wants to browse the learning content, the first server 110 can obtain the corresponding file from the storage unit 130. The second server 120 is used to perform the functions of classification, management and statistics. However, the application of the present disclosure is not limited thereto, and the data search system 100 may also be applied to an audio video streaming platform or an internet discussion forum.
The first server 110 is used to receive a plurality of learning data. In some embodiments, the first server 110 receives the learning data from the terminal device 200 via the internet. The learning material may be a film, sound, presentation, or discussion string. For convenience of description, the present embodiment is described by dividing the plurality of learning materials into the first learning material 131a, the second learning material 131b, and the third learning material 131 c. However, the disclosure is not limited thereto, and the number of learning materials can be arbitrarily adjusted.
In some embodiments, after the first server 110 receives the first learning data 131a, the first server 110 uploads the first learning data 131a to the course database 131 of the storage unit 130, wherein the first learning data 131a includes a plurality of first learning sections. The first learning sections are mutually connected (or arranged) according to a time sequence (e.g., a predetermined time axis in the first learning data 131 a). For example: if the first learning material 131a is a movie file with a film length of 30 minutes, the first learning material 131a may include two first learning sections, each corresponding to a film length of 15 minutes.
As shown in fig. 1B, the second server 120 includes an operation unit 121, an analysis unit 122, and a transmission unit 123. The operation unit 121 is electrically connected to the analysis unit 122 and the transmission unit 123. The second server 120 performs data transmission with the first server 110 and the storage unit 130 through the transmission unit 123. The second server 120 can obtain the first learning data 131a from the storage unit 130 according to the analysis information transmitted from the first server 110, and perform analysis processing to generate a plurality of first keyword strings (keyword strings) corresponding to each first learning section according to the first learning data 131 a.
For example, the first learning data 131a is a movie file and includes a subtitle file. The analysis unit 122 can create a semantic related word string (or called inference word string) by using a semantic analysis technique (Natural Language Processing) for the text in the subtitle file. To generate a first keyword string corresponding to each first learning segment, for example: "projector, image, principle", "high frequency signal, sharpening, enhancement" and "boost, sharpness". In some embodiments, the semantic related word string may be an original text in the caption file or an inferred word, for example, if the caption file includes "apple, memory, and processor", the analysis unit 122 may automatically infer "smartphone, iphone". After the second server 120 generates the first keyword string, the second server 120 can further store the first keyword string in the analysis database 132 of the storage unit 130. In some other embodiments, the second server 120 further stores a first identification code corresponding to the first learning material 131a in the analysis database 132, so that the first keyword string can be corresponding to the first learning material 131a in the course database 131 according to the first identification code.
In some embodiments, the first learning material 131a further includes time axis data, and each first learning section is connected according to the time axis data to form the first learning material 131 a. The first server 110 can correctly transmit the first learning section in the first learning material 131a to the terminal device 200 according to the time axis data, so that the user can directly browse the content of the first learning material 131a from the correct time point.
When the first server 110 receives the search information from the terminal device 200, the first server 110 forwards the search information to the second server 120. The second server 120 analyzes the search information via the analysis unit 122 to generate a search string, for example: the search information is "projector principle", and the second server 120 can first perform sentence breaking on the search information to generate the search string "projector principle" through analysis, capture, or inference.
The analyzing unit 122 is used for analyzing the text content submitted by the user and extracting information of people, things, objects, places, etc. in the text, so that the developer can know the real intention of the user and estimate the answer to the question to be asked. The analysis unit 122 may perform word segmentation on the search information and create a word vector (e.g., by word2vec, sentec 2vec, etc. analysis models) to infer similar words. In addition, the analysis unit 122 may be connected to a semantic network (ontology) through the internet to perform inference.
In some embodiments, an Auto-encoder 122a (Auto-encoder) is included in the analysis unit 122. The second server 120 may receive a plurality of training data and input the training data into the automatic encoder 122a to establish a Semantic analysis network (Semantic network)122b through data compression and dimension conversion. The semantic analysis network 122b is used for performing semantic analysis on the first learning data and the search information. The auto-encoder 122a may build the semantic analysis network 122b using deep learning. For example, the training data includes a plurality of original learning data and a plurality of confirmed keyword strings, and the automatic encoder 122a can convert the original learning data into an embedded vector after semantic analysis, and generate corresponding weighting parameters according to the confirmed keyword strings to establish the semantic analysis network 122 b. Since the principle of semantic analysis can be understood by those skilled in the art, it is not further described herein.
After the semantic analysis unit 122 analyzes the obtained search word string, the operation unit 121 is configured to compare the search word string with the first keyword strings, and generate a search list according to the first learning section corresponding to the first keyword string corresponding to the search word string. For example, the search string "projector, principle" is similar to the first keyword string "projector, image, principle", so the computing unit 122 will list the search list according to the corresponding first learning segment for the user to refer to. Referring to fig. 1A, if "the first keyword string corresponding to one of the first learning sections of the first learning data 131A" and "the second keyword string corresponding to one of the second learning sections of the second learning data 131 b" are both similar to the search string, the computing unit 121 will list the two learning sections on the search list at the same time, and the user can operate the terminal device 200 to click the corresponding learning section on the online learning interface provided by the first server 110, so that the first server 110 will provide the corresponding learning section to the terminal device 200 (e.g., a movie starts to be played from the time point of 15 minutes).
Accordingly, since the data search system 100 can perform semantic analysis on each first learning section of the first learning material 131a to establish the first keyword string for indexing on each first learning section, when the search information is subsequently received, the data search system 100 can compare the search information with the first keyword string to accurately search the corresponding first learning section in the first learning material 131a, so that the user can quickly start to browse the learning content to be searched, thereby greatly improving the learning efficiency. In addition, the data search system 100 can also store the search information and the analysis result of the first keyword string in the recommendation database 134, so as to generate the recommendation information according to the searched first learning data 131a at a specific time (e.g., when the user has finished browsing a movie or asked a question), and transmit the recommendation information to the terminal device 200. In some embodiments, the computing unit 121 is further configured to calculate a plurality of first similarities between the search string and the first keyword string. The first similarity is the matching degree between the search string and each first keyword string. For example, if the search string is "projector, principle", and "projector" appears in the first keyword string, the first similarity is 50%, and if "projector, principle" appears, the first similarity is 100%. The operation unit 121 can determine whether each first similarity is greater than a threshold (e.g.: 60%)? And only the first learning section corresponding to the first keyword string with the first similarity greater than the threshold is listed in the search list.
In some embodiments, the data search system 100 records the user's actions as a "behavior record". Behavioral records include, but are not limited to: film viewing records, film marking records, note making records, rating records, sharing records, forum records, upload/edit (film) records, page switching records. The second server 120 can refer to the behavior record of the user to sort the learning data in the search list.
As shown in fig. 1A, in some embodiments, the storage unit 130 stores a first learning material 131A, a second learning material 131b, and a third learning material 131 c. The second learning data 131b includes a plurality of second learning sections, and each of the second learning sections includes a second keyword string; similarly, the third learning data 131c includes a plurality of third learning sections, and each third learning section includes a third keyword string. When the user transmits the operation information to the data search system 100 through the terminal device 200 for one of the second learning sections in the second learning data, the first server 110 can receive the operation information and store the corresponding second keyword string in the behavior database 133 in the storage unit 130, so as to set the corresponding second keyword string as a piece of behavior data. In some other embodiments, the operation unit 121 can record the second keyword string as the behavior data after receiving the operation information. As shown in fig. 1B, after the user sends different operation information for a plurality of times, the behavior database 133 records a plurality of corresponding behavior data 133a to 133 c.
After the computing unit 121 selects the first similarity greater than the threshold value, the computing unit 121 can further analyze and calculate the first keyword string (i.e., the keyword string corresponding to the search string) according to the behavior data 133 a-133 c in the behavior database 133, and calculate a plurality of second similarities (e.g., similarity of sentences) between the behavior data 133 a-133 c and the first keyword string. The second similarity corresponds to the behavior data and each of the first keyword strings. For example, after the computing unit 121 compares the search information "projector" with a plurality of first keyword strings to screen out two first keyword strings "projector, principle" and "projector, definition", the computing unit 121 determines that the behavior database 133 stores the behavior data "definition" representing that the user has browsed the learning data about the "definition" theme in the past, at this time, the computing unit 121 determines that the second similarity between the behavior data "definition" and the first keyword strings "projector, definition" is higher, so that when the computing unit 121 generates the search list, the first learning section corresponding to the first keyword string "projector, definition" is arranged before the first learning section corresponding to the first keyword string "projector, principle".
In some other embodiments, when the first server 110 or the second server 120 records the second keyword string as the behavior data 133a to 133c, the first server 110 or the second server 120 further records the weighted values 133d to 133f for the behavior data 133a to 133c respectively according to the number of times the second keyword string is recorded in the behavior database 133. For example, the user browses one of the second learning sections of the second learning data 131b three times, so the second keyword string corresponding to the second learning section will be recorded three times, and the weight value of the corresponding behavior data will be larger (e.g., + 3). The operation unit 121 may adjust the second similarity according to the weight value. For example, if the computing unit 121 compares two first keyword strings with different behavior data 133 a-133 c in the behavior database 133 to obtain two second similarities of "40%", but the weighting values 133 d-133 f of the behavior data 133 a-133 c corresponding to one of the first keyword strings are higher, the computing unit will adjust the corresponding second similarity (e.g., + 10%) to place the first learning section corresponding to the first keyword string before another first learning section in the search list. Accordingly, the search list can be used for more personalized sorting and recommendation according to the content subject materials browsed by the user in the past.
The aforementioned operation commands can be the watching records of learning materials, the film marking records (e.g. the user marks the film as "important"), notes, scores, sharing actions, messages, etc. In some embodiments, the operation information causes the operation unit 121 to transmit the second learning data 131b to the terminal device 200 for browsing. In other embodiments, the operation information enables the operation unit 121 to write annotation data in the course database 131. The annotation data corresponds to the second learning material 131b, and may be a learning note, comment, score, share, question, discussion area, or annotation of the user.
In some embodiments, the analysis unit 122 identifies the first learning section according to metadata (metadata) in the first learning data 131 a. The metadata is information for describing a property (property) of data, and may be regarded as field data in the first learning material 131a, such as: title (caption), keyword (keywords), summary (summary), tags (tags), discussion (discussion), answer (reply), etc. in a movie file. The analysis unit 122 can identify the first learning segment according to the metadata and perform semantic analysis respectively.
In some embodiments, the operation unit 121 can further use word-embedding technology (word-embedding) to perform binary encoding on the metadata in the first learning material 131a, and then store the first learning material 131a in the storage unit 130.
The above-mentioned method for identifying the learning section by using the metadata is to identify the learning section according to the preset field in the first learning material 131 a. In some other embodiments, the first learning section can be divided on the first learning data 131a after being analyzed by the operation unit 121. For example: the operation unit 121 may add a first segment flag to the first learning data 131a to divide a plurality of first learning segments.
Please refer to fig. 2, which is a schematic diagram illustrating an operation of the data search system 100 according to a portion of the present disclosure. The data search system 100 is respectively configured to receive the first learning data 131a and the search information 210. The data search system 100 sequentially performs segmentation processing P01 and binary encoding P02 on the first learning data 131a, and stores the first learning data in the course database 131 to create an index. Then, after the data search system 100 receives the search information 210, the search information is analyzed and processed P03 (e.g., semantic analysis or metadata analysis), and the indexed first learning data 131a (including the analyzed first keyword string) is compared with the analyzed search information 210 to process P04, and a search list 300 is generated according to the behavior data in the behavior database 133.
Next, referring to fig. 1A and fig. 3A, please refer to a generation manner of the segment marks, and fig. 3A is a schematic text file diagram of the first learning data 131A according to some embodiments of the disclosure. In some embodiments, the first learning material 131a includes a text file a1 (e.g., subtitles). After receiving the first learning data 131a, the second server 120 analyzes the text file a1, for example: generating a plurality of characteristic sentences by a semantic analysis method. These characteristic sentences have precedence relationships. Then, the similarity between the adjacent characteristic sentences is judged to generate a first segmentation mark.
For example, after the text document a1 is analyzed, the generated characteristic sentence includes "the projector adjusts the light emitting unit according to the image signal", "the light emitted by the light emitting unit is reflected as an image picture", and "in another type of projector". The first sentence and the second sentence have the same words of image and luminescence, and have higher similarity, and the second sentence and the third sentence have lower similarity. Therefore, when the second server 120 determines that the similarity between adjacent characteristic sentences is lower than a predetermined value (e.g., there is no identical word or one of the adjacent characteristic sentences is a turning sentence, such as … in other embodiments), the second server 120 generates the first segment flag. So as to divide the character file A1 into a plurality of first learning sections A11-A14.
In the foregoing embodiment, the text file a1 may generate the feature sentences through a semantic analysis technique, and analyze similarities between the feature sentences, but the disclosure is not limited thereto. In some embodiments, the processor in the second server 120 may also perform binarization (binary) on the text file a1, and compare the processed text file with the processed text file to determine similarity, so as to establish the feature sentences or determine similarity between the feature sentences.
The text file in the foregoing embodiment refers to the text content of the subtitles or the brief report of the movie, and if the text file is the discussion content of the internet forum, the text file can still be segmented by the same principle. Similarly, if the first learning data 131a includes a sound file, the second server 120 can generate a text file a1 through speech recognition and then perform analysis to obtain a plurality of characteristic sentences.
In some other embodiments, please refer to fig. 3B, the first learning data 131B includes an image file B1. The video file B1 further includes a plurality of video frames B01-B04. The video frames B01-B04 can be a plurality of frames of video files linked according to time sequence. The second server 120 is used for determining the similarity between the adjacent image frames B01-B04 to generate a first segment flag. For example, the image frames B01-B02 are used to display the structure of the projector, and the image frames B03-B04 are used to display the path diagram of the light projection. Since the similarity between the video frames B02 and B03 is low, the second server 120 can add the first segment flag between the video frames B02 and B03 to form a plurality of first learning segments B11 and B12.
Referring to FIG. 3A again, a method for analyzing the first keyword string by the analysis unit 122 is described as follows. The analyzing unit 122 performs an analysis process (e.g., semantic analysis) on the text file A1 in the first learning data 131a to obtain a plurality of feature words. Then, after the first learning data 131a is divided into a plurality of first learning sections A11-A14 or B11-B12, the second server 120 determines the number of the feature words in each of the first learning sections A11-A14 or B11-B12, and sets the number as the first keyword string when the number is greater than a predetermined value. For example, a first learning section a11 of the text document a1 includes the following contents: the projector adjusts the light-emitting unit according to the image signal, and the light projected by the light-emitting unit is reflected as an image picture. The analysis unit 122 first performs sentence breaking on the text file to screen out a plurality of words (e.g., projector, video signal, adjustment …, etc.). Wherein, the image appears 2 times, the light-emitting unit appears 2 times, and the projector and the light ray appear 1 time respectively. The analysis unit 122 may set the feature word "image, light-emitting unit" appearing 2 times as the first keyword string.
Similarly, after receiving the search information, the analyzing unit 122 can also perform sentence-breaking on the text in the search information to obtain the search string. Or the analysis unit 122 can set the words with the occurrence frequency greater than the predetermined value as the search string according to the words generated after the sentence break.
Please refer to fig. 4, which is a schematic diagram illustrating a data searching method according to some embodiments of the present disclosure. The data searching method includes the following steps S401 to S409. In step S401, the first server 110 receives the first learning data 131a and stores the first learning data 131a in the course database 131. The first learning data 131a includes a plurality of first learning sections. In step S402, the second server 120 is connected to the storage unit 130, and the analysis unit 122 analyzes the first learning data 131a to generate a first keyword string corresponding to each first learning segment. In some embodiments, the analyzing unit 122 searches the first keyword string by a semantic analysis technique. In some other embodiments, the analysis unit may also perform a binarization process on the first learning data 131a to filter out the first keyword string according to the metadata comparison.
In step S403, the first server 110 receives the search information and sends the search information to the second server 120 for background analysis. In step S404, the analysis unit 122 performs semantic analysis on the search information to generate a search string. In step S405, the arithmetic unit 121 compares the search string with the first keyword strings to generate a first similarity.
In step S406, when the arithmetic unit 121 determines that the first similarity is greater than the threshold, a search list is generated according to the first learning sections corresponding to the first keyword strings. In step S407, the behavior data in the behavior database 133 and the first keyword strings screened in step S406 are compared to generate a plurality of second similarities. In step S408, the first learning sections in the search list are sorted according to the second similarity.
The foregoing steps are described with reference to the first database 131a, but in other embodiments, the data search system 100 stores a plurality of learning materials 131 a-131 c. The analysis unit 122 can perform semantic analysis on each learning data 131 a-131 c to search out the corresponding keyword string. The computing unit 121 matches the search string with each keyword string in each learning data 131 a-131 c to find the learning segment corresponding to the search string. For example: one of the first learning sections of the first learning data 131a is closest to the search string, one of the second learning sections of the second learning data 131b is also associated with the search string, and the computing unit 121 can display both learning sections in the search list.
Although the present disclosure has been described with reference to the above embodiments, it should be understood that various changes and modifications can be made by one skilled in the art without departing from the spirit and scope of the disclosure, and therefore, the scope of the disclosure should be determined by that of the appended claims.

Claims (20)

1. A data searching method, comprising:
receiving a first learning data, wherein the first learning data comprises a plurality of first learning sections;
analyzing the first learning data to generate a plurality of first keyword strings corresponding to each first learning segment;
receiving a search message;
analyzing the search information to generate a search string;
comparing the search word string with the plurality of first keyword strings; and
generating a search list according to the first learning sections corresponding to the first keyword strings matching the search string.
2. A method as claimed in claim 1, further comprising:
calculating a plurality of first similarities between the search string and the first keyword strings, wherein the first similarities correspond to the search string and the first keyword strings, respectively; and
the search list is generated according to the first learning sections corresponding to the first keyword strings with the first similarities larger than a threshold value.
3. A method as claimed in claim 1, further comprising:
receiving an operation information, wherein the operation information corresponds to a second learning section in a second learning data, and the second learning section comprises a second keyword string; and
storing the second keyword string corresponding to the operation information into a behavior database to be recorded as behavior data.
4. A method as claimed in claim 3, further comprising:
calculating second similarities between the behavior data and the first keyword strings corresponding to the search string; and
and sorting the first learning sections in the search list according to the second similarities.
5. The data searching method of claim 4, further comprising:
setting a weight value of the behavior data according to the number of times the second keyword string is stored in the behavior database; and
and adjusting the plurality of second similarity degrees according to the weight value.
6. The data searching method of claim 3, wherein the operation information is used to transmit the second learning data to a terminal device.
7. The data searching method of claim 3, wherein the operation information is used to write an annotation data in a course database, the annotation data corresponding to the second learning data.
8. A method as claimed in claim 1, further comprising:
receiving a plurality of training data;
inputting the training data into an automatic encoder, and establishing a semantic analysis network through data compression processing and dimension conversion processing; and
using the semantic analysis network to perform semantic analysis on the first learning data and the search information.
9. The data searching method of claim 1, wherein the plurality of first learning segments are identified according to metadata in the first learning data after receiving the first learning data.
10. A method as claimed in claim 9, further comprising:
and binary coding the metadata in the first learning material by using a word embedding technology.
11. A data search system, comprising:
a storage unit for storing a first learning data, wherein the first learning data comprises a plurality of first learning sections;
an analysis unit for generating a plurality of first keyword strings corresponding to each first learning segment according to the first learning data; the analysis unit is also used for generating a search string according to search information; and
and the arithmetic unit is electrically connected with the analysis unit and used for comparing the search string with the first keyword strings and generating a search list according to the first learning sections corresponding to the first keyword strings which are consistent with the search string.
12. The data searching system of claim 11, wherein the computing unit is configured to calculate a plurality of first similarities between the search string and the first keyword strings, the first similarities corresponding to the search string and the first keyword strings, respectively; the arithmetic unit is configured to generate the search list according to the first learning sections corresponding to the first keyword strings with the first similarities larger than a threshold.
13. The data searching system of claim 11, wherein the storage unit further stores a second learning data, the second learning data includes a second learning segment, and the second learning segment includes a second keyword string;
after the operation unit receives an operation message, the operation unit stores the corresponding second keyword string into a behavior database of the storage unit according to the second learning segment corresponding to the operation message, so as to record the second keyword string as behavior data.
14. The data searching system of claim 13, wherein the computing unit is configured to calculate second similarities between the behavior data and the first keyword strings matching the search string, the second similarities corresponding to the behavior data and the first keyword strings matching the search string, respectively; the arithmetic unit is further configured to sort the plurality of first learning sections in the search list according to the plurality of second similarities.
15. The data search system of claim 14, wherein the behavior database further stores a weight value, the weight value being a number of times the second keyword string is stored in the behavior database; the operation unit is used for adjusting the plurality of second similarities according to the weight value.
16. The data searching system of claim 13, wherein the operation information is used to transmit the second learning data to a terminal device.
17. The data searching system of claim 13, wherein the operation information is used to write an annotation data in a course database of the storage unit, the annotation data corresponding to the second learning data.
18. The data searching system of claim 11, wherein the analysis unit further comprises an auto-encoder, the semantic unit is configured to input a plurality of training data into the auto-encoder to establish a semantic analysis network through data compression and dimension conversion.
19. The data searching system of claim 11, wherein the analysis unit identifies the first learning sections according to metadata in the first learning data.
20. The data searching system of claim 19, wherein the computing unit uses word embedding technology to binary code the metadata of the first learning data, and stores the first learning data in the storage unit.
CN201910104937.6A 2018-09-07 2019-02-01 Data searching method and data searching system thereof Active CN110888896B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2019090932A JP6829740B2 (en) 2018-09-07 2019-05-13 Data search method and its data search system
SG10201905532QA SG10201905532QA (en) 2018-09-07 2019-06-17 Data search method and data search system thereof
EP19188646.4A EP3621021A1 (en) 2018-09-07 2019-07-26 Data search method and data search system thereof
US16/529,820 US11386163B2 (en) 2018-09-07 2019-08-02 Data search method and data search system thereof for generating and comparing strings

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862728082P 2018-09-07 2018-09-07
US62/728,082 2018-09-07

Publications (2)

Publication Number Publication Date
CN110888896A true CN110888896A (en) 2020-03-17
CN110888896B CN110888896B (en) 2023-09-05

Family

ID=69745778

Family Applications (5)

Application Number Title Priority Date Filing Date
CN201910104937.6A Active CN110888896B (en) 2018-09-07 2019-02-01 Data searching method and data searching system thereof
CN201910104946.5A Active CN110891202B (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer readable medium
CN201910105172.8A Pending CN110895654A (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer readable medium
CN201910105173.2A Pending CN110889034A (en) 2018-09-07 2019-02-01 Data analysis method and data analysis system
CN201910266133.6A Pending CN110888994A (en) 2018-09-07 2019-04-03 Multimedia data recommendation system and multimedia data recommendation method

Family Applications After (4)

Application Number Title Priority Date Filing Date
CN201910104946.5A Active CN110891202B (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer readable medium
CN201910105172.8A Pending CN110895654A (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer readable medium
CN201910105173.2A Pending CN110889034A (en) 2018-09-07 2019-02-01 Data analysis method and data analysis system
CN201910266133.6A Pending CN110888994A (en) 2018-09-07 2019-04-03 Multimedia data recommendation system and multimedia data recommendation method

Country Status (4)

Country Link
JP (3) JP6829740B2 (en)
CN (5) CN110888896B (en)
SG (5) SG10201905236WA (en)
TW (5) TWI699663B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351794A (en) * 2023-10-13 2024-01-05 浙江上国教育科技有限公司 Online course management system based on cloud platform

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI756703B (en) * 2020-06-03 2022-03-01 南開科技大學 Digital learning system and method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104123332A (en) * 2014-01-24 2014-10-29 腾讯科技(深圳)有限公司 Search result display method and device
CN104572716A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 System and method for playing video files
WO2015068947A1 (en) * 2013-11-06 2015-05-14 주식회사 시스트란인터내셔널 System for analyzing speech content on basis of extraction of keywords from recorded voice data, indexing method using system and method for analyzing speech content

Family Cites Families (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07311539A (en) * 1994-05-17 1995-11-28 Hitachi Ltd Teaching material edition supporting system
KR100250540B1 (en) * 1996-08-13 2000-04-01 김광수 Studying method of foreign language dictation with apparatus of playing caption video cd
JP2002041823A (en) * 2000-07-27 2002-02-08 Nippon Telegr & Teleph Corp <Ntt> Information distributing device, information receiving device and information distributing system
JP3685733B2 (en) * 2001-04-11 2005-08-24 株式会社ジェイ・フィット Multimedia data search apparatus, multimedia data search method, and multimedia data search program
JP2002341735A (en) * 2001-05-16 2002-11-29 Alice Factory:Kk Broadband digital learning system
CN1432932A (en) * 2002-01-16 2003-07-30 陈雯瑄 English examination and score estimation method and system
TW200411462A (en) * 2002-12-20 2004-07-01 Hsiao-Lien Wang A method for matching information exchange on network
KR101109023B1 (en) * 2003-04-14 2012-01-31 코닌클리케 필립스 일렉트로닉스 엔.브이. Method and apparatus for summarizing a music video using content analysis
JP4471737B2 (en) * 2003-10-06 2010-06-02 日本電信電話株式会社 Grouping condition determining device and method, keyword expansion device and method using the same, content search system, content information providing system and method, and program
JP4426894B2 (en) * 2004-04-15 2010-03-03 株式会社日立製作所 Document search method, document search program, and document search apparatus for executing the same
JP2005321662A (en) * 2004-05-10 2005-11-17 Fuji Xerox Co Ltd Learning support system and method
JP2006003670A (en) * 2004-06-18 2006-01-05 Hitachi Ltd Educational content providing system
KR20070116945A (en) * 2005-03-31 2007-12-11 코닌클리케 필립스 일렉트로닉스 엔.브이. Augmenting lectures based on prior exams
US9058406B2 (en) * 2005-09-14 2015-06-16 Millennial Media, Inc. Management of multiple advertising inventories using a monetization platform
WO2008023470A1 (en) * 2006-08-21 2008-02-28 Kyoto University Sentence search method, sentence search engine, computer program, recording medium, and document storage
TW200825900A (en) * 2006-12-13 2008-06-16 Inst Information Industry System and method for generating wiki by sectional time of handout and recording medium thereof
JP5010292B2 (en) * 2007-01-18 2012-08-29 株式会社東芝 Video attribute information output device, video summarization device, program, and video attribute information output method
JP5158766B2 (en) * 2007-10-23 2013-03-06 シャープ株式会社 Content selection device, television, content selection program, and storage medium
TW200923860A (en) * 2007-11-19 2009-06-01 Univ Nat Taiwan Science Tech Interactive learning system
CN101382937B (en) * 2008-07-01 2011-03-30 深圳先进技术研究院 Multimedia resource processing method based on speech recognition and on-line teaching system thereof
US8140544B2 (en) * 2008-09-03 2012-03-20 International Business Machines Corporation Interactive digital video library
CN101453649B (en) * 2008-12-30 2011-01-05 浙江大学 Key frame extracting method for compression domain video stream
JP5366632B2 (en) * 2009-04-21 2013-12-11 エヌ・ティ・ティ・コミュニケーションズ株式会社 Search support keyword presentation device, method and program
JP5493515B2 (en) * 2009-07-03 2014-05-14 富士通株式会社 Portable terminal device, information search method, and information search program
EP2524362A1 (en) * 2010-01-15 2012-11-21 Apollo Group, Inc. Dynamically recommending learning content
JP2012038239A (en) * 2010-08-11 2012-02-23 Sony Corp Information processing equipment, information processing method and program
US8839110B2 (en) * 2011-02-16 2014-09-16 Apple Inc. Rate conform operation for a media-editing application
CN102222227B (en) * 2011-04-25 2013-07-31 中国华录集团有限公司 Video identification based system for extracting film images
CN102348049B (en) * 2011-09-16 2013-09-18 央视国际网络有限公司 Method and device for detecting position of cut point of video segment
CN102509007A (en) * 2011-11-01 2012-06-20 北京瑞信在线系统技术有限公司 Method, system and device for multimedia teaching evaluation and multimedia teaching system
JP5216922B1 (en) * 2012-01-06 2013-06-19 Flens株式会社 Learning support server, learning support system, and learning support program
US9846696B2 (en) * 2012-02-29 2017-12-19 Telefonaktiebolaget Lm Ericsson (Publ) Apparatus and methods for indexing multimedia content
US20130263166A1 (en) * 2012-03-27 2013-10-03 Bluefin Labs, Inc. Social Networking System Targeted Message Synchronization
US9058385B2 (en) * 2012-06-26 2015-06-16 Aol Inc. Systems and methods for identifying electronic content using video graphs
TWI513286B (en) * 2012-08-28 2015-12-11 Ind Tech Res Inst Method and system for continuous video replay
CN102937972B (en) * 2012-10-15 2016-06-22 上海外教社信息技术有限公司 A kind of audiovisual subtitle making system and method
WO2014100893A1 (en) * 2012-12-28 2014-07-03 Jérémie Salvatore De Villiers System and method for the automated customization of audio and video media
JP6205767B2 (en) * 2013-03-13 2017-10-04 カシオ計算機株式会社 Learning support device, learning support method, learning support program, learning support system, and server device
TWI549498B (en) * 2013-06-24 2016-09-11 wu-xiong Chen Variable audio and video playback method
US20150206441A1 (en) * 2014-01-18 2015-07-23 Invent.ly LLC Personalized online learning management system and method
US9892194B2 (en) * 2014-04-04 2018-02-13 Fujitsu Limited Topic identification in lecture videos
US20150293995A1 (en) * 2014-04-14 2015-10-15 David Mo Chen Systems and Methods for Performing Multi-Modal Video Search
US20160239155A1 (en) * 2015-02-18 2016-08-18 Google Inc. Adaptive media
JP6334431B2 (en) * 2015-02-18 2018-05-30 株式会社日立製作所 Data analysis apparatus, data analysis method, and data analysis program
CN104978961B (en) * 2015-05-25 2019-10-15 广州酷狗计算机科技有限公司 A kind of audio-frequency processing method, device and terminal
CN105047203B (en) * 2015-05-25 2019-09-10 广州酷狗计算机科技有限公司 A kind of audio-frequency processing method, device and terminal
TWI571756B (en) * 2015-12-11 2017-02-21 財團法人工業技術研究院 Methods and systems for analyzing reading log and documents corresponding thereof
CN105978800A (en) * 2016-07-04 2016-09-28 广东小天才科技有限公司 Method and system for pushing subjects to mobile terminal and server
CN106202453B (en) * 2016-07-13 2020-08-04 网易(杭州)网络有限公司 Multimedia resource recommendation method and device
CN106231399A (en) * 2016-08-01 2016-12-14 乐视控股(北京)有限公司 Methods of video segmentation, equipment and system
CN106331893B (en) * 2016-08-31 2019-09-03 科大讯飞股份有限公司 Real-time caption presentation method and system
CN108122437A (en) * 2016-11-28 2018-06-05 北大方正集团有限公司 Adaptive learning method and device
CN107256262B (en) * 2017-06-13 2020-04-14 西安电子科技大学 Image retrieval method based on object detection
CN107623860A (en) * 2017-08-09 2018-01-23 北京奇艺世纪科技有限公司 Multi-medium data dividing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572716A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 System and method for playing video files
WO2015068947A1 (en) * 2013-11-06 2015-05-14 주식회사 시스트란인터내셔널 System for analyzing speech content on basis of extraction of keywords from recorded voice data, indexing method using system and method for analyzing speech content
CN104123332A (en) * 2014-01-24 2014-10-29 腾讯科技(深圳)有限公司 Search result display method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117351794A (en) * 2023-10-13 2024-01-05 浙江上国教育科技有限公司 Online course management system based on cloud platform

Also Published As

Publication number Publication date
TWI725375B (en) 2021-04-21
JP6829740B2 (en) 2021-02-10
TW202011231A (en) 2020-03-16
TWI700597B (en) 2020-08-01
CN110889034A (en) 2020-03-17
CN110895654A (en) 2020-03-20
SG10201906347QA (en) 2020-04-29
JP2020042777A (en) 2020-03-19
SG10201907250TA (en) 2020-04-29
CN110891202B (en) 2022-03-25
TWI709905B (en) 2020-11-11
TWI696386B (en) 2020-06-11
SG10201905523TA (en) 2020-04-29
JP2020042771A (en) 2020-03-19
SG10201905236WA (en) 2020-04-29
TW202011749A (en) 2020-03-16
TW202011222A (en) 2020-03-16
TW202011232A (en) 2020-03-16
TWI699663B (en) 2020-07-21
CN110888896B (en) 2023-09-05
SG10201905532QA (en) 2020-04-29
CN110891202A (en) 2020-03-17
TW202011221A (en) 2020-03-16
CN110888994A (en) 2020-03-17
JP2020042770A (en) 2020-03-19

Similar Documents

Publication Publication Date Title
CN110119711B (en) Method and device for acquiring character segments of video data and electronic equipment
US9253511B2 (en) Systems and methods for performing multi-modal video datastream segmentation
US8930288B2 (en) Learning tags for video annotation using latent subtags
CN112163122B (en) Method, device, computing equipment and storage medium for determining label of target video
CN111274442B (en) Method for determining video tag, server and storage medium
US20160014482A1 (en) Systems and Methods for Generating Video Summary Sequences From One or More Video Segments
CN112749326B (en) Information processing method, information processing device, computer equipment and storage medium
CN111314732A (en) Method for determining video label, server and storage medium
CN112015928A (en) Information extraction method and device of multimedia resource, electronic equipment and storage medium
US11361759B2 (en) Methods and systems for automatic generation and convergence of keywords and/or keyphrases from a media
CN114297439A (en) Method, system, device and storage medium for determining short video label
CN114363695B (en) Video processing method, device, computer equipment and storage medium
CN110888896B (en) Data searching method and data searching system thereof
US11386163B2 (en) Data search method and data search system thereof for generating and comparing strings
JP6486165B2 (en) Candidate keyword evaluation apparatus and candidate keyword evaluation program
CN114845149A (en) Editing method of video clip, video recommendation method, device, equipment and medium
US20240037941A1 (en) Search results within segmented communication session content
US11409804B2 (en) Data analysis method and data analysis system thereof for searching learning sections
CN114780757A (en) Short media label extraction method and device, computer equipment and storage medium
CN115618873A (en) Data processing method and device, computer equipment and storage medium
CN116012871A (en) Object recognition method, device, computer equipment, storage medium and product
Redaelli et al. Automated Intro Detection ForTV Series
CN117835004A (en) Method, apparatus and computer readable medium for generating video viewpoints
CN116311415A (en) Facial fine granularity emotion recognition method based on domain knowledge perception network
CN115767207A (en) Video abstract generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant