CN113254708A - Video searching method and device, computer equipment and storage medium - Google Patents

Video searching method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113254708A
CN113254708A CN202110715993.0A CN202110715993A CN113254708A CN 113254708 A CN113254708 A CN 113254708A CN 202110715993 A CN202110715993 A CN 202110715993A CN 113254708 A CN113254708 A CN 113254708A
Authority
CN
China
Prior art keywords
video
information
search
learning
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110715993.0A
Other languages
Chinese (zh)
Inventor
刘煊
甄学文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lexuebang Network Technology Co ltd
Original Assignee
Beijing Lexuebang Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lexuebang Network Technology Co ltd filed Critical Beijing Lexuebang Network Technology Co ltd
Priority to CN202110715993.0A priority Critical patent/CN113254708A/en
Publication of CN113254708A publication Critical patent/CN113254708A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Abstract

The present disclosure provides a video search method, apparatus, computer device, and storage medium, wherein the method comprises: receiving a search request and acquiring search information in the search request; acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information; determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance; and displaying the at least one video clip on a user terminal interface as a search result.

Description

Video searching method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a video search method and apparatus, a computer device, and a storage medium.
Background
With the rapid development of internet technology and the development of online education, more and more people can promote themselves by watching learning videos. The learning video watched by the user is usually a live playback video with a long time or a lecture video obtained by recording a real classroom.
In the related art, when a user wants to watch specific content of a certain specific video, the user often only needs to find the specific video containing the specific content first through searching or other methods, and then finds the specific content to be watched through methods such as dragging a progress bar, and the like, so that the whole process is too tedious, and user experience and learning efficiency are affected.
Disclosure of Invention
The embodiment of the disclosure at least provides a video searching method, a video searching device, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a video search method, including:
receiving a search request and acquiring search information in the search request;
acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information;
determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
and displaying the at least one video clip on a user terminal interface as a search result.
In a possible embodiment, the determining a target segmentation rule matching the user attribute information includes:
determining target statistical information matched with the user attribute information;
and determining a target segmentation rule for segmenting based on the target statistical information.
In one possible embodiment, the user attribute information includes at least one of class, grade, age, subject preference, teacher preference, segment length preference;
the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
In a possible implementation manner, the receiving a search request and acquiring search information in the search request includes at least one of the following:
receiving a voice search request, recognizing the voice search request, and taking a voice recognition result as the search information;
receiving a picture search request, identifying the picture search request, and taking a picture identification result as the search information;
receiving a character search request, identifying the character search request, and taking a character identification result as the search request.
In a possible embodiment, the determining, from a plurality of learning videos based on the search information, at least one video segment matching the search information includes:
determining target text information matched with the search information based on the text information of the plurality of learning videos; the learning video comprises a live video and a recorded broadcast video.
Taking the video frame corresponding to the target text information as a target video frame;
and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
In one possible embodiment, the text information of the learning video includes at least one of:
the learning video processing method comprises the steps of obtaining first text information by recognizing voice information in a learning video, obtaining second text information by recognizing character information in the learning video, obtaining third text information by recognizing picture information in the learning video, and obtaining fourth text information by recognizing characteristic information in the learning video.
In one possible embodiment, the text information of the learning video is obtained by at least one of:
when the learning video is a live video, generating the text information in real time;
and when the learning video is a recorded and broadcast video, generating the text information according to at least one of picture information, character information, voice information and feature information in the recorded and broadcast video.
In a possible implementation, the determining, based on the text information of the learning video, target text information that matches the search information includes:
determining a first keyword in text information of the learning video;
determining a second keyword which has an association relation with the first keyword;
and determining target text information matched with the search information based on the first keyword and the second keyword.
In one possible embodiment, presenting the at least one video segment as a search result on a user-side interface includes:
and displaying the at least one video clip on a user terminal interface according to a set preview mode.
In one possible embodiment, presenting the at least one video segment as a search result on a user-side interface includes:
and displaying the at least one video clip on a user terminal interface according to a set arrangement mode.
In one possible embodiment, after determining at least one video segment matching the search information from the plurality of learning videos, the method further includes:
storing the corresponding relation between the search request and the at least one video clip;
after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation;
and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
In a possible embodiment, the presenting the at least one video segment as a search result on a user-side interface includes:
and displaying the at least one video segment on a user side interface as a search result, and marking the origin and the starting time of each video segment, so that after the marking result corresponding to the origin of any video segment is triggered, a learning video corresponding to any video segment is displayed on the user side interface.
In a second aspect, an embodiment of the present disclosure further provides a video search apparatus, including:
the receiving module is used for receiving a search request and acquiring search information in the search request;
the first determining module is used for acquiring user attribute information and determining a target segmentation rule matched with the user attribute information;
the second determining module is used for determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
and the display module is used for displaying the at least one video clip as a search result on the user side interface.
In a possible implementation manner, the first determining module, when determining the target segmentation rule matching the user attribute information, is configured to:
determining target statistical information matched with the user attribute information;
and determining a target segmentation rule for segmenting based on the target statistical information.
In one possible embodiment, the user attribute information includes at least one of class, grade, age, subject preference, teacher preference, segment length preference;
the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
In a possible implementation manner, when receiving a search request and acquiring search information in the search request, the receiving module is configured to:
receiving a voice search request, recognizing the voice search request, and taking a voice recognition result as the search information; and/or the presence of a gas in the gas,
receiving a picture search request, identifying the picture search request, and taking a picture identification result as the search information; and/or the presence of a gas in the gas,
receiving a character search request, identifying the character search request, and taking a character identification result as the search request.
In one possible embodiment, the second determining module, when determining, based on the search information, at least one video segment matching the search information from a plurality of learning videos, is configured to:
determining target text information matched with the search information based on the text information of the plurality of learning videos; the learning video comprises a live video and a recorded broadcast video.
Taking the video frame corresponding to the target text information as a target video frame;
and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
In one possible embodiment, the text information of the learning video includes at least one of:
the learning video processing method comprises the steps of obtaining first text information by recognizing voice information in a learning video, obtaining second text information by recognizing character information in the learning video, obtaining third text information by recognizing picture information in the learning video, and obtaining fourth text information by recognizing characteristic information in the learning video.
In a possible implementation manner, the second determining module is configured to obtain text information of the learning video according to the following steps:
when the learning video is a live video, generating the text information in real time;
and when the learning video is a recorded and broadcast video, generating the text information according to at least one of picture information, character information, voice information and feature information in the recorded and broadcast video.
In one possible implementation, the second determining module, when determining the target text information matching the search information based on the text information of the learning video, is configured to:
determining a first keyword in text information of the learning video;
determining a second keyword which has an association relation with the first keyword;
and determining target text information matched with the search information based on the first keyword and the second keyword.
In one possible embodiment, the presentation module, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video clip on a user terminal interface according to a set preview mode.
In one possible embodiment, the presentation module, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video clip on a user terminal interface according to a set arrangement mode.
In one possible embodiment, after determining at least one video segment matching the search information from the plurality of learning videos, the presentation module is further configured to:
storing the corresponding relation between the search request and the at least one video clip;
after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation;
and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
In one possible embodiment, the presentation module, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video segment on a user side interface as a search result, and marking the origin and the starting time of each video segment, so that after the marking result corresponding to the origin of any video segment is triggered, a learning video corresponding to any video segment is displayed on the user side interface.
In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the computer device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect described above, or any possible implementation of the first aspect.
In a fourth aspect, this disclosed embodiment also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.
According to the video searching method, the video searching device, the computer equipment and the storage medium, the searching request is received, and the searching information in the searching request is obtained; acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information; after the corresponding target segmentation rule is determined based on the user attribute information, a plurality of video segments corresponding to the target segmentation rule can be obtained, then matching is carried out based on the search information carried in the search request, and the video segments obtained after matching are used as search results to be displayed on a user side interface. Therefore, the video segments matched with the search information can be searched from the plurality of video segments corresponding to the target segmentation rule, the searched video segments are pre-segmented based on the target segmentation rule matched with the attribute information, the user can search the video segments matched with the search information and the user attribute information, further personalized search of the video segments can be realized, the interest of the user in viewing the video segments is improved, and further the learning efficiency is improved.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
Fig. 1 shows a flowchart of a video search method provided by an embodiment of the present disclosure;
fig. 2a is a schematic diagram illustrating a user searching page in the video searching method according to the embodiment of the disclosure;
fig. 2b is a schematic diagram illustrating a target display area of a learning video in a video search method provided by an embodiment of the present disclosure;
fig. 2c is a schematic diagram illustrating a display page of a search result in the video search method provided by the embodiment of the disclosure;
fig. 2d is a schematic diagram illustrating another display page of search results in the video search method provided by the embodiment of the present disclosure;
fig. 2e is a schematic diagram illustrating a preview page of a target video frame in the video search method provided by the embodiment of the disclosure;
fig. 2f is a schematic diagram illustrating a user-side interface including a tagged result in the video search method according to the embodiment of the disclosure;
fig. 3 is a flowchart illustrating a specific method for segmenting a learning video into a plurality of video segments in a video search method provided by an embodiment of the present disclosure;
fig. 4 is a flowchart illustrating another specific method for segmenting a learning video into a plurality of video segments in the video search method provided by the embodiment of the present disclosure;
fig. 5 is a flowchart illustrating a specific method for determining a video segment in at least one learning video matching search information in a video search method provided by an embodiment of the present disclosure;
fig. 6 is a schematic diagram illustrating an architecture of a video search apparatus provided in an embodiment of the present disclosure;
fig. 7 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
Research shows that when a user wants to watch specific content of a certain specific video, the user often only needs to find the specific video containing the specific content by searching or other methods, and then finds the specific content to be watched by dragging a progress bar and the like, so that the whole process is too tedious, and user experience and learning efficiency are affected.
Based on the research, the present disclosure provides a video search method, apparatus, computer device, and storage medium, by receiving a search request and obtaining search information in the search request; acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information; after the corresponding target segmentation rule is determined based on the user attribute information, a plurality of video segments corresponding to the target segmentation rule can be obtained, then matching is carried out based on the search information carried in the search request, and the video segments obtained after matching are used as search results to be displayed on a user side interface. Therefore, the video segments matched with the search information can be searched from the plurality of video segments corresponding to the target segmentation rule, the searched video segments are pre-segmented based on the target segmentation rule matched with the attribute information, the user can search the video segments matched with the search information and the user attribute information, further personalized search of the video segments can be realized, the interest of the user in viewing the video segments is improved, and further the learning efficiency is improved.
To facilitate understanding of the present embodiment, first, a video search method disclosed in the embodiments of the present disclosure is described in detail, where an execution subject of the video search method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: the terminal device may be an intelligent terminal device with a display function, for example, a smart phone, a tablet computer, an intelligent wearable device, or the like. In some possible implementations, the video search method may be implemented by a processor calling computer readable instructions stored in a memory.
Referring to fig. 1, which is a flowchart of a video search method provided in the embodiment of the present disclosure, the method includes steps S101 to S103, where:
s101: receiving a search request and acquiring search information in the search request.
S102: acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information.
S103: determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance.
S104: and displaying the at least one video clip on a user terminal interface as a search result.
Each step and the corresponding implementation method in the embodiments of the present disclosure will be described in detail below.
For S101, if the execution subject is a terminal device, the search request may be generated after the user side responds to the user operation; if the execution subject is a server, the search request receiving may be a search request sent by a receiving user end after responding to a user operation; the receiving of the search request and the obtaining of the search information in the search request may be that after the search request is received, the content carried in the search request is identified to obtain an identification result corresponding to the search request, and the identification result is used as the search information, where the identification result may be a text identification result obtained by identifying text, and the text identification result is text content; or, the speech recognition result may be a speech recognition result obtained by recognizing a speech, where the speech recognition result is a text content corresponding to the speech; or, the image recognition result may be obtained after recognizing an image, where the image recognition result is a name of an object included in the image and text content included in the image.
For example, taking an Application scenario as an educational Application (APP) (of course, an applet, a public number, an H5 page, a web page link, a web page landing page, and the like embedded in the APP may also be used), the search page of the user side may be as shown in fig. 2a, and in fig. 2a, the user may input the search information in a preset area in a typing manner, or may start a voice acquisition device in the terminal device to complete the entry of the search information by triggering a voice entry button, or may start a picture acquisition device in the terminal device to complete the entry of the search information by a picture entry option (e.g., scanning a scan, a camera, and the like).
S102: acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information.
In order to provide personalized search experience for different users, a plurality of learning videos can be segmented in advance based on a plurality of segmentation rules to obtain a plurality of video segments, and then the video segments matched with the search information can be ensured to be related to the attribute information of the users by determining the target segmentation rule matched with the attribute information of the users in the plurality of segmentation rules, so that the video search result is more in line with the actual search intention and requirements of the users.
Illustratively, the user attribute information is a segment length preference, for example, may be 5 minutes, when a plurality of learning videos are pre-segmented, the plurality of learning videos may be segmented on the basis of 3 minutes, 4 minutes and 5 minutes to obtain video segments, and the target segmentation rule matched with the user attribute information is 5 minutes, so that by matching the segment length preference of the user, the video segment finally displayed to the user is a video segment obtained by segmenting the learning videos on the basis of 5 minutes, and thus the video segment better conforms to the preference of the user.
In a possible implementation manner, when determining a target segmentation rule matched with the user attribute information, target statistical information matched with the user attribute information may be determined; and determining a target segmentation rule for segmenting based on the target statistical information.
Wherein the user attribute information comprises at least one of class, grade, age, subject preference, teacher preference, segment length preference; the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
Here, the teacher preference is used to indicate which teacher the user prefers; the segment length preference is used to indicate the length of the video segment preferred by the user; the historical answering situation is used for representing the answering situation of the user for the questions corresponding to each knowledge point, for example, the answering accuracy of the knowledge point A is 75%; the keywords in the teacher's common sentences in the class may be words used by the teacher in the class to represent switching of knowledge points, such as words of "next", and the like; the frequency of the interactive information is used for representing the interactive intensity between the teacher and the students in the live teaching, for example, the barrage frequency of 9-10 minutes in the live teaching is 15 pieces/minute.
Illustratively, if the user attribute information is age 8, it may be determined that the matched target segmentation rule is such that each segmented video segment contains 1 knowledge point with a low accuracy; and if the user attribute information is that the age is 15 years, the matched target segmentation rule can be determined to be that each segmented video segment contains 2 knowledge points with lower accuracy, so that the video segments matched for users with smaller ages are shorter and more accord with education and teaching habits (the attention of the users with smaller ages is difficult to concentrate for a long time, and if the duration of the video segments is too long, a better teaching effect is difficult to generate).
For example, if the user attribute information is class and the class is class 3 and class 2, it may be determined that the matched target segmentation rule is to perform segmentation based on a keyword in a teacher common sentence in class 3 and class 2.
Illustratively, if the user attribute information indicates that the subject preference is math, the matched target segmentation rule can be determined to segment video contents of more than 10 people who watch the real-time number of people in the math teaching live broadcast into a video segment, and if the learning video needing to be segmented is the Chinese teaching live broadcast, because the subject preference of the user is not Chinese, the real-time number of people watching the real-time number can be adjusted to 15 people, that is, the real-time number of people watching the real-time number (basis for segmenting the learning video) is related to the subject preference, so that users who like math can be favored, and the duration of the math segment displayed in the search result is longer and meets the requirements of the users.
In some possible embodiments, the obtained user attribute information may be multiple, and at this time, the search result may be determined according to the accuracy requirement, for example, multiple video segments may be obtained based on segmentation rules corresponding to the multiple user attribute information (where each segmentation rule may obtain multiple video segments), and an intersection, a part, a union, and the like of the video segments are taken, which is not described herein again.
For example, if the user attribute information indicates that the subject preference is math, the grade is three grades, the teacher preference is XXX teacher, and the segment length preference is 3 minutes, a matching target segmentation rule may be determined:
1. respectively determining a mathematical video clip N section, a three-grade teaching video clip M section, a XXX teacher video clip K section and a video clip L section with the length of 3 minutes;
2. and determining at least one final video segment according to a processing principle, for example, taking the intersection and union of all the video segments, or taking the intersection and union of partial video segments in all the video segments, and the like.
It should be noted that the segmentation process of the plurality of learning videos based on the plurality of segmentation rules will be described in detail below, and will not be further described herein.
In practical applications, the learning video may be a recorded broadcast video associated with the user terminal, such as a recorded broadcast video of a teaching live broadcast that the user terminal has participated in; alternatively, the learning video may also be a lecture video recorded by the teacher separately, such as a six-grade math series course purchased by the user for a fee, including 10 lecture videos recorded by the teacher; or the learning video may be a live video, the live video is a video that is cached or recorded in a time-sharing manner in the live broadcasting process, for example, when the current teaching live broadcast is carried out for 20 minutes, the content of the previous 20 minutes is recorded at this time, but the recording of the teaching live broadcast is still continued, a complete recorded and broadcast video is not generated, and the live video may be a video of 10 to 15 minutes of the teaching live broadcast when the teaching live broadcast is carried out for 20 minutes (the teaching live broadcast is not ended at this time).
That is, for the live video, the live video may be converted in real time, for example, the live video may be stored in segments according to the duration of the live video, such as in segments for each time (2 minutes, 5 minutes, 10 minutes, or 30 minutes). Thus, even if one of the recording problems occurs, the file content of the other time slot is not influenced.
Furthermore, for one live broadcast, multiple recordings can be performed. The multiple recording mode can include two modes, the first mode is a mode of adopting multiple processes and the same time interval. For example, suppose there are 2 processes, each process is to store a live video at an interval of 5 minutes, and store 2 identical files in 2 different storage spaces, delete duplicate files after the live broadcast is completed, and keep a complete file, thereby ensuring data security; the second is a multi-process, different time interval approach. For example, suppose there are 2 processes, the first process is to store a live video every 5 minutes, the second process is to store a live video every 3 minutes, and store 2 identical files in 2 different storage spaces, delete duplicate files after the whole live broadcast is completed, and keep a complete file. Compared with the first mode, the data security can be further improved.
In practical application, the duration of one recorded video or lecture video is often more than 40 minutes, so that when a user wants to watch a certain target content in a learning video, the recorded video/lecture video corresponding to the content needs to be found in a mode of historical watching records and the like, and then the target content is found from the recorded video/lecture video with longer duration in a mode of dragging a progress bar and the like.
When the learning video is segmented, the server may segment the learning video into a plurality of video segments in advance, and then send the video segments to the corresponding user side, and after the user side initiates a search request, the user side can directly perform matching on the corresponding search result locally; or, after receiving the search request, the server may search on the server side, and send the matched video segment to the corresponding user side for presentation.
In a specific implementation, when the learning video is divided, any one of the following methods may be used:
method 1, segmenting the learning video based on the statistical information.
Here, the statistical information includes at least one of a knowledge point, a keyword in a sentence commonly used by each teacher, a video clip duration with a high viewing frequency of each user, a real-time viewing number in live teaching, and a frequency of interactive information.
Specifically, when the learning video is segmented based on the knowledge points, one video segment may be segmented for each knowledge point, and if it is subsequently desired to enable one video segment to include two knowledge points, aggregation may be performed according to user attribute information, for example, if the segment length preference of the user is 5 minutes, two video segments with the video segment length closest to 5 minutes after aggregation are spliced to form one video segment including two knowledge points.
Specifically, when the learning videos are segmented based on the keyword pair in the common sentences of the teachers, the words used for indicating knowledge point switching based on each teacher commonly used can be segmented into a plurality of learning videos respectively according to a preset teacher database, so that the segmented video segments corresponding to each teacher are obtained, and therefore follow-up operation only needs to be based on the user attribute information, the corresponding teachers are matched, and the video segments corresponding to the teachers can be directly determined.
Specifically, when the learning video is segmented based on the video segment duration with higher watching frequency of each user, the learning video may be segmented based on the video segment durations with higher watching frequency of a plurality of users obtained through statistics, and the video segment corresponding to each video segment duration is obtained, so that the video segment duration closest to the segment length preference is determined based on the segment length preference in the user attribute information, and thus the video segment corresponding to the segment length preference of the user is obtained.
Specifically, when the learning video is segmented based on the number of real-time viewers in the teaching live broadcast, the number of real-time viewers can be pre-divided into a plurality of hierarchies (such as 0-10, 10-20 and 20-30), so that the learning video can be segmented to obtain video segments corresponding to each hierarchy, and then corresponding target hierarchies can be set according to subject preferences in the user attribute information (for example, the target hierarchy corresponding to users with matched subject preferences can be set to 10-20, and the target hierarchy corresponding to users with unmatched subject preferences is set to 20-30), so that more video segments which are interested by the users without the subject preferences can be provided to improve the learning interest of the users.
Specifically, when the learning video is segmented based on the frequency of the interactive information, the frequency of the interactive information may be divided into a plurality of levels (for example, 0 to 5 times/minute, 5 to 10 times/minute, 10 to 15/minute) in advance, so that the learning video can be segmented to obtain video segments corresponding to each level, and then, corresponding target levels may be set according to subject preferences in the user attribute information (for example, a target level corresponding to a user with matched subject preferences may be set to 0 to 5 times/minute, and a target level corresponding to a user with unmatched subject preferences may be set to 5 to 10 times/minute), so that more video segments interested by the user without the subject preferences may be provided to improve the learning interest of the user.
For example, taking the statistical information as a knowledge point, and taking the learning video as a recorded and broadcast video of the lecture video as an example, after the teacher finishes recording and recording, before uploading the lecture video, the teacher may segment the recorded lecture video according to the knowledge point by itself, for example, the different knowledge points are used for marking and segmenting to obtain a plurality of segmented video segments, each video segment corresponds to 1 knowledge point, and a tag may be added to each knowledge point, the tag may be one or more, and the content of the tag may include characters, symbols, and the like, for example, "trigonometric function", "XX teacher", "junior high school math competition", "beijing middle school exam", and the like.
If the video is a live video, the video can be marked according to the time intervals, if a certain knowledge point spans a plurality of time intervals, the knowledge point can be respectively marked as a knowledge point 1-1, a knowledge point 1-2 and the like, and after the recording is finished, the knowledge points can be combined to be used as a video segment; certainly, when several knowledge points are located at the same time interval, the video recording and playing mode can be directly referred to for segmentation, which is not described in detail herein.
For example, taking the duration of the lecture video as 40 minutes, after the recording is completed, the teacher may divide the video into 5 segments according to the knowledge points corresponding to the content in the lecture video, where the 5 segments correspond to the 2 nd to 8 th minutes, the 9 th to 17 th minutes, the 17 th to 20 th minutes, the 20 th to 27 th minutes, and the 28 th to 35 th minutes of the original lecture video, and the corresponding knowledge points are, in order, knowledge point 1, knowledge point 2, knowledge point 3, knowledge point 4, and knowledge point 5. Of course, the remaining 35-40 minutes may not correspond to any knowledge point, and may be deleted or otherwise marked for storage during uploading, which is not described in detail.
Therefore, the learning video is segmented according to the statistical information, the video segments can be effectively segmented into a plurality of meaningful video segments, and the target statistical information matched with the user attribute information is determined from the statistical information subsequently, so that the segmented video segments are more suitable for the user at the user side, and the video segment segmentation effect is improved.
And 2, segmenting the learning video based on the text information.
Here, when video segmentation is performed, segmentation may be performed based on text information of a learning video, and although the learning video itself does not include text information, the learning video includes a voice and a video picture, and text information corresponding to the learning video can be indirectly obtained by recognizing the voice or the video picture.
In the case that the learning video is a recorded video, the text information may be obtained in at least one of the following ways:
mode 1 is first text information obtained by recognizing voice information.
Here, the Speech in the recorded video may be automatically converted into the first text information by an Automatic Speech Recognition (ASR) technique. The voice information here may refer to voice information of a lecturer in a learning video, voice information of a student in an interaction, voice information in a comment, voice information of media information played by the lecturer, and the like.
And the mode 2 is second text information obtained by identifying the character information.
Here, the text content in the recorded video may be recognized through Optical Character Recognition (OCR), so as to obtain the second text information corresponding to the text content. The text information herein may refer to all text information displayed in the recorded video, such as text in a presentation (PowerPoint, PPT), a name of a live broadcast room, a student utterance, and so on.
And the mode 3 is third text information obtained by identifying the picture information.
Here, the third text information may refer to text information contained in a picture, which may recognize text in the picture through recognition technology such as OCR; or, the third text information may also be an object text in the picture, and the third text information may perform object recognition on the picture through a trained neural network to obtain corresponding text information. The picture information herein may refer to all image information displayed in the recorded video, such as an image in the PPT, and so on.
For example, the content in the picture is a cart and a ball, and after the trained neural network performs object recognition on the picture, it can be recognized that the objects in the picture are a "cart" and a "ball", and the third text information corresponding to the picture is a "cart" and a "ball".
And 4, identifying the characteristic information to obtain fourth text information.
Here, the feature information may be information of lecturers, classes, comments, disciplines, grades, regions, and the like. For example, "XX name", "junior physical teacher", "student ID: 123DFG, Chinese at the third year level, Beijing physics, stable feeling trigonometric function, wonderful idiom story spoken by the young teacher, and the like.
Specifically, when the feature information in the learning video is identified, the feature information may be labeled manually, so as to obtain fourth text information corresponding to the feature information.
In some possible embodiments, corresponding tag information may be added to each divided video segment, and the tag information may include the four pieces of text information, or similar information including the four pieces of text information, or cluster information including the four pieces of text information, and the like, which is not described herein again.
In addition, when the learning video is a live video, corresponding text information (at least one of the first text information, the second text information, the third text information, and the fourth text information) can be generated for the live video in real time through at least one of the above manners, for example, when the teaching live video is carried out for the 20 th minute, the first text information corresponding to the live video in the first 19 minutes can be generated.
Specifically, when the learning video is segmented based on the text information, any one of the following methods may be used:
and 2A, segmenting the learning video based on the change condition of the text information.
Here, as shown in fig. 3, the learning video may be segmented into a plurality of video segments by:
s301: and determining a target display area in the learning video, wherein the target display area is used for displaying learning resources.
Here, the target display area may be a predetermined area for displaying the learning resource, and the target display area in the learning video may be determined according to a position parameter set for the target display area in advance; the teaching resources can be electronic courseware (such as courseware displayed in the forms of Word, PDF, PPT and the like) corresponding to the learning videos; or, it may be physical teaching material (such as books, test papers, etc.).
For example, taking the learning video as a recorded and broadcast video as an example, a target display area of the learning video may be as shown in fig. 2b, the learning video includes a plurality of display areas, the target display area is an area for displaying teaching resources, and the first display area may be used for displaying a video live broadcast picture including a teacher image acquired by a teacher end; the second display area can be used for displaying character interaction information in the teaching live broadcast, such as character information sent by a student participating in the teaching live broadcast; the third display area can be used for displaying attribute information of a student end participating in the live teaching, such as identification information and a nickname of the student end.
S302: and segmenting the learning video into a plurality of video segments based on the change condition of the content displayed in the target display area in the learning video.
Here, when the learning video is divided, the learning video may be divided into a plurality of video segments according to the content displayed in the target display area, so that the content displayed in the target display area in each video segment is the same.
Specifically, the content displayed in the target display area can be identified through an OCR technology, the characters displayed in the target display area are determined, and then the video frames with the same identified characters are segmented into the same video segment; or, the change condition between the pixel values of two adjacent video frames can be calculated through an interframe difference algorithm, and then the adjacent video frames which are not changed are segmented into the same video segment.
And 2B, segmenting the learning video based on the keywords in the text information.
Here, in the case where the text information includes the first text information, as shown in fig. 4, the learning video may be segmented into a plurality of video segments by:
s401: and matching the first text information with a preset keyword library, and determining a target keyword contained in the first text information, wherein a plurality of keywords used for representing knowledge point switching are stored in the preset keyword library.
Here, the keywords in the keyword library may be preset, such as the next question, the next page, the next chapter, and the like; in addition, the keywords can be obtained by identifying commonly used sentences of teachers, such as a teacher who explains a question and then habitually uses "good", and then looks at the next question. "indicates that the present topic has been completely explained, and the explanation of the next topic is about to proceed. Through identifying the commonly used sentences of the teacher in live broadcast teaching, the words which are frequently used by the teacher and used for representing the switching of knowledge points can be determined to be words such as 'next', and the like, so that the video clip can be segmented according to the teaching habits of the teachers when being segmented, and the accuracy of segmenting the video clip can be improved.
S402: and segmenting the learning video based on the timestamp corresponding to the target keyword to obtain a plurality of video segments.
For example, taking the learning video as a recorded broadcast video with a duration of 45 minutes as an example, if the timestamps corresponding to the target keywords are the 10 th minute, the 17 th minute, the 25 th minute and the 33 th minute after the start of the teaching live broadcast, 4 video segments can be generated, and the content of each video segment is the video content corresponding to the 10 th to 17 th minute, the 17 th to 25 th minute, the 25 th to 33 th minute and the 33 th to 45 th minute of the recorded broadcast video in sequence.
It should be noted that, in the period from the start of the recorded video to the 10 th minute, since the corresponding keyword is not matched, the content of the video may not be segmented at this time; alternatively, the segment of video content may be segmented in combination with other segmentation methods according to the embodiments of the present disclosure. Specifically, when video segment segmentation is performed, one or more segmentation methods provided by the embodiment of the present disclosure may be adopted according to actual needs, and the embodiment of the present disclosure does not limit this.
And 3, segmenting the learning video based on the segmentation instruction.
Here, when the video segment is cut, a cutting instruction input by a user may be received, and the learning video may be cut into a plurality of video segments based on the cutting instruction.
Specifically, in the live teaching process, a teacher can correspondingly input a segmentation instruction after each teaching content is finished according to the teaching content planned in advance in a teaching plan, so that after the live teaching process is finished, a complete recorded broadcast video can be segmented into a plurality of video segments according to the time corresponding to the segmentation instruction.
Illustratively, the teaching plan of the teacher can be live broadcast to explain 8 subjects, each subject is estimated to be 5 minutes, and after the teacher finishes speaking one subject, the teacher can generate the segmentation instruction after responding by triggering a preset marking button, so that after the teaching live broadcast is finished, the complete recorded broadcast video can be segmented into 8 video segments according to the time corresponding to the segmentation instruction.
In addition, the segmentation instruction may also be a mark performed by a student on the teaching live broadcast content according to the learning condition of the student in the live broadcast learning process, for example, when a teacher speaks a question 2, the question answer error of the student corresponding to the current student end is identified according to the historical answer condition, the student may be prompted in a pop-up window display mode, the question spoken by the teacher is the question that the student answers the error, a mark button is displayed in the pop-up window and the student is prompted to trigger the mark button, and after the student triggers the mark button, the segmentation instruction may be generated and the pop-up window may be closed; furthermore, if the student does not trigger the marking button within the preset time length, the student is considered to be distracted in live broadcast learning, and the information can be correspondingly sent to a chief's end or a teaching assistant end corresponding to the student end, so that the student is urged to concentrate on completing live broadcast teaching courses.
S103: determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance.
Specifically, after the learning video is segmented into a plurality of video segments based on the above methods, video segments corresponding to a plurality of segmentation rules respectively can be obtained; and then after determining a corresponding target segmentation rule based on the user attribute information, obtaining a plurality of video segments corresponding to the target segmentation rule, and then matching based on search information carried in a search request, so that the video segments matched with the search information can be searched from the plurality of video segments corresponding to the target segmentation rule, and the searched video segments are pre-segmented based on the target segmentation rule matched with the attribute information, so that the user can search the video segments matched with the search information and the user attribute information, thereby realizing personalized search of the video segments, improving the interest of the user in viewing the video segments and further improving the learning efficiency.
In one possible implementation, as shown in fig. 5, the video segments in the at least one learning video matching the search information may be determined by:
s501: and determining target text information matched with the search information based on the text information of the learning video.
Here, the text information of the learning video may be text information obtained at a video segment segmentation stage, and multiple text information of the four text information corresponding to the learning video may be identified according to the above manner, so that the probability of matching the target text information matched with the search information is improved (that is, the search hit rate is improved).
Specifically, when the target text information matched with the search information is determined, a word with a minimum editing distance from the search information in the text information may be found through a maximum matching algorithm, and the word is used as the target text information.
In one possible implementation, when determining target text information matching the search information based on the text information of the learning video, a first keyword in the text information of the learning video may also be determined; determining a second keyword which has an association relation with the first keyword; and determining target text information matched with the search information based on the first keyword and the second keyword.
The second keyword having an association relationship with the first keyword may be a synonym, or the like of the first keyword.
For example, taking the first keyword as a "trigonometric function" as an example, the second keyword associated with the "trigonometric function" may be "trigonometric transformation", so that target text information matching the search information may be determined according to the "trigonometric function" and the "trigonometric transformation".
S502: and taking the video frame corresponding to the target text information as a target video frame.
Here, the target text information may be at least one of the first text information, the second text information, the third text information, and the fourth text information.
Specifically, in a case where the target text information includes the first text information, the video frame corresponding to the target text information may be a video frame when a voice corresponding to the first text information is uttered; when the target text information includes the second text information, the video frame corresponding to the target text information may be a video frame containing the second text information; when the target text information includes the third text information, the video frame corresponding to the target text information may be a video frame containing content corresponding to the third text information; when the target text information includes the fourth text information, the video frame corresponding to the target text information may be a video frame including content corresponding to the fourth text information.
It should be noted that after the target video frame is determined, corresponding marks, such as text information and a label of the voice information, may also be highlighted, which is not described herein again.
S503: and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
For example, taking the target text information as an inverse trigonometric function, and the segmented video segments as video contents corresponding to the 10 th to 17 th minutes, the 17 th to 25 th minutes, the 25 th to 33 th minutes, and the 33 th to 45 th minutes of the recorded and broadcast video as examples, if the target video frames corresponding to the inverse trigonometric function in the recorded and broadcast video are located at the 13 th minute and the 16 th minute of the recorded and broadcast video, the video segment where the target video frames are located (the video content corresponding to the 17 th to 25 th minutes of the recorded and broadcast video) may be used as the video segment matched with the inverse trigonometric function.
In a possible implementation manner, after at least one video segment matching the search information is determined from a plurality of learning videos, the corresponding relation between the search request and the at least one video segment can be further stored; after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation; and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
Illustratively, in the historical video search process, search information in a search request is a "trigonometric function", a video segment corresponding to the historical search request is a video segment 1, after the search is finished, the correspondence between the "trigonometric function" and the video segment 1 may be stored in the database, and after another search request is received again, it is detected that search information of another search request is also the "trigonometric function", the search request at this time may be determined to be matched with the historical search request by querying the correspondence between the search information stored in the database and the video segment, and the video segment 1 corresponding to the "trigonometric function" is taken as a search result corresponding to the search request at this time.
S104: and displaying the at least one video clip on a user terminal interface as a search result.
In a possible implementation manner, if the execution subject is a terminal device, the at least one video clip may be displayed on the user side interface according to a set arrangement manner.
In addition, when the video clip is displayed, the target text information can be used as the title of the video clip, and when the search result is displayed, the title of the video clip is displayed; and displaying the corresponding video clip on the user side interface after the title of the video clip is triggered.
For example, the presentation page of the search result may be as shown in fig. 2c, the search information is "inverse trigonometric function", the title of the presented video clip is the target text information "inverse trigonometric function" matched with the search information "inverse trigonometric function", and the corresponding video duration is "2 minutes 30 seconds", and after the user performs the trigger operation on the title of the video clip, the video clip may be played on the user side interface.
In another possible implementation, if the execution subject is a server, the set preview manner and the at least one video clip may be sent to the user side, so that the at least one video clip is displayed on the user side interface according to the set preview manner.
The preview mode may be at least one of a picture, a target video frame, a frame sequence, and a video with a preset duration corresponding to the video clip, and when sending the picture, the target video frame, the frame sequence, and the video to the user side for displaying, the target video frame in the video clip may be sent to the user side for displaying, and after receiving a trigger operation of the user for the target video frame, the video clip is sent to the user side for displaying.
For example, the presentation page of the search result may be further shown in fig. 2d, where in fig. 2d, there are 2 video segments matched with the search information "inverse trigonometric function", the titles corresponding to the video segments are "inverse trigonometric function" and "trigonometric function", respectively, and the number of the target text information corresponding to each video segment and the preview button are presented after the title of each video segment.
Further, after the user triggers the preview button, a pop-up window/floating layer can be displayed on the user side interface, the pop-up window/floating layer is used for displaying the target video frame corresponding to the target text information, and the user can determine the target video clip which the user wants to watch by previewing the target video frame.
For example, the preview page of the target video frame may be as shown in fig. 2e, after the preview button is triggered by the user, the user side page shows a floating layer, the target video frame shown in the floating layer is a 1 st video frame in 5 video frames included in the video clip, and the switch button shown in the floating layer is used for switching the currently shown target video frame after being triggered; in addition, a determining button is also displayed in the floating layer, and the determining button is used for playing a corresponding video clip on a user side page according to a timestamp corresponding to the target video frame currently displayed by the floating layer after being triggered.
Therefore, the target video frame is displayed through the preview page, and then the video clip corresponding to the target video frame selected by the user is sent and displayed after the user previews the target video frame, so that compared with the method of sending all the video data to the user side, the data receiving amount of the terminal equipment can be reduced, and the processing speed of the terminal equipment is improved.
In a possible implementation manner, when the at least one video segment is displayed on the user-side interface as the search result, the origin and the start time of each video segment may be marked at the same time, so that after the marking result corresponding to the origin of any video segment is triggered, the learning video corresponding to any video segment is displayed on the user-side interface.
For example, the user-side interface including the marked result may be as shown in fig. 2f, where in fig. 2f, "video segment 1" corresponding to the search information "inverse trigonometric function" and "learning video 1" corresponding to the search information "inverse trigonometric function" are shown, and the starting time is 9 minutes, and after the user performs the triggering operation of the marked result for the video segment, "learning video 1" can be shown on the user-side interface.
In some possible embodiments, in order to improve the accuracy of pushing the video segments, after the at least one video segment is presented as a search result on the user-side interface, the method may further include:
when a supplementary search request is received, obtaining supplementary search information in the supplementary search request;
determining at least one accurate video segment matched with the supplementary search information from the at least one video segment based on the supplementary search information;
and displaying the at least one accurate video clip on a user terminal interface as a search result.
It should be noted that the number of times of the supplementary search may be set according to an actual situation, for example, when the number of the determined video segments is higher than a set number threshold, the user may be reminded to perform the supplementary search again; or when the user triggers, the supplementary search is carried out.
According to the video searching method provided by the embodiment of the disclosure, a searching request is received, and searching information in the searching request is obtained; acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information; after the corresponding target segmentation rule is determined based on the user attribute information, a plurality of video segments corresponding to the target segmentation rule can be obtained, then matching is carried out based on the search information carried in the search request, and the video segments obtained after matching are used as search results to be displayed on a user side interface. Therefore, the video segments matched with the search information can be searched from the plurality of video segments corresponding to the target segmentation rule, the searched video segments are pre-segmented based on the target segmentation rule matched with the attribute information, the user can search the video segments matched with the search information and the user attribute information, further personalized search of the video segments can be realized, the interest of the user in viewing the video segments is improved, and further the learning efficiency is improved.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
Based on the same inventive concept, a video search device corresponding to the video search method is also provided in the embodiments of the present disclosure, and because the principle of solving the problem of the device in the embodiments of the present disclosure is similar to the video search method in the embodiments of the present disclosure, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 6, a schematic diagram of an architecture of a video search apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes: a receiving module 601, a first determining module 602, a determining module 603 and a displaying module 604; wherein the content of the first and second substances,
a receiving module 601, configured to receive a search request and obtain search information in the search request;
a first determining module 602, configured to obtain user attribute information, and determine a target segmentation rule matching the user attribute information;
a second determining module 603, configured to determine, based on the search information, at least one video segment that is segmented based on the target segmentation rule and matches the search information, from among multiple video segments corresponding to multiple learning videos; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
a displaying module 604, configured to display the at least one video segment as a search result on the user-side interface.
In a possible implementation manner, the first determining module 602, when determining the target slicing rule matching with the user attribute information, is configured to:
determining target statistical information matched with the user attribute information;
and determining a target segmentation rule for segmenting based on the target statistical information.
In one possible embodiment, the user attribute information includes at least one of class, grade, age, subject preference, teacher preference, segment length preference;
the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
In a possible implementation manner, when receiving a search request and acquiring search information in the search request, the receiving module 601 is configured to:
receiving a voice search request, recognizing the voice search request, and taking a voice recognition result as the search information; and/or the presence of a gas in the gas,
receiving a picture search request, identifying the picture search request, and taking a picture identification result as the search information; and/or the presence of a gas in the gas,
receiving a character search request, identifying the character search request, and taking a character identification result as the search request.
In a possible implementation, the second determining module 603, when determining, based on the search information, at least one video segment matching the search information from a plurality of learning videos, is configured to:
determining target text information matched with the search information based on the text information of the plurality of learning videos; the learning video comprises a live video and a recorded broadcast video.
Taking the video frame corresponding to the target text information as a target video frame;
and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
In one possible embodiment, the text information of the learning video includes at least one of:
the learning video processing method comprises the steps of obtaining first text information by recognizing voice information in a learning video, obtaining second text information by recognizing character information in the learning video, obtaining third text information by recognizing picture information in the learning video, and obtaining fourth text information by recognizing characteristic information in the learning video.
In a possible implementation, the second determining module 603 is configured to obtain text information of the learning video according to the following steps:
when the learning video is a live video, generating the text information in real time;
and when the learning video is a recorded and broadcast video, generating the text information according to at least one of picture information, character information, voice information and feature information in the recorded and broadcast video.
In a possible implementation, the second determining module 603, when determining the target text information matching the search information based on the text information of the learning video, is configured to:
determining a first keyword in text information of the learning video;
determining a second keyword which has an association relation with the first keyword;
and determining target text information matched with the search information based on the first keyword and the second keyword.
In a possible implementation, the presenting module 604, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video clip on a user terminal interface according to a set preview mode.
In a possible implementation, the presenting module 604, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video clip on a user terminal interface according to a set arrangement mode.
In a possible embodiment, after determining at least one video segment matching the search information from the plurality of learning videos, the presentation module 604 is further configured to:
storing the corresponding relation between the search request and the at least one video clip;
after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation;
and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
In a possible implementation, the presenting module 604, when presenting the at least one video segment as a search result on the user-side interface, is configured to:
and displaying the at least one video segment on a user side interface as a search result, and marking the origin and the starting time of each video segment, so that after the marking result corresponding to the origin of any video segment is triggered, a learning video corresponding to any video segment is displayed on the user side interface.
The video searching device provided by the embodiment of the disclosure receives a search request and acquires search information in the search request; acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information; after the corresponding target segmentation rule is determined based on the user attribute information, a plurality of video segments corresponding to the target segmentation rule can be obtained, then matching is carried out based on the search information carried in the search request, and the video segments obtained after matching are used as search results to be displayed on a user side interface. Therefore, the video segments matched with the search information can be searched from the plurality of video segments corresponding to the target segmentation rule, the searched video segments are pre-segmented based on the target segmentation rule matched with the attribute information, the user can search the video segments matched with the search information and the user attribute information, further personalized search of the video segments can be realized, the interest of the user in viewing the video segments is improved, and further the learning efficiency is improved.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
Based on the same technical concept, the embodiment of the disclosure also provides computer equipment. Referring to fig. 7, a schematic structural diagram of a computer device 700 provided in the embodiment of the present disclosure includes a processor 701, a memory 702, and a bus 703. The memory 702 is used for storing execution instructions and includes a memory 7021 and an external memory 7022; the memory 7021 is also referred to as an internal memory, and is used to temporarily store operation data in the processor 701 and data exchanged with an external memory 7022 such as a hard disk, the processor 701 exchanges data with the external memory 7022 through the memory 7021, and when the computer apparatus 700 is operated, the processor 701 communicates with the memory 702 through the bus 703, so that the processor 701 executes the following instructions:
receiving a search request and acquiring search information in the search request;
acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information;
determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
and displaying the at least one video clip on a user terminal interface as a search result.
In a possible implementation manner, the determining, in the instructions of the processor 701, a target segmentation rule matching the user attribute information includes:
determining target statistical information matched with the user attribute information;
and determining a target segmentation rule for segmenting based on the target statistical information.
In a possible embodiment, in the instructions of the processor 701, the user attribute information includes at least one of class, grade, age, subject preference, teacher preference, and segment length preference;
the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
In a possible implementation manner, in the instructions of the processor 701, the receiving a search request and obtaining search information in the search request includes at least one of:
receiving a voice search request, recognizing the voice search request, and taking a voice recognition result as the search information;
receiving a picture search request, identifying the picture search request, and taking a picture identification result as the search information;
receiving a character search request, identifying the character search request, and taking a character identification result as the search request.
In a possible implementation manner, the determining, in the instructions of the processor 701, at least one video segment matching the search information from a plurality of learning videos based on the search information includes:
determining target text information matched with the search information based on the text information of the plurality of learning videos; the learning video comprises a live video and a recorded broadcast video.
Taking the video frame corresponding to the target text information as a target video frame;
and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
In a possible implementation manner, in the instructions of the processor 701, the text information of the learning video includes at least one of the following:
the learning video processing method comprises the steps of obtaining first text information by recognizing voice information in a learning video, obtaining second text information by recognizing character information in the learning video, obtaining third text information by recognizing picture information in the learning video, and obtaining fourth text information by recognizing characteristic information in the learning video.
In a possible implementation manner, in the instructions of the processor 701, the text information of the learning video is obtained by at least one of:
when the learning video is a live video, generating the text information in real time;
and when the learning video is a recorded and broadcast video, generating the text information according to at least one of picture information, character information, voice information and feature information in the recorded and broadcast video.
In a possible implementation manner, in the instructions of the processor 701, the determining, based on the text information of the learning video, target text information that matches the search information includes:
determining a first keyword in text information of the learning video;
determining a second keyword which has an association relation with the first keyword;
and determining target text information matched with the search information based on the first keyword and the second keyword.
In a possible implementation manner, the instructions of the processor 701, which present the at least one video segment as a search result on a user-side interface, include:
and displaying the at least one video clip on a user terminal interface according to a set preview mode.
In a possible implementation manner, the instructions of the processor 701, which present the at least one video segment as a search result on a user-side interface, include:
and displaying the at least one video clip on a user terminal interface according to a set arrangement mode.
In a possible implementation, the instructions of the processor 701, after determining at least one video segment matching the search information from a plurality of learning videos, further include:
storing the corresponding relation between the search request and the at least one video clip;
after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation;
and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
In a possible implementation manner, the instructions of the processor 701, the presenting the at least one video segment as a search result on a user side interface, includes:
and displaying the at least one video segment on a user side interface as a search result, and marking the origin and the starting time of each video segment, so that after the marking result corresponding to the origin of any video segment is triggered, a learning video corresponding to any video segment is displayed on the user side interface.
The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the video search method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the video search method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.
The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when actually implemented, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (15)

1. A video search method, comprising:
receiving a search request and acquiring search information in the search request;
acquiring user attribute information, and determining a target segmentation rule matched with the user attribute information;
determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
and displaying the at least one video clip on a user terminal interface as a search result.
2. The method of claim 1, wherein the determining a target slicing rule matching the user attribute information comprises:
determining target statistical information matched with the user attribute information;
and determining a target segmentation rule for segmenting based on the target statistical information.
3. The method of claim 2, wherein the user attribute information includes at least one of class, grade, age, subject preference, teacher preference, segment length preference;
the target statistical information comprises at least one of historical answer conditions of the user, keywords in commonly used sentences of teachers in the class, the number of real-time watching persons in live teaching and the frequency of interactive information.
4. The method of claim 1, wherein receiving the search request and obtaining the search information in the search request comprises at least one of:
receiving a voice search request, recognizing the voice search request, and taking a voice recognition result as the search information;
receiving a picture search request, identifying the picture search request, and taking a picture identification result as the search information;
receiving a character search request, identifying the character search request, and taking a character identification result as the search request.
5. The method of claim 1, wherein determining at least one video segment from a plurality of learning videos that matches the search information based on the search information comprises:
determining target text information matched with the search information based on the text information of the plurality of learning videos; the learning video comprises a live broadcast video and a recorded broadcast video;
taking the video frame corresponding to the target text information as a target video frame;
and taking the video segment where the target video frame is located as the at least one video segment matched with the search information.
6. The method of claim 5, wherein the text information of the learning video comprises at least one of:
the learning video processing method comprises the steps of obtaining first text information by recognizing voice information in a learning video, obtaining second text information by recognizing character information in the learning video, obtaining third text information by recognizing picture information in the learning video, and obtaining fourth text information by recognizing characteristic information in the learning video.
7. The method of claim 5, wherein the text information of the learning video is obtained by at least one of:
when the learning video is a live video, generating the text information in real time;
and when the learning video is a recorded and broadcast video, generating the text information according to at least one of picture information, character information, voice information and feature information in the recorded and broadcast video.
8. The method of claim 5, wherein determining target text information matching the search information based on the text information of the learning video comprises:
determining a first keyword in text information of the learning video;
determining a second keyword which has an association relation with the first keyword;
and determining target text information matched with the search information based on the first keyword and the second keyword.
9. The method of claim 1, wherein presenting the at least one video clip as a search result on a user-side interface comprises:
and displaying the at least one video clip on a user terminal interface according to a set preview mode.
10. The method of claim 1, wherein presenting the at least one video clip as a search result on a user-side interface comprises:
and displaying the at least one video clip on a user terminal interface according to a set arrangement mode.
11. The method of claim 1, wherein after determining at least one video segment from the plurality of learning videos that matches the search information, the method further comprises:
storing the corresponding relation between the search request and the at least one video clip;
after receiving other search requests, detecting whether search requests matched with the other search requests exist in the stored corresponding relation;
and displaying the corresponding video clips of the search requests matched with the other search requests on a user side interface as search results.
12. The method according to claim 1, wherein said presenting the at least one video clip as a search result on a user-side interface comprises:
and displaying the at least one video segment on a user side interface as a search result, and marking the origin and the starting time of each video segment, so that after the marking result corresponding to the origin of any video segment is triggered, a learning video corresponding to any video segment is displayed on the user side interface.
13. A video search apparatus, comprising:
the receiving module is used for receiving a search request and acquiring search information in the search request;
the first determining module is used for acquiring user attribute information and determining a target segmentation rule matched with the user attribute information;
the second determining module is used for determining at least one video segment which is segmented based on the target segmentation rule and matched with the search information from a plurality of video segments corresponding to a plurality of learning videos based on the search information; the video segments corresponding to the learning videos are obtained by segmenting the learning videos respectively based on a plurality of segmentation rules in advance;
and the display module is used for displaying the at least one video clip as a search result on the user side interface.
14. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is running, the machine-readable instructions when executed by the processor performing the steps of the video search method of any of claims 1 to 12.
15. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the video search method according to any one of claims 1 to 12.
CN202110715993.0A 2021-06-28 2021-06-28 Video searching method and device, computer equipment and storage medium Pending CN113254708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110715993.0A CN113254708A (en) 2021-06-28 2021-06-28 Video searching method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110715993.0A CN113254708A (en) 2021-06-28 2021-06-28 Video searching method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113254708A true CN113254708A (en) 2021-08-13

Family

ID=77189817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110715993.0A Pending CN113254708A (en) 2021-06-28 2021-06-28 Video searching method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113254708A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505262A (en) * 2021-08-17 2021-10-15 深圳华声医疗技术股份有限公司 Ultrasonic image searching method and device, ultrasonic equipment and storage medium
CN114245203A (en) * 2021-12-15 2022-03-25 平安科技(深圳)有限公司 Script-based video editing method, device, equipment and medium
CN114697763A (en) * 2022-04-07 2022-07-01 脸萌有限公司 Video processing method, device, electronic equipment and medium
CN115134660A (en) * 2022-06-27 2022-09-30 中国平安人寿保险股份有限公司 Video editing method and device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255053A (en) * 2018-09-14 2019-01-22 北京奇艺世纪科技有限公司 Resource search method, device, terminal, server, computer readable storage medium
CN109492087A (en) * 2018-11-27 2019-03-19 北京中熙正保远程教育技术有限公司 A kind of automatic answer system and method for online course learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255053A (en) * 2018-09-14 2019-01-22 北京奇艺世纪科技有限公司 Resource search method, device, terminal, server, computer readable storage medium
CN109492087A (en) * 2018-11-27 2019-03-19 北京中熙正保远程教育技术有限公司 A kind of automatic answer system and method for online course learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505262A (en) * 2021-08-17 2021-10-15 深圳华声医疗技术股份有限公司 Ultrasonic image searching method and device, ultrasonic equipment and storage medium
CN114245203A (en) * 2021-12-15 2022-03-25 平安科技(深圳)有限公司 Script-based video editing method, device, equipment and medium
CN114245203B (en) * 2021-12-15 2023-08-01 平安科技(深圳)有限公司 Video editing method, device, equipment and medium based on script
CN114697763A (en) * 2022-04-07 2022-07-01 脸萌有限公司 Video processing method, device, electronic equipment and medium
CN114697763B (en) * 2022-04-07 2023-11-21 脸萌有限公司 Video processing method, device, electronic equipment and medium
CN115134660A (en) * 2022-06-27 2022-09-30 中国平安人寿保险股份有限公司 Video editing method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US10911840B2 (en) Methods and systems for generating contextual data elements for effective consumption of multimedia
WO2021000909A1 (en) Curriculum optimisation method, apparatus, and system
Altrabsheh et al. SA-E: sentiment analysis for education
US9812028B1 (en) Automated generation and presentation of lessons via digital media content extraction
US8832584B1 (en) Questions on highlighted passages
US20170213469A1 (en) Digital media content extraction and natural language processing system
CN113254708A (en) Video searching method and device, computer equipment and storage medium
US20160133148A1 (en) Intelligent content analysis and creation
CN111522970A (en) Exercise recommendation method, exercise recommendation device, exercise recommendation equipment and storage medium
US10089898B2 (en) Information processing device, control method therefor, and computer program
CN107424100B (en) Information providing method and system
Tuna et al. Indexed captioned searchable videos: A learning companion for STEM coursework
CN113590956A (en) Knowledge point recommendation method and device, terminal and computer readable storage medium
CN114339285A (en) Knowledge point processing method, video processing method and device and electronic equipment
Siqueira et al. ELT materials for basic education in Brazil: Has the time for an ELF-aware practice arrived
CN113709526A (en) Teaching video generation method and device, computer equipment and storage medium
CN110852073A (en) Language learning system and learning method for customizing learning content for user
US10593366B2 (en) Substitution method and device for replacing a part of a video sequence
CN114048335A (en) Knowledge base-based user interaction method and device
CN113420135A (en) Note processing method and device in online teaching, electronic equipment and storage medium
CN113038053A (en) Data synthesis method and device, electronic equipment and storage medium
Braun et al. When Worlds Collide: AI-created, human-mediated video description services and the user experience
JP2006053273A (en) Content creating device, program, and recording medium
KR102093037B1 (en) System providing question-based wisdom education service
Shah et al. Evaluation of User-Interface Designs for Educational Feedback Software for ASL Students

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210813

RJ01 Rejection of invention patent application after publication