CN115734045A - Video playing method, device, equipment and storage medium - Google Patents

Video playing method, device, equipment and storage medium Download PDF

Info

Publication number
CN115734045A
CN115734045A CN202211425128.3A CN202211425128A CN115734045A CN 115734045 A CN115734045 A CN 115734045A CN 202211425128 A CN202211425128 A CN 202211425128A CN 115734045 A CN115734045 A CN 115734045A
Authority
CN
China
Prior art keywords
picture
video
determining
detected
target video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211425128.3A
Other languages
Chinese (zh)
Inventor
詹澄海
韦玉善
陈辉
洪九英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Dongming Juchuang Electronics Co ltd
Original Assignee
Shenzhen Dongming Juchuang Electronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dongming Juchuang Electronics Co ltd filed Critical Shenzhen Dongming Juchuang Electronics Co ltd
Priority to CN202211425128.3A priority Critical patent/CN115734045A/en
Publication of CN115734045A publication Critical patent/CN115734045A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a video playing method, a video playing device, video playing equipment and a storage medium. Acquiring a video clip to be confirmed; taking a frame of picture in the video clip to be confirmed as a picture to be detected for feature extraction; determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information; and according to the characteristics, retrieving the picture to be detected, determining a target video and displaying the target video.

Description

Video playing method, device, equipment and storage medium
Technical Field
The present application relates to the field of video playing technologies, and in particular, to a video playing method, apparatus, device, and storage medium.
Background
With the rapid development of science and technology, more and more video playing platforms are provided. People are also increasingly relying on video playback platforms over the internet to learn about news, interesting trivia, or to watch tv shows, movies, etc. of interest.
As these video playing platforms become more diverse, user selectivity also becomes more and more. Accordingly, the problem arises that the user gets fragments fragmented more and more. For example, many users browse interesting segments on a certain video playing platform, but the corresponding video resources cannot be acquired because the segments are too fragmented. This results in that the user cannot watch the corresponding video resource, thereby greatly affecting the experience of the user.
Disclosure of Invention
The application provides a video playing method, a video playing device, video playing equipment and a storage medium.
In a first aspect, the present application provides a video playing method, including:
acquiring a video clip to be confirmed;
taking a frame of picture in the video clip to be confirmed as a picture to be detected for feature extraction;
determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information;
and according to the characteristics, retrieving the picture to be detected, determining a target video and displaying the target video.
Through the steps, after the video clip to be confirmed is obtained, feature extraction is carried out, and after the feature is confirmed, feature retrieval is carried out to obtain the target video. After browsing the related interesting videos, the user can obtain the target video and the corresponding playing source in time through the method provided by the application. The key problem that the user cannot search the target video is solved, and the experience of the user is improved.
Optionally, the performing feature extraction on a frame of picture in the video segment to be confirmed as a picture to be detected includes:
determining the number of feature types in each frame of picture of the video clip to be confirmed by using a preset feature type;
sequencing the feature type quantity in each frame of picture, and determining the picture frame with the most feature quantity;
and taking the picture frame with the most characteristic quantity as a picture to be detected for characteristic extraction.
By means of the method, the number of the feature types of the video segments to be confirmed is collected, and the number of the feature types is sequenced to obtain a frame with the most feature types. The search range can be narrowed, and the workload can be reduced. The retrieval accuracy can be improved by retrieving the frame with the most feature types.
Optionally, the retrieving, determining and displaying the target video of the to-be-detected picture according to the features includes:
determining a character image in the characteristics according to the characteristics;
determining information corresponding to actors of the character image according to the character image;
determining a reference work of actors of the character image according to the information;
and comparing the picture to be detected with the reference works to determine a target video and displaying the target video.
By the method of the embodiment, the extracted character image characteristics are analyzed, and actor information of the role player is determined. And obtains the works of the actors from the corresponding actor information. And comparing the video segment to be confirmed with the reference works, confirming the target video and displaying. The mode can correspond to the corresponding actor at the fastest speed, and the retrieval speed and the retrieval accuracy are improved.
Optionally, the retrieving, determining and displaying the target video of the to-be-detected picture according to the features includes:
retrieving the picture to be detected according to the characteristics to obtain the target video;
acquiring the authority of the target video;
and sequencing and displaying the target videos by using the preset priority and the permission.
By the method of the embodiment, when the target video is retrieved and displayed, the playing permissions of different playing sources can be acquired, and sequencing is performed according to the playing permissions, so that a user can obtain the best playing source according to the sequencing result, and more convenient service is provided for the user. The experience of the user is improved.
Optionally, the method further includes:
acquiring the network bandwidth at the current moment;
matching the image quality level corresponding to the network bandwidth by using the network bandwidth range corresponding to the preset image quality level;
according to the matching result, adjusting the image quality at the current moment;
and playing the target video by utilizing the image quality.
The method provided by the embodiment can enable the video to be smoothly played by acquiring the network bandwidth at the current moment and performing corresponding image quality level matching and adjustment. The network bandwidth can be acquired in real time in the playing process, when the network bandwidth changes, corresponding image quality grade matching is carried out in time, the image quality is adjusted, smooth playing of the video is guaranteed, and the situation that the user experience is influenced due to the fact that the video playing is blocked due to poor network bandwidth data is avoided.
Optionally, the method further includes:
acquiring environmental noise at the current moment;
comparing the environmental noise with a preset noise threshold value, and determining the level corresponding to the environmental noise;
and according to the level, determining a volume adjustment range corresponding to the current moment and taking the minimum value in the volume adjustment range as the playing volume.
The embodiment provides a volume adjustment method, which includes, when environmental noise changes at the current moment, obtaining the environmental noise in time and matching the environmental noise with a preset noise threshold, determining a corresponding volume adjustable range, and adjusting the volume by using a minimum volume value of the volume adjustable range as an adjustment target. The method can avoid the problem that the user cannot hear the video playing sound due to sudden noise in the watching process, and the hearing of the user cannot be impacted by adjusting the minimum volume within the volume adjustable range, so that the user has better volume adjustment experience.
Optionally, the method further includes:
acquiring a timestamp corresponding to a sound track, a timestamp corresponding to a picture track and a timestamp corresponding to a subtitle track of the target video;
judging whether the timestamp corresponding to the sound track, the timestamp corresponding to the picture track and the timestamp corresponding to the subtitle track are consistent;
and if the timestamps corresponding to the sound tracks, the timestamps corresponding to the picture tracks and the timestamps corresponding to the subtitle tracks are inconsistent, judging the tracks corresponding to the inconsistent timestamps, and calibrating according to a preset timestamp calibration rule.
In this embodiment, the time stamp corresponding to the sound track, the time stamp corresponding to the picture track, and the time stamp corresponding to the subtitle track are obtained and compared to confirm that there is an inconsistency, and the time stamps corresponding to the inconsistent tracks are calibrated according to a preset time stamp calibration rule, so that the problem of audio and video asynchronization can be solved in the shortest time. Thereby improving the experience of the user.
In a second aspect, the present application provides a video playing apparatus, including:
the clip acquisition module is used for acquiring a video clip to be confirmed;
the characteristic extraction module is used for extracting the characteristics of one frame of picture in the video clip to be confirmed as a picture to be detected;
the characteristic determining module is used for determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information;
and the segment retrieval display module is used for retrieving the picture to be detected according to the characteristics to determine a target video and displaying the target video.
Optionally, the feature extraction module is specifically configured to:
determining the number of feature types in each frame of picture of the video clip to be confirmed by using a preset feature type;
sequencing the feature type quantity in each frame of picture, and determining the picture frame with the most feature quantity;
and taking the picture frame with the most characteristic quantity as a picture to be detected for characteristic extraction.
Optionally, the segment retrieving and displaying module is specifically configured to:
determining a character image in the characteristics according to the characteristics;
determining information corresponding to actors of the character image according to the character image;
determining a performance of the actor of the character image according to the information;
and comparing the picture to be detected with the reference works to determine a target video and displaying the target video.
Optionally, the segment retrieving and displaying module is further specifically configured to:
retrieving the picture to be detected according to the characteristics to obtain the target video;
acquiring the authority of the target video;
and sequencing and displaying the target videos by utilizing the preset priority and the permission.
Optionally, the apparatus further includes an image quality adjustment module, configured to:
acquiring the network bandwidth at the current moment;
matching the image quality level corresponding to the network bandwidth by using the network bandwidth range corresponding to the preset image quality level;
according to the matching result, adjusting the image quality at the current moment;
and playing the target video by utilizing the image quality.
Optionally, the apparatus further includes a volume adjustment module, configured to:
acquiring environmental noise at the current moment;
comparing the environmental noise with a preset noise threshold value, and determining the level corresponding to the environmental noise;
and according to the level, determining a volume adjustment range corresponding to the current moment and taking the minimum value in the volume adjustment range as the playing volume.
Optionally, the apparatus further comprises a calibration module, configured to:
acquiring a timestamp corresponding to a sound track, a timestamp corresponding to a picture track and a timestamp corresponding to a subtitle track of the target video;
judging whether the time stamp corresponding to the sound track, the time stamp corresponding to the picture track and the time stamp corresponding to the subtitle track are consistent;
and if the timestamps corresponding to the sound tracks, the timestamps corresponding to the picture tracks and the timestamps corresponding to the subtitle tracks are inconsistent, judging the tracks corresponding to the inconsistent timestamps, and calibrating according to a preset timestamp calibration rule.
In a third aspect, the present application provides a video playback device, including: a memory having stored thereon a computer program which is loadable by the processor and adapted to perform the method of the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program that can be loaded by a processor and execute the method of the first aspect.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic view of an application scenario provided in an embodiment of the present application;
fig. 2 is a flowchart of a video playing method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a video playback device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a video playing device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter associated objects are in an "or" relationship, unless otherwise specified.
The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.
As video playing platforms become more diversified, user selectivity becomes more and more. Accordingly, the problem arises that the user gets fragments fragmented more and more. For example, many users browse interesting segments on a certain video playing platform, but the corresponding video resources cannot be acquired because the segments are too fragmented. This results in that the user cannot watch the corresponding video resource, thereby greatly affecting the experience of the user.
Based on this, the application provides a video playing method, device, equipment and storage medium.
By acquiring a video clip to be confirmed, taking a frame of picture in the video clip to be confirmed as a picture to be detected for feature extraction, and searching according to corresponding features, determining and displaying a target video of the video clip to be confirmed. By the method, when the user browses the interested video clip, the target video can be timely, efficiently and accurately retrieved and watched by the method provided by the application. The fragment fragmentation problem is solved, and the experience of a user is improved.
Fig. 1 is a schematic view of an application scenario provided in the present application. And the server acquires the to-be-confirmed segments of other video playing platforms provided by the user equipment. And after the acquisition is successful, selecting one frame of picture as a picture to be detected for feature extraction. And performing related retrieval according to the feature extraction result to obtain a target video and displaying the target video to the user equipment. Specific implementations can be found in the following examples.
Fig. 2 is a flowchart of a video playing method according to an embodiment of the present application, where the method of the present embodiment may be applied to a server in the above scenario. As shown in fig. 2, the method includes:
s201, obtaining a video clip to be confirmed.
The video clip to be confirmed is an interesting video clip transmitted by the user, and can also be a screenshot corresponding to the interesting video clip transmitted by the user.
S202, taking a frame of picture in the video clip to be confirmed as a picture to be detected to perform feature extraction.
The determination of one frame of picture can be a frame with the most feature types, or a frame with representative meaning, such as the presence of a task image, or scene information, etc.
Specifically, the characteristic extraction can be carried out on the picture to be detected by setting a deep learning model, and the key characteristic of the picture can be confirmed.
S203, determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the features include at least one of scene placement, dressing features, actor information.
After the feature extraction is performed in step S202, the features existing in the picture to be measured are determined. Such as scene type: ancient scenes and modern scenes; dressing characteristics: dressing ancient clothes and dressing modern clothes; the information of the actor: gender, name, age, performance, activity, proximity, etc.
And S204, retrieving the picture to be detected according to the characteristics, determining a target video and displaying the target video.
Specifically, after determining the features in the frame to be measured through the step S203, the features may be matched by using a deep learning model. And the target video can be obtained and displayed by calling the database content for retrieval.
And constructing a deep learning model, and training the deep learning model by virtue of a large amount of data so as to enable the deep learning model to have the capability of carrying out picture feature classification. Wherein, the data is feature data in each frame of picture in different dramas, such as character information, background feature, makeup and the like of people. And transmitting the to-be-confirmed segment acquired in the step S201 to a trained deep learning model, realizing the steps S201-S204 through automatic identification of the deep learning model, confirming the characteristics of the to-be-detected picture in the to-be-confirmed video segment, and further searching and displaying the to-be-detected picture to a target video.
Through the steps, after the video clip to be confirmed is obtained, feature extraction is carried out, and after the feature is confirmed, feature retrieval is carried out to obtain the target video. After the user browses the related interested videos, the target videos and the corresponding playing sources can be obtained in time through the method provided by the application. The key problem that the user cannot search the target video is solved, and the experience of the user is improved.
In some embodiments, the picture to be tested may be determined by determining the number of feature types. And if the frame with the most feature types is taken as the picture to be detected, feature extraction is carried out. Specifically, the method comprises the following steps: determining the number of feature types in each frame of picture of a video clip to be confirmed by using a preset feature type; sequencing the feature type quantity in each frame of picture, and determining the picture frame with the most feature quantity; and taking the picture frame with the most characteristic quantity as a picture to be detected for characteristic extraction.
The preset feature types can be adjusted accordingly, and the types comprise a scene type, a makeup hair type, a dressing type and the like. Scene types such as modern dramas are mostly tall-building mansions, ancient dramas are mostly courtyard pavilions, and fairy swordsman dramas are mostly arranged in a dream manner; makeup hair types such as long hair, curly hair, and various styles typical of modern dramas; the ancient dramas are mostly in a shape with historical characteristics; the Xianhui opera mainly takes the shape of the floating of the Xianhui. Here, by way of example only, those skilled in the art will appreciate that the adjustment of the preset feature types may be performed by referring to the material or browsing the corresponding drama.
Specifically, after the video clip to be confirmed is obtained through the above steps, feature type statistics is performed on each frame of picture of the video clip to be confirmed, the number of feature types existing in each frame of picture is confirmed, and the pictures are sequenced to obtain a frame with the largest number of feature types. And carrying out detailed feature extraction and identification on the frame with the most feature types.
In some implementations, feature types may be prioritized, such as character image priority over makeup hair style; make-up style has a higher priority than scene type, etc. If there are a plurality of frame pictures with a parallel number of feature types after the above sorting, the plurality of frame pictures with the same number of feature types are prioritized. If more than half of the feature types in a certain frame of picture have higher priority than other frames of pictures, the frame of picture can be determined as the picture to be detected for feature extraction.
By means of the method, the number of the feature types of the video segments to be confirmed is collected, and the number of the feature types is sequenced to obtain a frame with the most feature types. The search range can be narrowed, and the workload can be reduced. The retrieval accuracy can also be improved by retrieving the frame with the most feature types.
In some embodiments, the target video to which the video segment to be confirmed belongs may be determined by determining the character image in the feature. Specifically, the method comprises the following steps: determining a character image in the characteristics according to the characteristics; determining information corresponding to actors of the character image according to the character image; determining the reference works of actors of the character image according to the information; and comparing the picture to be detected with the reference works to determine a target video and displaying the target video.
The character image includes primarily facial features of the character. Actor information of a player of a character is determined by acquiring facial features of the character. The actor information may include personal basic information of the actor (such as sex, height, age, constellation, etc.), a performance, recent activities, and the like.
Specifically, after the corresponding characteristics are extracted through the steps, character image analysis is carried out, actor information of the role player is determined, the corresponding reference works are called and compared with the video segment to be confirmed, and then the target video of the video segment to be confirmed is confirmed and displayed.
By the method of the embodiment, the extracted character image characteristics are analyzed, and actor information of the role player is determined. And obtains the performance works of the actors from the corresponding actor information. And comparing the video segment to be confirmed with the reference works, confirming the target video and displaying. The method can correspond to the corresponding actor at the fastest speed, and improves the retrieval speed and the retrieval accuracy.
In some embodiments, the target videos may be ordered by privilege type so that the user may visually observe the playable source. Specifically, the method includes: according to the characteristics, retrieving the picture to be detected to obtain a target video; acquiring the authority of a target video; and sequencing and displaying the target videos by using the preset priority and authority.
The rights in this embodiment may represent playing rights, such as whether the playing requires a member, and whether the playing requires payment. The preset priority is also correspondingly set according to the authority, for example, the priority of the non-required member is higher than that of the required member, and the priority of the required member is higher than that of the required payment. And sequencing the target videos of different video playing platforms according to the sequence. So that the user can find the most suitable playing platform in the first time.
In some implementations, the priority of the click rate may also be set, and the click rate of the target video on different video playing platforms is sorted from high to low. The higher the number of clicks, the higher the permissions, and the higher the ranking. The user can judge which video playing platform has better experience according to the condition, and can correspondingly select which video playing platform.
By the method of the embodiment, when the target video is retrieved and displayed, the playing permissions of different playing sources can be acquired, and sequencing is performed according to the playing permissions, so that a user can obtain the best playing source according to the sequencing result, and more convenient service is provided for the user. The experience of the user is improved.
In some embodiments, the network bandwidth at the current time may be obtained, and the image quality level may be determined according to the network bandwidth, so as to adjust the image quality at the current time. Specifically, the method further comprises: acquiring the network bandwidth at the current moment; matching the image quality level corresponding to the network bandwidth by using the network bandwidth range corresponding to the preset image quality level; according to the matching result, adjusting the image quality at the current moment; and playing the target video according to the image quality.
Network bandwidth mainly refers to the bandwidth in a network, which refers to the amount of information flowing from one end to the other end within a specified time, i.e., the data transmission rate. In the playing process, the network condition of the current environment can be obtained by acquiring the information of the network bandwidth.
The preset image quality level is set according to the fluency of video playing under different network bandwidth states. The network bandwidth range can be understood as that the set network bandwidth in the range can make the video playing under the current image quality smooth.
In some implementations, the network bandwidth range corresponding to the image quality can be set by establishing a deep learning model. In addition, those skilled in the art may set the network bandwidth range corresponding to the preset image quality level according to experience. And constructing a deep learning model, and training the deep learning model by virtue of a large amount of image quality data to enable the deep learning model to have the capability of matching the network bandwidth with the corresponding image quality grade. The image quality data is the image quality that can be smoothly played in different network bandwidth states. And inputting the network bandwidth at the current moment into the trained deep learning model, and confirming the image quality grade corresponding to the network bandwidth at the current moment through automatic identification of the deep learning model.
Specifically, the network bandwidth at the current moment is acquired, and the acquired network bandwidth is compared with a network bandwidth range corresponding to a preset image quality level. And determining the network bandwidth range to which the network bandwidth at the current moment belongs, and judging the image quality level corresponding to the network bandwidth range based on the network bandwidth range to ensure that the video at the current moment can be smoothly played. And according to the matching result, taking the judged image quality level as the image quality of the current time and adjusting to play the target video.
The method provided by the embodiment can enable the video to be smoothly played by acquiring the network bandwidth at the current moment and performing corresponding image quality level matching and adjustment. The network bandwidth can be acquired in real time in the playing process, when the network bandwidth changes, corresponding image quality grade matching is carried out in time, image quality is adjusted, smooth playing of the video is guaranteed, and the situation that the user experience is influenced due to video playing stagnation caused by poor network bandwidth data is avoided.
In some embodiments, ambient noise may be captured and the playback volume of the video adjusted. Specifically, the method further comprises: acquiring environmental noise at the current moment; comparing the environmental noise with a preset noise threshold value, and determining the level corresponding to the environmental noise; and according to the level, determining a volume adjustment range corresponding to the current moment and taking the minimum value in the volume adjustment range as the playing volume.
Ambient noise is anything that may create a sound in the surroundings that interferes with the surrounding living environment. Such as the voice of a person-to-person conversation, the voice of a pet, etc.
The preset noise threshold value can also be set correspondingly through the deep learning model in the embodiment, a new deep learning model is constructed, a large amount of noise data is input to train the deep learning model, and the proper volume range can be determined when different noises are obtained. Such that different noise thresholds correspond to different volume ranges.
Specifically, after the environmental noise at the current moment is obtained, the environmental noise at the current moment is compared with a preset noise threshold value, and a volume adjustment range to which the environmental noise at the current moment belongs is determined. And playing the minimum value in the affiliated volume adjustment range as the playing volume.
In some implementations, the user can set whether to automatically adjust the volume according to the environment according to preferences. For example, in the playing process, the environment is increasingly noisy, and the volume at the current moment can not make the user clearly hear the content played by the video. At this time, it is judged whether the user sets automatic volume adjustment. If the automatic volume adjustment is set by the user, when the environmental noise at the current moment is obtained to exceed the original matched preset noise threshold value, re-matching is carried out, and the volume is adjusted according to the steps; if it is determined that the automatic volume adjustment is not set by the current user, when it is obtained that the ambient noise at the current time exceeds the original matched preset noise threshold, voice reminding may be performed, for example, the voice assistant performs voice announcement to remind the user whether to perform the volume adjustment.
The embodiment provides a volume adjustment method, when environmental noise changes at the current moment, the environmental noise is obtained in time and is matched with a preset noise threshold, a corresponding volume adjustable range is determined, and the minimum volume value of the volume adjustable range is used as an adjustment target to adjust the volume. The method can avoid the problem that the user cannot hear the video playing sound due to sudden noise in the watching process, and the hearing of the user cannot be impacted by adjusting the minimum volume within the volume adjustable range, so that the user has better volume adjustment experience.
In some embodiments, when the sound and picture are not synchronized, the corresponding adjustment can be made by calibrating the time stamps of the different tracks. Specifically, the method further comprises: acquiring a timestamp corresponding to a sound track, a timestamp corresponding to a picture track and a timestamp corresponding to a subtitle track of a target video; judging whether the time stamp corresponding to the sound track, the time stamp corresponding to the picture track and the time stamp corresponding to the subtitle track are consistent; if the timestamps corresponding to the sound tracks, the timestamps corresponding to the picture tracks and the timestamps corresponding to the subtitle tracks are inconsistent, the tracks corresponding to the inconsistent timestamps are judged, and calibration is carried out according to a preset timestamp calibration rule.
The time stamp is mainly to verify the time when the data was generated. For example, in the present embodiment, the time at which the sound track generates the corresponding sound data can be verified by the timestamp. In the video playing process, three tracks of sound, picture and subtitle usually exist, so that the corresponding timestamps of the three tracks exist for proving the time when corresponding data of the sound, the picture and the subtitle appear.
The preset timestamp calibration rule may, in some implementations, be based on a timestamp corresponding to a picture track. If the timestamps corresponding to the sound tracks, the image tracks and the subtitle tracks are inconsistent, the timestamps corresponding to the image tracks and the subtitle tracks are uniformly adjusted according to the timestamps corresponding to the image tracks regardless of whether the timestamps corresponding to the sound tracks and the subtitle tracks are consistent or not. If the timestamp corresponding to the audio track or the subtitle track is faster than the timestamp corresponding to the picture track, the timestamp of the corresponding faster track is adjusted to be paused, and the audio track or the subtitle track is played after the timestamps corresponding to the picture track are consistent. Or adjusting the time stamp of the corresponding faster track to reduce the speed, and restoring the normal playing speed after the time stamps corresponding to the picture tracks are consistent. If the time stamp corresponding to the sound track or the subtitle track is slower than the time stamp corresponding to the picture track, the time stamp corresponding to the slower track is adjusted to accelerate the playing speed of the audio track or the subtitle track until the time stamp corresponding to the picture track is consistent and then the audio track or the subtitle track is recovered to be normal.
In some implementations, if there is a match between the timestamps of two tracks, it may be determined whether there is a relationship between the timestamp corresponding to the track and the timestamps corresponding to other tracks. If the time stamp corresponding to the track in the case of inconsistency is faster than the time stamps corresponding to the other tracks, the time stamp corresponding to the faster track is adjusted to pause in the above manner, and the audio track or the subtitle track is played after the time stamps corresponding to the screen tracks are consistent. Or adjusting the time stamp of the corresponding faster track to reduce the speed, and recovering the normal playing speed after the time stamps corresponding to the picture tracks are consistent. If the time stamp corresponding to the sound track or the caption track is slower than the time stamp corresponding to the picture track, the time stamp corresponding to the slower track is adjusted to accelerate the playing speed of the track until the time stamp corresponding to the picture track is consistent and then the playing speed is recovered to be normal.
Specifically, a timestamp corresponding to the sound track, a timestamp corresponding to the picture track, and a timestamp corresponding to the subtitle track are first acquired, and it is determined whether the timestamps corresponding to the sound track, the picture track, and the subtitle track are the same. If the track time stamps are inconsistent with the normal track time stamps, secondary determination is carried out, the speed of the inconsistent time stamps of the corresponding tracks is determined, and then corresponding adjustment is carried out by utilizing the method, and the time stamps are calibrated.
In this embodiment, the time stamp corresponding to the sound track, the time stamp corresponding to the picture track, and the time stamp corresponding to the subtitle track are obtained and compared to confirm that there is an inconsistency, and the time stamps corresponding to the inconsistent tracks are calibrated according to a preset time stamp calibration rule, so that the problem of audio and video asynchronization can be solved in the shortest time. Thereby improving the experience of the user.
In some embodiments, the comments of the user can be used for extracting keywords and retrieving to obtain and display the target video. Specifically, the method further comprises: obtaining user comments under the video clip to be confirmed; extracting key words in the user comments to perform preliminary retrieval; matching the video clip to be confirmed with the video obtained by the preliminary retrieval; and determining the target video according to the matching result and displaying the target video.
The method provided by the embodiment can be used for setting an information acquisition link on the short video platform, and if a user browses an interested video segment to be confirmed on the short video platform, the user clicks the information acquisition link, so that the comment of each user under the relevant video segment to be confirmed can be acquired. The content in each user's review is compared to the cast names in the database. And acquiring keywords with correct drama names and performing related retrieval. And matching the video clip to be confirmed with the drama corresponding to the searched keyword to obtain the drama with the highest matching degree as the target video for displaying.
Specifically, the video optimization methods, such as image quality adjustment, volume adjustment, and timestamp calibration, which are related in the foregoing embodiments are all applicable in this embodiment, and are not described in detail herein.
By the method provided by the embodiment, if a user browses an interested video to be confirmed on a short video platform, the scenario keywords can be directly acquired through the information acquisition link provided by the embodiment and relevant retrieval is performed. And matching the video clip to be confirmed with the drama corresponding to the searched keyword to obtain the drama with the highest matching degree as the target video for displaying. The method can enable the user to directly search the video clip to be confirmed without downloading the video, thereby reducing the occupation of the equipment memory, improving the use convenience of the user and further improving the user satisfaction.
Fig. 3 is a schematic structural diagram of a data transmission device according to an embodiment of the present application, and as shown in fig. 3, a video playing device 300 according to the embodiment includes: the system comprises a segment acquisition module 301, a feature extraction module 302, a feature determination module 303 and a segment retrieval and display module 304.
A segment obtaining module 301, configured to obtain a video segment to be confirmed;
a feature extraction module 302, configured to extract features from a frame of picture in the video segment to be confirmed as a picture to be detected;
a feature determining module 303, configured to determine features of the to-be-detected picture by using the feature extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information;
and the segment retrieval and display module 304 is configured to retrieve the to-be-detected picture according to the characteristics, determine a target video, and display the target video.
Optionally, the feature extraction module 302 is specifically configured to:
determining the number of feature types in each frame of picture of the video clip to be confirmed by using a preset feature type;
sequencing the feature type quantity in each frame of picture, and determining the picture frame with the most feature quantity;
and taking the picture frame with the most characteristic quantity as a picture to be detected for characteristic extraction.
Optionally, the segment retrieving and displaying module 304 is specifically configured to:
determining a character image in the characteristics according to the characteristics;
determining information corresponding to actors of the character image according to the character image;
determining a performance of the actor of the character image according to the information;
and comparing the picture to be detected with the reference works to determine a target video and displaying the target video.
Optionally, the segment retrieving and displaying module 304 is further specifically configured to:
retrieving the picture to be detected according to the characteristics to obtain the target video;
acquiring the authority of the target video;
and sequencing and displaying the target videos by utilizing the preset priority and the permission.
Optionally, the apparatus further includes an image quality adjustment module 305, configured to:
acquiring the network bandwidth at the current moment;
matching the image quality level corresponding to the network bandwidth by using the network bandwidth range corresponding to the preset image quality level;
according to the matching result, adjusting the image quality at the current moment;
and playing the target video by utilizing the image quality.
Optionally, the apparatus further includes a volume adjusting module 306, configured to:
acquiring environmental noise at the current moment;
comparing the environmental noise with a preset noise threshold value, and determining the level corresponding to the environmental noise;
and according to the level, determining a volume adjustment range corresponding to the current moment and taking the minimum value in the volume adjustment range as the playing volume.
Optionally, the apparatus further comprises a calibration module 307, configured to:
acquiring a timestamp corresponding to a sound track, a timestamp corresponding to a picture track and a timestamp corresponding to a subtitle track of the target video;
judging whether the time stamp corresponding to the sound track, the time stamp corresponding to the picture track and the time stamp corresponding to the subtitle track are consistent;
and if the timestamps corresponding to the sound tracks, the timestamps corresponding to the picture tracks and the timestamps corresponding to the subtitle tracks are inconsistent, judging the tracks corresponding to the inconsistent timestamps, and calibrating according to a preset timestamp calibration rule.
The apparatus of this embodiment may be configured to perform the method of any of the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 4 is a schematic structural diagram of a video playing device according to an embodiment of the present application, and as shown in fig. 4, the video playing device 400 according to this embodiment may include: a memory 401 and a processor 402.
The memory 401 has stored thereon a computer program that can be loaded by the processor 402 and executed to perform the method in the above-described embodiments.
Wherein the processor 402 is coupled to the memory 401, such as via a bus.
Optionally, the video playback device 400 may also include a transceiver. It should be noted that the transceiver in practical application is not limited to one, and the structure of the video playing device 400 is not limited to the embodiment of the present application.
The Processor 402 may be a CPU (Central Processing Unit), a general purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 402 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
A bus may include a path that transfers information between the above components. The bus may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this is not intended to represent only one bus or type of bus.
The Memory 401 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.
The memory 401 is used for storing application program codes for executing the scheme of the application, and the processor 402 is used for controlling the execution. The processor 402 is configured to execute application program code stored in the memory 401 to implement the aspects illustrated in the foregoing method embodiments.
The video playing device includes but is not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. But also a server, etc. The video playing device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
The video playing device of this embodiment may be configured to execute the method of any of the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
The present application also provides a computer readable storage medium storing a computer program that can be loaded by a processor and executed to perform the method as in the above embodiments.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Claims (10)

1. A video playback method, comprising:
acquiring a video clip to be confirmed;
taking a frame of picture in the video clip to be confirmed as a picture to be detected for feature extraction;
determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information;
and according to the characteristics, retrieving the picture to be detected, determining a target video and displaying the target video.
2. The method according to claim 1, wherein the extracting features of one frame of picture in the video segment to be confirmed as the picture to be detected comprises:
determining the number of feature types in each frame of picture of the video clip to be confirmed by using a preset feature type;
sequencing the feature type quantity in each frame of picture, and determining the picture frame with the most feature quantity;
and taking the picture frame with the most characteristic quantity as a picture to be detected for characteristic extraction.
3. The method according to claim 1, wherein the retrieving and displaying the picture to be tested according to the characteristics comprises:
determining a character image in the characteristics according to the characteristics;
determining information corresponding to actors of the character image according to the character image;
determining a performance of the actor of the character image according to the information;
and comparing the picture to be detected with the reference works to determine a target video and displaying the target video.
4. The method according to claim 1, wherein the retrieving and displaying the picture to be tested according to the characteristics comprises:
retrieving the picture to be detected according to the characteristics to obtain the target video;
acquiring the authority of the target video;
and sequencing and displaying the target videos by using the preset priority and the permission.
5. The method of claim 1, further comprising:
acquiring the network bandwidth at the current moment;
matching the image quality level corresponding to the network bandwidth by using the network bandwidth range corresponding to the preset image quality level;
according to the matching result, adjusting the image quality at the current moment;
and playing the target video by utilizing the image quality.
6. The method of claim 1, further comprising:
acquiring environmental noise at the current moment;
comparing the environmental noise with a preset noise threshold value, and determining the level corresponding to the environmental noise;
and according to the level, determining a volume adjustment range corresponding to the current moment and taking the minimum value in the volume adjustment range as the playing volume.
7. The method of claim 1, further comprising:
acquiring a timestamp corresponding to a sound track, a timestamp corresponding to a picture track and a timestamp corresponding to a subtitle track of the target video;
judging whether the timestamp corresponding to the sound track, the timestamp corresponding to the picture track and the timestamp corresponding to the subtitle track are consistent;
and if the timestamps corresponding to the sound tracks, the timestamps corresponding to the picture tracks and the timestamps corresponding to the subtitle tracks are inconsistent, judging the tracks corresponding to the inconsistent timestamps, and calibrating according to a preset timestamp calibration rule.
8. A video playback apparatus, comprising:
the clip acquisition module is used for acquiring a video clip to be confirmed;
the characteristic extraction module is used for extracting characteristics of a frame of picture in the video clip to be confirmed as a picture to be detected;
the characteristic determining module is used for determining the characteristics of the picture to be detected by utilizing the characteristic extraction; the characteristics comprise at least one of scene arrangement, dressing characteristics and actor information;
and the segment retrieval display module is used for retrieving the picture to be detected according to the characteristics to determine a target video and displaying the target video.
9. A video playback device, comprising: a memory and a processor;
the memory to store program instructions;
the processor, which is used to call and execute the program instructions in the memory, executes the method of any one of claims 1-7.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium; the computer program, when executed by a processor, implements the method of any one of claims 1-7.
CN202211425128.3A 2022-11-15 2022-11-15 Video playing method, device, equipment and storage medium Pending CN115734045A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211425128.3A CN115734045A (en) 2022-11-15 2022-11-15 Video playing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211425128.3A CN115734045A (en) 2022-11-15 2022-11-15 Video playing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115734045A true CN115734045A (en) 2023-03-03

Family

ID=85295682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211425128.3A Pending CN115734045A (en) 2022-11-15 2022-11-15 Video playing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115734045A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7466865B2 (en) * 2003-02-14 2008-12-16 Canon Europa, N.V. Method and device for analyzing video sequences in a communication network
CN103475935A (en) * 2013-09-06 2013-12-25 北京锐安科技有限公司 Method and device for retrieving video segments
CN109640062A (en) * 2018-12-19 2019-04-16 深圳市东明炬创电子有限公司 A kind of ultra high-definition video is without compression long distance transmitter
CN110019938A (en) * 2017-11-29 2019-07-16 深圳Tcl新技术有限公司 Video Information Retrieval Techniquess method, apparatus and storage medium based on RGB classification
CN110570841A (en) * 2019-09-12 2019-12-13 腾讯科技(深圳)有限公司 Multimedia playing interface processing method, device, client and medium
CN110913241A (en) * 2019-11-01 2020-03-24 北京奇艺世纪科技有限公司 Video retrieval method and device, electronic equipment and storage medium
WO2021232978A1 (en) * 2020-05-18 2021-11-25 Oppo广东移动通信有限公司 Video processing method and apparatus, electronic device and computer readable medium
CN114741553A (en) * 2022-03-31 2022-07-12 慧之安信息技术股份有限公司 Image feature-based picture searching method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7466865B2 (en) * 2003-02-14 2008-12-16 Canon Europa, N.V. Method and device for analyzing video sequences in a communication network
CN103475935A (en) * 2013-09-06 2013-12-25 北京锐安科技有限公司 Method and device for retrieving video segments
CN110019938A (en) * 2017-11-29 2019-07-16 深圳Tcl新技术有限公司 Video Information Retrieval Techniquess method, apparatus and storage medium based on RGB classification
CN109640062A (en) * 2018-12-19 2019-04-16 深圳市东明炬创电子有限公司 A kind of ultra high-definition video is without compression long distance transmitter
CN110570841A (en) * 2019-09-12 2019-12-13 腾讯科技(深圳)有限公司 Multimedia playing interface processing method, device, client and medium
CN110913241A (en) * 2019-11-01 2020-03-24 北京奇艺世纪科技有限公司 Video retrieval method and device, electronic equipment and storage medium
WO2021232978A1 (en) * 2020-05-18 2021-11-25 Oppo广东移动通信有限公司 Video processing method and apparatus, electronic device and computer readable medium
CN114741553A (en) * 2022-03-31 2022-07-12 慧之安信息技术股份有限公司 Image feature-based picture searching method

Similar Documents

Publication Publication Date Title
CN110209843B (en) Multimedia resource playing method, device, equipment and storage medium
CN106331778B (en) Video recommendation method and device
US20150082330A1 (en) Real-time channel program recommendation on a display device
US20170289619A1 (en) Method for positioning video, terminal apparatus and cloud server
WO2017161776A1 (en) Bullet comment pushing method and device
CN107801096B (en) Video playing control method and device, terminal equipment and storage medium
CN109640129B (en) Video recommendation method and device, client device, server and storage medium
US20170169040A1 (en) Method and electronic device for recommending video
US8453179B2 (en) Linking real time media context to related applications and services
CN109408672B (en) Article generation method, article generation device, server and storage medium
CN110347866B (en) Information processing method, information processing device, storage medium and electronic equipment
CN108696765B (en) Auxiliary input method and device in video playing
CN111767814A (en) Video determination method and device
CN113035199B (en) Audio processing method, device, equipment and readable storage medium
CN114095749B (en) Recommendation and live interface display method, computer storage medium and program product
JP2015090716A (en) Moving image program recommendation method and server therefor
US20190342428A1 (en) Content evaluator
CN111279709A (en) Providing video recommendations
WO2022161328A1 (en) Video processing method and apparatus, storage medium, and device
CN107515870B (en) Searching method and device and searching device
CN112291614A (en) Video generation method and device
CN113852767B (en) Video editing method, device, equipment and medium
CN106936830B (en) Multimedia data playing method and device
CN110020106B (en) Recommendation method, recommendation device and device for recommendation
CN111343508B (en) Information display control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination