CN111615002A

CN111615002A - Video background playing control method, device and system and electronic equipment

Info

Publication number: CN111615002A
Application number: CN202010367303.2A
Authority: CN
Inventors: 邓朔
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-04-30
Filing date: 2020-04-30
Publication date: 2020-09-01
Anticipated expiration: 2040-04-30
Also published as: CN111615002B

Abstract

The disclosure provides a video background playing control method, device and system, a computer storage medium and electronic equipment, and relates to the field of artificial intelligence. The method comprises the following steps: receiving a detection request, wherein the detection request is generated based on the operation of switching the currently played video into background playing and comprises video information and time information of the currently played video; acquiring a target image according to the video information and the time information, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; acquiring audio information corresponding to the image sequence, and classifying the audio information to acquire an audio type corresponding to the audio information; and determining a target playing mode according to the audio type, and controlling a background playing mode of the currently played video according to the target playing mode. The method and the device can control the video to be played or paused in the background, and avoid the user from missing important pictures in the video.

Description

Video background playing control method, device and system and electronic equipment

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a video background playing control method, a video background playing control device, a video background playing control system, a computer-readable storage medium, and an electronic device.

Background

With the development of various application programs and the advent of intelligent electronic devices, people can perform various tasks through the intelligent electronic devices, such as chatting, listening to music, watching videos, reading news and the like, and the tasks can be performed independently or simultaneously, such as switching music and videos to background playing, and performing other operations in a graphical user interface of the intelligent electronic device, such as chatting, reading books and the like.

The video background playing refers to a function that when a user uses an application to play a video, the application is switched to the background, the video is still played normally, at the moment, a picture is not displayed, but sound is played according to a normal time axis, and the user continues to enjoy the video through the sound. However, the current video is continuously played when being switched to the background playing, so that a user may miss some important pictures to influence the viewing experience of the user, and if the user considers that the currently played content cannot be ignored, the user needs to manually pause the video and then enter the background, thereby increasing the operation cost.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The embodiment of the disclosure provides a video background playing control method, a video background playing control device, a video background playing control system, a computer readable storage medium and an electronic device, so that the importance of the current video content can be accurately determined at least to a certain extent, and whether to continue playing is judged, so that it is ensured that a user does not miss important pictures in a video, and user experience is further improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the embodiments of the present disclosure, a method for controlling background playback of a video is provided, including: receiving a detection request, wherein the detection request is generated based on the operation of switching the currently played video into background playing and comprises video information and time information of the currently played video; acquiring a target image according to the video information and the time information, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; acquiring audio information corresponding to the image sequence, and classifying the audio information to acquire an audio type corresponding to the audio information; and determining a target playing mode according to the audio type, and controlling a background playing mode of the currently played video according to the target playing mode.

According to an aspect of the embodiments of the present disclosure, there is provided a video background playing control device, including: the device comprises a request receiving module, a detection module and a processing module, wherein the request receiving module is used for receiving a detection request, the detection request is generated based on the operation of switching a currently played video into background playing, and the detection request comprises video information and time information of the currently played video; the first classification module is used for acquiring a target image according to the video information and the time information and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; the second classification module is used for acquiring audio information corresponding to the image sequence and classifying the audio information to acquire an audio type corresponding to the audio information; and the mode determining module is used for determining a target playing mode according to the audio type and controlling the background playing mode of the currently played video according to the target playing mode.

In some embodiments of the present disclosure, the time information is a current playing timestamp corresponding to the target image in the currently playing video, and the video information is identification information of the currently playing video; based on the foregoing solution, the first classification module is configured to: acquiring a video corresponding to the identification information from a database according to the identification information; and acquiring an image frame corresponding to the current playing time stamp from the video as the target image.

In some embodiments of the present disclosure, based on the foregoing, the first classification module is configured to: inputting the target image into a first classification model, and classifying the target image through the first classification model to obtain the image type of the target image; acquiring an Nth frame of image adjacent to the target image, and classifying the Nth frame of image through the first classification model to acquire the image type of the Nth frame of image, wherein N is a positive integer; comparing the image type corresponding to the Nth frame of image with the image type of the adjacent previous frame of image; when the image type corresponding to the N frame image is different from the image type of the adjacent previous frame image, the image sequence is formed according to the target image, the N-1 frame image and all the images between the target image and the N-1 frame image.

In some embodiments of the present disclosure, based on the foregoing scheme, the image type is a shot type corresponding to an image, and the image types include a first image type and a second image type, wherein the first image type is a long-range view, a panoramic view, or a medium-range view, and the second image type is a close-up view or a close-up view.

In some embodiments of the present disclosure, based on the foregoing, the second classification module is configured to: and inputting the audio information into a second classification model, and classifying the audio information through the second classification model to acquire the audio type.

In some embodiments of the present disclosure, the audio types include a first audio type and a second audio type, and the target play mode includes a pause mode and a background play mode; based on the foregoing, the mode determination module is configured to: when the audio type is the first audio type, determining that the target playing mode is a background playing mode, and controlling the currently played video to continue background playing according to the background playing mode; and when the audio type is the second audio type, determining that the target playing mode is a pause mode, and controlling the currently played video to pause background playing according to the pause mode.

In some embodiments of the present disclosure, based on the foregoing scheme, the first audio type is dialogue and voice-over, and the second audio type is background sound.

In some embodiments of the present disclosure, based on the foregoing solution, the video background playing control device is further configured to: acquiring an image sample and an image classification sample corresponding to the image sample, and simultaneously acquiring an audio sample and an audio classification sample corresponding to the audio sample; training a first classification model to be trained according to the image sample and an image classification sample corresponding to the image sample to obtain the first classification model; and training a second classification model to be trained according to the audio samples and the audio classification samples corresponding to the audio samples to obtain the second classification model.

In some embodiments of the present disclosure, based on the foregoing solution, the video background playing control device is further configured to: and updating the data in the storage unit according to the video information of the currently played video, the time interval corresponding to the image sequence and the target playing mode corresponding to the time interval.

In some embodiments of the present disclosure, based on the foregoing solution, the video background playing control device is further configured to: after the video information of the currently played video is acquired, matching the video information with the video information in the storage unit; when target video information matched with the video information exists in the storage unit, comparing the time information with a time interval corresponding to the target video information in the storage unit; and when a target time interval with intersection with the time information exists in the storage unit, acquiring a target playing mode corresponding to the target time interval, and controlling a background playing mode of the currently played video according to the target playing mode corresponding to the target time interval.

In some embodiments of the present disclosure, based on the foregoing solution, the video background playing control device is further configured to: and when the currently played video is switched to background playing, displaying prompt information in a graphical user interface of the terminal equipment so that a user can select a target function item according to the prompt information.

According to an aspect of the embodiments of the present disclosure, there is provided a video background playing control system, including: the video playing unit is used for displaying a video picture on a graphical user interface in the terminal equipment; the video control unit is connected with the video playing unit and used for responding to the triggering operation of a user on the currently played video, switching the currently played video into background playing and controlling the background playing mode of the currently played video according to a target playing mode; the data interaction unit is connected with the video control unit and used for receiving a detection request which is sent by the video control unit and generated based on the operation of switching the currently played video into background playing, and sending a target playing mode which is obtained by the detection unit in response to the detection request to the video control unit; the detection unit is connected with the data interaction unit and used for receiving the detection request, acquiring a target image according to video information and time information corresponding to the currently played video in the detection request, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; acquiring audio information corresponding to the image sequence, classifying the audio information to acquire an audio type corresponding to the audio information, and determining the target playing mode according to the audio type; and the storage unit is connected with the data interaction unit and the detection unit and is used for storing the video information, a time interval corresponding to the image sequence corresponding to the video information and a target playing mode corresponding to the time interval.

According to an aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the video background playing control method according to the embodiments.

According to an aspect of an embodiment of the present disclosure, there is provided an electronic device including: one or more processors; a storage device to store one or more programs that, when executed by the one or more processors, cause the one or more processors to perform a video trick play control method as described in the embodiments above.

In the technical solutions provided in some embodiments of the present disclosure, a detection request generated based on switching a currently played video to a background play operation is received first; then, acquiring a target image according to the video information and the time information in the detection request, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; then, audio information corresponding to the image sequence is obtained, and the audio information is classified to obtain an audio type; and finally, determining a target play mode according to the audio type so that the terminal equipment controls a background play mode of the currently played video according to the target play mode. According to the technical scheme, on one hand, important pictures of the video, of which the audio cannot transmit information, can be accurately determined based on the audio information of the image sequence; on the other hand, the target playing mode can be determined according to the audio information of the image sequence, and the video is controlled to be played or paused in the background according to the target playing mode, so that important pictures in the video are prevented from being omitted when a user performs multi-task operation, and the user experience is further improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:

fig. 1 shows a schematic diagram of an exemplary system architecture to which technical aspects of embodiments of the present disclosure may be applied;

fig. 2 schematically shows a flow diagram of a video background playback control method according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates an interface diagram of a toggle control according to one embodiment of the present disclosure;

FIG. 4 schematically illustrates an interface diagram for a reminder message according to one embodiment of the present disclosure;

FIG. 5 schematically shows an interface diagram for setting a background play mode according to an embodiment of the present disclosure;

6A-6B schematically illustrate interface diagrams of a first image type and a second image type, according to one embodiment of the present disclosure;

FIG. 7 schematically shows a flow diagram of acquiring a sequence of images according to one embodiment of the present disclosure;

fig. 8 schematically shows a flowchart for finding a target playback mode corresponding to video information in a storage unit according to an embodiment of the present disclosure;

fig. 9 schematically shows an interaction flow diagram of a video background playing control method performed by a terminal device according to an embodiment of the present disclosure;

fig. 10 schematically shows an interaction flow diagram of a video background playing control method executed by a terminal device and a server according to an embodiment of the present disclosure;

fig. 11 schematically shows a block diagram of a video background playback control apparatus according to an embodiment of the present disclosure;

FIG. 12 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 includes a terminal device 101, a network 102, and a server 103, wherein the terminal device 101 may be any electronic device with a display screen, such as a tablet computer, a notebook computer, a desktop computer, a smart phone, a smart television, and the like; network 102 is the medium used to provide communication links between terminal devices 101 and server 103, and network 102 may include various connection types, such as wired communication links, wireless communication links, and so forth.

It should be understood that the number of terminal devices 101, networks 102, servers 103 in fig. 1 is merely illustrative. There may be any number of terminal devices 101, networks 102, servers 103, as desired.

Further, the terminal device 101 may specifically include a video playing unit, a video control unit, and a data interaction unit, and the server 103 may specifically include a detection unit and a storage unit, and in the process of implementing the video background playing control method of the present disclosure, the video playing unit, the video control unit, the data interaction unit, the detection unit, and the storage unit are implemented mainly through interaction among the above units, so that the video playing unit, the video control unit, the data interaction unit, the detection unit, and the storage unit together form the video background playing control system in the embodiment of the present disclosure.

In an embodiment of the present disclosure, the video playing unit, the video control unit, the data interaction unit, the detection unit, and the storage unit may also be disposed in the terminal device. Meanwhile, another data interaction unit can be arranged in the server and is connected with the data interaction unit, the detection unit and the storage unit so as to realize the interaction transmission effect of data.

In an embodiment of the present disclosure, the video playing unit may be a video playing program loaded in the terminal device 101, and configured to display a video picture in a graphical user interface; the video control unit is connected with the video playing unit and is used for responding to the triggering operation of a user on the currently played video to control the playing and the pausing of the currently played video, also can respond to the triggering operation of the user on the currently played video to switch the currently played video to background playing, and can control the background playing mode of the currently played video according to the target playing mode after the currently played video is switched to the background playing; the data interaction unit is connected with the video control unit and used for receiving a detection request sent by the video control unit, the detection request is generated based on the operation of switching the currently played video into background playing, the detection request is sent to the detection unit, and meanwhile, a target playing mode sent by the detection unit in response to the detection request can be received and sent to the video control unit; the detection unit is connected with the data interaction unit and used for receiving a detection request sent by the data interaction unit, acquiring a target image according to video information and time information corresponding to a currently played video in the detection request, and acquiring an image sequence with the same image type based on the target image and a multi-frame image adjacent to the target image; acquiring audio information corresponding to the image sequence, classifying the audio information to acquire an audio type corresponding to the audio information, and determining a target playing mode according to the audio type; and the storage unit is connected with the data interaction unit and the detection unit and is used for storing the video information, a time interval corresponding to the image sequence corresponding to the video information and a target playing mode corresponding to the time interval.

Next, a processing flow of the video background playing control method in the embodiment of the present disclosure is roughly described based on the video background playing control system.

In one embodiment of the present disclosure, when a user plays and watches a video through a video playing unit in the terminal device 101, the currently playing video in the terminal device 101 may be switched to a background for playing when the user wants to perform other tasks while enjoying the video. After a user executes a trigger operation of switching a currently played video to a background for playing, the video control unit can respond to the trigger operation and switch the currently played video to the background for playing, meanwhile, the video control unit can acquire video information of the currently played video and time information of switching to the background, form a detection request according to the video information and the time information, and then send the detection request to the detection unit through the data interaction unit. After receiving the detection request, the detection unit may obtain a target image according to video information and time information corresponding to a currently played video in the detection request, and obtain an image sequence having the same image type based on the target image and a plurality of frames of images adjacent to the target image; and acquiring audio information corresponding to the image sequence, classifying the audio information to acquire an audio type corresponding to the audio information, and determining a target playing mode according to the audio type. When the target play mode is determined, the detection unit may send the target play mode to the data interaction unit, and send the target play mode to the video control unit through the data interaction unit, so that the video control unit controls a background play mode of a currently played video according to the target play mode, where the target play mode includes a pause mode and a background play mode, that is, the video control unit may control the currently played video to pause background play according to the target play mode, and may also control the currently played video to continue background play. The detection unit can also send the video information of the currently played video, the time interval corresponding to the image sequence corresponding to the video information and the target playing mode corresponding to different time intervals to the storage unit so as to update the data in the storage unit. Furthermore, the storage unit stores information of a plurality of videos, the data interaction unit can send the detection request to the storage unit after receiving the detection request, whether matched video data exists is judged by matching the video information and time information in the detection request with the video information and time information in the storage unit, if so, a corresponding target play mode is obtained, and the target play mode is returned to the video control unit through the data interaction unit; and if not, detecting through the detection unit to determine the corresponding target play mode.

According to the technical scheme of the embodiment of the disclosure, on one hand, audio segmentation can be realized by segmenting the video, and then important images and corresponding target playing modes, of which the audio cannot transmit information, in the video are accurately determined according to the audio classification of the audio segmentation; on the other hand, the background playing mode of the currently played video can be controlled according to the target playing mode, important pictures in the video are prevented from being omitted when a user performs multi-task operation, and user experience is further improved.

It should be noted that the video background playing control method provided by the embodiment of the present disclosure may be executed by a terminal device, and accordingly, the video background playing control device may be disposed in the terminal device. However, in other embodiments of the present disclosure, the terminal device and the server may jointly execute the video background playing control method provided by the embodiments of the present disclosure.

In the related art in the field, when performing a multitasking operation, a user may switch a video to a background for playing, that is, the user may continue to enjoy the video through audio information while performing other operations through a terminal device. However, in the related art, after the video is switched to the background for playing, the video is always in a playing state, and if the user thinks that the current video content is not negligible, for example, when a climax of a scenario such as a battle scene happens or a picture contains a large amount of information and other pictures, the user needs to manually pause the video and then switch to the background, which increases the operation cost and also reduces the user experience.

In view of the problems in the related art, the embodiment of the present disclosure provides a video background playing control method, which is implemented by a machine learning model and relates to the technical field of artificial intelligence. Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV), which is a science for researching how to make a machine "see", and further refers to using a camera and a Computer to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further performing image processing, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the disclosure relates to an artificial intelligence video processing and machine learning technology, and is specifically explained by the following embodiments:

fig. 2 schematically shows a flowchart of a video background playing control method according to an embodiment of the present disclosure, where the method may be executed by a terminal device, or may be executed by both the terminal device and a server, where the terminal device and the server are the terminal device 101 and the service 103 shown in fig. 1, and the video background playing control method executed by the terminal device is described as an example. Referring to fig. 2, the video background playing control method at least includes steps S210 to S240, which are described in detail as follows:

in step S210, a detection request is received, where the detection request is generated based on an operation of switching a currently playing video to a background playing video, and includes video information and time information of the currently playing video.

In an embodiment of the present disclosure, when a user wants to perform other tasks, such as replying to a message, querying information, and the like, through a terminal device while watching a video using the terminal device, the currently playing video being played in a graphical user interface needs to be switched to a background, and then the user enjoys the video according to the audio of the currently playing video while performing other tasks at the front end of the interface. It is noted that the video in the embodiments of the present disclosure is a formal post-production video, which includes complete storylines and pictures, such as movies, television shows, and the like.

In an embodiment of the present disclosure, after a user performs an operation of switching a currently played video to a background playing, the video playing unit may send a switching request corresponding to the operation to the video control unit, so that the video control unit switches the currently played video to a background playing mode. Meanwhile, the video control unit may further obtain time information corresponding to an image being played when the currently played video is switched to the background playing mode, and form a detection request according to the time information and video information of the currently played video, where the time information is a currently played timestamp corresponding to the currently played image in the video, and the video information is identification information of the currently played video, for example, a character string formed by processing a name of the currently played video according to a preset algorithm, such as a hash algorithm, or identification information formed according to a name pinyin of the currently played video, an acronym of the name pinyin, or the like, and for videos with the same name, the identification information may be distinguished by adding different numbers. After the detection request is formed, the video control unit may send the detection request to the data interaction unit, so that the data interaction unit sends the detection request to the detection unit for detection and determines the target play mode.

In an embodiment of the present disclosure, in time, a currently played video may be switched to a background for playing through a switching control, fig. 3 shows an interface schematic diagram of the switching control, as shown in fig. 3, the switching control "background playing" may be set in an upper right corner of a graphical user interface, and when a user clicks the switching control, the currently played video may be switched to a background playing mode. The switching control can also be arranged at a position which does not affect video appreciation, such as the lower right corner of the graphical user interface, and the switching control can be in a hidden state when the user does not operate like other controls in the interface, and can be displayed only when the user clicks the graphical user interface and the like to trigger operation.

Furthermore, when the user switches the currently played video to the background playing mode by clicking the switching control, the video playing unit can generate a prompt message according to the switching operation of the user, and displays the prompt in the graphical user interface of the terminal device, fig. 4 shows an interface diagram of the prompt, as shown in fig. 4, a prompt message box 401 exists in the graphical user interface, the specific content of the prompt message is "the currently played video is switched to the background playing mode, please select whether to start the automatic pause/play function according to the played content", two function item keys 402 and 403 are arranged below the prompt information frame, the function item key 402 is 'on', the function item key 403 is 'not on', and if the user selects 'not on', the background can always play the audio information of the currently played video without pause due to the change of the video content; if the user selects "start", the importance of the video clip when the currently played video is switched to the background playing mode can be detected to determine the target playing mode, and the background playing mode of the currently played video is controlled according to the target playing mode, specifically, when the current clip is determined to contain important information and the user needs to watch the picture to obtain related information, the target playing mode is determined to be pause playing, the background playing of the currently played video is stopped, and when the current clip is determined not to contain important information or the user does not need to watch the picture to obtain related information, the target playing mode is determined to be background playing, and the background playing of the currently played video is continued. It should be noted that, when the currently played video is continuously played in the background, the target playing mode corresponding to the next video segment adjacent to the current segment corresponding to the currently played timestamp is continuously detected, and the background playing mode of the currently played video is controlled according to the newly determined target playing mode.

Of course, besides setting the switching control, the currently played video may be switched to the background playing mode through the Home key or other modes of switching the video playing platform to the background, but the implementation of the function requires the user to perform related settings in the video playing platform, for example, selecting to turn on "switch the currently played video to the background playing mode through the Home key" on the attribute setting interface of the video playing platform, as shown in fig. 5.

In step S220, a target image is obtained according to the video information and the time information, and an image sequence having the same image type is obtained based on the target image and a plurality of frame images adjacent to the target image.

In an embodiment of the present disclosure, after receiving the detection request, the detection unit may parse the detection request to obtain the video information and the time information therein. After acquiring the video information and the time information, the target image may be acquired according to the video information and the time information, specifically, a video corresponding to the identification information may be acquired from a database according to the video information, and then an image frame corresponding to the time information may be acquired from the video as the target image. The database may be a database for locally storing video data in the terminal device, may be a cloud database, and may also be a server connected to the terminal device for storing video data, and so on. Further, an image sequence with the same image type can be acquired based on the target image and the multi-frame image adjacent to the target image, and the image sequence with the same image type is the video clip corresponding to the current playing time stamp.

In an embodiment of the present disclosure, considering that a video is composed of a plurality of different shots, for example, a movie or a tv series, and a photographer may use different shots to express according to different scenes when shooting, such as a long shot, a full shot, a medium shot, a close shot, and a close-up shot, in an embodiment of the present disclosure, an image type of each image frame may be determined according to a shot type corresponding to each image frame in the video, and the video may be divided into a plurality of image sequences based on the image type of each image frame. In order to improve data processing efficiency, the image types are divided into a first image type and a second image type, and specifically, the shot type may be a long shot, a full shot, and a medium shot, the type of an image used for representing an environment in which a scenario is located is defined as the first image type, and the shot type is a short shot and a close-up, and the type of an image used for describing a conversation and an object close-up is defined as the second image type.

In one embodiment of the present disclosure, fig. 6A-6B illustrate schematic interface diagrams of a first image type and a second image type, as shown in fig. 6A, the image frame is a leader in a "dreaming space" of a movie, which is a panoramic shot, that is, the image type corresponding to the image frame is the first image type; as shown in fig. 6B, the image frame is a close-up shot in the "dreaming space" of the movie, that is, the image type corresponding to the image frame is the second image type. Of course, the lens types corresponding to the first image type and the second image type may also be interchanged, and this is not specifically limited in the embodiment of the present disclosure.

In one embodiment of the present disclosure, when acquiring an image sequence having the same image type based on a target image and a plurality of frame images adjacent to the target image, the image sequence may be acquired through a machine learning model, which may specifically be a classification network model, and may classify the images to determine the image type of each image, fig. 7 shows a schematic flow chart of acquiring the image sequence, as shown in fig. 7, in step S701, the target image is input to a first classification model, and the target image is classified through the first classification model to acquire the image type of the target image; in step S702, an nth frame of image adjacent to the target image is obtained, and the nth frame of image is classified by a first classification model to obtain an image type of the nth frame of image, where N is a positive integer; in step S703, comparing the image type corresponding to the nth frame image with the image type of the previous frame image; in step S704, when the image type corresponding to the nth frame image is different from the image type of the adjacent previous frame image, an image sequence is formed according to the target image, the N-1 th frame image, and all the images between the target image and the N-1 th frame image.

Next, a flow of acquiring an image sequence shown in fig. 7 will be described in detail.

The first classification model may be any classification network model, such as MobileNet or the like, as long as the image type thereof can be determined from the input image. Determining a target image I corresponding to the current playing time stamp₁Thereafter, the target image I can be processed₁Inputting the image data into a trained first classification model, and enabling the first classification model to pass through a target image I₁The probability of the first image classification and the second image classification can be obtained through analysis, and then the target image I can be determined according to the probability₁Of the image type, e.g. obtaining the target image I by means of a first classification model₁The image type of (1) is a first image type; then, a first frame image I adjacent to the target image can be obtained₂And the first frame image I is processed₂Inputting the first classification model to obtain a first frame image I₂The image type of (2); the first frame image I may then be processed₂Image classification and target image I₁If the image classification of the two images is the same, the image classification of the target image I and the image classification of the target image I can be continuously acquired₁Adjacent second frame image I₃I.e. with the first frame image I₂Adjacent first frame images and judging second frame images I through the first classification model₃If the second frame image I₃The image type and the first frame image I₂Is the same as the target image I, then the acquisition is continued with the target image I₁Adjacent third frame image I₄Repeating the steps until obtaining the target image I₁Is shown inImage I of the Nth frame with different image types_N+1That is, the target image I₁Up to the adjacent N-1 frame image I_NAll have the same image classification, then it can be based on the target image I₁The N-1 th frame image I_NAnd between the target image I₁The N-1 th frame image I_NAll images in between form a sequence of images, i.e. a video clip corresponding to the current play timestamp.

In an embodiment of the present disclosure, in order to improve the accuracy of the classification result, before the image classification of the image is classified by using the first classification model, the first classification model to be trained needs to be trained to obtain a stable first classification model. Specifically, a plurality of image samples and image classification samples corresponding to the image samples may be obtained first; then inputting the plurality of image samples into a first classification model to be trained one by one for processing to obtain image classes corresponding to the image samples; the loss function can be constructed according to the image category output by the first classification model and the image classification sample corresponding to the input image sample, and the parameters of the first classification model are continuously adjusted until the loss function reaches the minimum value, namely, the image type output by the first classification model is the same as or basically the same as the image classification sample, so as to complete the training of the first classification model.

In step S230, audio information corresponding to the image sequence is acquired, and the audio information is classified to acquire an audio type corresponding to the audio information.

In one embodiment of the present disclosure, in a video, an audio also carries a large amount of effective information, when the video enters a background playing, a picture image cannot be seen, and the playing content of the video can only be understood through the audio information. Continuing with the example in step S220, in determining the target image I₁The N-1 th frame image I_NAnd between the target image I₁The N-1 th frame image I_NAfter an image sequence is formed, audio information corresponding to the image sequence can be obtained, and the audio type corresponding to the audio information can be determined based on the audio typeAnd determining the background playing mode of the currently played video. When audio information corresponding to the image sequence is acquired, the audio information can be acquired according to the target image I₁And the N-1 st frame image I_NAnd intercepting corresponding audio information from the audio track of the currently played video at the corresponding playing time in the currently played video.

Based on the characteristics of the video content, the types of the audio information may be divided into a first audio type and a second audio type, where the first audio type may be dialogue and voice-over, and the second audio type may be background sound, and of course, the first audio type may also be background sound, and the second audio type is dialogue and voice-over. The dialogue and the dialogue often carry a large amount of scenario information and can be used for replacing pictures, the background sound usually cannot transfer video content, and a user can only hear various background music and cannot judge what happens in the video, character states and the like, so that when the video is played in the background, if the audio type of the current playing segment is dialogue and dialogue, background playing is kept without pause, the user can obtain the scenario information through the dialogue and the dialogue, if the audio type of the current playing segment is the background sound, the playing needs to be paused, and after the background playing is switched to front-end playing by the user, the scenario information is obtained through an ornamental picture.

In an embodiment of the present disclosure, after obtaining the audio information corresponding to the image sequence, the audio information may be input to the second classification model, and the audio information is classified by the second classification model to obtain an audio classification of the audio information.

In an embodiment of the present disclosure, to improve the accuracy of the audio type, before the audio type of the audio information is classified by using the second classification model, the second classification model to be trained needs to be trained to obtain a stable second classification model. Specifically, a plurality of audio samples and audio classification samples corresponding to the audio samples may be obtained first; then, inputting the plurality of audio samples into a second classification model to be trained one by one for processing so as to obtain audio types corresponding to the audio samples; and constructing a loss function according to the audio type output by the second classification model and the audio classification sample corresponding to the input audio sample, and continuously adjusting the parameters of the second classification model until the loss function reaches the minimum value, namely, the audio type output by the second classification model is the same as or basically the same as the audio classification sample so as to finish the training of the second classification model.

In step S240, a target play mode is determined according to the audio type, so that the terminal device controls a background play mode of the currently played video according to the target play mode.

In an embodiment of the present disclosure, after the audio type of the audio information corresponding to the image sequence is obtained in step S230, the target play mode may be determined according to different audio types, so as to control the background play mode of the currently played video according to the target play model. The target playing mode comprises a pause mode and a background playing mode, and specifically, when the audio type is a first audio type, namely dialogue and dialogue, the target playing mode is determined to be the background playing mode, and the currently played video is controlled to continue background playing according to the background playing mode; and when the audio type is a second audio type, namely background sound, determining that the target playing mode is a pause mode, and controlling the currently played video to pause background playing according to the pause mode. In the embodiment of the disclosure, the lens is used for segmenting the video according to the characteristics of the video, and whether the audio of the currently played video segment can express the video information is judged according to the audio type of the audio corresponding to the current video segment, so that the corresponding target playing mode is determined, and thus whether the currently played video segment is suitable for background playing can be accurately determined, the user is ensured not to miss a wonderful video segment while performing multi-task processing, the operation flow of the user is reduced, and the user experience is improved.

In one embodiment of the present disclosure, the detection unit detects the video signal in the request according to the detectionAfter the target playing mode corresponding to the current video clip is determined by the information and the time information, the target playing mode can be returned to the video control unit, so that the video control unit controls the background playing mode of the currently played video according to the target playing mode. Meanwhile, in order to save the calculation amount and accelerate the efficiency of subsequently determining the target playing mode corresponding to each video, the video information of the currently played video, the time interval corresponding to the image sequence corresponding to the currently played video and the target playing mode corresponding to the time interval can be stored in the storage unit so as to update the data in the storage unit. The storage form in the storage unit is specifically as follows:<video information, { [ t ]₁,t_i]Target play mode }, { [ t ]_i+1,t_j]Target play mode, … …>Wherein the video information is identification information of the video, [ t₁,t_i]、[t_i+1,t_j]Representing time segments in the video, i, j being positive integers. If there is already initial information in the storage unit that corresponds to the same video information but the time interval and the target playback mode are not exactly the same, the initial information may be updated according to the newly added data, the updating may be a replacement, an integration, or the like, for example, the initial information is:<35697as2，{[25’15”,26’00”]background play }, { [26 '01 ", 27' 30"), pause }, … …>And the newly added information is<35697as2，{[15’15”,16’30”]Background play }, { [16 '31 ", 18' 00"]Background playback … { [25 '15 ", 26' 00"]Background play }, { [26 '01 ", 27' 30"), pause }, … …>It can be found that the newly added information contains the initial information, and therefore the initial information can be directly replaced with the newly added information.

In an embodiment of the present disclosure, due to the large number and variety of videos, if the batch detection of the video information and the target playing mode corresponding to the time information in each detection request takes a lot of time and machines, in an embodiment of the present disclosure, after receiving a plurality of detection requests, the video information and the time information in the detection requests may be first matched with the video information and the corresponding time information in the storage unit, if the matched video information and time information exist, the corresponding target playing mode is obtained for feedback, if not, the detection request is detected by the detection unit to obtain the target play mode, and the target playing mode is stored in the storage unit, so that the corresponding target playing mode can be directly acquired when the same detection request is received subsequently.

Fig. 8 is a schematic flowchart illustrating a process of searching for a target play mode corresponding to video information in a storage unit, as shown in fig. 8, in step S801, after acquiring video information of a currently playing video, matching the video information of the currently playing video with video information in the storage unit; in step S802, when target video information matching the video information exists in the storage unit, comparing the time information with a time interval corresponding to the target video information in the storage unit; in step S803, when a target time interval intersecting the time information exists in the storage unit, a target playback mode corresponding to the target time interval is acquired, and the background playback mode for currently playing back the video is controlled according to the target playback mode corresponding to the target time interval. For example, when the user is watching a video a, the current playing timestamp corresponding to the video image being played when the user switches the video to background playing is t_mThen, the video information of the video a may be matched with the video information in the storage unit, and if there is matched video information, the current playing time stamp t is used_mComparing with the time interval corresponding to the matched video information, if the time interval [ t ] exists_m,t_n]With the current playing time stamp t_mIf there is intersection, then obtain the time interval [ t_m,t_n]And returning the target play mode to the video control unit so that the video control unit controls the background play of the video A according to the received target play mode. Of course, if the video information and/or the current playing time stamp t are determined according to the video A_mIf the corresponding target playing mode cannot be obtained, the detection unit needs to obtain the identification information of the video a and the current playing timestamp t_mPerforming image classification and chordFrequency classification, and further determining a target playing mode, the specific flow is the same as the above embodiment, and will not be described herein again.

In an embodiment of the present disclosure, a plurality of detection requests including the same video information and time information may exist in the plurality of detection requests, for example, a thermoplay a updates at 21 points per day, so that a user as a chase episode inevitably logs in a video playing platform on time to watch the thermoplay a, and during the watching process, there is a case that a plurality of users switch to background playing at the same time with a high probability, that is, a video background playing control system receives a plurality of detection requests including identification information of the thermoplay a and the same time information at the same time, and in order to improve data processing efficiency and reduce resource consumption, it may only detect one of the plurality of same detection requests, and return an obtained target playing mode to a plurality of video control units to perform background control on videos being played by respective video playing units.

The video background playing control method in the embodiments of the present disclosure may be executed by a terminal device, or may be executed by both the terminal device and a server, for example, when the terminal device is in an unconnected state, the video background playing control method may be executed by the terminal device, and when the terminal device is in a connected state, the video background playing control method may be executed by both the terminal device and the server.

When the terminal equipment of the user is in an unconnected state and the played video is also the downloaded offline video, the background playing mode of the currently played video can be controlled by a video background playing control system arranged in the terminal equipment when the user switches the currently played video to background playing and performs other task operations.

Fig. 9 shows an interaction flowchart of a video background playing control method executed by a terminal device, in step S901, when a user switches a currently playing video to a background playing mode, a video playing unit sends a switching request to a video control unit; in step S902, the video control unit switches the currently played video to the background playing, and simultaneously obtains the video information of the currently played video and the time information corresponding to the video image during switching, i.e., the currently played timestamp, and forms a detection request according to the time information and the video information; in step S903, the detection request is sent to the data interaction unit; in step S904, the data interaction unit sends the detection request to the storage unit; in step S905, querying the storage unit according to the video information and the time information in the detection request, and determining whether a target play mode corresponding to the video information and the time information exists; in step S906, if yes, returning the target play mode to the data interaction unit; in step S907, the data interaction unit returns the target play mode to the video control unit to control the background play mode of the currently played video according to the target play mode; in step S908, if not, sending a detection request to the detection unit; in step S909, the detection unit determines a target play mode from the video information and the time information; in step S910, returning the target play mode to the data interaction unit; in step S911, the data interaction unit returns the target play mode to the video control unit to control the background play mode of the currently played video according to the target play mode. The method for determining the target play mode according to the video information and the time information in step S909 is the same as the method in the above embodiment, and will not be described herein again.

In an embodiment of the present disclosure, data in the storage unit may be automatically updated when the terminal device is in a networking state, and it is ensured that the data in the storage unit is all the latest, so that when the offline video is played in a background in an unconnected state, the hit rate of the offline video in the storage unit may be increased in step S905, and then the target play mode may be quickly obtained, and the target play mode does not need to be obtained by detecting through the detection unit, thereby further increasing the data processing efficiency.

When the terminal device is in a networking state, in order to increase the storage space of the terminal device and improve the data processing efficiency, the video playing unit, the video control unit and the data interaction unit may be disposed in the terminal device, and meanwhile, the detection unit and the storage unit are disposed in the server, and the interaction flow of video background playing is the same as the interaction flow shown in fig. 9, which is not repeated here.

Further, another data interaction unit may be further disposed in the server, and the data interaction unit may be disposed among the data interaction unit, the detection unit, and the storage unit in the terminal device.

Fig. 10 is a schematic view illustrating an interaction flow of the terminal device and the server executing the video background playing control method, and as shown in fig. 10, steps S1001 to 1003 are the same as steps S901 to S903, and are not described again here; in step S1004, the front-end data interaction unit sends the detection request to the back-end data interaction unit; in step S1005, the backend data interaction unit sends the detection request to the storage unit; in step S1006, querying the storage unit according to the video information and the time information in the detection request, and determining whether a target play mode corresponding to the video information and the time information exists; in step S1007, if yes, returning the target play mode to the back-end data interaction unit; in step S1008, the back-end data interaction unit returns the target play mode to the front-end data interaction unit; in step S1009, the front-end data interaction unit returns the target play mode to the video control unit to control the background play mode of the currently played video; in step S1010, if not, sending a detection request to the detection unit; in step S1011, the detection unit determines a target play mode according to the video information and the time information; in step S1012, returning the target play mode to the back-end data interaction unit; in step S1013, the back-end data interaction unit returns the target play mode to the front-end data interaction unit; in step S1014, the front-end data interaction unit returns the target play mode to the video control unit to control the background play mode of the currently played video. The method for determining the target play mode according to the video information and the time information in step S1011 is the same as the process in the above embodiment, and is not described herein again.

In an embodiment of the present disclosure, the multiple videos may also be segmented and labeled by way of manual labeling, so as to supplement the data of the storage unit. Specifically, a plurality of videos can be acquired, images in each video are classified in a manual labeling mode, and the videos are divided into a plurality of image sequences, namely video segments, according to the image types of the images; then, audio information corresponding to each image sequence can be obtained, the audio information is classified, and a target playing mode is determined according to the audio type; and finally, the identification information of the video, the time intervals corresponding to the image sequence and the target playing modes corresponding to the time intervals are written into the storage unit, so that the target playing modes can be quickly obtained after the video information in the detection request is successfully matched with the video identification information in the storage unit, the data processing efficiency is improved, and the user experience is further improved.

The video background playing control method in the embodiment of the disclosure is based on a shot segmentation detection method to determine a target playing mode when a currently played video is switched to background playing, firstly, a target image is obtained according to video information and time information in a received detection request, and an image sequence with the same image type is obtained based on the target image and a multi-frame image adjacent to the target image; then, audio information corresponding to the image sequence is obtained, and the audio information is classified to obtain an audio type; and finally, determining a target play mode according to the audio type so that the terminal equipment controls a background play mode of the currently played video according to the target play mode. The video background playing control system is also provided with a storage unit for storing playing modes corresponding to time intervals corresponding to each image sequence in the detected video, and before the detection unit detects the playing modes, the storage unit can be inquired to acquire a target playing mode. According to the technical scheme, on one hand, the video can be divided into a plurality of image sequences based on the lens of the video, and important pictures of which the audio cannot transmit information in the video are accurately determined based on the audio information of the image sequences; on the other hand, the target playing mode can be determined according to the audio information of the image sequence, and the video is controlled to be played or paused in the background according to the target playing mode, so that important pictures in the video are prevented from being omitted when a user performs multi-task operation, and the user experience is further improved; on the other hand, the target playing mode can be quickly acquired by matching with the data in the storage unit, so that the data processing efficiency is further improved, and the calculated amount is saved.

The following describes an embodiment of an apparatus of the present disclosure, which may be used to execute a video background playing control method in the foregoing embodiment of the present disclosure. For details that are not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the video background playing control method described above in the present disclosure.

Fig. 11 schematically shows a block diagram of a video background playback control apparatus according to an embodiment of the present disclosure. The apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application.

Referring to fig. 11, a video background playback control apparatus 1100 according to an embodiment of the present disclosure includes: a request receiving module 1101, a first classification module 1102, a second classification module 1103, and a mode determination module 1104.

The request receiving module 1101 is configured to receive a detection request, where the detection request is generated based on an operation of switching a currently played video to a background playing, and includes video information and time information of the currently played video; a first classification module 1102, configured to obtain a target image according to the video information and the time information, and obtain an image sequence having the same image type based on the target image and a plurality of frames of images adjacent to the target image; a second classification module 1103, configured to obtain audio information corresponding to the image sequence, and classify the audio information to obtain an audio type corresponding to the audio information; and a mode determining module 1104, configured to determine a target play mode according to the audio type, and control a background play mode of the currently played video according to the target play mode.

In an embodiment of the present disclosure, the time information is a current playing timestamp corresponding to the target image in the currently playing video, and the video information is identification information of the currently playing video; the first classification module 1102 is configured to: acquiring a video corresponding to the identification information from a database according to the identification information; and acquiring an image frame corresponding to the current playing time stamp from the video as the target image.

In one embodiment of the present disclosure, the first classification module 1102 is configured to: inputting the target image into a first classification model, and classifying the target image through the first classification model to obtain the image type of the target image; acquiring an Nth frame of image adjacent to the target image, and classifying the Nth frame of image through the first classification model to acquire the image type of the Nth frame of image, wherein N is a positive integer; comparing the image type corresponding to the Nth frame of image with the image type of the adjacent previous frame of image; when the image type corresponding to the N frame image is different from the image type of the adjacent previous frame image, the image sequence is formed according to the target image, the N-1 frame image and all the images between the target image and the N-1 frame image.

In some embodiments of the present disclosure, the image type is a shot type corresponding to an image, the image type includes a first image type and a second image type, wherein the first image type is a long shot, a panorama, or a medium shot, and the second image type is a close-up or a close-up.

In one embodiment of the present disclosure, the second classification module 1103 is configured to: and inputting the audio information into a second classification model, and classifying the audio information through the second classification model to acquire the audio type.

In one embodiment of the present disclosure, the audio types include a first audio type and a second audio type, and the target play mode includes a pause mode and a background play mode; the mode determination module 1104 is configured to: when the audio type is the first audio type, determining that the target playing mode is a background playing mode, and controlling the currently played video to continue background playing according to the background playing mode; and when the audio type is the second audio type, determining that the target playing mode is a pause mode, and controlling the currently played video to pause background playing according to the pause mode.

In one embodiment of the present disclosure, the first audio type is dialogue and voice-over, and the second audio type is background sound.

In some embodiments of the present disclosure, the video background playing control device is further configured to: acquiring an image sample and an image classification sample corresponding to the image sample, and simultaneously acquiring an audio sample and an audio classification sample corresponding to the audio sample; training a first classification model to be trained according to the image sample and an image classification sample corresponding to the image sample to obtain the first classification model; and training a second classification model to be trained according to the audio samples and the audio classification samples corresponding to the audio samples to obtain the second classification model.

In an embodiment of the present disclosure, the video background playing control device 1100 is further configured to: and updating the data in the storage unit according to the video information of the currently played video, the time interval corresponding to the image sequence and the target playing mode corresponding to the time interval.

In an embodiment of the present disclosure, the video background playing control device 1100 is further configured to: after the video information of the currently played video is acquired, matching the video information with the video information in the storage unit; when target video information matched with the video information exists in the storage unit, comparing the time information with a time interval corresponding to the target video information in the storage unit; and when a target time interval with intersection with the time information exists in the storage unit, acquiring a target playing mode corresponding to the target time interval, and controlling a background playing mode of the currently played video according to the target playing mode corresponding to the target time interval.

In an embodiment of the present disclosure, the video background playing control device 1100 is further configured to: and when the currently played video is switched to background playing, displaying prompt information in a graphical user interface of the terminal equipment so that a user can select a target function item according to the prompt information.

According to the technical scheme, on one hand, the video can be divided into a plurality of image sequences based on the lens of the video, and important pictures of which the audio cannot transmit information in the video are accurately determined based on the audio information of the image sequences; on the other hand, the target playing mode can be determined according to the audio information of the image sequence, and the video is controlled to be played or paused in the background according to the target playing mode, so that important pictures in the video are prevented from being omitted when a user performs multi-task operation, and the user experience is further improved; on the other hand, the target playing mode can be quickly acquired by matching with the data in the storage unit, so that the data processing efficiency is further improved, and the calculated amount is saved.

It should be noted that the computer system 1200 of the electronic device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 12, the computer system 1200 includes a Central Processing Unit (CPU)1201, which can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203, and implements the image labeling method described in the above-described embodiment. In the RAM 1203, various programs and data necessary for system operation are also stored. The CPU 1201, ROM 1202, and RAM 1203 are connected to each other by a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.

The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output section 1207 including a Display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 1208 including a hard disk and the like; and a communication section 1209 including a network interface card such as a LAN (Local area network) card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. A driver 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 1210 as necessary, so that a computer program read out therefrom is mounted into the storage section 1208 as necessary.

In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1209, and/or installed from the removable medium 1211. The computer program, when executed by a Central Processing Unit (CPU)1201, performs various functions defined in the system of the present disclosure.

It should be noted that the computer readable medium shown in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

As another aspect, the present disclosure also provides a computer-readable medium that may be contained in the image processing apparatus described in the above-described embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video background playing control method is characterized by comprising the following steps:

receiving a detection request, wherein the detection request is generated based on the operation of switching the currently played video into background playing and comprises video information and time information of the currently played video;

acquiring a target image according to the video information and the time information, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image;

acquiring audio information corresponding to the image sequence, and classifying the audio information to acquire an audio type corresponding to the audio information;

and determining a target playing mode according to the audio type, and controlling a background playing mode of the currently played video according to the target playing mode.

2. The video background playing control method according to claim 1, wherein the time information is a current playing timestamp corresponding to the target image in the currently playing video, and the video information is identification information of the currently playing video;

the acquiring the target image according to the video information and the time information includes:

acquiring a video corresponding to the identification information from a database according to the identification information;

and acquiring an image frame corresponding to the current playing time stamp from the video as the target image.

3. The video background playing control method according to claim 1, wherein said obtaining a sequence of images having the same image type based on the target image and a plurality of frames of images adjacent to the target image comprises:

inputting the target image into a first classification model, and classifying the target image through the first classification model to obtain the image type of the target image;

acquiring an Nth frame of image adjacent to the target image, and classifying the Nth frame of image through the first classification model to acquire the image type of the Nth frame of image, wherein N is a positive integer;

comparing the image type corresponding to the Nth frame of image with the image type of the adjacent previous frame of image;

when the image type corresponding to the N frame image is different from the image type of the adjacent previous frame image, the image sequence is formed according to the target image, the N-1 frame image and all the images between the target image and the N-1 frame image.

4. The video background playing control method according to claim 3, wherein the image type is a shot type corresponding to an image, and the image type includes a first image type and a second image type, wherein the first image type is a long-range view, a full-range view or a medium-range view, and the second image type is a close-up view or a close-up view.

5. The video background playing control method according to claim 3, wherein the classifying the audio information to obtain the audio type corresponding to the audio information includes:

and inputting the audio information into a second classification model, and classifying the audio information through the second classification model to acquire the audio type.

6. The video background playing control method according to claim 1 or 5, wherein the audio types include a first audio type and a second audio type, and the target playing mode includes a pause mode and a background playing mode;

the determining a target play mode according to the audio type and controlling a background play mode of the currently played video according to the target play mode includes:

when the audio type is the first audio type, determining that the target playing mode is a background playing mode, and controlling the currently played video to continue background playing according to the background playing mode;

and when the audio type is the second audio type, determining that the target playing mode is a pause mode, and controlling the currently played video to pause background playing according to the pause mode.

7. The video background playback control method according to claim 6, wherein the first audio type is dialogue and voice-over, and the second audio type is background sound.

8. The video background playback control method of claim 5, further comprising:

acquiring an image sample and an image classification sample corresponding to the image sample, and simultaneously acquiring an audio sample and an audio classification sample corresponding to the audio sample;

training a first classification model to be trained according to the image sample and an image classification sample corresponding to the image sample to obtain the first classification model;

and training a second classification model to be trained according to the audio samples and the audio classification samples corresponding to the audio samples to obtain the second classification model.

9. The video background playback control method according to claim 1, wherein the method further comprises:

and updating the data in the storage unit according to the video information of the currently played video, the time interval corresponding to the image sequence and the target playing mode corresponding to the time interval.

10. The video background playback control method according to claim 9, wherein the method further comprises:

after the video information of the currently played video is acquired, matching the video information with the video information in the storage unit;

when target video information matched with the video information exists in the storage unit, comparing the time information with a time interval corresponding to the target video information in the storage unit;

and when a target time interval with intersection with the time information exists in the storage unit, acquiring a target playing mode corresponding to the target time interval, and controlling a background playing mode of the currently played video according to the target playing mode corresponding to the target time interval.

11. The video background playback control method of claim 1, the method further comprising:

and when the currently played video is switched to background playing, displaying prompt information in a graphical user interface of the terminal equipment so that a user can select a target function item according to the prompt information.

12. A video background playback control apparatus, comprising:

the device comprises a request receiving module, a detection module and a processing module, wherein the request receiving module is used for receiving a detection request, the detection request is generated based on the operation of switching a currently played video into background playing, and the detection request comprises video information and time information of the currently played video;

the first classification module is used for acquiring a target image according to the video information and the time information and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image;

the second classification module is used for acquiring audio information corresponding to the image sequence and classifying the audio information to acquire an audio type corresponding to the audio information;

and the mode determining module is used for determining a target playing mode according to the audio type and controlling the background playing mode of the currently played video according to the target playing mode.

13. A video background playback control system, comprising:

the video playing unit is used for displaying a video picture on a graphical user interface in the terminal equipment;

the video control unit is connected with the video playing unit and used for responding to the triggering operation of a user on the currently played video, switching the currently played video into background playing and controlling the background playing mode of the currently played video according to a target playing mode;

the data interaction unit is connected with the video control unit and used for receiving a detection request which is sent by the video control unit and generated based on the operation of switching the currently played video into background playing, and sending a target playing mode which is obtained by the detection unit in response to the detection request to the video control unit;

the detection unit is connected with the data interaction unit and used for receiving the detection request, acquiring a target image according to video information and time information corresponding to the currently played video in the detection request, and acquiring an image sequence with the same image type based on the target image and a plurality of frames of images adjacent to the target image; and

acquiring audio information corresponding to the image sequence, classifying the audio information to acquire an audio type corresponding to the audio information, and determining the target playing mode according to the audio type;

and the storage unit is connected with the data interaction unit and the detection unit and is used for storing the video information, a time interval corresponding to the image sequence corresponding to the video information and a target playing mode corresponding to the time interval.

14. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the video background playback control method according to any one of claims 1 to 11.

15. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to perform the video trick play control method of any of claims 1-11.