CN108845742B

CN108845742B - Image picture acquisition method and device and computer readable storage medium

Info

Publication number: CN108845742B
Application number: CN201810654362.0A
Authority: CN
Inventors: 张云; 周俊清
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-06-22
Filing date: 2018-06-22
Publication date: 2020-05-05
Anticipated expiration: 2038-06-22
Also published as: CN108845742A

Abstract

The embodiment of the invention discloses a method and a device for acquiring an image picture and a computer-readable storage medium, which are used for improving the screenshot effect and screenshot quality of the image picture and improving the effective information content included in the content of a cover. The method for acquiring the image picture comprises the following steps: acquiring an image data stream, wherein the image data stream comprises a plurality of frames of image pictures; intercepting a first image picture from the image data stream, wherein the first image picture belongs to the image pictures of the plurality of frames; identifying a first User Interface (UI) element from the first image picture, wherein the first UI element is a UI element which is positioned at the edge of an interface in the first image picture and the element position of which is kept unchanged within a preset time period; and cutting off the interface edge area where the first UI element is located from the first image picture to obtain a second image picture, wherein the second image picture is the image area left in the first image picture after the interface edge area is cut off.

Description

Image picture acquisition method and device and computer readable storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for acquiring an image frame, and a computer-readable storage medium.

Background

In the field of game video, it is often necessary to capture an image from the game video as a cover image of the game video.

The prior art provides the following video cover capture scheme: randomly taking frames from a game video and live broadcast interface, then intercepting the whole image, reducing the whole image in equal proportion, and taking the reduced whole image as a cover image of the video or live broadcast.

The above prior art has at least the following disadvantages: when the whole screenshot is taken as the front cover, the whole screenshot does not effectively help the user to identify the video and the live content, even the whole front cover image becomes more disordered, and the defect that the high-quality front cover image cannot be obtained exists.

Disclosure of Invention

The embodiment of the invention provides an image picture acquiring method and device and a computer readable storage medium, which are used for improving the screenshot effect and screenshot quality of an image picture and improving the effective information content included in the content of a cover.

The embodiment of the invention provides the following technical scheme:

in one aspect, an embodiment of the present invention provides an image frame acquiring method, including:

acquiring an image data stream, wherein the image data stream comprises a plurality of frames of image pictures;

intercepting a first image picture from the image data stream, wherein the first image picture belongs to the image pictures of the plurality of frames;

identifying a first User Interface (UI) element from the first image picture, wherein the first UI element is a UI element which is positioned at the edge of an interface in the first image picture and the element position of which is kept unchanged within a preset time period;

and cutting off the interface edge area where the first UI element is located from the first image picture to obtain a second image picture, wherein the second image picture is the image area left in the first image picture after the interface edge area is cut off.

On the other hand, an embodiment of the present invention further provides an apparatus for acquiring an image frame, including:

the data flow acquisition module is used for acquiring an image data flow, and the image data flow comprises a plurality of frames of image pictures;

the picture intercepting module is used for intercepting a first image picture from the image data stream, wherein the first image picture belongs to the image pictures of the plurality of frames;

the picture identification module is used for identifying a first User Interface (UI) element from the first image picture, wherein the first UI element is a UI element which is positioned on the edge of an interface in the first image picture and the element position of which is kept unchanged within a preset time period;

and the area screenshot module is used for intercepting the interface edge area where the first UI element is located from the first image picture to obtain a second image picture, and the second image picture is the image area left after the interface edge area is intercepted in the first image picture.

In the foregoing aspect, the constituent modules of the image picture acquiring apparatus may further perform the steps described in the foregoing aspect and various possible implementations, which are described in detail in the foregoing description of the foregoing aspect and various possible implementations.

In another aspect, an embodiment of the present invention provides an apparatus for acquiring an image frame, where the apparatus includes: a processor, a memory; the memory is used for storing instructions; the processor is configured to execute the instructions in the memory to cause the image frame acquisition device to perform the method according to any one of the preceding aspects.

In another aspect, the present invention provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform the method of the above aspects.

In the embodiment of the invention, an image data stream comprising a plurality of frames of image pictures is acquired, a first image picture is captured from the image data stream, a first user interface UI element is identified from the first image picture, the first UI element is a UI element which is located at an interface edge in the first image picture and whose element position remains unchanged within a preset time period, and finally an interface edge area where the first UI element is located is captured from the first image picture to obtain a second image picture, and the second image picture is an image area remaining after the interface edge area is captured from the first image picture. According to the embodiment of the invention, the first UI element which is positioned at the edge of the interface and the element position of which is kept unchanged in the preset time period can be identified from the first image, and the interface edge area where the first UI element is positioned is cut off from the first image, so that the interface edge area where the first UI element is positioned is not included in the obtained second image, and the second image keeps the image of the interface non-edge area which is pulled down, and the screenshot content which is more closely related to the image data flow can be intuitively displayed through the second image, so that the screenshot effect and the screenshot quality of the image are improved, and the effective information content included by the front cover content is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings.

Fig. 1 is a schematic diagram of a system architecture to which an image frame acquisition method according to an embodiment of the present invention is applied;

fig. 2 is a schematic flowchart of a method for acquiring an image frame according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of another method for acquiring an image frame according to an embodiment of the present invention;

fig. 4 is a schematic diagram illustrating a comparison between a random frame capture effect provided in the prior art and an effect obtained after the area capture in the embodiment of the present invention;

fig. 5 is a schematic diagram of a system architecture for capturing a live cover page according to an embodiment of the present invention;

fig. 6 is a schematic diagram illustrating a matching process between an image frame and a feature picture according to an embodiment of the present invention;

FIG. 7-a is a diagram illustrating a feature picture in a feature database according to an embodiment of the present invention;

FIG. 7-b is a diagram illustrating another feature picture in a feature database according to an embodiment of the present invention;

FIG. 8-a is a schematic diagram of a method for labeling non-core elements from an image frame according to an embodiment of the present invention;

FIG. 8-b is a schematic diagram of a non-core element labeled from another image frame according to an embodiment of the present invention;

FIG. 8-c is a schematic diagram of a non-core element labeled from another image frame according to an embodiment of the present invention;

FIG. 9-a is a schematic diagram of a configuration of an apparatus for acquiring an image frame according to an embodiment of the present invention;

fig. 9-b is a schematic diagram illustrating a structure of a picture recognition module according to an embodiment of the present invention;

FIG. 9-c is a schematic diagram of a structure of an image matching unit according to an embodiment of the present invention;

FIG. 9-d is a schematic diagram illustrating a structure of a region matching unit according to an embodiment of the present invention;

FIG. 9-e is a schematic diagram of another image frame acquisition apparatus according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a terminal to which an image frame acquiring method according to an embodiment of the present invention is applied;

fig. 11 is a schematic structural diagram of a server to which the method for acquiring an image frame according to the embodiment of the present invention is applied.

Detailed Description

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one skilled in the art from the embodiments given herein are intended to be within the scope of the invention.

The terms "comprises" and "comprising," and any variations thereof, in the description and claims of this invention and the above-described drawings are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Please refer to fig. 1, which illustrates a system architecture diagram applied by the image frame acquiring method according to the embodiment of the present application. The system may include: a data stream server 110 and an image picture acquiring device 120, wherein the data stream server can provide an image data stream to the image picture acquiring device 120, and the image data stream comprises a plurality of frames of image pictures. The data transmission is performed between the image frame acquisition device 120 and the data flow server 110 through a communication network. The image-screen acquiring apparatus 120 may be specifically the terminal 120 shown in fig. 1, and for example, the image-screen acquiring apparatus 120 may also be an image-screen acquiring server. The terminal may be a mobile phone, a tablet computer, an e-book reader, an MP3 player (Moving Picture Experts Group Audio Layer III, motion Picture Experts Group Audio Layer IV, motion Picture Experts Group Audio Layer 4), an MP4 player, a laptop, a desktop computer, or the like.

In the embodiment of the present invention, the terminal 120 may obtain an image data stream from the data stream server 110 through a communication network, and the terminal 120 captures the image data stream to obtain a first image. Then, the terminal 120 may identify, from the first image, a first User Interface (UI) element that is located at an interface edge and whose element position remains unchanged within a preset time period, and intercept, from the first image, an interface edge area where the first UI element is located, so that an interface edge area where the first UI element is located is not included in the obtained second image, and therefore an image where an interface non-edge area is located is retained in the second image, and screenshot content that is more closely related to an image data stream may be visually displayed through the second image, thereby improving screenshot effect and screenshot quality of the image, and increasing effective information content included in cover content.

For example, the image data stream may be a video data stream or a live data stream, and in the prior art, a large number of game interface elements and anchor related elements (such as an anchor avatar, anchor fan group information, and play information) appear when the entire image is captured, and these elements have a small effect on identifying the content of live and game videos, but make the entire cover screenshot become more cluttered. Through the picture recognition and the screenshot of the non-edge area of the interface in the embodiment of the invention, the first UI element which is not changed in the video and live broadcast interfaces within a certain time and is arranged at the edge of the interface is removed, and the second image picture which is more core and has a clear interface is intercepted, so that a user can visually see more clear information such as games, roles, modes, props and the like related to the live broadcast and video contents.

The following description is made in detail from the viewpoint of an image screen acquisition apparatus. Referring to fig. 2, the method for acquiring an image frame according to an embodiment of the present invention may be specifically applied to a cover image acquisition scene of an image data stream, and may include the following steps:

201. and acquiring an image data stream, wherein the image data stream comprises a plurality of frames of image pictures.

In an embodiment of the present invention, the data stream server provides an image data stream, which may be a video data stream or a live data stream, and the image data stream includes a plurality of frames of image pictures, for example, the image data stream is composed of a plurality of frames of continuous image pictures. The image frame may be an image in an application, for example, an image generated by a game application.

The acquiring means of the image screen may acquire the image data stream from the data stream server and then trigger execution of step 202.

202. A first image picture is intercepted from the image data stream, and the first image picture belongs to the image pictures of a plurality of frames.

In the embodiment of the present invention, the image frame acquiring device may capture the image data stream to obtain the first image frame. In acquiring the image data stream, the entire image frame may be cut out from the image data stream, and in order to distinguish from the image frame cut out in the subsequent embodiment, the image frame cut out from the image data stream is referred to as a first image frame, and the first image frame may belong to a certain frame of image frames of multiple frames in the image data stream.

In some embodiments of the present invention, step 202 intercepts a first image frame from an image data stream, comprising the steps of:

and intercepting the first image picture from the image data stream according to a preset interception period.

When the image picture is intercepted from the image data stream, the image picture may be intercepted at regular time according to a preset interception period, for example, the interception period may be set according to the type of the application program to which the image data stream belongs, for example, the application program may be a game application program, and the interception period may be set according to different game application programs. After the capture period is set, capture can be timed based on the time requirement of the capture period, and then a plurality of first image frames can be automatically captured from the image data stream. The subsequent steps in the embodiment of the present invention may be performed for each image frame to remove the interface edge area of each image frame, which is described in detail in the following embodiments.

203. And identifying a first UI element from the first image picture, wherein the first UI element is a UI element which is positioned on the edge of the interface in the first image picture and the position of the element is kept unchanged within a preset time period.

In the embodiment of the present invention, after the obtaining device of the image frames captures the first image frame, picture recognition is performed on the first image frame to recognize the first UI element included in the first image frame, where the first UI element refers to a UI element that is located at an edge of an interface in the first image frame and whose element position remains unchanged for a preset time period. For example, the first UI element in the embodiment of the present invention refers to a screen element displayed on a UI. In different application scenarios, the first UI element may represent a specific screen element. As described below, taking a game scene as an example, the first UI element may refer to a non-core element in the game image screen, such as a UI element in which an operation button, game data, a main avatar, and the like in the game image screen are nearly immobile in the live/video data stream. In contrast, the core element in the game image screen may refer to a UI element such as a game character, a prop, a skin, or the like in the game image screen.

In the embodiment of the present invention, the UI elements in the first image frame are subjected to picture recognition, so as to determine which UI elements in the first image frame are the first UI elements that satisfy the above requirements. The UI element in the first image screen, which is located at the edge of the interface and whose position remains unchanged for the preset time period, may be referred to as a first UI element, where the location of the UI element in the first image screen at the edge of the interface may refer to at least one edge of the interface in the first image screen, for example, the interface edge may refer to at least one edge of an upper edge of the interface, a lower edge of the interface, a left edge of the interface, or a right edge of the interface. For example, the first image screen being located at an edge of the interface may refer to the first UI element being present in the first image screen at a periphery of the interface, i.e., on a surrounding area surrounding the core element in the first image screen.

204. And cutting off the interface edge area where the first UI element is located from the first image picture to obtain a second image picture, wherein the second image picture is the image area left after the interface edge area is cut off from the first image picture.

In the embodiment of the present invention, after a first UI element is identified from a first image, an image area where the first UI element is located is determined, the image area where the first UI element is located is referred to as an interface edge area, then an area screenshot is performed on the first image, that is, an interface edge area where a non-core element is located in the first image needs to be truncated, an image area remaining after the interface edge area is truncated in the first image is referred to as a second image, the second image may be an image area including a core element, and a union of the second image and the interface edge area forms the first image.

For example, in the embodiment of the present invention, an area screenshot may be performed on a first image according to an interface edge area where a first UI element is located, where the area screenshot refers to screenshot on a certain part of an entire game screen, and a non-core element in a live video random cover is removed. Through picture identification and area screenshot, game UI elements which are unchanged in video and live interfaces and are around the interfaces are removed. And (5) intercepting the cover which is more core and has a clear interface. The video can be content which can be watched repeatedly by the user, the live broadcast timeliness is higher, and the user can only watch the content at a specific time. According to the embodiment of the invention, the first UI element in the first image picture is removed, so that the user can visually see more clear information such as games, roles, modes, props and the like related to the live broadcast and video content.

Fig. 3 is a schematic flow chart illustrating another method for acquiring an image frame according to an embodiment of the present invention. The method mainly comprises the following steps:

301. and acquiring an image data stream, wherein the image data stream comprises a plurality of frames of image pictures.

302. A first image picture is intercepted from the image data stream, and the first image picture belongs to the image pictures of a plurality of frames.

Steps 301 to 302 are similar to steps 201 to 202 in the previous embodiment, and are not described again here.

After the first image screen is captured, the embodiment of the present invention may recognize the first UI element from the first image screen in the following manner from step 303 to step 306.

303. And acquiring a feature database, wherein the feature database comprises a plurality of feature pictures.

In the embodiment of the present invention, the image frame acquisition apparatus may be configured with a feature database in advance, where a plurality of feature pictures are stored in the feature database, where a feature picture refers to an image region including a core element of an image frame. For example, the core element in the game image screen may refer to a UI element such as a game character, a prop, and a skin in the game image screen.

In some embodiments of the present invention, the plurality of feature pictures included in the feature database respectively correspond to screenshot pictures of the application program in different application scenes.

For example, screenshot pictures of the application program in different application scenes can be stored in advance as feature pictures, and the feature pictures extracted in advance do not include non-core elements. For example, the feature database may be a game feature database, where the game feature database may include a screenshot of a clear portion of a game screen, and multiple screenshots may be taken as game feature pictures by the same game application.

304. And respectively matching the first image picture with a plurality of characteristic pictures in the characteristic database.

In the embodiment of the present invention, a first image captured from an image data stream may be respectively matched with all feature pictures in a feature database, so as to determine which feature picture in the feature database is matched with the first image. If the feature database does not have the feature picture matched with the first image picture, or the matching degree is very low, the matching is returned to be failed, at this time, step 302 may be executed again, the first image picture of the next frame is obtained again, and then the matching is performed with the feature database.

In the embodiment of the present invention, there may be a plurality of matching manners for the first image picture and the feature picture, for example, whether the first image picture and the feature picture are matched is determined by an image similarity detection algorithm, for example, a histogram may be used for similarity detection, for example, two kinds of histograms may be calculated: and counting the histogram and accumulating the histogram, and determining whether the first image picture and the characteristic picture are matched based on the histogram result. For another example, the embodiment of the present invention may determine whether the first image picture and the Feature picture are matched according to a Scale-Invariant Feature Transform (SIFT) algorithm, and may determine which image similarity detection algorithm to select in combination with the scene in the practical application.

In some embodiments of the present invention, the step 304 of matching the first image frame with the plurality of feature pictures in the feature database respectively includes:

extracting a first characteristic point from the first image picture, and extracting a second characteristic point from each characteristic picture in the characteristic database;

carrying out similarity matching on the first characteristic points and the second characteristic points to obtain similarity matching results;

and determining whether the first image picture is matched with the characteristic picture according to the similarity matching result.

Feature points can be respectively extracted from the first image picture and all feature pictures in the feature database. The feature points may also be referred to as key points, and for example, the key points in the image are respectively extracted according to the SIFT algorithm. For convenience of description, the feature points extracted from the first image picture are referred to as first feature points, and the feature points extracted from the feature picture are referred to as second feature points.

After the first feature point and the second feature point are calculated respectively, similarity matching may be performed on the first feature point and the second feature point to obtain a similarity matching result, and it is determined whether the first image picture and the feature picture are matched according to the similarity matching result, and step 305 is performed if the pictures are similar. For example, a first SIFI feature is extracted from the first image frame by using an SIFI algorithm, and then the SIFI feature is matched with the SIFI feature of the feature picture in the game feature database.

305. When a first feature picture in the feature database is matched with a first image picture, a feature matching area matched with the first feature picture is determined from the first image picture, and the first feature picture belongs to a plurality of feature pictures.

In the embodiment of the present invention, after the first image picture is respectively matched with all feature pictures in the feature database, if it is determined that the first feature picture is matched with the first image picture, a feature matching region may be determined from the first image picture, where the feature matching region is an image region that can be matched with the first feature picture in the first image picture. For example, a feature matching region matching the first feature picture is circled in the first image picture.

In some embodiments of the present invention, the step 305 determining a feature matching region matching the first feature picture from the first image picture includes:

determining two diagonal coordinates matched with the second feature point from a plurality of first feature points of the first image picture;

acquiring the length and the width of the feature matching area according to the pixel difference value between the two diagonal coordinates;

acquiring screenshot coordinates from a first image picture according to the two diagonal coordinates, the length and the width of the feature matching area and the length and the width of the first feature picture;

and determining the feature matching area according to the screenshot coordinate and the length and width of the feature matching area.

When the similarity matching algorithm is used for determining that the first feature picture is matched with the first image picture, two diagonal coordinates matched with the second feature point are determined from a plurality of first feature points of the first image picture according to the second feature point in the first feature picture. The two diagonal coordinates refer to two first feature points that constitute a diagonal relationship in the first image picture. After determining the two diagonal coordinates, the length and width of the feature matching region may be obtained according to the pixel difference between the two diagonal coordinates, for example, the pixel value of one diagonal coordinate is subtracted from the pixel value of the other diagonal coordinate, and the length and width of the feature matching region may be calculated according to the difference between the two pixel values, and the length and width may be expressed by pixels.

When the first feature picture is found from the feature database, the length and the height of the first feature picture can be obtained, then two diagonal coordinates and the length and the width of the feature matching area can be obtained through the calculation, and then according to the relationship between the length and the width of the feature matching area and the length and the width of the first feature picture, the two diagonal coordinates are used as reference points to determine which pixel point in the first image picture can be used as a screenshot coordinate, wherein the screenshot coordinate refers to a starting point coordinate when the first image picture is subjected to screenshot. For example, screenshot coordinates are obtained from the first image according to two diagonal coordinates, the length and width of the feature matching region, and the length and width of the first feature picture, for example, the length of the feature matching region and the length of the first feature picture are subtracted, the width of the feature matching region and the width of the first feature picture are subtracted, a difference result in the length and width directions can be obtained, and then the screenshot coordinates in the first image can be determined by shifting from the two diagonal coordinates according to the difference result in the length and width directions. And finally, determining the feature matching area according to the screenshot coordinate and the length and width of the feature matching area, namely taking the screenshot coordinate as a starting point and intercepting the image area with the length and width of the feature matching area from the first feature image.

For example, the first image frame is compared with the game feature picture in the game feature database, and if the pictures are similar, the feature matching area of the part similar to the game feature picture can be cut out from the first image frame. The feature matching region may be a rectangular region determined from two diagonal coordinates in the first image frame. After the rectangular area is determined, the picture in the rectangular area can be intercepted, namely the second image picture obtained in the subsequent embodiment is obtained, and the second image picture can be used as the current live cover of the live data stream.

306. A first UI element is identified from an image area in the first image screen that does not belong to the feature matching area.

In the embodiment of the present invention, after the feature matching region is detected from the first image screen in the above manner, the UI element is identified in the image region of the first image screen that does not belong to the feature matching region, and the UI element identified in the image region that does not belong to the feature matching region is the first UI element described in the foregoing embodiment.

307. And cutting off the interface edge area where the first UI element is located from the first image picture to obtain a second image picture, wherein the second image picture is the image area left after the interface edge area is cut off from the first image picture.

For example, as shown in fig. 4, a schematic diagram illustrating a comparison between a random frame capture effect provided in the prior art and an effect obtained after the area capture in the embodiment of the present invention is illustrated. According to the embodiment of the invention, the area screenshot can be carried out on the first image picture according to the interface edge area where the first UI element is located, wherein the area screenshot refers to the screenshot of a certain part of the whole picture of the game, and non-core elements in a video live random cover are removed. Through picture identification and area screenshot, game UI elements which are unchanged in video and live interfaces and are around the interfaces are removed. And (5) intercepting the cover which is more core and has a clear interface. The video can be content which can be watched repeatedly by the user, the live broadcast timeliness is higher, and the user can only watch the content at a specific time. According to the embodiment of the invention, the first UI element in the first image picture is removed, so that the user can visually see more clear information such as games, roles, modes, props and the like related to the live broadcast and video content.

In some embodiments of the present invention, after the step 307 cuts the interface edge area where the first UI element is located from the first image frame to obtain a second image frame, the method provided in an embodiment of the present invention further includes the following steps:

the second image frame is sent to an information server, which updates the second image frame to a cover image of the image data stream.

The information server can acquire the second image picture from the acquisition device of the image picture, and the information server updates the second image picture in the image list to be used as a cover image of the image data stream, so that the cover image of the image data stream can be acquired in real time.

As can be seen from the above description of the embodiment of the present invention, a feature database is configured in advance, and a plurality of feature pictures in the feature database are respectively matched with a first image, so as to determine a first feature picture matched with the first image, and determine a feature matching area matched with the first feature picture from the first image, so that in an image area not belonging to the feature matching area in the first image, a first UI element located at an interface edge and having an element position that remains unchanged within a preset time period can be detected, and the interface edge area where the first UI element is located is cut off from the first image, so that an interface edge area where the first UI element is located is not included in a second image, and therefore, a pull-down image of the interface non-edge area is retained in the second image, and screenshot content more closely related to an image data stream can be visually displayed through the second image, the screenshot effect and screenshot quality of the image are improved, and the effective information content included in the front cover content is improved.

In order to better understand and implement the above-mentioned schemes of the embodiments of the present invention, the following description specifically illustrates corresponding application scenarios.

The embodiment provides a cover intercepting method for removing non-core elements in a video live random cover through picture recognition and automatic area screenshot, so that the cover with a more core and a clear interface can be intercepted. And game UI elements and character information around the game interface are removed, the messy feeling of the cover is reduced, and the information transmission of the game content is amplified. The clear interface refers to that the matching is performed through a preset model interface, and the matched interface is a clear image part.

Fig. 5 is a schematic diagram of a system architecture for capturing a live cover page provided in the embodiment of the present invention. The system architecture can comprise: a live streaming or video streaming server, a cover screenshot server, a game feature database, a live cover database, a plurality of live information servers, and N users (user 1, user 2, …, user N-1, and user N, respectively). The method mainly comprises the following interactive processes:

step 1, a cover screenshot server acquires an image data stream from a live stream or video stream server, for example, a live stream or a video data stream.

And 2, the front cover screenshot server intercepts an image picture from the image data stream, then matches the image picture with a game feature picture in a game feature database, identifies non-core elements from the image picture, then removes an interface edge area where the non-core elements in the image picture are located, so as to obtain an image front cover of the image data stream, and stores the image front cover in a live broadcast front cover database.

And 3, the user can pull a live broadcast list from the live broadcast information server, wherein the live broadcast list comprises a live broadcast cover picture.

The cover data of the live broadcast cover database is periodically intercepted from a live broadcast/video stream by a cover screenshot server, the game characteristic database is pre-configured during product operation, the game characteristic database is a characteristic database of game scene images preset in advance, and the game characteristic database is constructed by a plurality of game pictures.

The following illustrates an application scenario of the embodiment of the present invention: the scene intercepted by the cover page comprises a live broadcast source and a video library in the background. The scene presented by the cover is in product interfaces of personal computers, mobile terminals and the like for presenting live broadcast and video contents. Wherein, if the screenshot is from the live interface, the live source is needed, and if the screenshot is from the video picture, the video library is needed.

The method and the device have the functional characteristics that unchangeable UI elements such as game UI elements, anchor relevant information and the like in the image picture are identified, unchangeable non-core elements are removed through automatic screenshot, and a core game interface is reserved as a cover of live broadcast and video content.

Fig. 6 is a schematic diagram of a matching process between an image frame and a feature picture according to an embodiment of the present invention, which mainly includes the following steps:

and S01, reading the live data stream or the video data stream.

Wherein the image pictures of a plurality of frames in a live data stream or a video data stream can be read.

And S02, regularly intercepting the image picture.

For example, the picture frames are cut at a preset cut cycle timing.

And S03, acquiring a game feature database.

The game feature images in the game feature database are loaded, the game feature database refers to screenshots of clear parts in game pictures, and multiple screenshots exist in the same game. For example, fig. 7-a is a schematic diagram of one feature picture in the feature database provided in the embodiment of the present invention, and fig. 7-b is a schematic diagram of another feature picture in the feature database provided in the embodiment of the present invention. By way of example in fig. 7-a and 7-b, the non-core elements have been removed from the feature picture, i.e. the edge region of the interface where the non-core elements are located in the feature picture has been removed.

S04, feature matching is complete?

And matching the captured image with the feature pictures in the game feature database, determining whether the image is matched with all the feature pictures in the game feature database, if so, executing the step S05, and if not, executing the step S06.

And S05, returning the matching failure.

And S06, taking a characteristic picture.

And S07, matching the captured image picture with the characteristic picture.

For example, according to the SIFT algorithm, the captured image frame is compared with the game feature picture of the game feature database, if the pictures are similar, SIFI features extracted from the captured picture by using the SIFI algorithm are used, and then the SIFI features are matched with the game feature database. Areas of similar parts may be truncated in the screenshot. Taking two diagonal coordinates in the feature matching region, a rectangular region can be determined. After the rectangular area is determined, the picture of the rectangular area can be intercepted and used as the current live broadcast cover.

It should be noted that any one of the captured image frames and any one of the feature pictures in the game feature library data need to be matched, and the matching process is looped until all the feature pictures of the game are matched.

The SIFT algorithm searches key points (characteristic points) on the two pictures, calculates the direction of the key points, and determines whether the screenshot is matched with the characteristic pictures according to the number of the matched key points. For example, a threshold number of key points is set, and if the number of key points is greater than the threshold value, the key points are matched, otherwise, the key points are not matched.

S08, match?

And S09, returning the matched feature matching area.

If there is a match, the coordinates of the match are found (x1, y1), (x2, y2), which are diagonal coordinates. Calculating the length of the matching region: length x2-x1 and width y2-y1. assume that the feature picture length and width are L and W, and the coordinates of the screenshot should be: x is X1- (L-length)/2, and Y is Y2+ (W-width)/2.

Finding the coordinates (X, Y) of the screenshot, plus its length (L) and width (W), determines the feature matching region. And if the matched characteristic picture does not exist in the game characteristic database or the matching degree is low, returning the matching failure.

In the embodiment of the present invention, the game feature database is a preconfigured database, and in the present embodiment, a feature database of a game scene image needs to be prepared in advance, as shown in fig. 7-a and 7-b, the game feature database is constructed by several game pictures. The characteristic picture is characterized in that: 1. standard game scene pictures, and static UI elements are removed; 2. a game may have multiple feature pictures to cover different game scenes.

Fig. 8-a is a schematic diagram illustrating the labeling of non-core elements from one image frame according to an embodiment of the present invention, fig. 8-b is a schematic diagram illustrating the labeling of non-core elements from another image frame according to an embodiment of the present invention, and fig. 8-c is a schematic diagram illustrating the labeling of non-core elements from another image frame according to an embodiment of the present invention. By matching the image picture and the characteristic picture in the embodiment of the invention, the non-core element can be identified from the image picture, the interface edge area where the non-core element is positioned is determined, and the interface edge area is removed to obtain the interface non-edge. The removal of the boundary region where the non-core element is located is performed for different game image frames in fig. 8-a to 8-c.

It should be noted that, in the embodiment of the present invention, the full-area capturing may also be performed in a manner that the character follows the capturing, for example, a character picture is included in the feature database, and when the character in the screenshot is similar to the feature picture, a picture of the character is extracted. In addition, the area screenshot can also be completed by fixed area-centered geometric capture, that is, only the screenshot part with the centered screen area is reserved for each screenshot.

According to the embodiment of the invention, the game characteristic data and the live real pictures are matched, for example, the picture similarity matching is carried out according to the SIFT algorithm, so that the core elements and scenes in the game process can be automatically extracted, the core elements of the live anchor pictures are better shown for the user, and the user is attracted to watch the live anchor pictures. The embodiment of the invention can improve the quality of live game and video cover, including the effectiveness of picture information transmission, the aesthetic degree of the cover and the like. And manual editing, game characteristic pictures and automatic screenshot are not needed. After the cover effect is guaranteed, the whole live broadcast and video list interface can be more concise and beautiful.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides relevant means for implementing the above-described aspects.

Referring to fig. 9-a, an apparatus 900 for acquiring an image frame according to an embodiment of the present invention may include: a data flow acquisition module 901, a picture capture module 902, a picture recognition module 903, and an area capture module 904, wherein,

a data stream acquiring module 901, configured to acquire an image data stream, where the image data stream includes multiple frames of image frames;

a frame capture module 902, configured to capture a first image frame from the image data stream, where the first image frame belongs to the image frames of the multiple frames;

a picture identification module 903, configured to identify a first user interface UI element from the first image, where the first UI element is a UI element that is located at an interface edge in the first image and whose element position remains unchanged within a preset time period;

and an area screenshot module 904, configured to truncate the interface edge area where the first UI element is located from the first image to obtain a second image, where the second image is an image area remaining after the interface edge area is truncated in the first image.

In some embodiments of the present invention, referring to fig. 9-b, the picture recognition module 903 comprises:

the feature image loading unit 9031 is configured to acquire a feature database, where the feature database includes multiple feature images;

an image matching unit 9032, configured to match the first image with multiple feature images in the feature database, respectively;

a region matching unit 9033, configured to determine, when a first feature picture in the feature database matches the first image picture, a feature matching region that matches the first feature picture from the first image picture, where the first feature picture belongs to the multiple feature pictures;

an element identification unit 9034, configured to identify the first UI element from an image area in the first image that does not belong to the feature matching area.

In some embodiments of the present invention, referring to fig. 9-c, the image matching unit 9032 includes:

a feature extraction subunit 90321, configured to extract a first feature point from the first image picture, and extract a second feature point from each feature picture in the feature database;

a similarity calculation subunit 90322, configured to perform similarity matching on the first feature point and the second feature point, so as to obtain a similarity matching result;

a determining subunit 90323, configured to determine whether the first image picture and the feature picture match according to a similarity matching result.

In some embodiments of the present invention, referring to fig. 9-d, the region matching unit 9033 includes:

a diagonal coordinate determination subunit 90331, configured to determine two diagonal coordinates that match the second feature point from among the plurality of first feature points of the first image;

a length and width calculation subunit 90332, configured to obtain the length and width of the feature matching region according to a pixel difference between the two diagonal coordinates;

a screenshot coordinate obtaining subunit 90333, configured to obtain a screenshot coordinate from the first image according to the two diagonal coordinates, the length and the width of the feature matching region, and the length and the width of the first feature picture;

a matching region determining subunit 90334, configured to determine the feature matching region according to the screenshot coordinates, and the length and width of the feature matching region.

In some embodiments of the present invention, the frame clipping module 902 is specifically configured to clip the first image frame from the image data stream according to a preset clipping period.

In some embodiments of the present invention, referring to fig. 9-e, the apparatus 900 for acquiring an image frame, as compared to that shown in fig. 9-a, further includes:

a sending module 905, configured to, after the area screenshot module cuts the interface edge area where the first UI element is located from the first image frame to obtain a second image frame, send the second image frame to an information server, and the information server updates the second image frame to a cover image of the image data stream.

As can be seen from the above description of the embodiment of the present invention, an image data stream including multiple frames of image frames is first obtained, a first image frame is then captured from the image data stream, a first user interface UI element is then identified from the first image frame, where the first UI element is a UI element that is located at an interface edge in the first image frame and whose element position remains unchanged within a preset time period, and finally an interface edge area where the first UI element is located is captured from the first image frame, so as to obtain a second image frame, where the second image frame is an image area remaining after the interface edge area is captured from the first image frame. According to the embodiment of the invention, the first UI element which is positioned at the edge of the interface and the element position of which is kept unchanged in the preset time period can be identified from the first image, and the interface edge area where the first UI element is positioned is cut off from the first image, so that the interface edge area where the first UI element is positioned is not included in the obtained second image, and the second image keeps the image of the interface non-edge area which is pulled down, and the screenshot content which is more closely related to the image data flow can be intuitively displayed through the second image, so that the screenshot effect and the screenshot quality of the image are improved, and the effective information content included by the front cover content is improved.

As shown in fig. 10, for convenience of description, only the parts related to the embodiment of the present invention are shown, and details of the specific technology are not disclosed, please refer to the method part of the embodiment of the present invention. The terminal may be any terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a POS (Point of sales), a vehicle-mounted computer, etc., taking the terminal as the mobile phone as an example:

fig. 10 is a block diagram showing a partial structure of a cellular phone related to a terminal provided by an embodiment of the present invention. Referring to fig. 10, the cellular phone includes: radio Frequency (RF) circuit 1010, memory 1020, input unit 1030, display unit 1040, sensor 1050, audio circuit 1060, wireless fidelity (WiFi) module 1070, processor 1080, and power source 1090. Those skilled in the art will appreciate that the handset configuration shown in fig. 10 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 10:

RF circuit 1010 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, for processing downlink information of a base station after receiving the downlink information to processor 1080; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit 1010 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1010 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.

The memory 1020 can be used for storing software programs and modules, and the processor 1080 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1020. The memory 1020 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1020 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 1030 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1030 may include a touch panel 1031 and other input devices 1032. The touch panel 1031, also referred to as a touch screen, may collect touch operations by a user (e.g., operations by a user on or near the touch panel 1031 using any suitable object or accessory such as a finger, a stylus, etc.) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1031 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1080, and can receive and execute commands sent by the processor 1080. In addition, the touch panel 1031 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1030 may include other input devices 1032 in addition to the touch panel 1031. In particular, other input devices 1032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, a joystick, or the like.

The display unit 1040 may be used to display information input by a user or information provided to the user and various menus of the cellular phone. The Display unit 1040 may include a Display panel 1041, and optionally, the Display panel 1041 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1031 can cover the display panel 1041, and when the touch panel 1031 detects a touch operation on or near the touch panel 1031, the touch operation is transmitted to the processor 1080 to determine the type of the touch event, and then the processor 1080 provides a corresponding visual output on the display panel 1041 according to the type of the touch event. Although in fig. 10, the touch panel 1031 and the display panel 1041 are two separate components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1031 and the display panel 1041 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 1050, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1041 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1041 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1060, speaker 1061, microphone 1062 may provide an audio interface between the user and the handset. The audio circuit 1060 can transmit the electrical signal converted from the received audio data to the speaker 1061, and the electrical signal is converted into a sound signal by the speaker 1061 and output; on the other hand, the microphone 1062 converts the collected sound signal into an electrical signal, which is received by the audio circuit 1060 and converted into audio data, which is then processed by the audio data output processor 1080 and then sent to, for example, another cellular phone via the RF circuit 1010, or output to the memory 1020 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help the user to send and receive e-mail, browse web pages, access streaming media, etc. through the WiFi module 1070, which provides wireless broadband internet access for the user. Although fig. 10 shows the WiFi module 1070, it is understood that it does not belong to the essential constitution of the handset, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1080 is a control center of the mobile phone, connects various parts of the whole mobile phone by using various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1020 and calling data stored in the memory 1020, thereby integrally monitoring the mobile phone. Optionally, processor 1080 may include one or more processing units; preferably, the processor 1080 may integrate an application processor, which handles primarily the operating system, user interfaces, applications, etc., and a modem processor, which handles primarily the wireless communications. It is to be appreciated that the modem processor described above may not be integrated into processor 1080.

The handset also includes a power source 1090 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 1080 via a power management system to manage charging, discharging, and power consumption via the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.

In the embodiment of the present invention, the processor 1080 included in the terminal further has a flow of controlling and executing the above acquiring method of the image picture executed by the terminal.

Fig. 11 is a schematic diagram of a server 1100 according to an embodiment of the present invention, where the server 1100 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1122 (e.g., one or more processors) and a memory 1132, and one or more storage media 1130 (e.g., one or more mass storage devices) for storing applications 1142 or data 1144. Memory 1132 and storage media 1130 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 1130 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 1122 may be provided in communication with the storage medium 1130 to execute a series of instruction operations in the storage medium 1130 on the server 1100.

The server 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input-output interfaces 1158, and/or one or more operating systems 1141, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

The steps of the method for acquiring an image frame performed by the server in the above embodiment may be based on the server structure shown in fig. 11.

It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus necessary general hardware, and may also be implemented by special hardware including special integrated circuits, special CPUs, special memories, special components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, the implementation of a software program is a more preferable embodiment for the present invention. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

In summary, the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the above embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the above embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An image frame acquisition method, comprising:

identifying a first User Interface (UI) element from the first image picture, wherein the first UI element is a UI element which is positioned at the edge of an interface in the first image picture and the position of the element is kept unchanged within a preset time period;

and cutting off the interface edge area where the first user interface UI element is located from the first image picture to obtain a second image picture, wherein the second image picture is the image area left after the interface edge area is cut off from the first image picture.

2. The method of claim 1, wherein the identifying a first User Interface (UI) element from the first image screen comprises:

acquiring a feature database, wherein the feature database comprises a plurality of feature pictures;

matching the first image picture with a plurality of characteristic pictures in the characteristic database respectively;

when a first feature picture in the feature database is matched with the first image picture, determining a feature matching area matched with the first feature picture from the first image picture, wherein the first feature picture belongs to the plurality of feature pictures;

and identifying the first user interface UI element from an image area which does not belong to the characteristic matching area in the first image picture.

3. The method according to claim 2, wherein the matching the first image frame with the plurality of feature pictures in the feature database respectively comprises:

extracting a first feature point from the first image picture, and extracting a second feature point from each feature picture in the feature database;

and determining whether the first image picture and the characteristic picture are matched or not according to the similarity matching result.

4. The method according to claim 3, wherein the determining a feature matching region from the first image picture matching the first feature picture comprises:

acquiring the length and the width of the feature matching area according to the pixel difference between the two diagonal coordinates;

acquiring screenshot coordinates from the first image picture according to the two diagonal coordinates, the length and the width of the feature matching area and the length and the width of the first feature picture;

5. The method according to claim 2, wherein the plurality of feature pictures in the feature database respectively correspond to screenshot pictures of the application program in different application scenes.

6. The method of any of claims 1 to 5, wherein said truncating a first image picture from the image data stream comprises:

and intercepting a first image picture from the image data stream at regular time according to a preset interception period.

7. The method according to any one of claims 1 to 5, wherein after the interface edge region where the first UI element is located is cut off from the first image frame to obtain a second image frame, the method further comprises:

and sending the second image picture to an information server, and updating the second image picture into a cover image of the image data stream by the information server.

8. An apparatus for acquiring an image frame, comprising:

the picture identification module is used for identifying a first user interface UI element from the first image picture, wherein the first user interface UI element is a UI element which is positioned at the edge of an interface in the first image picture and the element position of which is kept unchanged within a preset time period;

and the area screenshot module is used for intercepting the interface edge area where the first user interface UI element is located from the first image picture to obtain a second image picture, and the second image picture is the image area left after the interface edge area is intercepted from the first image picture.

9. The apparatus of claim 8, wherein the picture recognition module comprises:

the device comprises a feature picture loading unit, a feature database and a feature image processing unit, wherein the feature picture loading unit is used for acquiring a feature database which comprises a plurality of feature pictures;

the image matching unit is used for respectively matching the first image picture with a plurality of feature pictures in the feature database;

the region matching unit is used for determining a feature matching region matched with a first feature picture from the first image picture when the first feature picture in the feature database is matched with the first image picture, wherein the first feature picture belongs to the plurality of feature pictures;

an element identification unit configured to identify the first user interface UI element from an image area in the first image screen that does not belong to the feature matching area.

10. The apparatus of claim 9, wherein the image matching unit comprises:

the characteristic extraction subunit is used for extracting a first characteristic point from the first image picture and extracting a second characteristic point from each characteristic picture in the characteristic database;

a similarity calculation subunit, configured to perform similarity matching on the first feature point and the second feature point to obtain a similarity matching result;

and the determining subunit is used for determining whether the first image picture and the feature picture are matched or not according to the similarity matching result.

11. The apparatus of claim 10, wherein the region matching unit comprises:

a diagonal coordinate determination subunit configured to determine two diagonal coordinates that match the second feature point from among the plurality of first feature points of the first image screen;

the length and width calculation subunit is used for acquiring the length and the width of the feature matching area according to the pixel difference between the two diagonal coordinates;

a screenshot coordinate obtaining subunit, configured to obtain a screenshot coordinate from the first image frame according to the two diagonal coordinates, the length and the width of the feature matching area, and the length and the width of the first feature picture;

and the matching area determining subunit is used for determining the feature matching area according to the screenshot coordinate and the length and width of the feature matching area.

12. The apparatus according to claim 9, wherein the plurality of feature pictures in the feature database respectively correspond to screenshot pictures of the application program in different application scenes.

13. The apparatus according to any of the claims 8 to 12, wherein the picture-clipping module is specifically configured to clip a first image picture from the image data stream according to a preset clipping period timing.

14. The apparatus according to any one of claims 8 to 12, wherein the apparatus for acquiring the image frame further comprises:

and the sending module is used for cutting off the interface edge area where the first user interface UI element is located from the first image picture by the area screenshot module to obtain a second image picture, then sending the second image picture to an information server, and updating the second image picture into a cover image of the image data stream by the information server.

15. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1 to 7.

16. An apparatus for acquiring an image frame, comprising: a processor and a memory;

the memory to store instructions;

the processor, configured to execute the instructions in the memory, to perform the method of any of claims 1 to 7.