CN115599950A

CN115599950A - Image-based multimedia file display method, device, equipment and storage medium

Info

Publication number: CN115599950A
Application number: CN202110723291.7A
Authority: CN
Inventors: 徐玉伟
Original assignee: Beijing Anyun Century Technology Co Ltd
Current assignee: Beijing Anyun Century Technology Co Ltd
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2023-01-13

Abstract

The invention discloses a multimedia file display method, a device, equipment and a storage medium based on images, wherein the method comprises the following steps: determining image category information according to the target image; extracting the features of the target image according to the image category information to obtain image feature information; determining corresponding situation information according to the image characteristic information; determining the emotional portrait of the current user according to the context information; and searching a corresponding multimedia file to be displayed based on the emotional portrait, and displaying the multimedia file to be displayed. Compared with the prior art, the method only stores the image and does not process the image, and the corresponding context information is determined according to the image characteristic information of the target image, then the emotion portrait of the current user is determined according to the context information, and finally the corresponding multimedia file is recommended according to the emotion portrait of the current user, so that the user experience is improved while the user is reminded of the image, and the use viscosity of the terminal equipment is increased.

Description

Image-based multimedia file display method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a method, a device, equipment and a storage medium for displaying a multimedia file based on an image.

Background

In daily life, a user can shoot more images, in the prior art, the shot images are only stored so that the user can browse or search conveniently, the images cannot be processed, and the user cannot remember the images when looking over the images, so that the user can shoot the images when looking over the images, and therefore, the technical problem to be solved urgently is how to increase the use viscosity of the terminal equipment while reminding the user of the image.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a method, a device, equipment and a storage medium for displaying a multimedia file based on an image, and aims to solve the technical problem of increasing the use viscosity of terminal equipment while calling a user to recall the image.

In order to achieve the above object, the present invention provides a method for displaying a multimedia file based on an image, the method for displaying a multimedia file based on an image comprising the steps of:

determining image category information according to the target image;

extracting the features of the target image according to the image category information to obtain image feature information;

determining corresponding situation information according to the image characteristic information;

determining the emotional portrait of the current user according to the situation information;

and searching a corresponding multimedia file to be displayed based on the emotional portrait, and displaying the multimedia file to be displayed.

Optionally, before the step of determining the image category information according to the target image, the method further includes:

determining the display duration of the image to be processed in the preset display area;

judging whether the display duration is greater than a preset display threshold or not;

and when the display duration is greater than the preset display threshold, taking the image to be processed as a target image.

determining the display duration of a video to be processed in a preset display area;

when the display duration is greater than a preset display threshold, extracting a plurality of frames of images from the video to be processed;

and determining a target image according to the multi-frame image.

Optionally, the step of extracting multiple frames of images from the video to be processed includes:

when the sliding operation behavior of a current user is received, determining a preset extraction rule according to the sliding operation behavior;

and processing the video to be processed according to the preset extraction rule to obtain a plurality of frames of images.

Optionally, the step of determining image category information according to the target image includes:

carrying out gray level processing on the target image to obtain a target gray level image;

determining image subject information of the target gray image;

and determining image category information according to the image subject information.

Optionally, the step of determining the image subject information of the target grayscale image includes:

determining position information of a plurality of objects in the target gray level image;

determining corresponding object area occupation ratios according to the plurality of object position information;

and determining image subject information according to the object area ratio.

Optionally, the step of extracting features of the target image according to the image category information to obtain image feature information includes:

acquiring pixel point information of the target image;

dividing the target image according to the image category information and the pixel point information to obtain a plurality of image areas;

and extracting the features of the target image according to the image areas to obtain image feature information.

Optionally, the step of determining corresponding context information according to the image feature information includes:

analyzing the image characteristic information to obtain a plurality of image labels;

and splicing the plurality of image labels to obtain the situation information of the target image.

Optionally, the step of determining an emotion figure of the current user according to the context information includes:

judging whether an audio signal exists in the video to be processed;

when audio signals exist in the video to be processed, extracting audio information according to the audio signals;

converting the audio information into character information;

analyzing the character information to obtain audio keywords;

and determining the emotional portrait of the current user according to the context information and the audio key words.

Optionally, the step of searching for a corresponding multimedia file to be displayed based on the emotion portrait and displaying the multimedia file to be displayed includes:

determining a corresponding multimedia file type according to the emotional portrait;

and determining a multimedia file to be displayed according to the multimedia file type, and displaying the multimedia file to be displayed.

Optionally, the step of determining the multimedia file to be displayed according to the multimedia file type includes:

matching a plurality of corresponding multimedia files from a preset multimedia file mapping relation table according to the multimedia file types, wherein the preset multimedia file mapping relation table comprises a plurality of multimedia file types and a plurality of multimedia files;

respectively obtaining the work scores corresponding to the multimedia files;

and selecting the multimedia files to be displayed from the plurality of multimedia files according to the work scores.

Optionally, the step of selecting a multimedia file to be displayed from the plurality of multimedia files according to the composition score includes:

judging whether the score of the work is larger than a preset threshold value or not;

and when the composition score is larger than the preset threshold value, selecting the multimedia file corresponding to the composition score from the plurality of multimedia files, and taking the multimedia file as the multimedia file to be displayed.

In addition, to achieve the above object, the present invention further provides an image-based multimedia file presentation apparatus, including:

the determining module is used for determining image category information according to the target image;

the extraction module is used for extracting the features of the target image according to the image category information to obtain image feature information;

the determining module is further configured to determine corresponding context information according to the image feature information;

the determining module is further used for determining the emotional portrait of the current user according to the context information;

and the display module is used for searching the corresponding multimedia file to be displayed based on the emotional portrait and displaying the multimedia file to be displayed.

Optionally, the determining module is further configured to determine a display duration of the to-be-processed image in the preset display area;

the determining module is further configured to determine whether the display duration is greater than a preset display threshold;

the determining module is further configured to use the image to be processed as a target image when the display duration is greater than the preset display threshold.

Optionally, the determining module is further configured to determine a display duration of the video to be processed in the preset display area;

the determining module is further configured to extract a plurality of frames of images from the video to be processed when the display duration is greater than a preset display threshold;

the determining module is further configured to determine a target image according to the multi-frame image.

Optionally, the determining module is further configured to determine a preset extraction rule according to the sliding operation behavior when the sliding operation behavior of the current user is received;

the determining module is further configured to process the video to be processed according to the preset extraction rule, so as to obtain a plurality of frames of images.

Optionally, the determining module is further configured to perform gray processing on the target image to obtain a target gray image;

the determining module is further configured to determine image subject information of the target grayscale image;

the determining module is further configured to determine image category information according to the image subject information.

Optionally, the determining module is further configured to determine position information of a plurality of objects in the target grayscale image;

the determining module is further configured to determine corresponding object area ratios according to the plurality of object position information;

the determining module is further used for determining image subject information according to the object area ratio.

In addition, in order to achieve the above object, the present invention further provides an image-based multimedia file display apparatus, including: a memory, a processor and a video-based multimedia file presentation program stored on the memory and executable on the processor, the video-based multimedia file presentation program configured to implement the steps of the video-based multimedia file presentation method as described above.

In addition, to achieve the above object, the present invention further provides a storage medium, in which an image-based multimedia file presentation program is stored, and the image-based multimedia file presentation program implements the steps of the image-based multimedia file presentation method as described above when executed by a processor.

The method comprises the steps of firstly determining image category information according to a target image, extracting features of the target image according to the image category information to obtain image feature information, then determining corresponding context information according to the image feature information, then determining emotion portrait of a current user according to the context information, finally searching a corresponding multimedia file to be displayed based on the emotion portrait, and displaying the multimedia file to be displayed. Compared with the prior art, the method only stores the image and does not process the image, and the corresponding context information is determined according to the image characteristic information of the target image, then the emotion portrait of the current user is determined according to the context information, and finally the corresponding multimedia file is recommended according to the emotion portrait of the current user, so that the user experience is improved while the user is reminded of the image, and the use viscosity of the terminal equipment is increased.

Drawings

FIG. 1 is a schematic diagram of a video-based multimedia file presentation apparatus in a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a first embodiment of a method for displaying a multimedia file based on an image according to the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a method for displaying a multimedia file based on an image according to the present invention;

FIG. 4 is a flowchart illustrating a method for displaying a multimedia file based on an image according to a third embodiment of the present invention;

FIG. 5 is a block diagram of a first embodiment of an image-based multimedia file presentation apparatus according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of an image-based multimedia file presentation apparatus in a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the image-based multimedia file presentation apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the image-based multimedia file presentation apparatus and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, the memory 1005, which is a storage medium, may include an operating system, a data storage module, a network communication module, a user interface module, and a video-based multimedia file presentation program.

In the video-based multimedia file presentation apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the video-based multimedia file presentation apparatus according to the present invention may be disposed in the video-based multimedia file presentation apparatus, and the video-based multimedia file presentation apparatus calls the video-based multimedia file presentation program stored in the memory 1005 through the processor 1001 and executes the video-based multimedia file presentation method according to the embodiment of the present invention.

An embodiment of the present invention provides a method for displaying a multimedia file based on an image, and referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the method for displaying a multimedia file based on an image according to the present invention.

In this embodiment, the image-based multimedia file display method includes the following steps:

step S10: and determining image category information according to the target image.

It is easy to understand that the execution subject of this embodiment may be a communication device having functions of image processing, data processing, network communication, program operation, and the like, the device may also be capable of displaying multimedia files, and may also be other computer devices having similar functions, and the present embodiment is not limited thereto.

The target image can be understood as a certain image needing to be processed and shot by a user in an album of the terminal equipment, and can also be an image spliced by extracting multi-frame images from a certain video needing to be processed and shot by the user in the album of the terminal equipment.

In order to obtain an accurate target image from a plurality of images, before the step of determining image type information according to the target image, the method further comprises the following steps: determining the display duration of the image to be processed in the preset display area; judging whether the display duration is greater than a preset display threshold value or not; and when the display duration is greater than a preset display threshold, taking the image to be processed as a target image.

The preset display area is an interface area for displaying an image, the image to be processed is a single image browsed or viewed by a current user, the display duration is the duration for displaying the image in the interface area, the preset display threshold value can be set by the user in a self-defined manner, can be 40s, can also be 1min, and the like, and the embodiment is not limited.

In order to obtain an accurate target image from a plurality of videos, the method further includes, before the step of determining image type information according to the target image: determining the display duration of a video to be processed in a preset display area; when the display duration is greater than a preset display threshold, extracting a plurality of frames of images from the video to be processed; and determining a target image according to the multi-frame image.

The video to be processed is a single video browsed or checked by a current user, and multiple frames of images exist in the video.

For example, the time length of the video B browsed by the current user displayed in the interface area is 10s, the preset display threshold value is 8s, the display time length of the video B is greater than the preset display threshold value, multi-frame images are extracted from the video B, and the target image is determined according to the multi-frame images.

The method for extracting the multi-frame image from the video to be processed comprises the following steps: when the sliding operation behavior of the current user is received, determining a preset extraction rule according to the sliding operation behavior; and processing the video to be processed according to a preset extraction rule to obtain a multi-frame image.

The sliding operation behavior may be understood as an operation of sliding a progress bar when a user watches a video, the preset extraction rule is to extract an image with a preset frame number according to an end point of the user sliding progress bar, and the preset frame number may be set by the user in a user-defined manner, may be 3 frames, may also be 5 frames, and the like.

For example, the total duration of the video B to be processed is 20s, and when the user slides the progress bar to 10s while watching the video, an image with a preset frame number is extracted from 10s of the video B to be processed, and the target image is determined according to the image with the preset frame number.

In this embodiment, when the sliding operation behavior of the current user is not accepted, it is necessary to acquire an image with a preset frame number from the start point of the video to be processed, acquire an image with a preset frame number from the middle position of the video to be processed, acquire an image with a preset frame number from the end point of the video to be processed, and then determine the target image according to the acquired image with the preset frame number.

The processing mode for determining the target image according to the multi-frame images may be stitching the multi-frame images to obtain the target image.

In order to accurately obtain the image category information of the target image, the step of determining the image category information according to the target image comprises the following steps: carrying out gray level processing on the target image to obtain a target gray level image; determining image subject information of the target gray image; and determining image type information according to the image main body information.

It should be understood that a gray scale image is an image with only one sample color per pixel. Such images are typically displayed in gray scale from the darkest black to the brightest white. The image subject information is a main component of the image, and the image category information includes a human scene category, a scenery category and a character category.

The step of determining the image subject information of the target gray image includes: determining position information of a plurality of objects in the target gray level image; determining corresponding object area occupation ratios according to the position information of the plurality of objects; and determining image subject information according to the area ratio of the object.

The object position information is the position information of the object in the image, and can be the center, the left lower part, the right upper part and the like; the ratio of the object area to the image area may be one third, one fourth, or the like.

For example, the image has three objects, namely sky, building and person, and is divided into ten equal parts, the sky area ratio is two tenths, the building area ratio is four tenths, the person area ratio is four tenths, and the preset area ratio threshold is one tenth, so that the sky area ratio, the building area ratio and the person area ratio are all greater than the preset area ratio threshold, the sky, the building and the person are image subjects, and the corresponding image category information is a scene category.

Step S20: and extracting the features of the target image according to the image category information to obtain image feature information.

The method comprises the steps of obtaining pixel point information of a target image, dividing the target image according to image category information and the pixel point information to obtain a plurality of image areas, and extracting features of the target image according to the image areas to obtain image feature information.

For example, the image type information is a scene type, and a person, a building and a sky exist in a target image, person pixel point information corresponding to the person, building pixel point information corresponding to the building and sky pixel point information corresponding to the sky are obtained, then the target image is divided according to the person pixel point information, the building pixel point information and the sky pixel point information to obtain a person region, a building region and a sky region, finally, person characteristic information is extracted from the person region, building characteristic information is extracted from the building region, and sky characteristic information is extracted from the sky region, wherein the person characteristic information includes face characteristic information corresponding to the person, the building characteristic information includes building shape information and building color information, and the sky characteristic information includes sky color information and the like.

Step S30: and determining corresponding situation information according to the image characteristic information.

For example, the image feature information includes character feature information, building feature information, and sky feature information, wherein the character feature information includes facial feature information corresponding to a character, the building feature information includes building shape information and building color information, the sky feature information includes sky color information, the character expression information can be determined according to the facial feature information, the building name and position information can be determined according to the building shape information and the building color information, the weather condition information can be determined according to the sky color information, and finally, the context information corresponding to the target image is generated according to the expression information corresponding to the character, the building name and position information, and the weather condition information.

The step of determining corresponding context information according to the image feature information comprises the following steps: analyzing the image characteristic information to obtain a plurality of image labels; and splicing the plurality of image tags to obtain the situation information of the target image.

For example, the character feature information is character facial feature information, the character facial feature information is analyzed, character facial expression information is obtained, and if the character facial expression information is surprised, the image tag shows a surprised expression for the visitor; for example, if the building characteristic information is building shape information and building color information, and the building shape information and the building color information are analyzed to obtain a building name and location information, the image tag indicates that a certain building exists at a certain location, and the building name is a certain building; for example, the sky feature information is sky color information, the sky color information is analyzed to obtain weather condition information, and the image tag indicates that the weather condition is a cloudy day or a sunny day.

For example, if the video tag indicates that the visitor shows a surprise expression, a building C exists in the center of city a, the weather condition is cloudy, the shooting time of the target video is 2019, 6, month, 11, and 18 pm, the context information of the target video is 00 at 2019, 6, month, 11, and 18 pm, the weather condition is cloudy, and the visitor shows a surprise expression before the building C in the center of city a.

Step S40: and determining the emotional portrait of the current user according to the context information.

The emotion portrait of the user can be an expression state image of the user, psychological state information of the user can be determined according to the situation information, then an emotion portrait of the current user can be determined according to the psychological state information, the emotion portrait can be a frightened portrait of the user, a cheerful portrait of the user and the like.

The step of determining the emotional portrait of the current user according to the context information comprises the following steps: judging whether audio signals exist in the video to be processed or not; when audio signals exist in the video to be processed, extracting audio information according to the audio signals; converting the audio information into text information; analyzing the character information to obtain audio keywords; and determining the emotional portrait of the current user according to the context information and the audio key words.

For example, if an audio signal exists in the video C to be processed, audio information is extracted according to the audio signal, if the audio information is that i is very afraid at this time, the text information is that "i is very afraid at this time", then "i is very afraid at this time" is analyzed, and an audio keyword is obtained as "afraid", if the context information is that 13 pm at 8/2020, the weather condition is rainy, and the visitor at the door of the mall shows a frightened expression, the weather condition is rainy, and the visitor at the door of the mall B shows a frightened expression "and the audio keyword" afraid portrait ", the emotion is determined as frightened portrait of the current user according to the context information of" 13 pm at 8/2020.

Step S50: and searching a corresponding multimedia file to be displayed based on the emotional portrait, and displaying the multimedia file to be displayed.

The multimedia files to be displayed can be a plurality of multimedia files and also can be a single multimedia file, the multimedia files can be movie works, music and scenic spot propaganda films, and the like, wherein the multimedia files can be classified into recording films, war films, history films, biographical films, sports films, science fiction films, magic films, fantasy films, literature films, music films, singing and dance films, cartoon films, western films, swordsman films, ancient films, action films, love films, drama films, comedy films, family films, ethics films, horror films, thrillers, adventure films, crime films, suspense films and the like.

For example, if the emotional portrait is a thriller portrait, acquiring shooting time of a target image, determining a court online showing time interval according to the shooting time of the target image, acquiring a plurality of multimedia files corresponding to the court online showing time interval, searching corresponding thrillers or thrillers from the plurality of multimedia files according to the thriller portrait, and pushing the thrillers or thrillers to the user.

The method comprises the steps of searching corresponding multimedia files to be displayed based on emotional portraits and displaying the multimedia files to be displayed, and comprises the following steps: determining a corresponding multimedia file type according to the emotional portrait; and determining the multimedia file to be displayed according to the type of the multimedia file, and displaying the multimedia file to be displayed.

In this embodiment, first, image category information is determined according to a target image, feature extraction is performed on the target image according to the image category information to obtain image feature information, then, corresponding context information is determined according to the image feature information, then, an emotion portrait of a current user is determined according to the context information, finally, a corresponding multimedia file to be displayed is searched based on the emotion portrait, and the multimedia file to be displayed is displayed. Compared with the prior art, the method only stores the image and does not process the image, and the corresponding context information is determined according to the image characteristic information of the target image, then the emotion portrait of the current user is determined according to the context information, and finally the corresponding multimedia file is recommended according to the emotion portrait of the current user, so that the user experience is improved while the user is reminded of the image, and the use viscosity of the terminal equipment is increased.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for displaying a multimedia file based on an image according to a second embodiment of the present invention.

Based on the first embodiment, in this embodiment, before the step S10, the method further includes:

step S01: and determining the display duration of the image to be processed in the preset display area.

The preset display area is an interface area for displaying images, the to-be-processed images are single images browsed or checked by a current user, the display time is the time for displaying the images in the interface area, and the display time can be 3s, 2s and the like.

Step S02: and judging whether the display duration is greater than a preset display threshold value.

The preset display threshold may be set by a user in a customized manner, and may be 40s, or 1min, and the embodiment is not limited.

Step S03: and when the display duration is greater than the preset display threshold, taking the image to be processed as a target image.

In a specific implementation, for example, when the duration of the image a browsed by the current user being displayed in the interface area is 41s and the preset display threshold is 40s, the duration of the image a being displayed is greater than the preset display threshold, and the image a is taken as the target image.

It should be further understood that, when the display duration of the to-be-processed image in the preset display area is longer than the preset display duration, the viewing operation behavior of the user for the image needs to be acquired, so that the user is prevented from only displaying the image in the interface area and not viewing the image, if the viewing operation behavior meets a preset viewing condition, the to-be-processed image is taken as a target image, and the preset viewing condition is to perform amplification or reduction processing on the image.

For example, if the time length of the image a browsed by the current user and displayed in the interface area is 41s and the preset display threshold is 40s, the display time length of the image a is greater than the preset display threshold, the watching operation behavior of the user is obtained, and when the watching operation behavior of the user is not detected, the image a cannot be used as the target image; and if the display duration of the image B is greater than a preset display threshold, acquiring the watching operation behavior of the user, and when the watching operation behavior of the user is detected, taking the image B as a target image and the like.

In this embodiment, firstly, the display duration of the to-be-processed image in the preset display area is determined, then, whether the display duration is greater than a preset display threshold is judged, when the display duration is greater than the preset display threshold, the to-be-processed image is taken as a target image, compared with the prior art that the image to be checked is directly taken as the target image to be processed, the display duration of the to-be-processed image needs to be acquired in this embodiment, and then, the target image is determined according to the display duration, so that the target image is accurately acquired, and the user experience is improved.

Referring to fig. 4, fig. 4 is a flowchart illustrating a method for displaying a multimedia file based on an image according to a third embodiment of the present invention.

Based on the first embodiment, in this embodiment, the step S50 further includes:

step S501: and determining the corresponding multimedia file type according to the emotion portrait.

The processing mode for determining the corresponding multimedia file type according to the emotion portraits can be that emotion information is determined according to the emotion portraits, and then the corresponding multimedia file type is searched from a preset type mapping relation table according to the emotion information, wherein the preset type mapping relation table comprises a plurality of emotion information and a plurality of multimedia file types.

For example, if the emotion information is happy, the multimedia file type corresponding to the emotion information is a comedy type; for example, if the emotion information is frightened, the multimedia file type of the emotion information poison is a thriller type, and the like.

Step S502: and determining a multimedia file to be displayed according to the multimedia file type, and displaying the multimedia file to be displayed.

Matching a plurality of corresponding multimedia files from a preset multimedia file mapping relation table according to the multimedia file types, wherein the preset multimedia file mapping relation table comprises a plurality of multimedia file types and a plurality of multimedia files, respectively obtaining work scores corresponding to the plurality of multimedia files, and selecting multimedia to be displayed from the plurality of multimedia files according to the work scores.

The step of selecting the multimedia files to be displayed from the plurality of multimedia files according to the work scores comprises the following steps: judging whether the score of the work is larger than a preset threshold value or not; and when the work score is larger than a preset threshold value, selecting a multimedia file corresponding to the work score from the plurality of multimedia files, and taking the multimedia file as the multimedia file to be displayed.

The preset threshold value can be set by a user in a self-defined way, and can be 9.0, 8.5 and the like.

For example, the plurality of multimedia files are A, B and C, the score of the work corresponding to the multimedia file a is 9.2, the score of the work corresponding to the multimedia file B is 8.9, and the score of the work corresponding to the media file C is 9.4, so that the multimedia file a and the multimedia file C are greater than a preset threshold value, the multimedia file a and the multimedia file C can be pushed, and the like.

In the embodiment, the corresponding multimedia file type is determined according to the emotion portrait, the multimedia file to be displayed is determined according to the multimedia file type, and the multimedia file to be displayed is displayed.

Referring to fig. 5, fig. 5 is a block diagram illustrating a first embodiment of an image-based multimedia file presentation apparatus according to the present invention.

As shown in fig. 5, the image-based multimedia file display apparatus according to the embodiment of the present invention includes:

the determining module 5001 is configured to determine image category information according to the target image.

In order to obtain an accurate target image from a plurality of images, before the step of determining image type information according to the target image, the method further comprises: determining the display duration of the image to be processed in the preset display area; judging whether the display duration is greater than a preset display threshold value or not; and when the display duration is greater than a preset display threshold, taking the image to be processed as a target image.

The preset display area is an interface area for displaying an image, the image to be processed is a single image browsed or viewed by a current user, the display duration is duration of displaying the image in the interface area, the preset display threshold value can be set by the user in a self-defined manner, can be 40s, can also be 1min, and the like, and the embodiment is not limited.

The video to be processed is a single video browsed or checked by a current user, and a plurality of frames of images exist in the video.

The processing mode of determining the target image according to the multi-frame images can be splicing the multi-frame images to obtain the target image.

In order to accurately obtain the image category information of the target image, the step of determining the image category information according to the target image comprises the following steps: carrying out gray level processing on the target image to obtain a target gray level image; determining image subject information of the target gray image; and determining image category information according to the image subject information.

The step of determining the image subject information of the target gray image includes: determining position information of a plurality of objects in the target gray level image; determining corresponding object area occupation ratios according to the position information of the plurality of objects; and determining image main body information according to the area ratio of the object.

An extracting module 5002, configured to perform feature extraction on the target image according to the image category information to obtain image feature information.

The determining module 5001 is further configured to determine corresponding context information according to the image feature information.

The determining module 5001 is further configured to determine an emotion image of the current user according to the context information.

A presentation module 5003 configured to search a corresponding multimedia file to be presented based on the emotion portrait, and present the multimedia file to be presented.

The multimedia files to be displayed can be a plurality of multimedia files and can also be a single multimedia file, the multimedia files can be movie works, music and scenic spot propaganda films, and the like, wherein the multimedia files can be classified into recording films, war films, history films, biographical films, sports films, science fiction films, magic films, fantasy films, artistic films, music films, dance films, cartoon films, western films, martial arts films, ancient films, action films, love films, drama films, comedy films, family films, ethics films, horror films, thrillers, adventure films, crime films, suspense films and the like.

For example, if the emotional portrait is a frightened portrait, acquiring target image shooting time, determining a court line showing time interval according to the target image shooting time, acquiring a plurality of multimedia files corresponding to the court line showing time interval, searching corresponding thrillers or thrillers from the plurality of multimedia files according to the frightened portrait, and pushing the thrillers or thrillers to the user.

The step of determining the multimedia file to be displayed according to the type of the multimedia file comprises the following steps:

matching a plurality of corresponding multimedia files from a preset multimedia file mapping relation table according to the multimedia file types, wherein the preset multimedia file mapping relation table comprises a plurality of multimedia file types and a plurality of multimedia files; respectively obtaining the work scores corresponding to the multimedia files; and selecting the multimedia to be displayed from the plurality of multimedia files according to the grading of the works.

The preset threshold may be set by a user in a customized manner, and may be 9.0, 8.5, or the like.

Further, the determining module 5001 is further configured to determine a display duration of the image to be processed in the preset display area;

the determining module 5001 is further configured to determine whether the display duration is greater than a preset display threshold;

the determining module 5001 is further configured to use the image to be processed as a target image when the display duration is greater than the preset display threshold.

Further, the determining module 5001 is further configured to determine a display duration of the video to be processed in the preset display area;

the determining module 5001 is further configured to extract a plurality of frames of images from the video to be processed when the display duration is greater than a preset display threshold;

the determining module 5001 is further configured to determine a target image according to the multi-frame image.

Further, the determining module 5001 is further configured to determine a preset extraction rule according to the sliding operation behavior when the sliding operation behavior of the current user is received;

the determining module 5001 is further configured to process the video to be processed according to the preset extraction rule, so as to obtain a plurality of frames of images.

Further, the determining module 5001 is further configured to perform gray processing on the target image to obtain a target gray image;

the determining module 5001 is further configured to determine image subject information of the target grayscale image;

the determining module 5001 is further configured to determine image category information according to the image subject information.

Further, the determining module 5001 is further configured to determine position information of a plurality of objects in the target grayscale image;

the determining module 5001 is further configured to determine corresponding object area ratios according to the plurality of object location information;

the determining module 5001 is further configured to determine the image subject information according to the object area ratio.

Further, the extracting module 5002 is further configured to obtain pixel point information of the target image;

the extracting module 5002 is further configured to divide the target image according to the image category information and the pixel point information to obtain a plurality of image areas;

the extracting module 5002 is further configured to perform feature extraction on the target image according to the plurality of image areas to obtain image feature information.

Further, the determining module 5001 is further configured to analyze the image feature information to obtain a plurality of image tags;

the determining module 5001 is further configured to combine the plurality of image tags to obtain context information of the target image.

Further, the determining module 5001 is further configured to determine whether an audio signal exists in the video to be processed;

the determining module 5001 is further configured to, when an audio signal exists in the video to be processed, extract audio information according to the audio signal;

the determining module 5001 is further configured to convert the audio information into text information;

the determining module 5001 is further configured to analyze the text information to obtain an audio keyword;

the determining module 5001 is further configured to determine an emotion image of the current user according to the context information and the audio keyword.

Further, determining a corresponding multimedia file type according to the emotion portraits;

Further, the presentation module 5003 is further configured to determine a corresponding multimedia file type according to the emotion portrayal;

the presentation module 5003 is further configured to determine a multimedia file to be presented according to the multimedia file type, and present the multimedia file to be presented.

Further, the presentation module 5003 is further configured to match a plurality of corresponding multimedia files from a preset multimedia file mapping relationship table according to the multimedia file types, where the preset multimedia file mapping relationship table includes a plurality of multimedia file types and a plurality of multimedia files;

the presentation module 5003 is further configured to obtain the work scores corresponding to the multiple multimedia files respectively;

the presentation module 5003 is further configured to select a multimedia file to be presented from the plurality of multimedia files according to the work score.

Further, the display module 5003 is further configured to determine whether the score of the work is greater than a preset threshold;

the displaying module 5003 is further configured to select a multimedia file corresponding to the score of the work from the plurality of multimedia files when the score of the work is greater than the preset threshold, and use the multimedia file as a multimedia file to be displayed.

Other embodiments or specific implementation manners of the image-based multimedia file display device of the present invention may refer to the above method embodiments, and are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., a rom/ram, a magnetic disk, an optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

The invention also discloses A1, a multimedia file display method based on the image, which comprises the following steps:

determining image category information according to the target image;

A2, the method as in A1, further comprising, before the step of determining the image type information according to the target image:

A3, as in the method of A1, before the step of determining the image type information according to the target image, the method further includes:

and determining a target image according to the multi-frame image.

A4, the method as in A3, wherein the step of extracting multiple frames of images from the video to be processed includes:

The method according to any of A1 to A4, wherein the step of determining the image type information according to the target image includes:

determining image subject information of the target gray image;

A6, as in the method of A5, the step of determining the image subject information of the target gray-scale image includes:

and determining image subject information according to the object area ratio.

A7, the method according to any one of A1 to A4, wherein the step of performing feature extraction on the target image according to the image category information to obtain image feature information includes:

acquiring pixel point information of the target image;

A8, the method according to any one of A1 to A4, wherein the step of determining the corresponding context information according to the image feature information includes:

A9, the method as in A3 or A4, said step of determining an emotion figure of the current user according to the context information comprises:

judging whether an audio signal exists in the video to be processed;

converting the audio information into character information;

analyzing the character information to obtain audio keywords;

and determining the emotion portrait of the current user according to the context information and the audio key words.

A10, the method as in any one of A1-A4, wherein the step of searching for the corresponding multimedia file to be displayed based on the emotional portrait and displaying the multimedia file to be displayed comprises:

A11, as in the method of a10, the step of determining the multimedia file to be displayed according to the multimedia file type includes:

respectively obtaining the work scores corresponding to the multimedia files;

A12, as in the method of a11, the step of selecting the multimedia file to be displayed from the plurality of multimedia files according to the composition score includes:

The invention also discloses B13, a multimedia file display device based on the image, which comprises:

The device shown in the step B14, as in the step B13, wherein the determining module is further configured to determine a display duration of the to-be-processed image in the preset display area;

The device as in B13, the determining module is further configured to determine a display duration of the video to be processed in the preset display area;

The device of B16, as in B15, where the determining module is further configured to determine a preset extraction rule according to the sliding operation behavior when the sliding operation behavior of the current user is received;

B17, the device as in B13-B16, where the determining module is further configured to perform gray processing on the target image to obtain a target gray image;

B18, the apparatus as in B17, where the determining module is further configured to determine position information of a plurality of objects in the target grayscale image;

the determining module is further configured to determine image subject information according to the object area ratio.

The invention also discloses C19, a multimedia file display device based on image, the device includes: a memory, a processor and a video-based multimedia file presentation program stored on the memory and executable on the processor, the video-based multimedia file presentation program configured to implement the steps of the video-based multimedia file presentation method as described above.

The invention also discloses a storage medium and a D20, wherein the storage medium is stored with an image-based multimedia file display program, and the image-based multimedia file display program realizes the steps of the image-based multimedia file display method when being executed by a processor.

Claims

1. A multimedia file display method based on images is characterized by comprising the following steps:

determining image category information according to the target image;

2. The method of claim 1, wherein the step of determining the image type information according to the target image further comprises:

judging whether the display duration is greater than a preset display threshold value or not;

3. The method of claim 1, wherein the step of determining the image type information according to the target image further comprises:

and determining a target image according to the multi-frame image.

4. The method of claim 3, wherein the step of extracting a plurality of frames of images from the video to be processed comprises:

5. The method according to any one of claims 1-4, wherein the step of determining image classification information from the target image comprises:

determining image subject information of the target gray image;

6. The method of any one of claims 1-4, wherein the step of searching for a corresponding multimedia file to be presented based on the emotion portrait and presenting the multimedia file to be presented comprises:

7. The method of claim 6, wherein the step of determining the multimedia file to be presented based on the multimedia file type comprises:

respectively obtaining the work scores corresponding to the multimedia files;

8. An image-based multimedia file presentation apparatus, comprising:

9. An image-based multimedia file presentation apparatus, the apparatus comprising: a memory, a processor and a video-based multimedia file presentation program stored on the memory and executable on the processor, the video-based multimedia file presentation program being configured to implement the steps of the video-based multimedia file presentation method according to any one of claims 1 to 7.

10. A storage medium having stored thereon a video-based multimedia file presentation program, the video-based multimedia file presentation program when executed by a processor implementing the steps of the video-based multimedia file presentation method according to any one of claims 1 to 7.