WO2017071227A1 - 视频处理方法及系统、视频播放器与云服务器 - Google Patents

视频处理方法及系统、视频播放器与云服务器 Download PDF

Info

Publication number
WO2017071227A1
WO2017071227A1 PCT/CN2016/085011 CN2016085011W WO2017071227A1 WO 2017071227 A1 WO2017071227 A1 WO 2017071227A1 CN 2016085011 W CN2016085011 W CN 2016085011W WO 2017071227 A1 WO2017071227 A1 WO 2017071227A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
face
face image
classification database
cloud server
Prior art date
Application number
PCT/CN2016/085011
Other languages
English (en)
French (fr)
Inventor
马进
唐熊
Original Assignee
乐视控股(北京)有限公司
乐视移动智能信息技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视移动智能信息技术(北京)有限公司 filed Critical 乐视控股(北京)有限公司
Priority to US15/247,043 priority Critical patent/US20170116465A1/en
Publication of WO2017071227A1 publication Critical patent/WO2017071227A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/432Content retrieval operation from a local storage medium, e.g. hard-disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/458Scheduling content for creating a personalised stream, e.g. by combining a locally stored advertisement with an incoming stream; Updating operations, e.g. for OS modules ; time-related management operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • the present invention relates to the field of video processing technologies, and in particular, to a video processing method and system, a video player, and a cloud server.
  • the embodiment of the invention provides a video processing method and system, a video player and a cloud server, to overcome the defect of low video positioning efficiency in the prior art, so as to realize a certain person in the video. Position all the video segments of the face to improve the positioning processing efficiency of the video.
  • An embodiment of the present invention provides a video processing method, where the method includes:
  • the video information corresponding to the selected face image in the video, where the video information includes the identifier of the selected face image and at least one video of the selected face image Segment information
  • the embodiment of the invention further provides a video processing method, the method comprising:
  • the video player Receiving, by the video player, a video location request that carries the selected face image; the video location request is sent by the video player to receive the user by using a human interface module;
  • the video information corresponding to the selected face image;
  • the video information includes the identifier of the selected face image and at least one segment of the selected face image Video segment information;
  • An embodiment of the present invention further provides a video player, including:
  • a receiving module configured to receive a video positioning request that is sent by the user through the human interface module and that carries the selected face image
  • An acquiring module configured to acquire video information corresponding to the selected face image in the video in the video positioning request, where the video information includes the identifier of the selected face image and the selected face At least one piece of video segment information of the picture;
  • a display module configured to display the video information corresponding to the selected face image.
  • the embodiment of the invention further provides a cloud server, where the cloud server includes:
  • a receiving module configured to receive a video positioning request that is sent by the video player and that carries the selected face image; the video positioning request is sent by the video player to the user through the human interface module;
  • An obtaining module configured to obtain the selected face image from a pre-stored face classification database Corresponding to the video information; the video information includes an identifier of the selected face image and at least one piece of video segment information of the selected face image;
  • a sending module configured to send the video information corresponding to the selected face image to the video player, so that the video player displays the video information corresponding to the selected face image to a user.
  • the embodiment of the present invention further provides a video playing system, where the video playing system includes a video player and a cloud server, the video player is communicably connected with the cloud server, and the video player uses the video playing as described above.
  • the cloud server employs a cloud server as described above.
  • the video processing method and system, the video player and the cloud server of the embodiment of the present invention obtain the selected face image in the video positioning request by receiving the video positioning request that is sent by the user through the human interface module and carrying the selected face image. Corresponding video information in the video, and displaying the video information corresponding to the selected face image.
  • the technical solution of the embodiment of the present invention can make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in a low efficiency of video positioning, and implementing a selected person in the video.
  • the positioning of all the video information of the face image is very efficient.
  • the technical solution of the embodiment of the present invention is convenient for the user to view all the performances of the actors corresponding to the selected face image in the video, and the user experience is very good.
  • FIG. 1 is a flowchart of an embodiment of a video processing method according to an embodiment of the present invention
  • FIG. 2 is a PTS distribution diagram of a face corresponding to a face identifier according to an embodiment of the present invention
  • FIG. 3 is a flowchart of another embodiment of a video processing method according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an embodiment of a video player according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of another embodiment of a video player according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of still another embodiment of a video player according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of still another embodiment of a video player according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of an embodiment of a cloud server according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of another embodiment of a cloud server according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of an embodiment of a video playing system according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of an embodiment of a video processing method according to the present invention. As shown in FIG. 1 , the video processing method in this embodiment may specifically include the following steps:
  • the video player is the client of the video processing system.
  • the video player can be installed on a mobile terminal such as a mobile phone or a tablet computer, or can be installed on a non-mobile terminal such as a computer, that is, an ordinary terminal.
  • the client interacts with the user, and the video player receives a video positioning request that is sent by the user through the human interface module and carries the selected face image, where the human interface module can be a keyboard, a stylus, or a touch screen.
  • Information detection and receiving module and so on.
  • the information detecting and receiving module of the touch screen can detect the video positioning request sent by the user, and obtain The selected face image carried in the video location request.
  • the selected face image selected by the user in this embodiment may be a clear face photo of one of the actors selected by the user, or a face of the actor in the video screen capture.
  • the selected face image requires that the included face must be clear enough to be easily identifiable.
  • the video information in this embodiment includes at least one piece of video segment information corresponding to the selected face image and the selected face image in the video, or may further include the selected face picture.
  • all the video information corresponding to the selected face image in the video positioning request may be obtained, where each video information may include the selected video information.
  • the identifier of the face image and the at least one piece of video segment information, wherein the identifier of the selected face image is used to uniquely identify the selected face image in the video and may be the name or stage name of the corresponding actor of the selected face image. Or when the selected face picture corresponds to the actor's name or stage name is not unique in the video, other identification (ID) may be used to uniquely identify the selected face picture.
  • the video segment is a segment of the video in which the selected face image appears in the video; the video segment in which the selected face image appears in the video is a video segment; at least one video segment information is the selected person Face image A fragment of all the videos that appear in the video.
  • at least one piece of video segment information of this embodiment may include a start and end time of each video segment, that is, a start time and an end time of the video segment.
  • the video information corresponding to the selected face image may be displayed on the interface of the video player, so that the positioning of the video of the selected face image is completed.
  • the user can select a video of the selected face image that is positioned and viewed on the video player according to the displayed video information of the selected face image.
  • the video processing method of this embodiment can be applied to the positioning of all video information of any one of the actors in a video program, so that the user can view all the performances of the actor in the video.
  • the video processing method of the embodiment by receiving a video positioning request that is sent by the user through the human interface module and carrying the selected face image, obtains video information corresponding to the selected face image in the video in the video positioning request, and displays the video information.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of a certain face in the video, which results in low efficiency of video positioning, and implements a selected person in the video.
  • the positioning of all the video information of the face image, the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good. .
  • the video information corresponding to the selected face image in the video in the positioning request may include: acquiring the video information corresponding to the selected face image from the pre-stored face classification database.
  • the face classification database is pre-stored on the client side of the video player, that is, the video playing system.
  • the video player can also perform the video processing of this embodiment by itself.
  • the video processing method of the embodiment may further include: establishing a face classification, before the “following the video information corresponding to the selected face image from the pre-stored face classification database” in the foregoing embodiment.
  • database may include a plurality of face identifiers, and video information corresponding to the face corresponding to each face identifier in the video, for example, the video information may include starting and ending of each video segment of the face in the video. time.
  • the “establishing a face classification database” in the foregoing embodiment may specifically include the following steps:
  • the video is formed by concatenating images of one frame and one frame, and decoding each frame image to obtain a corresponding image.
  • the decoded image is taken as an RGB image as an example. Decoding all the frame videos of the video, a set of RGB images can be obtained.
  • a face detection algorithm is used to detect a face for each of a set of RGB images obtained in the step (1). When it is detected that the face is included in the RGB image, the face in the RGB image and the PTS in the video play of the RGB image are acquired.
  • a face time stamp database is generated according to the face obtained by the face detection in step (2) and the PTS of each face. That is, the face time stamp database includes a face and a PTS of each face in the video.
  • the face time stamp database saves the detected face corresponding to each moment in the image including the human face on the basis of time. Since one video is long, the decoded image is too much, and the time is 90 minutes.
  • all the faces in the face time stamp database obtained in step (3) may include faces of a plurality of actors, some of which are faces of different actors in different PTSs, and in this step,
  • the face identifier classifies the face.
  • each face in the face time stamp database can be identified in the order from the front to the back of the PTS.
  • the first face can be set with a face identifier.
  • the face identifier can be input by the user through the human interface module, for example, the name or stage name of the actor corresponding to the face, or other face ID, and the face identifier, the face, and the PTS of the face are stored.
  • the second face in the face timestamp database is identified, and the feature value matching algorithm is used to determine whether the face and the stored face are the same person, and if so, the face identifier Set to the stored face ID to match the faces belonging to the same person to the same face ID. If it is not the same person, set a new face identifier, and so on, all faces in the face time stamp database can be classified according to each face identifier, so that the faces belonging to the same person correspond to the same face. logo.
  • each piece of video segment information of the face corresponding to the face identifier includes a start and end time of the video segment
  • all the faces in the face time stamp database can be classified according to the face identifiers, and then, in this embodiment, according to the PTS of the face corresponding to each face identifier, A continuous PTS corresponding to the face identifier is determined. Because the video segment of the face needs the face to appear in the continuous PTS, the continuous video segment corresponding to the face identifier can be determined according to the continuous PTS corresponding to the face identifier, so that the person corresponding to the face identifier can be estimated.
  • the video segment information of the face that is, the start and end time of the video segment.
  • FIG. 2 is a PTS distribution diagram of a face corresponding to a certain face identifier according to an embodiment of the present invention.
  • the abscissa is PTS, and the ordinate is the probability that the face corresponding to the face identifier appears. 0 means no occurrence, and 1 means appearance.
  • a period of time in which the vertical axis value is 1 and the most dense ones correspond to the PTS, such as time periods 3 to 5, can be considered to satisfy the condition in which the face appears.
  • the point corresponding to the vertical axis value of 1 in FIG. 2 can be divided into several segments, and each segment represents a video segment in the actor set corresponding to the face.
  • the face profile in FIG. 2 can obtain the video segment information shown in Table 1 below.
  • each face classification database includes each face identifier, and each face identifier corresponds to The start and end time of each face in each video segment in the video. This is very convenient for video positioning based on each face in the video in the face classification database.
  • the core structure of the face classification database of this embodiment may be expressed as follows:
  • the technical solution of the present invention is described on the side of the video player, that is, the client of the video playing system.
  • the face classification database may also be on the cloud server side, as described in the following embodiments.
  • the method may further include: Each face identifier is arranged in descending order of probability of occurrence in the video.
  • each face identifier in the face classification database is arranged in descending order of probability of occurrence in the video, and a probability distribution table of the face corresponding to each face identifier is obtained, according to which the probability distribution table can be directly Determine the protagonist in the video.
  • the number of faces with a small number of faces may be discarded according to the probability of occurrence of the face corresponding to each face identifier.
  • the faces with small probability may be extras, and the probability that the face is positioned by the user is very Small, so you can discard faces with low probability at this time to save storage space in the face classification database.
  • the step of “corresponding to each of the face identifiers in the face classification database according to the probability of occurrence in the video is arranged in descending order” in the corresponding embodiment of the foregoing embodiment, step 100 “receiving the user through the person” Before the video location request of the selected face image sent by the machine interface module, the method further includes: displaying a face image corresponding to the front N personal face identifier in the face classification database, where N is an integer greater than or equal to 1;
  • the first N in the present embodiment refers to the N face identifiers whose face identifiers are more likely to appear in the video.
  • the N face identifiers are the more important characters in the video, and the important characters are The probability that an actor is positioned by a user is high. Therefore, the video player can display a face image corresponding to each face identifier of the top N personal face identifier with a high probability of occurrence in the face classification database, so that the user can select a face from the N human face as the selected person. Face image to locate the video of the selected face image.
  • the selected face image in the “receiving the video location request of the user carrying the selected face image sent by the user interface module” in the step 100 in the above embodiment may be the user corresponding to the user from the N personal face identifier. Selected in the face picture. Specifically, the user may select one of the N human faces through the human interface module to initiate a video location request.
  • the “receiving user carries the selected face map sent by the human interface module.
  • the selected face image in the video location request of the slice may also be input by the user through the human interface module. For example, the user knows that an actor has participated in the video and wants to locate all the videos of the actor in the video.
  • the segment may download a picture including the selected face image of the actor to initiate a video location request from the network, or the user may take a photo including the selected face image of the actor by taking a photo, and initiate video positioning. request.
  • All the solutions of the above embodiments establish a face classification database on the client side of the video playing system, that is, the video player side, and perform video processing.
  • the function module for performing the above-mentioned face classification database can be deployed in the engine of the video player, and provides corresponding interfaces in the native layer and the Java layer for local execution by the video player. Called when the corresponding function.
  • the face classification database may be sent to the cloud server, so that the cloud server stores the face classification database, and locates the cloud server side in the subsequent video location request.
  • a video message of a selected face image is “establishing a face classification database”.
  • step 101 “Acquiring the video information corresponding to the selected face image in the video in the video positioning request” may include the following steps:
  • a video positioning request is performed on the cloud server side as an example.
  • the video player receives the video location request sent by the user through the human interface module and carries the selected face image
  • the video player sends the video location request carrying the selected face image to the cloud server.
  • the cloud server obtains the video information corresponding to the selected face image in the face classification database pre-stored on the cloud server side, and sends the video information to the video player.
  • the video player receives the video information sent by the cloud server.
  • the method may further include: at least one according to the selected face image. Segment video segment information, combining at least one video segment into a positioning video corresponding to the selected face image.
  • the corresponding video segments are obtained from the video, and the video segments are merged together to form a corresponding face image. Position the video.
  • the video processing method of the foregoing embodiment establishes a face classification database, and after receiving a video location request that carries the selected face image sent by the user, performs positioning on the video of the selected face image according to the face classification database.
  • the video positioning efficiency is very high, and the technical solution of the above embodiment is adopted, which is convenient for the user to view all the performances of the actors corresponding to the selected face image in the video, and the user experience is very good.
  • FIG. 3 is a flowchart of another embodiment of a video processing method according to an embodiment of the present invention.
  • the video processing method of this embodiment describes a usage scenario of the present invention on the basis of the technical solutions of the foregoing embodiments.
  • the video processing method in this embodiment may specifically include:
  • the video player decodes each frame of video in the video to obtain a set of images
  • the usage scenario of this embodiment is that when the user uses the video positioning processing function on the video player side through the human interface module, there is no communication connection between the video player and the cloud server, the face classification database is established, and the face classification is performed according to the face.
  • the video processing request is performed by the database, and the video processing is performed on the client side of the video player system, that is, the video processing system is taken as an example to describe the technical solution of the present invention.
  • the video player performs face detection on each image in a group of images, and acquires a face and a PTS of the face in each image;
  • the video player generates a face time stamp database according to the face and the PTS of the face.
  • the video player classifies all the faces in the face timestamp database according to the face identifiers, so that the faces belonging to the same person correspond to the same personal face identifier.
  • the video player estimates, according to the PTS of the face corresponding to each face identifier, pieces of video segment information of the face corresponding to the face identifier.
  • the video segment information includes a start time and an end time of the video segment.
  • the video player establishes a face classification according to each piece of video segment information corresponding to each face identifier. database;
  • the face classification database may include a face identifier and pieces of video segment information corresponding to the face identifier in the video.
  • the video player arranges each face identifier in the face classification database in descending order of probability of occurrence in the video;
  • the video player displays a face image corresponding to the front N personal face identifier in the face classification database on the interface.
  • N is an integer greater than or equal to 1; in this embodiment, the front N personal face identifier in the face classification database is displayed to inform the user that the N face in the video is an important actor with a high probability of occurrence, and the user can know Each main supporting angle in the video.
  • the user selects a selected face image from the face image corresponding to the N personal face identifier through the human interface module, and initiates a video positioning request.
  • a face image is selected from the face image corresponding to the front N personal face identifier in the face classification database displayed on the video player interface, as an example of the selected face image.
  • the selected face image can also be obtained by photographing or downloading from the Internet, and no longer one by one.
  • the video player receives a video positioning request that is sent by the user and carries the selected face image.
  • the video player acquires video information corresponding to the selected face image from the pre-stored face classification database
  • the video information includes an identifier of the selected face image and at least one piece of video segment information of the selected face image.
  • the video information corresponding to the selected face image pre-stored in the face classification database may further include each selected face image.
  • the video player may perform face recognition on each face image in the selected face image and the face classification database, for example, face recognition may be performed by using a feature value matching algorithm, thereby obtaining the face from the face classification database. Select the video information corresponding to the face image.
  • the video player displays the video information corresponding to the selected face image on the interface.
  • the user can click on the video segment corresponding to the video information according to the start time and the end time of the selected face image displayed on the video player interface, and view the selected face image in the All the video segments corresponding to the video, understand the performance of the actor corresponding to the selected face image in the video.
  • the video player combines at least one video segment into the positioning video corresponding to the selected face image according to at least one segment of the video segment information in the video information corresponding to the selected face image.
  • the face classification database is established on the video player side, and after receiving the video location request sent by the user carrying the selected face image, the selected face is implemented according to the face classification database.
  • the positioning of the video of the image, the video positioning efficiency is very high.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of a certain face in the video, which results in low efficiency of video positioning, and implements a selected person in the video.
  • the positioning of all the video information of the face image, the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good. .
  • FIG. 4 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention. As shown in FIG. 4, the video processing method in this embodiment may specifically include the following steps:
  • the video positioning request in this embodiment is sent by the video player to the user through the human interface module; the video processing method in this embodiment describes the technical solution of the present invention on the cloud server side.
  • the video information of the embodiment includes at least one piece of video segment information of the selected face image and the selected face image; for example, the selected face image may also be included in the video information.
  • the selected face image may also be included in the video information.
  • the cloud server After the cloud server obtains the video information corresponding to the selected face image, the cloud server sends the video information corresponding to the selected face image to the video player, and the video player can display the video corresponding to the selected face image to the user on the interface. Information, the user's view based on the displayed face image The frequency information can be used to view all the video segments corresponding to the selected face image in the video, and can further determine the acting of the actor corresponding to the selected face image in the video according to the video segments.
  • the difference between the embodiment and the embodiment shown in FIG. 1 is that the embodiment shown in FIG. 1 has no communication connection between the video player, that is, the client and the cloud server, and all video processing solutions are in the video player.
  • the side implementation is taken as an example to describe the video processing scheme of the present invention.
  • the cloud server and the video player have a communication connection.
  • the video player After the video player receives the video location request sent by the user through the human interface module, the video player can obtain the selected person from the pre-stored face classification database. The video information corresponding to the face image is sent; finally, the video information corresponding to the selected face image is sent to the video player, so that the video player displays the video information corresponding to the selected face image to the user. That is, the technical solution of the present invention is described by taking a communication connection between the video player and the cloud server as an example. The implementation principle of each step is similar. For details, refer to the description of the embodiment shown in FIG. No longer.
  • the video processing method of the embodiment receives the video location request that carries the selected face image sent by the video player, and obtains the video information corresponding to the selected face image from the pre-stored face classification database, and plays the video to the video.
  • the device sends the video information corresponding to the selected face image, so that the video player displays the video information corresponding to the selected face image to the user, thereby realizing the positioning of the video of the selected face image according to the face classification database, and the video positioning Very efficient.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of a certain face in the video, which results in low efficiency of video positioning, and implements a selected person in the video.
  • the positioning of all the video information of the face image, the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good. .
  • the method further includes: establishing a face classification. database. That is, in the embodiment, the face classification database is established on the cloud server side, and the structure of the face classification database and the information included are the same as the face classification database established on the video player side in the above embodiment, and the details may be Reference is made to the description of the above embodiments, and details are not described herein again.
  • the “establishing a face classification database” in the foregoing embodiment may specifically include the following steps:
  • steps (a)-(f) of the embodiment are the same as the implementations of steps (1)-(6) in the subsequent optional technical solutions of the embodiment shown in FIG. 1 to implement the face classification database, and the details can be referred to. The description of the above embodiments will not be repeated here.
  • the method of “establishing a face classification database according to each piece of video segment information corresponding to each face identifier” may further include: grouping each person in the face classification database The face identifiers are arranged in descending order of probability of occurrence in the video.
  • the method may further include: sending the front N personal face identifier in the face classification database to the video player, so that the video player displays the front N to the user.
  • the corresponding selected face image is selected by the user from the face image corresponding to the N personal face identifier; or the selected face image may also be input by the user through the human interface module.
  • the pre-stored face classification database on the cloud server side may be established on the video player side, and sent to the cloud server after the cloud server side has a communication connection with the video player side. of.
  • the method of acquiring the video information corresponding to the selected face image from the pre-stored face classification database the method further includes: receiving a face classification database sent by the video player.
  • the video processing method of the foregoing embodiment establishes a face classification database on the cloud server side, and after receiving a video location request that is sent by the video player and carries the selected face image, implements the selected person according to the face classification database. Positioning the video of the face image, and returning the structure of the positioning to the video player, the video player is displayed to the user, the video positioning efficiency is very high, and the technical solution of the above embodiment is adopted, so that the user can select the selected video in the video.
  • the face image corresponds to all the performances of the actors, and the user experience is very good.
  • FIG. 5 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention. As shown in FIG. 5, the video processing method of this embodiment describes still another usage scenario of the present invention. As shown in FIG. 5, the video processing method in this embodiment may specifically include:
  • the video player decodes each frame of video in the video to obtain a set of images
  • the usage scenario of this embodiment is that when the user uses the video positioning processing function on the video player side through the human interface module, there is no communication connection between the video player and the cloud server, and the face classification database is established in the video player.
  • the client side of the video playback system is performed, but the communication connection between the subsequent video player and the cloud server is resumed, and the face classification database established by the video player is sent to the cloud server, and the cloud server subsequently performs the classification according to the face database.
  • a video positioning request is performed for video processing as an example to describe the technical solution of the present invention.
  • the video player performs face detection on each image in a group of images, and acquires a face and a face PTS in each image;
  • the video player generates a face time stamp database according to the face and the PTS of the face;
  • the video player classifies all the faces in the face timestamp database according to the face identifiers, so that the faces belonging to the same person correspond to the same personal face identifier.
  • the video player estimates, according to the PTS of the face corresponding to each face identifier, pieces of video segment information of the face corresponding to the face identifier.
  • the video segment information includes a start time and an end time of the video segment.
  • the video player establishes a face classification according to each piece of video segment information corresponding to each face identifier. database;
  • the face classification database may include a face identifier and pieces of video segment information corresponding to the face identifier in the video.
  • the video player arranges each face identifier in the face classification database in descending order of probability of occurrence in the video;
  • the video player may send the face classification database to the cloud server.
  • video processing can be performed on the cloud server side to reduce resource loss of the video player client and improve video processing efficiency.
  • the cloud server sends, to the video player, a face image corresponding to the front N personal face identifier in the face classification database; where N is an integer greater than or equal to 1;
  • the video player displays, on the interface, a face image corresponding to the front N personal face identifier in the face classification database to the user;
  • the user can determine the main supporting role in the video according to the displayed face. And further selecting a face from the selected face image to initiate a video location request to request to view all the video segments of the selected face image in the video.
  • the user selects a selected face image from the face image corresponding to the N personal face identifier through the human interface module, and initiates a video positioning request.
  • the video player receives a video location request that is sent by the user and carries the selected face image, and forwards the request to the cloud server.
  • the cloud server receives the video location request, and obtains video information corresponding to the selected face image from the pre-stored face classification database.
  • the video information includes an identifier of the selected face image and at least one piece of video segment information of the selected face image.
  • the video information corresponding to the selected face image pre-stored in the face classification database may further include each selected face image.
  • the cloud server may perform face recognition on each face image of the selected face image and the face classification database, for example, face recognition may be performed by using a feature value matching algorithm, so that the selected face is obtained from the face classification database.
  • face recognition may be performed by using a feature value matching algorithm, so that the selected face is obtained from the face classification database.
  • the video information corresponding to the face image may be performed by using a feature value matching algorithm.
  • the cloud server may send the video information corresponding to the selected face image to the video player; the video player displays the video information corresponding to the selected face image on the interface.
  • the user can click on each video segment corresponding to the video information according to the start time and the end time of the selected face image displayed on the video player interface, and view all the video segments corresponding to the selected face image in the video. Understand the acting of the actor corresponding to the selected face image in the video.
  • the method may further include the following steps:
  • the cloud server merges at least one video segment into a positioning video corresponding to the selected face image according to at least one piece of video segment information in the corresponding video information of the selected face image.
  • the cloud server may directly send the video information corresponding to the selected face image to the video playing server, and the video player according to at least one piece of video segment information in the corresponding video information of the selected face image, Combine at least one video segment into a positioning video corresponding to the selected face image.
  • the cloud server sends a positioning video to the video player.
  • the video player displays, on the interface, a positioning video corresponding to the selected face image to the user.
  • the positioning video is a set of all the video segments of the selected face image in the video.
  • the video player displays the positioning video corresponding to the selected face image to the user on the interface, the user can view the video.
  • the selected face image is in all the video segments corresponding to the video, and the actors corresponding to the selected face image are learned in the video.
  • the face classification database is established on the video player side, and when the cloud server and the video player have a communication connection, the video player sends the face classification database to the cloud server.
  • the subsequent video location request processing is performed on the cloud server side.
  • the cloud server receives the video location request sent by the video player and carries the selected face image
  • the video of the selected face image is implemented according to the face classification database. Positioning, video positioning efficiency is very high.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located in the video in the prior art, and the video positioning efficiency is low.
  • the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the actors corresponding to the selected face image in the video. Performance, user experience is also very good.
  • FIG. 6 is a flowchart of still another embodiment of a video processing method according to an embodiment of the present invention.
  • the video processing method of this embodiment describes another usage scenario of the present invention on the basis of the technical solutions of the foregoing embodiments.
  • the video processing method in this embodiment may specifically include:
  • the cloud server decodes each frame of video in the video to obtain a set of images.
  • the usage scenario of this embodiment is that when the user uses the video positioning processing function on the video player side through the human interface module, the video player has a communication connection with the cloud server, and the face classification database is established on the cloud server side.
  • the technical solution of the present invention is described by taking a video processing request by the cloud server to perform video location request according to the face classification database.
  • the cloud server performs face detection on each image in a group of images, and obtains a face and a face PTS in each image;
  • the cloud server generates a face timestamp database according to the face and the PTS of the face.
  • the cloud server classifies all the faces in the face timestamp database according to the face identifiers, so that the faces belonging to the same person correspond to the same personal face identifier.
  • the cloud server estimates, according to the PTS of the face corresponding to each face identifier, pieces of video segment information of the face corresponding to the face identifier.
  • the video segment information includes a start time and an end time of the video segment.
  • the cloud server establishes a face classification database according to pieces of video segment information corresponding to each face identifier.
  • the face classification database may include a face identifier and pieces of video segment information corresponding to the face identifier in the video.
  • the cloud server arranges each face identifier in the face classification database according to a probability of appearing in the video in descending order;
  • video processing can be performed on the cloud server side to reduce resource loss of the video player client and improve video processing efficiency.
  • the cloud server sends, to the video player, a face image corresponding to the front N personal face identifier in the face classification database; where N is an integer greater than or equal to 1;
  • the video player displays, on the interface, a face image corresponding to the front N personal face identifier in the face classification database.
  • the user can determine the main supporting role in the video according to the displayed face. And further selecting a face from the selected face image to initiate a video location request to request to view all the video segments of the selected face image in the video.
  • the user selects a selected face image from the face image corresponding to the N personal face identifier through the human interface module, and initiates a video positioning request.
  • the user can input the selected face image through the human-machine interface module by taking a photo or downloading the image, and initiate a video positioning request.
  • the video player receives a video location request that is sent by the user and carries the selected face image, and forwards the request to the cloud server.
  • the cloud server receives the video location request, and obtains video information corresponding to the selected face image from the pre-stored face classification database.
  • the video information includes an identifier of the selected face image and at least one piece of video segment information of the selected face image.
  • the video information corresponding to the selected face image pre-stored in the face classification database may further include each selected face image.
  • the cloud server may send the video information corresponding to the selected face image to the video player; the video player displays the video information corresponding to the selected face image on the interface.
  • the user can click on each video segment corresponding to the video information according to the start time and the end time of the selected face image displayed on the video player interface, and view all the video segments corresponding to the selected face image in the video. Understand the acting of the actor corresponding to the selected face image in the video.
  • the method may further include the following steps:
  • the cloud server merges at least one video segment into a positioning video corresponding to the selected face image according to at least one piece of video segment information in the corresponding video information of the selected face image.
  • the cloud server sends a positioning video to the video player.
  • the video player displays, on the interface, a positioning video corresponding to the selected face image to the user.
  • the positioning video is a set of all the video segments of the selected face image in the video.
  • the video player displays the positioning video corresponding to the selected face image to the user on the interface, the user can view the video.
  • the selected face image is in all the video segments corresponding to the video, and the actors corresponding to the selected face image are learned in the video.
  • the face classification database is established on the cloud server side, and the subsequent video location request processing is performed on the cloud server side, that is, the cloud server receives the carried face image sent by the video player.
  • the video of the selected face image is located according to the face classification database, and the video positioning efficiency is very high.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of a certain face in the video, which results in low efficiency of video positioning, and implements a selected person in the video.
  • the positioning of all the video information of the face image, the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good. .
  • FIG. 7 is a schematic structural diagram of an embodiment of a video player according to an embodiment of the present invention. As shown in FIG. 7, the video player of this embodiment may specifically include: a receiving module 10, an obtaining module 11, and a display module 12.
  • the receiving module 10 is configured to receive a video positioning request that is sent by the user through the human interface module and that carries the selected face image.
  • the acquiring module 11 is connected to the receiving module 10, and the obtaining module 11 is configured to obtain the video positioning request received by the receiving module 10.
  • the selected face image is corresponding to the video information in the video, and the video information includes the identifier of the selected face image and at least one piece of video segment information of the selected face image;
  • the display module 12 is connected to the acquisition module 11, and the display module 12 It is used to display the video information corresponding to the selected face image acquired by the obtaining module 11.
  • the video player of the present embodiment is the same as the implementation mechanism of the method embodiment shown in FIG. 1 by using the above-mentioned module. For details, refer to the description of the embodiment shown in FIG. 1 , and details are not described herein again. .
  • the video player of the embodiment obtains a video positioning request that is sent by the user through the human interface module and carries the selected face image, and obtains a video corresponding to the selected face image in the video positioning request in the video. Information and display the video information corresponding to the selected face image.
  • the technical solution of the embodiment it is possible to make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in low efficiency of video positioning, and implementing a selected face in the video.
  • the positioning of all the video information of the picture is very efficient, and the technical solution of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • FIG. 8 is a schematic structural diagram of another embodiment of a video player according to an embodiment of the present invention. As shown in FIG. 8, the video player of the present embodiment further describes the technical solution of the present invention in more detail on the basis of the technical solution of the embodiment shown in FIG.
  • the obtaining module 11 in the video player of the embodiment is specifically configured to obtain video information corresponding to the selected face image from the pre-stored face classification database.
  • the video player of the embodiment further includes an establishing module 13 for establishing a face classification database.
  • the obtaining module 11 is connected to the establishing module 13 , and the obtaining module 11 is specifically configured to obtain the video information corresponding to the selected face image from the face classification database established by the establishing module 13 .
  • the module 13 is specifically configured to include: a decoding unit 131, a face detecting unit 132, a face time stamp database generating unit 133, and a categorizing unit 134.
  • the decoding unit 131 is configured to decode each frame of video in the video to obtain a set of images; the face detecting unit 132 is connected to the decoding unit 131, and the face detecting unit 132 is configured to obtain a set of images obtained by the decoding unit 131.
  • the face detection is performed on each image in the image, and the PTS of the face and the face in each image is acquired; the face time stamp database generation unit 133 is connected to the face detection unit 132, and the face time stamp database generation unit 133 is used to The face detecting unit 132 detects the obtained face and the PTS of the face to generate a face time stamp database; the categorizing unit 134 is connected to the face time stamp database generating unit 133, and the categorizing unit 134 is configured to generate the face time stamp database.
  • All the faces in the face time stamp database generated by the unit 133 are classified according to the face identifiers, so that the faces belonging to the same person correspond to the same personal face identifier; the estimating unit 135 is connected to the categorizing unit 134, and the estimating unit 135
  • the PTS of the face corresponding to each face identifier after being classified according to the classification unit 134 is estimated Calculating each piece of video segment information of the face corresponding to the face identifier; the video segment information includes a start and end time of the video segment;
  • the face classification database generating unit 136 is connected to the estimating unit 135, and the face classification database generating unit 136 is configured to estimate
  • the segment video segment information corresponding to each face identifier obtained by the unit 135 is used to establish a face classification database.
  • the establishing module 13 in the video player of the embodiment further includes: a sorting unit 137, which is connected to the face classification database generating unit 136, and the sorting unit 137 is used.
  • a sorting unit 137 which is connected to the face classification database generating unit 136, and the sorting unit 137 is used.
  • Each face identifier in the face classification database generated by the face classification database generating unit 136 is arranged in descending order of probability of occurrence in the video.
  • the obtaining module 11 is connected to the face classification database generating unit 136, and the obtaining module 11 is specifically configured to obtain the video information corresponding to the selected face image from the face classification database established by the face classification database generating unit 136.
  • the display module 12 of the video player of the embodiment is further connected to the face classification database generating unit 136, and the display module 12 is configured to display the person corresponding to the front N personal face identifier in the sorted face classification database.
  • a face image where N is an integer greater than or equal to 1; further, the selected face image is selected by the user from the face image corresponding to the N personal face identifier; or the selected face image is the user through the human interface Module input.
  • the video player of this embodiment further includes: a merging module 14.
  • the merging module 14 is connected to the face classification database generating unit 136, and the merging module 14 is configured to use at least one piece of video segment information of the selected face image in the face classification database generated by the face classification database generating unit 136. A video segment is merged into a positioning video corresponding to the selected face image.
  • the above technical solution is to establish a face classification database on the video player side, and perform video processing according to the video positioning request sent by the user carrying the selected face image.
  • the video player of the present embodiment is the same as the implementation mechanism of the method embodiment shown in FIG. 3 by using the above-mentioned module. For details, refer to the description of the embodiment shown in FIG. 3, and details are not described herein again. .
  • the video player of the embodiment implements the face classification database by using the above module, and after receiving the video location request that carries the selected face image sent by the user, implements the selected face image according to the face classification database.
  • Video positioning video positioning efficiency is very high.
  • Adopt The technical solution of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of the certain face in the video, resulting in a low efficiency of video positioning, and implementing a selected face image in the video.
  • the positioning of all the video information is very efficient, and the technical solution of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • FIG. 9 is a schematic structural diagram of still another embodiment of a video player according to an embodiment of the present invention. As shown in FIG. 9, the video player of the present embodiment further describes the technical solution of the present invention in more detail on the basis of the technical solution of the embodiment shown in FIG.
  • the video player of the embodiment further includes a sending module 15 .
  • the sending module 15 is connected to the face classification database generating unit 136 and configured to send the face classification database generated by the face classification database generating unit 136 to the cloud server.
  • the sending module 15 of the video player of the embodiment is further connected to the receiving module 10, and the sending module 15 is further configured to send, to the cloud server, the video positioning request that is received by the receiving module 10 and carries the selected face image;
  • the receiving module 10 is further configured to receive the video information sent by the cloud server, where the video information is obtained by the cloud server from the face classification database pre-stored in the cloud server according to the selected face image.
  • the merging module 14 is further configured to connect to the receiving module 10, and the merging module 14 is configured to: at least one video segment according to at least one piece of video segment information of the selected face image in the video information received by the receiving module 10. The segments are merged into the positioning video corresponding to the selected face image.
  • the video player of this embodiment is to establish a face classification database on the video player side, and send the face classification database to the cloud server; and after the video player receives the video location request carrying the selected face image, The video player sends the video location request to the cloud server, and the cloud server performs video processing according to the video location request that carries the selected face image.
  • the video player of the present embodiment is the same as the implementation mechanism of the method embodiment shown in FIG. 5 by using the above-mentioned module. For details, refer to the description of the embodiment shown in FIG. 5, and details are not described herein again. .
  • the face classification database is established on the video player side by using the above module, and when the cloud server and the video player have a communication connection, the video player sends the person to the cloud server. Face classification database, and subsequent video location request processing in the cloud After the video server receives the video location request sent by the video player and carries the selected face image, the cloud server locates the video of the selected face image according to the face classification database, and the video positioning efficiency is very high.
  • the technical solution of the embodiment it is possible to make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in low efficiency of video positioning, and implementing a selected face in the video.
  • the positioning of all the video information of the picture is very efficient, and the technical solution of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • FIG. 10 is a schematic structural diagram of still another embodiment of a video player according to an embodiment of the present invention. As shown in FIG. 10, the video player of this embodiment further includes the following technical solutions on the basis of the technical solution of the embodiment shown in FIG.
  • the transmitting module 15 is also included in the video player of this embodiment.
  • the sending module 15 is connected to the receiving module 10, and the sending module 15 is further configured to send the video positioning request that is received by the receiving module 10 to carry the selected face image to the cloud server.
  • the receiving module 10 is further configured to receive the video information sent by the cloud server.
  • the video information is obtained by the cloud server from the face classification database pre-stored in the cloud server according to the selected face image.
  • the merging module 14 is connected to the acquiring module 11, and the merging module 14 is configured to merge at least one video segment into the selected one according to at least one piece of video segment information of the selected face image in the video information acquired by the obtaining module 11.
  • the positioning video corresponding to the face image can also be configured on the cloud server side, and the corresponding obtaining module 11 can also be used to directly receive the positioning video corresponding to the selected face image sent by the video server.
  • the video player of this embodiment is omitted from the above-described embodiment shown in FIG.
  • the video player of this embodiment is to establish a face classification database on the cloud server side, and after the video player receives the video location request carrying the selected face image, the video player sends the video location request to the cloud server. And the video processing is performed by the cloud server according to the video positioning request carrying the selected face image.
  • the video player of this embodiment implements the implementation mechanism of the video processing by using the foregoing module. For details, reference may be made to the description of the foregoing related method embodiments, and details are not described herein again.
  • the video player of the embodiment After the video player of the embodiment receives the video location request carrying the selected face image, the video player sends the video location request to the cloud server, and the cloud server carries the selected face image according to the cloud server.
  • Video positioning request, video processing, video positioning efficiency is very high.
  • the technical solution of the embodiment is convenient for the user to view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • FIG. 11 is a schematic structural diagram of an embodiment of a cloud server according to an embodiment of the present invention.
  • the cloud server of this embodiment includes: a receiving module 20, an obtaining module 21, and a sending module 22.
  • the receiving module 20 is configured to receive a video positioning request that is sent by the video player and that carries the selected face image; the video positioning request is sent by the video player to receive the user through the human interface module; and the acquiring module 21 is connected to the receiving module 20 to obtain
  • the module 21 is configured to obtain, from the pre-stored face classification database, the video information corresponding to the selected face image received by the receiving module 20; the video information includes the identifier of the selected face image and at least one video of the selected face image.
  • the segmentation information is sent by the sending module 22, and the sending module 22 is configured to send, to the video player, the video information corresponding to the selected face image acquired by the acquiring module 21, so that the video player displays the selected face image to the user. Corresponding video information.
  • the implementation of the video processing in the cloud server of the present embodiment is the same as that in the foregoing embodiment of the method shown in FIG. 4 .
  • the cloud server of the embodiment obtains a video location request that is sent by the video player and carries the selected face image by using the foregoing module, and obtains video information corresponding to the selected face image from the pre-stored face classification database. Sending video information corresponding to the selected face image to the video player, so that the video player displays the video information corresponding to the selected face image to the user, thereby realizing the positioning of the selected face image according to the face classification database. , video positioning efficiency is very high.
  • the video processing method of the embodiment can compensate for the defect that the video segment cannot be located on all the video segments of a certain face in the video, which results in low efficiency of video positioning, and implements a selected person in the video.
  • the positioning of all the video information of the face image, the video positioning efficiency is very high, and the video processing method of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good. .
  • FIG. 12 is a schematic structural diagram of another embodiment of a cloud server according to an embodiment of the present invention. As shown in FIG. 12, the cloud server of the present embodiment further describes the technical solution of the present invention in more detail on the basis of the technical solutions of the embodiment shown in FIG.
  • the cloud server of this embodiment further includes: an establishing module 23, configured to establish a face classification database.
  • the obtaining module 21 is further connected to the establishing module 23, and the obtaining module 21 is configured to obtain the video information corresponding to the selected face image received by the receiving module 20 from the face classification database established by the establishing module 23.
  • the establishing module 23 includes: a decoding unit 231, a face detecting unit 232, a face timestamp database generating unit 233, a categorizing unit 234, The estimating unit 235 and the face classification database generating unit 236.
  • the decoding unit 231 is configured to decode each frame of video in the video to obtain a set of images; the face detecting unit 232 is connected to the decoding unit 231, and the face detecting unit 232 is configured to obtain a set of images obtained by the decoding unit 231.
  • the face detection is performed for each image in the image, and the PTS of the face and the face in each image is acquired; the face time stamp database generation unit 233 is connected to the face detection unit 232, and the face time stamp database generation unit 233 is used to The face detecting unit 232 detects the obtained face and the PTS of the face to generate a face time stamp database; the categorizing unit 234 is connected to the face time stamp database generating unit 233, and the categorizing unit 234 is configured to generate the face time stamp database.
  • All the faces in the face time stamp database generated by the unit 233 are classified according to the face identifiers, so that the faces belonging to the same person correspond to the same personal face identifier; the estimating unit 235 is connected to the categorizing unit 234, and the estimating unit 235 For estimating the video segment information of the face corresponding to the face identifier according to the PTS of the face corresponding to each face identifier after the categorization unit 234 is classified; the video segment information The start and end time of the video segment is included; the face classification database generating unit 236 and the estimating unit 235, the face classification database generating unit 236 is configured to establish a face classification according to various types of video segment information corresponding to each face identifier obtained by the estimating unit 235. database.
  • the establishing module 23 in the cloud server of this embodiment further includes a sorting unit 237, which is connected to the face classification database generating unit 236, and the sorting unit 237 is used to face the face.
  • a sorting unit 237 which is connected to the face classification database generating unit 236, and the sorting unit 237 is used to face the face.
  • Each face identifier in the face classification database generated by the classification database generating unit 236 is arranged in descending order of the probability of occurrence in the video.
  • the obtaining module 21 is further connected to the face classification database generating unit 236, and the obtaining module 21 is configured to obtain the selected face image received by the receiving module 20 from the face classification database established by the face classification database generating unit 236. Corresponding video information.
  • the sending module 22 in the cloud server of the embodiment is further configured to send the front N personal face identifier in the face classification database to the video player, so that the video player displays the front N to the user.
  • Personal face identifier, N is an integer greater than or equal to 1.
  • the selected face image in the video location request received by the receiving module 20 may be selected by the user from the face image corresponding to the N personal face identifier; or the selected face image may be input by the user through the human interface module. of.
  • the cloud server in this embodiment is configured to establish a face classification database on the cloud server side, and after receiving the video location request sent by the video player and carrying the selected face image, the cloud server carries the selected face image according to the Video location request for video processing.
  • the implementation of the video processing in the cloud server of the present embodiment is the same as the implementation mechanism of the method embodiment shown in FIG. 6 .
  • the implementation mechanism of the method embodiment shown in FIG. 6 is the same as the implementation mechanism of the method embodiment shown in FIG. 6 .
  • the receiving module 20 in the cloud server of the embodiment is further configured to receive a face classification database sent by the video player.
  • the cloud server of this embodiment implements the face classification database on the cloud server side by using the above module, and the subsequent video location request processing is performed on the cloud server side, that is, the cloud server receives the bearer sent by the video player.
  • the video location request of the face image is selected, the video of the selected face image is located according to the face classification database, and the video positioning efficiency is very high.
  • the technical solution of the embodiment it is possible to make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in low efficiency of video positioning, and implementing a selected face in the video.
  • the positioning of all the video information of the picture is very efficient, and the technical solution of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • FIG. 13 is a schematic structural diagram of an embodiment of a video playing system according to an embodiment of the present invention.
  • the video player system of the present embodiment includes a video player 30 and a cloud server 40.
  • the video player 30 and the cloud server 40 are communicatively coupled.
  • the video player 30 of the embodiment is implemented as shown in FIG.
  • the cloud server 40 corresponds to the cloud server shown in FIG. 11
  • the video processing method of the embodiment shown in FIG. 5 can be used to implement video processing.
  • the video player 30 of the embodiment of the present invention adopts the video player of the embodiment shown in FIG. 10
  • the cloud server 40 adopts the cloud server as shown in FIG. 12, and the video processing of the embodiment shown in FIG. 6 can be specifically used.
  • the method is to implement video processing.
  • the video player 30 and the cloud server 40 can be used to locate the video of the selected face image according to the face classification database, and the video positioning efficiency is very high.
  • the technical solution of the embodiment it is possible to make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in low efficiency of video positioning, and implementing a selected face in the video.
  • the positioning of all the video information of the picture is very efficient, and the technical solution of the embodiment is adopted, so that the user can view all the performances of the actors corresponding to the selected face image in the video, and the user experience is also very good.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • the device embodiments described above are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located in one place. Or it can be distributed to at least two network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.
  • the video processing method and system, the video player and the cloud server of the present invention receive the video location request of the selected face image sent by the user through the human interface module, and obtain the selected face image in the video positioning request in the video. Corresponding video information and display the selected face map The corresponding video information of the slice.
  • the technical solution of the present invention can make up for the defect that the video segment cannot be located on all the video segments of a certain face in the video, resulting in low efficiency of video positioning, and realizing a selected face image in the video.
  • the positioning of all the video information is very efficient, and the technical solution of the present invention is adopted to facilitate the user to view all the performances of the actors corresponding to the selected face image in the video, and the user experience is very good.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

一种视频处理方法及系统、视频播放器与云服务器,视频处理方法包括:接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;获取视频定位请求中的被选中人脸图片在视频中对应的视频信息,该视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息;显示被选中人脸图片对应的视频信息。采用本方法及系统,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度提高。

Description

视频处理方法及系统、视频播放器与云服务器
交叉引用
本申请引用于2015年10月26日递交的名称为“视频处理方法及系统、视频播放器与云服务器”的第2015107020937号中国专利申请,其通过引用被全部并入本申请。
技术领域
本发明涉及视频处理技术领域,尤其涉及一种视频处理方法及系统、视频播放器与云服务器。
背景技术
近年来,随着科技的发展,为了给用户提供更加丰富的精神文化生活服务,涌现出来各种各样的视频。为了便于用户观看,用户可以通过电脑或者手机等终端,通过下载或者在线观看的方式,观看用户感兴趣的视频节目。
现有技术中,随着视频节目越来越多,为了便于用户快速查找视频中各个时间段的大致画面。有些客户端可以为用户提供视频缩略图,用户可以通过视频缩略图提前了解视频各个时间段的画面情况,但当视频过长时,缩略图会较多,导致用户难以快速的在视频中定位到自己感兴趣的视频段,从而可能给观看者带来较差的用户体验。为了便于用户从视频中快速定位到自己感兴趣的视频段,有些客户端还提供有部分时间段的剧情提示,这样,用户结合视频缩略图和剧情提示,可以快速定位到用户感兴趣的视频段。
但是,在实现本发明的过程中,发明人发现现有技术中用户需要结合视频缩略图和剧情提示,进行手动操作实现定位用户感兴趣的视频段,导致视频定位的效率较低。
发明内容
本发明实施例提供一种视频处理方法及系统、视频播放器与云服务器,以克服现有技术中视频定位效率较低的缺陷,以实现对视频中某个确定的人 脸的所有视频段进行定位,提高视频的定位处理效率。
本发明实施例提供一种视频处理方法,所述方法包括:
接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;
获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
显示所述被选中人脸图片对应的所述视频信息。
本发明实施例还提供一种视频处理方法,所述方法包括:
接收视频播放器发送的携带被选中人脸图片的视频定位请求;所述视频定位请求为所述视频播放器接收用户通过人机接口模块发送的;
从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息;所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
向所述视频播放器发送所述被选中人脸图片对应的所述视频信息,以供所述视频播放器向用户显示所述被选中人脸图片对应的所述视频信息。
本发明实施例还提供一种视频播放器,包括:
接收模块,用于接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;
获取模块,用于获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
显示模块,用于显示所述被选中人脸图片对应的所述视频信息。
本发明实施例还提供一种云服务器,所述云服务器包括:
接收模块,用于接收视频播放器发送的携带被选中人脸图片的视频定位请求;所述视频定位请求为所述视频播放器接收用户通过人机接口模块发送的;
获取模块,用于从预存储的人脸分类数据库中获取所述被选中人脸图片 对应的所述视频信息;所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
发送模块,用于向所述视频播放器发送所述被选中人脸图片对应的所述视频信息,以供所述视频播放器向用户显示所述被选中人脸图片对应的所述视频信息。本发明实施例还提供一种视频播放系统,所述视频播放系统包括视频播放器和云服务器,所述视频播放器和所述云服务器通信连接,所述视频播放器采用如上所述的视频播放器,所述云服务器采用如上所述的云服务器。
本发明实施例的视频处理方法及系统、视频播放器与云服务器,通过接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求,获取视频定位请求中的被选中人脸图片在视频中对应的视频信息,并显示被选中人脸图片对应的视频信息。采用本发明实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本发明实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度非常好。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例的视频处理方法一实施例的流程图;
图2为本发明实施例中的某一人脸标识对应的人脸的PTS分布图;
图3为本发明实施例的视频处理方法另一实施例的流程图;
图4为本发明实施例的视频处理方法再一实施例的流程图;
图5为本发明实施例的视频处理方法又一实施例的流程图;
图6为本发明实施例的视频处理方法再另一实施例的流程图;
图7为本发明实施例的视频播放器一实施例的结构示意图;
图8为本发明实施例的视频播放器另一实施例的结构示意图;
图9为本发明实施例的视频播放器再一实施例的结构示意图;
图10为本发明实施例的视频播放器又一实施例的结构示意图;
图11为本发明实施例的云服务器一实施例的结构示意图;
图12为本发明实施例的云服务器另一实施例的结构示意图;
图13为本发明实施例的视频播放系统实施例的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1为本发明的视频处理方法一实施例的流程图。如图1所示,本实施例的视频处理方法,具体可以包括如下步骤:
100、接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;
本实施例在视频播放器一侧描述本发明的技术方案,该视频播放器即为视频处理系统的客户端。该视频播放器可以安装在例如手机、平板电脑等移动终端上;也可以安装在电脑等非移动终端即普通终端上。具体地,该客户端与用户交互,该视频播放器接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求,其中,该人机接口模块可以为键盘、触控笔或者触摸屏的信息检测与接收模块等等。例如当用户通过手指或者触控笔在触摸屏上、选择被选中人脸,并点击发送视频定位请求对应的按钮时,触摸屏的信息检测与接收模块可以检测到用户发出的该视频定位请求,并获取到该视频定位请求中携带的被选中人脸图片。例如本实施例中用户选择的被选中人脸图片可以为用户所选择的视频中的某一个演员的清晰的人脸照片,或者演员在视频截屏中的人脸。总之,该被选中的人脸图片要求所包括的人脸必须足够清晰,能够便于识别。
101、获取视频定位请求中的被选中人脸图片在视频中对应的视频信息;
本实施例的视频信息中包括被选中人脸图片的标识和该视频中被选中人脸图片对应的至少一段视频段信息,或者进一步还可以包括该被选中人脸图片。由于视频是由一个个演员通过一段段视频段串接而成,本实施例中,可以获取视频定位请求中该被选中人脸图片对应的所有视频信息,其中每一个视频信息可以包括该被选中人脸图片的标识和至少一段视频段信息,其中被选中人脸图片的标识用于唯一标识视频中该被选中人脸图片,可以为该被选中人脸图片的对应的演员的姓名或者艺名,或者当该被选中人脸图片对应演员的姓名或者艺名在该视频中不唯一时,可以使用其他标识(Identification;ID)来唯一标识该被选中人脸图片。视频段为该被选中人脸图片在该视频中出现的视频的片段;该被选中人脸图片在该视频中出现的一个视频片段即为一段视频段;至少一段视频段信息为该被选中人脸图片在该视频中出现的所有视频的片段。例如本实施例的至少一段视频段信息可以包括每一段视频段的起止时间,即该视频段的开始时间和结束时间。
102、显示被选中人脸图片对应的视频信息。
例如,具体可以在视频播放器的界面上显示被选中人脸图片对应的视频信息,这样,即完成了对该被选中人脸图片的视频的定位。用户可以根据所显示的被选中人脸图片的视频信息,选择在该视频播放器上观看定位的该被选中人脸图片的视频。例如本实施例的视频处理方法,可以适用于对一个视频节目中任意一个演员的所有视频信息的定位,方便用户观看该视频中该演员的所有表演。
本实施例的视频处理方法,通过接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求,获取视频定位请求中的被选中人脸图片在视频中对应的视频信息,并显示被选中人脸图片对应的视频信息。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
进一步可选地,在上述实施例的技术方案的基础上,步骤101“获取视频 定位请求中的被选中人脸图片在视频中对应的视频信息”,具体可以包括:从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息。
具体地,本实施例中,在视频播放器即视频播放系统的客户端一侧预存储有人脸分类数据库。这样,当视频播放器与云服务器之间没有网络连接的时候,视频播放器一端也可以自行进行本实施例的视频处理。
进一步可选地,在上述实施例中的“从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息”之前,本实施例的视频处理方法,还可以包括:建立人脸分类数据库。例如该人脸分类数据库中可以包括多个人脸标识,以及每个人脸标识对应的人脸在视频中对应的视频信息,例如该视频信息可以包括该人脸在视频中的每一段视频段的起止时间。
进一步可选地,上述实施例中的“建立人脸分类数据库”,具体可以包括如下步骤:
(1)对视频中的每一帧视频进行解码,得到一组图像;
视频是由一帧一帧的图像串接而成,对每一帧图像进行解码,可以得到对应的图像,本实施例中,以解码所得图像为RGB图像为例。对该视频的所有帧视频进行解码,可以得到一组RGB图像。
(2)对该组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的视频播放时间(Presentation Time Stamp;PTS);
对步骤(1)所得的一组RGB图像中的每一个RGB图像使用人脸检测算法检测人脸。当检测到该RGB图像中包括有人脸的时候,获取该RGB图像中的人脸,以及该RGB图像的在视频播出中的PTS。
(3)根据人脸以及人脸的PTS,生成人脸时间戳数据库;
根据步骤(2)人脸检测得到的人脸以及各人脸的PTS,生成人脸时间戳数据库。即该人脸时间戳数据库中包括人脸以及每一个人脸在视频中的PTS。该人脸时间戳数据库以时间为基准,保存了包括有人脸的图像中每个时刻对应的检测出的人脸,由于一部视频较长,解码出来的图像会过多,以时长90分钟,帧率为30来计算,共需检测90×60×30=162000张图像。这样的计算量会带来较大的计算负担以及人脸时间戳数据库的存储负担。因此,实际应用中,考虑到视频在短时间内画面变化并不大,在步骤(2)进行人脸检测 时,可以考虑改变采样频率,例如每10帧扫描一张图像的人脸,则每秒只需要扫描3张,总计需要90×60×3=16200张图像即可。
(4)将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;
具体地,步骤(3)所得的人脸时间戳数据库中的所有人脸可能包括好多个演员的人脸,其中有些人脸是某一演员的在不同PTS的人脸,该步骤中,可以按照人脸标识对人脸进行归类,例如可以按照PTS由前向后的顺序,对人脸时间戳数据库中每一个人脸进行识别,例如第一个人脸可以对其设置人脸标识,该人脸标识可以由用户通过人机接口模块输入,例如可以为该人脸对应的演员的姓名或者艺名,或者其他人脸ID,并存储该人脸标识、该人脸以及该人脸的PTS。然后接着按照PTS的顺序,识别人脸时间戳数据库中的第二个人脸,通过特征值匹配算法,判断该人脸与已存储的人脸是否为同一人,如果是,将该人脸的标识设置为已存储的人脸标识,以将属于同一人的人脸对应同一个人脸标识。如果不是同一人,则设置新的人脸标识,依此类推,可以将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识。
(5)根据各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;该视频段信息包括视频段的起止时间;
根据步骤(4)的处理,可以对人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,紧接着,本实施例中,可以根据各人脸标识对应的人脸的PTS,确定该人脸标识对应的连续的PTS。因为人脸的视频段需要该人脸在连续的PTS中出现,因此根据该人脸标识对应的连续的PTS可以确定该人脸的连续的视频段,从而可以估算出该人脸标识对应的人脸的各段视频段信息,即视频段的起止时间。例如图2为本发明实施例中的某一人脸标识对应的人脸的PTS分布图。其中横坐标为PTS,纵坐标为该人脸标识对应的人脸出现的概率,0表示没有出现,1表示出现。从图2中可以看出,纵轴值为1且最为密集的那些点对应的PTS组成的一段时间,如时间段3到5可以认为满足出现该人脸的条件。通过分段算法,可以将图2中纵轴值为1对应的点分为若干段,每一段都代表了该人脸对应的演员集中出现的一个视频片段。另外,对于某一段中PTS点的数量少,即视 频片段极短的段可以丢弃。例如图2中的人脸分布图可以得到如下的表1所示的视频段信息。
表1
段别 起止时间
1 3s-5s
2 8s-9s
(6)根据各人脸标识对应的各段视频段信息,建立人脸分类数据库。
根据上述得到的每一个人脸标识,以及每一个人脸标识对应的各段视频段信息,建立人脸分类数据库,以及该人脸分类数据库中包括每一个人脸标识,每一个人脸标识对应的人脸在视频中的每一段视频段中的起止时间。这样非常方便根据该人脸分类数据库中该视频中每一个人脸进行视频定位。
例如,本实施例的人脸分类数据库的核心结构体可以采用如下方式表示:
typedefstruct_humanFaceData
{
int face_id;//人脸的ID
char*face_name;//人脸对应人物之名字
double**face_timestamp;//视频片段起止时间
int number_appear;//视频片段的个数
float penrcent_appear;//人脸出现概率
}humanFaceData;
typedef struct_humanFaceDataSet
{
int number_face;//有效的人脸数量<=N
humanFaceData*human_face_data;//所有人脸对应的分段数据
int SOURCE_ID;//数据生成来源:云服务器端或视频播放器端即客户端
}humanFaceDataSet;
本实施例以在视频播放器一侧即视频播放系统的客户端描述本发明的技术方案,实际应用中,该人脸分类数据库也可以在云服务器端,参见后续实施例的记载。
进一步可选地,在上述实施例的技术方案的基础上,步骤“根据各人脸标识对应的各段视频段信息,建立人脸分类数据库”之后,还可以包括:将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列。
具体地,将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列,得到各个人脸标识对应的人脸的概率分布表,根据该概率分布表可以直接确定出该视频中的主角配角。可选地,还可以根据各人脸标识对应的人脸出现的概率丢弃出现数量少的人脸,例如,这些概率很小的人脸可能为群众演员,该人脸被用户去定位的概率很小,所以此时可以丢弃概率很小的人脸,以节省人脸分类数据库中的存储空间。
进一步可选地,此时对应的上述实施例的步骤“将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列”之后,步骤100“接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求”之前,还可以包括:显示人脸分类数据库中前N个人脸标识对应的人脸图片,N为大于或者等于1的整数;
本实施例中的前N个即指的是各人脸标识按照在视频中出现的概率较大的N个人脸标识,这N个人脸标识即为该视频中的较为重要的角色,重要角色的演员被用户定位的概率较高。因此,视频播放器可以显示人脸分类数据库中出现概率较高的前N个人脸标识中每一个人脸标识对应的人脸图片,这样用户可以从N个人脸中选择一个人脸作为被选中人脸图片,来定位该被选中人脸图片的视频。因此,上述实施例中的步骤100中的“接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求”中的被选中人脸图片可以为用户从N个人脸标识对应的人脸图片中选择的。具体地,用户可以通过人机接口模块从N个人脸中选择一个来发起视频定位请求。另外,上述实施例中的步骤100中的“接收用户通过人机接口模块发送的携带被选中人脸图 片的视频定位请求”中的被选中人脸图片也可以为用户通过人机接口模块输入的,例如,用户知道某一个演员参演了该视频,想要在该视频中定位该演员的所有视频段,可以从网络下载一个包括该演员的被选中人脸图片的图片发起视频定位请求。或者用户也可以通过拍摄照片的形式拍得包括该演员的被选中人脸图片的照片,并发起视频定位请求。
上述实施例的所有方案均以在视频播放系统的客户端一侧即视频播放器一侧建立人脸分类数据库,并进行视频处理。这种方案需要客户端无法连接至云服务器时,执行上述人脸分类数据库建立的功能模块可以部署于视频播放器的引擎中,并在native层及Java层提供相应接口,供视频播放器本地执行相应功能时调用。
需要说明的是,如果在视频播放器一端放置人脸分类数据库,并执行相应功能执行时,需要消耗大量的资源,因此,可选地,上述实施例的步骤“建立人脸分类数据库”之后,当视频播放器与云服务器之间建立通信连接之后,还可以向云服务器发送人脸分类数据库,以供云服务器存储该人脸分类数据库,并在后续视频定位请求中在云服务器一侧定位某一被选中人脸图片的视频信息。
例如,进一步可选地,上述实施例中的步骤101“获取视频定位请求中的被选中人脸图片在视频中对应的视频信息”,具体可以包括如下步骤:
(A)向云服务器发送携带被选中人脸图片的视频定位请求;
(B)接收云服务器发送的视频信息,视频信息为云服务器根据被选中人脸图片从云服务器中预存储的人脸分类数据库中获取的。
本实施例中以在云服务器一侧进行视频定位请求为例。视频播放器接收到用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求之后,视频播放器向云服务器发送携带被选中人脸图片的该视频定位请求。然后云服务器在云服务器一侧预存储的人脸分类数据库中获取该被选中人脸图片对应的视频信息,并发送给视频播放器。对应地,视频播放器接收云服务器发送的视频信息。
在上述实施例的技术方案的基础上,可选地,步骤102“显示被选中人脸图片对应的视频信息”之后,具体还可以包括:根据被选中人脸图片的至少一 段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。
例如,具体地,根据至少一个视频段信息中各视频段的开始时间和终止时间,从视频中获取对应的各段视频段,将各视频段合并在一起,形成该被选中人脸图片对应的定位视频。
上述实施例中的各种可选方案可以采用可结合的方式任意组合,形成本发明的可选实施例,在此不再一一赘述。
上述实施例的视频处理方法,通过建立人脸分类数据库,并在接收用户发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高,且采用上述实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度非常好。
图3为本发明实施例的视频处理方法另一实施例的流程图。如图3所示,本实施例的视频处理方法,在上述实施例的技术方案的基础上,描述本发明的一种使用场景。如图3所示,本实施例的视频处理方法,具体可以包括:
200、视频播放器对视频中的每一帧视频进行解码,得到一组图像;
本实施例的使用场景为当用户通过人机接口模块在视频播放器一侧使用视频定位处理功能时,视频播放器与云服务器之间无通信连接,人脸分类数据库的建立以及根据人脸分类数据库进行视频定位请求,均在视频播放器即视频播放系统的客户端一侧进行视频处理为例描述本发明的技术方案。
201、视频播放器对一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;
202、视频播放器根据人脸以及人脸的PTS,生成人脸时间戳数据库;
203、视频播放器将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;
204、视频播放器根据各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;
例如该视频段信息包括视频段的开始时间和终止时间。
205、视频播放器根据各人脸标识对应的各段视频段信息,建立人脸分类 数据库;
其中该人脸分类数据库中可以包括人脸标识以及该人脸标识在该视频中对应的各段视频段信息。
206、视频播放器将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列;
207、视频播放器在界面上显示人脸分类数据库中前N个人脸标识对应的人脸图片;
其中N为大于或者等于1的整数;本实施例向显示人脸分类数据库中的前N个人脸标识,是为了告知用户该视频中这N个人脸是出现概率较高的重要演员,用户可以知道该视频中的各个主配角。
208、用户通过人机接口模块从N个人脸标识对应的人脸图片中选择一个被选中人脸图片,并发起视频定位请求;
本实施例中是以从视频播放器界面上显示的人脸分类数据库中的前N个人脸标识对应的人脸图片中选择一个人脸图片,作为被选中人脸图片为例。实际应用中,也可以通过拍照的方式或者从网上下载的方式来获取被选中人脸图片,在此不再一一举例。
209、视频播放器接收用户发送的携带被选中人脸图片的视频定位请求;
210、视频播放器从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息;
视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息。人脸分类数据库中预存储的被选中人脸图片对应的视频信息中还可以包括各个被选中人脸图片。
具体地,视频播放器可以将被选中人脸图片与人脸分类数据库中每一个人脸图片进行人脸识别,例如可以通过特征值匹配算法进行人脸识别,从而从人脸分类数据库中获取被选中人脸图片对应的视频信息。
211、视频播放器在界面上显示被选中人脸图片对应的视频信息;
用户可以根据视频播放器界面上显示的被选中人脸图片的开始时间和结束时间,点击观看视频信息对应的各段视频段,观看该被选中人脸图片在该 视频中对应的所有视频段,了解该被选中人脸图片对应的演员在该视频中的演技。
212、视频播放器根据被选中人脸图片对应的视频信息中的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。
本实施例中各步骤的实施,详细可以参考上述相关实施例的记载,在此不再赘述。
本实施例的视频处理方法,通过在视频播放器一侧建立人脸分类数据库,并在接收用户发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图4为本发明实施例的视频处理方法再一实施例的流程图。如图4所示,本实施例的视频处理方法,具体可以包括如下步骤:
300、接收视频播放器发送的携带被选中人脸图片的视频定位请求;
本实施例中的视频定位请求为视频播放器接收用户通过人机接口模块发送的;本实施例的视频处理方法在云服务器一侧描述本发明的技术上方案。
301、从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息;
其中本实施例的视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息;例如该视频信息中还可以包括该被选中人脸图片。详细可以参考上述实施例的记载,在此不再赘述。
302、向视频播放器发送被选中人脸图片对应的视频信息,以供视频播放器向用户显示被选中人脸图片对应的视频信息。
最后云服务器获取到被选中人脸图片对应的视频信息之后,向视频播放器发送该被选中人脸图片对应的视频信息,视频播放器可以在界面上向用户显示被选中人脸图片对应的视频信息,用户根据显示的被选中人脸图片的视 频信息,可以观看该被选中人脸图片在该视频中对应的所有视频段,并可以进一步根据这些视频段确定该被选中人脸图片对应的演员在视频中的演技。
本实施例与上述图1所示实施例的区别在,上述图1所示实施例,是以视频播放器即客户端与云服务器之间无通信连接,所有视频处理方案均在视频播放器一侧来实现为例,描述本发明的视频处理方案。
而本实施例在云服务器与视频播放器之间具有通信连接,在视频播放器接收到用户通过人机接口模块发送的视频定位请求之后,可以从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息;最后再向视频播放器发送被选中人脸图片对应的视频信息,以供视频播放器向用户显示被选中人脸图片对应的视频信息。即具体地通过视频播放器与云服务器之间具有通信连接为例,来描述本发明的技术方案,其各步骤的实现原理类似,详细亦可以参考上述图1所示实施例的记载,在此不再赘述。
本实施例的视频处理方法,通过接收视频播放器发送的携带被选中人脸图片的视频定位请求,并从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息,向视频播放器发送被选中人脸图片对应的视频信息,以供视频播放器向用户显示被选中人脸图片对应的视频信息,实现根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
进一步可选地,在上述实施例的技术方案的基础上,在步骤301“从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息”之前,还可以包括:建立人脸分类数据库。即本实施例中,在云服务器一侧建立人脸分类数据库,该人脸分类数据库的结构以及所包括的信息与上述实施例中在视频播放器一侧建立的人脸分类数据库相同,详细可以参考上述实施例的记载,在此不再赘述。
进一步可选地,上述实施例中的“建立人脸分类数据库”,具体可以包括如下步骤:
(a)对视频中的每一帧视频进行解码,得到一组图像;
(b)对一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;
(c)根据人脸以及人脸的PTS,生成人脸时间戳数据库;
(d)将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;
(e)根据各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;该视频段信息包括视频段的起止时间;
(f)根据各人脸标识对应的各段视频段信息,建立人脸分类数据库。
本实施例的上述步骤(a)-(f)与上述图1所示实施例的后续可选技术方案中的步骤(1)-(6)实现建立人脸分类数据库的实现相同,详细可以参考上述实施例的记载,在此不再赘述。
进一步可选地,在上述实施例中的步骤(f)“根据各人脸标识对应的各段视频段信息,建立人脸分类数据库”之后,还可以包括:将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列。
或者进一步可选地,在上述实施例中的步骤“将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列”之后,步骤300“接收视频播放器发送的携带被选中人脸图片的视频定位请求”之前,还可以包括:向视频播放器发送人脸分类数据库中前N个人脸标识,以供视频播放器向用户显示前N个人脸标识对应的人脸图片,N为大于或者等于1的整数;
此时对应的被选中人脸图片为用户从N个人脸标识对应的人脸图片中选择的;或者被选中人脸图片也可以为用户通过人机接口模块输入的。
或者进一步可选地,云服务器一侧的预存储的人脸分类数据库可以是在视频播放器一侧建立的,并在云服务器一侧与视频播放器一侧有通信连接之后,发送给云服务器的。例如在上述实施例的步骤301“从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息”之前,还可以包括:接收视频播放器发送的人脸分类数据库。
上述实施例中的各种可选方案,均在云服务器一侧描述本发明的技术方案,具体实现方式亦可以参考视频播放器一侧的实施,在此不再赘述。上述实施例中的各种可选方案可以采用可结合的方式任意组合,形成本发明的可选实施例,在此不再一一赘述。
上述实施例的视频处理方法,通过在云服务器一侧建立人脸分类数据库,并在接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,并将定位的结构返回给视频播放器,由视频播放器向用户显示,视频定位效率非常高,且采用上述实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度非常好。
图5为本发明实施例的视频处理方法又一实施例的流程图。如图5所示,本实施例的视频处理方法,描述本发明的再一种使用场景。如图5所示,本实施例的视频处理方法,具体可以包括:
400、视频播放器对视频中的每一帧视频进行解码,得到一组图像;
本实施例的使用场景为当用户通过人机接口模块在视频播放器一侧使用视频定位处理功能时,视频播放器与云服务器之间无通信连接,人脸分类数据库的建立在视频播放器即视频播放系统的客户端一侧进行,但后续视频播放器与云服务器之间又恢复通信连接,视频播放器将建立的人脸分类数据库又发送给云服务器,由云服务器后续根据人脸分类数据库进行视频定位请求进行视频处理为例描述本发明的技术方案。
401、视频播放器对一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;
402、视频播放器根据人脸以及人脸的PTS,生成人脸时间戳数据库;
403、视频播放器将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;
404、视频播放器根据各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;
例如该视频段信息包括视频段的开始时间和终止时间。
405、视频播放器根据各人脸标识对应的各段视频段信息,建立人脸分类 数据库;
其中该人脸分类数据库中可以包括人脸标识以及该人脸标识在该视频中对应的各段视频段信息。
406、视频播放器将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列;
407、当视频播放器与云服务器建立网络链接,视频播放器可以向云服务器发送该人脸分类数据库;
这样后续可以在云服务器侧进行视频处理,减少视频播放器客户端的资源损耗,提高视频处理效率。
408、云服务器向视频播放器发送人脸分类数据库中前N个人脸标识对应的人脸图片;其中N为大于或者等于1的整数;
409、视频播放器在界面上向用户显示人脸分类数据库中前N个人脸标识对应的人脸图片;
这样用户可以根据显示的人脸确定该视频中的主配角。并进一步可以从中选择一个人脸作为被选中人脸图片发起视频定位请求,以请求查看该被选中人脸图片在该视频中的所有视频段。
410、用户通过人机接口模块从N个人脸标识对应的人脸图片中选择一个被选中人脸图片,并发起视频定位请求;
411、视频播放器接收用户发送的携带被选中人脸图片的视频定位请求,并转发给云服务器;
412、云服务器接收视频定位请求,并从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息;
视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息。人脸分类数据库中预存储的被选中人脸图片对应的视频信息中还可以包括各个被选中人脸图片。
具体地,云服务器可以将被选中人脸图片与人脸分类数据库中每一个人脸图片进行人脸识别,例如可以通过特征值匹配算法进行人脸识别,从而从人脸分类数据库中获取被选中人脸图片对应的视频信息。
此时云服务器可以向视频播放器发送被选中人脸图片对应的视频信息;由视频播放器在界面上显示被选中人脸图片对应的视频信息。
用户可以根据视频播放器界面上显示的被选中人脸图片的开始时间和结束时间,点击观看视频信息对应的各段视频段,观看该被选中人脸图片在该视频中对应的所有视频段,了解该被选中人脸图片对应的演员在该视频中的演技。
或者进一步地,还可以包括如下步骤:
413、云服务器根据被选中人脸图片的对应的视频信息中的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频;
或者本实施例中,云服务器也可以直接向视频播放服务器发送该被选中人脸图片对应的视频信息,由视频播放器根据被选中人脸图片的对应的视频信息中的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。
414、云服务器向视频播放器发送定位视频;
415、视频播放器在界面上向用户显示该被选中人脸图片对应的定位视频。
本实施例中,定位视频为该被选中人脸图片在视频中的所有视频段的集合,当视频播放器在界面上向用户显示该被选中人脸图片对应的定位视频,用户便可以观看该被选中人脸图片在该视频中对应的所有视频段,了解该被选中人脸图片对应的演员在该视频中的演技。
本实施例中各步骤的实施,详细可以参考上述相关实施例的记载,在此不再赘述。
本实施例的视频处理方法,通过在视频播放器一侧建立人脸分类数据库,并当云服务器与视频播放器之间具有通信连接时,由视频播放器向云服务器发送该人脸分类数据库,而后续的视频定位请求处理在云服务器一侧进行,即由云服务器接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷, 实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图6为本发明实施例的视频处理方法再另一实施例的流程图。如图6所示,本实施例的视频处理方法,在上述实施例的技术方案的基础上,描述本发明的再一种使用场景。如图6所示,本实施例的视频处理方法,具体可以包括:
500、云服务器对视频中的每一帧视频进行解码,得到一组图像;
本实施例的使用场景为当用户通过人机接口模块在视频播放器一侧使用视频定位处理功能时,视频播放器与云服务器之间有通信连接,人脸分类数据库的建立在云服务器一侧进行,后续也由云服务器后续根据人脸分类数据库进行视频定位请求进行视频处理为例描述本发明的技术方案。
501、云服务器对一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;
502、云服务器根据人脸以及人脸的PTS,生成人脸时间戳数据库;
503、云服务器将人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;
504、云服务器根据各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;
例如该视频段信息包括视频段的开始时间和终止时间。
505、云服务器根据各人脸标识对应的各段视频段信息,建立人脸分类数据库;
其中该人脸分类数据库中可以包括人脸标识以及该人脸标识在该视频中对应的各段视频段信息。
506、云服务器将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列;
这样后续可以在云服务器侧进行视频处理,减少视频播放器客户端的资源损耗,提高视频处理效率。
507、云服务器向视频播放器发送人脸分类数据库中前N个人脸标识对应的人脸图片;其中N为大于或者等于1的整数;
508、视频播放器在界面上向用户显示人脸分类数据库中前N个人脸标识对应的人脸图片;
这样用户可以根据显示的人脸确定该视频中的主配角。并进一步可以从中选择一个人脸作为被选中人脸图片发起视频定位请求,以请求查看该被选中人脸图片在该视频中的所有视频段。
509、用户通过人机接口模块从N个人脸标识对应的人脸图片中选择一个被选中人脸图片,并发起视频定位请求;
或者用户也可以自己通过拍照或者下载图片的方式,通过人机接口模块输入被选中人脸图片,并发起视频定位请求。
510、视频播放器接收用户发送的携带被选中人脸图片的视频定位请求,并转发给云服务器;
511、云服务器接收视频定位请求,并从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息;
视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息。人脸分类数据库中预存储的被选中人脸图片对应的视频信息中还可以包括各个被选中人脸图片。
此时云服务器可以向视频播放器发送被选中人脸图片对应的视频信息;由视频播放器在界面上显示被选中人脸图片对应的视频信息。
用户可以根据视频播放器界面上显示的被选中人脸图片的开始时间和结束时间,点击观看视频信息对应的各段视频段,观看该被选中人脸图片在该视频中对应的所有视频段,了解该被选中人脸图片对应的演员在该视频中的演技。
或者进一步地,还可以包括如下步骤:
512、云服务器根据被选中人脸图片的对应的视频信息中的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频;
513、云服务器向视频播放器发送定位视频;
514、视频播放器在界面上向用户显示该被选中人脸图片对应的定位视频。
本实施例中,定位视频为该被选中人脸图片在视频中的所有视频段的集合,当视频播放器在界面上向用户显示该被选中人脸图片对应的定位视频,用户便可以观看该被选中人脸图片在该视频中对应的所有视频段,了解该被选中人脸图片对应的演员在该视频中的演技。
本实施例中各步骤的实施,详细可以参考上述相关实施例的记载,在此不再赘述。
本实施例的视频处理方法,通过在云服务器一侧建立人脸分类数据库,而后续的视频定位请求处理在云服务器一侧进行,即由云服务器接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图7为本发明实施例的视频播放器一实施例的结构示意图。如图7所示,本实施例的视频播放器,具体可以包括:接收模块10、获取模块11和显示模块12。
其中接收模块10用于接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;获取模块11与接收模块10连接,获取模块11用于获取接收模块10接收的视频定位请求中的被选中人脸图片在视频中对应的视频信息,视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息;显示模块12与获取模块11连接,显示模块12用于显示获取模块11获取的被选中人脸图片对应的视频信息。
本实施例的视频播放器,通过采用上述模块实现视频处理的实现机制与上述图1所示方法实施例的实现机制相同,详细可以参考上述图1所示实施例的记载,在此不再赘述。
本实施例的视频播放器,通过采用上述模块实现接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求,获取视频定位请求中的被选中人脸图片在视频中对应的视频信息,并显示被选中人脸图片对应的视频信息。采用本实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图8为本发明实施例的视频播放器另一实施例的结构示意图。如图8所示,本实施例的视频播放器,在上述图7所示实施例的技术方案的基础上进一步更加详细地描述本发明的技术方案。
进一步可选地,本实施例的视频播放器中的获取模块11具体用于从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息。
如图8所示,进一步可选地,本实施例的视频播放器还包括:建立模块13,用于建立人脸分类数据库。此时对应地,获取模块11与建立模块13连接,获取模块11具体用于从建立模块13建立的人脸分类数据库中获取被选中人脸图片对应的视频信息。
如图8所示,进一步可选地,本实施例的视频播放器中,建立模块13,具体包括:解码单元131、人脸检测单元132、人脸时间戳数据库生成单元133、归类单元134、估算单元135和人脸分类数据库生成单元136。
其中解码单元131用于对视频中的每一帧视频进行解码,得到一组图像;人脸检测单元132与解码单元131连接,人脸检测单元132用于对解码单元131得到的一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;人脸时间戳数据库生成单元133与人脸检测单元132连接,人脸时间戳数据库生成单元133用于根据人脸检测单元132检测得到的人脸以及人脸的PTS,生成人脸时间戳数据库;归类单元134与人脸时间戳数据库生成单元133连接,归类单元134用于将人脸时间戳数据库生成单元133生成的人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;估算单元135与归类单元134连接,估算单元135用于根据归类单元134归类之后的各人脸标识对应的人脸的PTS,估 算人脸标识对应的人脸的各段视频段信息;该视频段信息包括视频段的起止时间;人脸分类数据库生成单元136与估算单元135连接,人脸分类数据库生成单元136用于根据估算单元135得到的各人脸标识对应的各段视频段信息,建立人脸分类数据库。
进一步可选地,如图8所示,本实施例的视频播放器中的建立模块13,还包括:排序单元137,该排序单元137与人脸分类数据库生成单元136连接,排序单元137用于将人脸分类数据库生成单元136生成的人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列。
此时对应的,获取模块11与人脸分类数据库生成单元136连接,获取模块11具体用于从人脸分类数据库生成单元136建立的人脸分类数据库中获取被选中人脸图片对应的视频信息。
进一步可选地,本实施例的视频播放器中显示模块12还与人脸分类数据库生成单元136连接,显示模块12用于显示经排序后的人脸分类数据库中前N个人脸标识对应的人脸图片,N为大于或者等于1的整数;进一步地,此时被选中人脸图片为用户从N个人脸标识对应的人脸图片中选择的;或者被选中人脸图片为用户通过人机接口模块输入的。
进一步可选地,本实施例的视频播放器中还包括:合并模块14。该合并模块14与人脸分类数据库生成单元136连接,该合并模块14用于根据人脸分类数据库生成单元136生成的人脸分类数据库中的被选中人脸图片的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。
本实施例的视频播放器,上述技术方案是以在视频播放器一侧建立人脸分类数据库,并根据用户发送的携带被选中人脸图片的视频定位请求,进行视频处理。
本实施例的视频播放器,通过采用上述模块实现视频处理的实现机制与上述图3所示方法实施例的实现机制相同,详细可以参考上述图3所示实施例的记载,在此不再赘述。
本实施例的视频播放器,通过采用上述模块实现建立人脸分类数据库,并在接收用户发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用 本实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图9为本发明实施例的视频播放器再一实施例的结构示意图。如图9所示,本实施例的视频播放器,在上述图8所示实施例的技术方案的基础上进一步更加详细地描述本发明的技术方案。
如图9所示,进一步可选地,本实施例的视频播放器中还包括发送模块15。该发送模块15与人脸分类数据库生成单元136连接,用于向云服务器发送人脸分类数据库生成单元136生成的人脸分类数据库。
进一步可选地,本实施例的视频播放器中发送模块15还与接收模块10连接,发送模块15具体还用于向云服务器发送接收模块10接收的携带被选中人脸图片的视频定位请求;接收模块10具体还用于接收云服务器发送的视频信息,视频信息为云服务器根据被选中人脸图片从云服务器中预存储的人脸分类数据库中获取的。
此时对应地,进一步可选地,合并模块14与接收模块10连接,合并模块14用于根据接收模块10接收的视频信息中的被选中人脸图片的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。
本实施例的视频播放器,是以在视频播放器一侧建立人脸分类数据库,并将人脸分类数据库发送给云服务器;并当视频播放器接收携带被选中人脸图片的视频定位请求之后,视频播放器向云服务器发送该视频定位请求,并由云服务器根据携带被选中人脸图片的视频定位请求,进行视频处理。
本实施例的视频播放器,通过采用上述模块实现视频处理的实现机制与上述图5所示方法实施例的实现机制相同,详细可以参考上述图5所示实施例的记载,在此不再赘述。
本实施例的视频播放器,通过采用上述模块实现在视频播放器一侧建立人脸分类数据库,并当云服务器与视频播放器之间具有通信连接时,由视频播放器向云服务器发送该人脸分类数据库,而后续的视频定位请求处理在云 服务器一侧进行,即由云服务器接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图10为本发明实施例的视频播放器又一实施例的结构示意图。如图10所示,本实施例的视频播放器,在图7所示实施例的技术方案的基础上,进一步包括如下技术方案。
本实施例的视频播放器中也包括发送模块15。发送模块15与接收模块10连接,发送模块15具体还用于向云服务器发送接收模块10接收的携带被选中人脸图片的视频定位请求;接收模块10具体还用于接收云服务器发送的视频信息,视频信息为云服务器根据被选中人脸图片从云服务器中预存储的人脸分类数据库中获取的。
此时对应地,合并模块14与获取模块11连接,合并模块14用于根据获取模块11获取的视频信息中的被选中人脸图片的至少一段视频段信息,将至少一段视频段合并为被选中人脸图片对应的定位视频。可选地,合并模块14也可以设置在云服务器一侧,此时对应的获取模块11还可以用于直接接收视频服务器发送的该被选中人脸图片对应的定位视频。
本实施例的视频播放器与上述图9所示实施例相比,省去建立模块13。本实施例的视频播放器,是以在云服务器一侧建立人脸分类数据库,并当视频播放器接收携带被选中人脸图片的视频定位请求之后,视频播放器向云服务器发送该视频定位请求,并由云服务器根据携带被选中人脸图片的视频定位请求,进行视频处理。本实施例的视频播放器,采用上述模块实现视频处理的实现机制,详细亦可以参考上述相关方法实施例的记载,在此不再赘述。
本实施例的视频播放器,通过采用上述模块实现接收携带被选中人脸图片的视频定位请求之后,视频播放器向云服务器发送该视频定位请求,并由云服务器根据携带被选中人脸图片的视频定位请求,进行视频处理,视频定位效率非常高。采用本实施例的技术方案,可以弥补现有技术中无法对视频 中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图11为本发明实施例的云服务器一实施例的结构示意图。如图11所示,本实施例的云服务器包括:接收模块20、获取模块21和发送模块22。其中接收模块20用于接收视频播放器发送的携带被选中人脸图片的视频定位请求;视频定位请求为视频播放器接收用户通过人机接口模块发送的;获取模块21与接收模块20连接,获取模块21用于从预存储的人脸分类数据库中获取接收模块20接收的被选中人脸图片对应的视频信息;视频信息中包括被选中人脸图片的标识和被选中人脸图片的至少一段视频段信息;发送模块22与获取模块21连接,发送模块22用于向视频播放器发送获取模块21获取的被选中人脸图片对应的视频信息,以供视频播放器向用户显示被选中人脸图片对应的视频信息。
本实施例的云服务器,通过采用上述模块实现视频处理的实现机制与上述图4所示方法实施例的实现机制相同,详细可以参考上述图4所示实施例的记载,在此不再赘述。
本实施例的云服务器,通过采用上述模块实现接收视频播放器发送的携带被选中人脸图片的视频定位请求,并从预存储的人脸分类数据库中获取被选中人脸图片对应的视频信息,向视频播放器发送被选中人脸图片对应的视频信息,以供视频播放器向用户显示被选中人脸图片对应的视频信息,实现根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的视频处理方法,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的视频处理方法,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图12为本发明实施例的云服务器另一实施例的结构示意图。如图12所示,本实施例的云服务器,在上述图11所示实施例的技术方案的基础上进一步更加详细地描述本发明的技术方案。
如图12所示,本实施例的云服务器还包括:建立模块23,该建立模块23用于建立人脸分类数据库。此时对应地,获取模块21还与建立模块23连接,获取模块21用于从建立模块23建立的人脸分类数据库中获取接收模块20接收的被选中人脸图片对应的视频信息。
如图12所示,进一步可选地,本实施例的云服务器中,建立模块23,具体包括:解码单元231、人脸检测单元232、人脸时间戳数据库生成单元233、归类单元234、估算单元235和人脸分类数据库生成单元236。
其中解码单元231用于对视频中的每一帧视频进行解码,得到一组图像;人脸检测单元232与解码单元231连接,人脸检测单元232用于对解码单元231得到的一组图像中各图像进行人脸检测,获取各图像中的人脸以及人脸的PTS;人脸时间戳数据库生成单元233与人脸检测单元232连接,人脸时间戳数据库生成单元233用于根据人脸检测单元232检测得到的人脸以及人脸的PTS,生成人脸时间戳数据库;归类单元234与人脸时间戳数据库生成单元233连接,归类单元234用于将人脸时间戳数据库生成单元233生成的人脸时间戳数据库中的所有人脸按照各人脸标识进行归类,以使得属于同一人的人脸对应同一个人脸标识;估算单元235与归类单元234连接,估算单元235用于根据归类单元234归类之后的各人脸标识对应的人脸的PTS,估算人脸标识对应的人脸的各段视频段信息;该视频段信息包括视频段的起止时间;人脸分类数据库生成单元236与估算单元235,人脸分类数据库生成单元236用于根据估算单元235得到的各人脸标识对应的各类视频段信息,建立人脸分类数据库。
进一步可选地,如图12所示,本实施例的云服务器中的建立模块23还包括排序单元237,该排序单元237与人脸分类数据库生成单元236连接,排序单元237用于将人脸分类数据库生成单元236生成的人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列。
此时对应地,获取模块21还与人脸分类数据库生成单元236连接,获取模块21用于从人脸分类数据库生成单元236建立的人脸分类数据库中获取接收模块20接收的被选中人脸图片对应的视频信息。
进一步可选地,本实施例的云服务器中的发送模块22还用于向视频播放器发送人脸分类数据库中前N个人脸标识,以供视频播放器向用户显示前N 个人脸标识,N为大于或者等于1的整数。对应地接收模块20接收的视频定位请求中的被选中人脸图片可以为用户从N个人脸标识对应的人脸图片中选择的;或者该被选中人脸图片可以为用户通过人机接口模块输入的。
本实施例的云服务器,是以在云服务器一侧建立人脸分类数据库,并当接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,由云服务器根据携带被选中人脸图片的视频定位请求,进行视频处理。
本实施例的云服务器,通过采用上述模块实现视频处理的实现机制与上述图6所示方法实施例的实现机制相同,详细可以参考上述图6所示实施例的记载,在此不再赘述。
或者可选地,当人脸分类数据库是在视频播放器一侧建立,并由视频播放器发送给云服务器,由云服务器根据携带被选中人脸图片的视频定位请求,进行视频处理时,此时本实施例的云服务器中的接收模块20还用于接收视频播放器发送的人脸分类数据库。
本实施例的云服务器,通过采用上述模块实现通过在云服务器一侧建立人脸分类数据库,而后续的视频定位请求处理在云服务器一侧进行,即由云服务器接收视频播放器发送的携带被选中人脸图片的视频定位请求之后,根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
图13为本发明实施例的视频播放系统实施例的结构示意图。如图13所示,本实施例的视频播放系统包括视频播放器30和云服务器40,视频播放器30和云服务器40通信连接,例如本实施例的视频播放器30采用如上图9所示实施例的视频播放器,对应地云服务器40采用如上图11所示的云服务器,并且具体可以采用图5所示实施例的视频处理方法来实现视频处理。或者本实施例的视频播放器30采用如上图10所示实施例的视频播放器,对应地云服务器40采用如上图12所示的云服务器,并且具体可以采用图6所示实施例的视频处理方法来实现视频处理。详细可以参考上述相关实施例的记 载,在此不再赘述。
本实施例的视频播放系统,通过采用上述视频播放器30和云服务器40可以根据人脸分类数据库实现对被选中人脸图片的视频的定位,视频定位效率非常高。采用本实施例的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本实施例的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度也非常好。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到至少两个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。
工业实用性
本发明的视频处理方法及系统、视频播放器与云服务器,通过接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求,获取视频定位请求中的被选中人脸图片在视频中对应的视频信息,并显示被选中人脸图 片对应的视频信息。采用本发明的技术方案,可以弥补现有技术中无法对视频中某个确定的人脸的所有视频段进行定位,导致视频定位的效率较低的缺陷,实现对视频中一个被选中人脸图片的所有视频信息的定位,视频定位效率非常高,且采用本发明的技术方案,方便用户观看该视频中该被选中人脸图片对应的演员的所有表演,用户体验度非常好。

Claims (31)

  1. 一种视频处理方法,其特征在于,所述方法包括:
    接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;
    获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
    显示所述被选中人脸图片对应的所述视频信息。
  2. 根据权利要求1所述的方法,其特征在于,获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,包括:
    从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息。
  3. 根据权利要求2所述的方法,其特征在于,从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息之前,所述方法还包括:
    建立所述人脸分类数据库。
  4. 根据权利要求3所述的方法,其特征在于,建立所述人脸分类数据库,包括:
    对所述视频中的每一帧视频进行解码,得到一组图像;
    对所述一组图像中各所述图像进行人脸检测,获取各所述图像中的人脸以及所述人脸的视频播放时间;
    根据所述人脸以及所述人脸的视频播放时间,生成人脸时间戳数据库;
    将所述人脸时间戳数据库中的所有所述人脸按照各人脸标识进行归类,以使得属于同一人的所述人脸对应同一个所述人脸标识;
    根据各所述人脸标识对应的所述人脸的所述视频播放时间,估算所述人脸标识对应的所述人脸的各类所述视频段信息;
    根据各所述人脸标识对应的各类所述视频段信息,建立所述人脸分类数据库。
  5. 根据权利要求4所述的方法,其特征在于,根据各所述人脸标识对应的各类所述视频段信息,建立所述人脸分类数据库之后,所述方法还包括:
    将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列。
  6. 根据权利要求5所述的方法,其特征在于,将人脸分类数据库中的各人脸标识按照在视频中出现的概率由大到小的顺序排列之后,接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求之前,所述方法还包括:
    显示所述人脸分类数据库中前N个所述人脸标识,所述N为大于或者等于1的整数;
    进一步地,所述被选中人脸图片为所述用户从所述N个所述人脸标识对应的人脸图片中选择的;或者所述被选中人脸图片为所述用户通过所述人机接口模块输入的。
  7. 根据权利要求3所述的方法,其特征在于,建立所述人脸分类数据库之后,所述方法还包括:
    向云服务器发送所述人脸分类数据库。
  8. 根据权利要求1所述的方法,其特征在于,获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,包括:
    向云服务器发送携带所述被选中人脸图片的所述视频定位请求;
    接收所述云服务器发送的所述视频信息,所述视频信息为所述云服务器根据所述被选中人脸图片从所述云服务器中预存储的人脸分类数据库中获取的。
  9. 根据权利要求1-8任一所述的方法,其特征在于,显示所述被选中人脸图片对应的所述视频信息之后,所述方法还包括:
    根据所述被选中人脸图片的所述至少一段视频段信息,将所述至少一段视频段合并为所述被选中人脸图片对应的定位视频。
  10. 一种视频处理方法,其特征在于,所述方法包括:
    接收视频播放器发送的携带被选中人脸图片的视频定位请求;所述视频 定位请求为所述视频播放器接收用户通过人机接口模块发送的;
    从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息;所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
    向所述视频播放器发送所述被选中人脸图片对应的所述视频信息,以供所述视频播放器向用户显示所述被选中人脸图片对应的所述视频信息。
  11. 根据权利要求10所述的方法,其特征在于,所述从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息之前,所述方法还包括:
    建立所述人脸分类数据库。
  12. 根据权利要求11所述的方法,其特征在于,建立所述人脸分类数据库,具体包括:
    对所述视频中的每一帧视频进行解码,得到一组图像;
    对所述一组图像中各所述图像进行人脸检测,获取各所述图像中的人脸以及所述人脸的视频播放时间;
    根据所述人脸以及所述人脸的视频播放时间,生成人脸时间戳数据库;
    将所述人脸时间戳数据库中的所有所述人脸按照各人脸标识进行归类,以使得属于同一人的所述人脸对应同一个所述人脸标识;
    根据各所述人脸标识对应的所述人脸的所述视频播放时间,估算所述人脸标识对应的所述人脸的各段所述视频段信息;根据各所述人脸标识对应的各段所述视频段信息,建立所述人脸分类数据库。
  13. 根据权利要求12所述的方法,其特征在于,根据各所述人脸标识对应的各段所述视频段信息,建立所述人脸分类数据库之后,所述方法还包括:
    将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列。
  14. 根据权利要求13所述的方法,其特征在于,将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列之后,接收视频播放器发送的携带被选中人脸图片的视频定位请求之前,所述 方法还包括:
    向所述视频播放器发送所述人脸分类数据库中前N个所述人脸标识,以供所述视频播放器向所述用户显示所述前N个所述人脸标识,所述N为大于或者等于1的整数;
    进一步地,所述被选中人脸图片为所述用户从所述N个所述人脸标识对应的人脸图片中选择的;或者所述被选中人脸图片为所述用户通过所述人机接口模块输入的。
  15. 根据权利要求10所述的方法,其特征在于,所述从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息之前,所述方法还包括:
    接收所述视频播放器发送的所述人脸分类数据库。
  16. 一种视频播放器,其特征在于,包括:
    接收模块,用于接收用户通过人机接口模块发送的携带被选中人脸图片的视频定位请求;
    获取模块,用于获取所述视频定位请求中的所述被选中人脸图片在视频中对应的视频信息,所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
    显示模块,用于显示所述被选中人脸图片对应的所述视频信息。
  17. 根据权利要求16所述的视频播放器,其特征在于,所述获取模块,具体用于从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息。
  18. 根据权利要求17所述的视频播放器,其特征在于,所述视频播放器,还包括:
    建立模块,用于建立所述人脸分类数据库。
  19. 根据权利要求18所述的视频播放器,其特征在于,所述建立模块,具体包括:
    解码单元,用于对所述视频中的每一帧视频进行解码,得到一组图像;
    人脸检测单元,用于对所述一组图像中各所述图像进行人脸检测,获取 各所述图像中的人脸以及所述人脸的视频播放时间;
    人脸时间戳数据库生成单元,用于根据所述人脸以及所述人脸的视频播放时间,生成人脸时间戳数据库;
    归类单元,用于将所述人脸时间戳数据库中的所有所述人脸按照各人脸标识进行归类,以使得属于同一人的所述人脸对应同一个所述人脸标识;
    估算单元,用于根据各所述人脸标识对应的所述人脸的所述视频播放时间,估算所述人脸标识对应的所述人脸的各段所述视频段信息;人脸分类数据库生成单元,用于根据各所述人脸标识对应的各段所述视频段信息,建立所述人脸分类数据库。
  20. 根据权利要求19所述的视频播放器,其特征在于,所述建立模块,还包括:
    排序单元,用于将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列。
  21. 根据权利要求20所述的视频播放器,其特征在于,所述显示模块,还用于显示所述人脸分类数据库中前N个所述人脸标识,所述N为大于或者等于1的整数;
    进一步地,所述被选中人脸图片为所述用户从所述N个所述人脸标识对应的人脸图片中选择的;或者所述被选中人脸图片为所述用户通过所述人机接口模块输入的。
  22. 根据权利要求18所述的视频播放器,其特征在于,所述视频播放器还包括:
    发送模块,用于向云服务器发送所述人脸分类数据库。
  23. 根据权利要求22所述的视频播放器,其特征在于,所述发送模块,具体还用于向所述云服务器发送携带所述被选中人脸图片的所述视频定位请求;
    所述接收模块,具体还用于接收所述云服务器发送的所述视频信息,所述视频信息为所述云服务器根据所述被选中人脸图片从所述云服务器中预存储的人脸分类数据库中获取的。
  24. 根据权利要求16-23任一所述的视频播放器,其特征在于,所述视频播放器还包括:
    合并模块,用于根据所述被选中人脸图片的所述至少一段视频段信息,将所述至少一段视频段合并为所述被选中人脸图片对应的定位视频。
  25. 一种云服务器,其特征在于,所述云服务器包括:
    接收模块,用于接收视频播放器发送的携带被选中人脸图片的视频定位请求;所述视频定位请求为所述视频播放器接收用户通过人机接口模块发送的;
    获取模块,用于从预存储的人脸分类数据库中获取所述被选中人脸图片对应的所述视频信息;所述视频信息中包括所述被选中人脸图片的标识和所述被选中人脸图片的至少一段视频段信息;
    发送模块,用于向所述视频播放器发送所述被选中人脸图片对应的所述视频信息,以供所述视频播放器向用户显示所述被选中人脸图片对应的所述视频信息。
  26. 根据权利要求25所述的云服务器,其特征在于,所述云服务器还包括:
    建立模块,用于建立所述人脸分类数据库。
  27. 根据权利要求26所述的云服务器,其特征在于,所述建立模块,具体包括:
    解码单元,用于对所述视频中的每一帧视频进行解码,得到一组图像;
    人脸检测单元,用于对所述一组图像中各所述图像进行人脸检测,获取各所述图像中的人脸以及所述人脸的视频播放时间;
    人脸时间戳数据库生成单元,用于根据所述人脸以及所述人脸的视频播放时间,生成人脸时间戳数据库;
    归类单元,用于将所述人脸时间戳数据库中的所有所述人脸按照各人脸标识进行归类,以使得属于同一人的所述人脸对应同一个所述人脸标识;
    估算单元,用于根据各所述人脸标识对应的所述人脸的所述视频播放时间,估算所述人脸标识对应的所述人脸的各段所述视频段信息;人脸分类数 据库生成单元,用于根据各所述人脸标识对应的各段所述视频段信息,建立所述人脸分类数据库。
  28. 根据权利要求27所述的云服务器,其特征在于,所述建立模块,还包括:
    排序单元,用于将所述人脸分类数据库中的各所述人脸标识按照在所述视频中出现的概率由大到小的顺序排列。
  29. 根据权利要求28所述的云服务器,其特征在于,所述发送模块,还用于向所述视频播放器发送所述人脸分类数据库中前N个所述人脸标识,以供所述视频播放器向所述用户显示所述前N个所述人脸标识,所述N为大于或者等于1的整数;
    进一步地,所述被选中人脸图片为所述用户从所述N个所述人脸标识对应的人脸图片中选择的;或者所述被选中人脸图片为所述用户通过所述人机接口模块输入的。
  30. 根据权利要求25所述的云服务器,其特征在于,所述接收模块,还用于接收所述视频播放器发送的所述人脸分类数据库。
  31. 一种视频播放系统,其特征在于,所述视频播放系统包括视频播放器和云服务器,所述视频播放器和所述云服务器通信连接,所述视频播放器采用如上权利要求22-24任一所述的视频播放器,所述云服务器采用如上权利要求25-30任一所述的云服务器。
PCT/CN2016/085011 2015-10-26 2016-06-06 视频处理方法及系统、视频播放器与云服务器 WO2017071227A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/247,043 US20170116465A1 (en) 2015-10-26 2016-08-25 Video processing method and system, video player and cloud server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510702093.7A CN105872717A (zh) 2015-10-26 2015-10-26 视频处理方法及系统、视频播放器与云服务器
CN201510702093.7 2015-10-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/247,043 Continuation US20170116465A1 (en) 2015-10-26 2016-08-25 Video processing method and system, video player and cloud server

Publications (1)

Publication Number Publication Date
WO2017071227A1 true WO2017071227A1 (zh) 2017-05-04

Family

ID=56624361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/085011 WO2017071227A1 (zh) 2015-10-26 2016-06-06 视频处理方法及系统、视频播放器与云服务器

Country Status (2)

Country Link
CN (1) CN105872717A (zh)
WO (1) WO2017071227A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734144A (zh) * 2018-05-28 2018-11-02 北京文香信息技术有限公司 一种基于人脸识别的主讲人身份认证方法
CN111353357A (zh) * 2019-01-31 2020-06-30 杭州海康威视数字技术股份有限公司 一种人脸建模系统、方法和装置
CN115984427A (zh) * 2022-12-08 2023-04-18 上海积图科技有限公司 基于音频的动画合成方法、装置、设备及存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106385624A (zh) * 2016-09-29 2017-02-08 乐视控股(北京)有限公司 视频播放方法及装置
CN109168037B (zh) * 2017-01-05 2021-08-27 腾讯科技(深圳)有限公司 视频播放方法和装置
CN108881813A (zh) * 2017-07-20 2018-11-23 北京旷视科技有限公司 一种视频数据处理方法及装置、监控系统
CN107743248A (zh) * 2017-09-28 2018-02-27 北京奇艺世纪科技有限公司 一种视频快进方法及装置
CN107832724A (zh) * 2017-11-17 2018-03-23 北京奇虎科技有限公司 从视频文件中提取人物关键帧的方法及装置
CN108446385A (zh) * 2018-03-21 2018-08-24 百度在线网络技术(北京)有限公司 用于生成信息的方法和装置
CN108540817B (zh) * 2018-05-08 2021-04-20 成都市喜爱科技有限公司 视频数据处理方法、装置、服务器及计算机可读存储介质
CN109873952B (zh) * 2018-06-20 2021-03-23 成都市喜爱科技有限公司 一种拍摄的方法、装置、设备及介质
CN109918996A (zh) * 2019-01-17 2019-06-21 平安科技(深圳)有限公司 人员违法动作识别方法、系统、计算机设备和存储介质
US11627248B2 (en) 2019-02-03 2023-04-11 Chengdu Sioeye Technology Co., Ltd. Shooting method for shooting device, and electronic equipment
CN110942027A (zh) * 2019-11-26 2020-03-31 浙江大华技术股份有限公司 遮挡策略的确定方法及装置、存储介质、电子装置
CN115037987B (zh) * 2022-06-07 2024-05-07 厦门蝉羽网络科技有限公司 一种直播带货视频的回看方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010049826A1 (en) * 2000-01-19 2001-12-06 Itzhak Wilf Method of searching video channels by content
US20130254816A1 (en) * 2012-03-21 2013-09-26 Sony Corporation Temporal video tagging and distribution
CN104298748A (zh) * 2014-10-13 2015-01-21 中南民族大学 一种用于视频中人脸检索的装置及方法
CN104731964A (zh) * 2015-04-07 2015-06-24 上海海势信息科技有限公司 基于人脸识别的人脸摘要方法、视频摘要方法及其装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010049826A1 (en) * 2000-01-19 2001-12-06 Itzhak Wilf Method of searching video channels by content
US20130254816A1 (en) * 2012-03-21 2013-09-26 Sony Corporation Temporal video tagging and distribution
CN104298748A (zh) * 2014-10-13 2015-01-21 中南民族大学 一种用于视频中人脸检索的装置及方法
CN104731964A (zh) * 2015-04-07 2015-06-24 上海海势信息科技有限公司 基于人脸识别的人脸摘要方法、视频摘要方法及其装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734144A (zh) * 2018-05-28 2018-11-02 北京文香信息技术有限公司 一种基于人脸识别的主讲人身份认证方法
CN111353357A (zh) * 2019-01-31 2020-06-30 杭州海康威视数字技术股份有限公司 一种人脸建模系统、方法和装置
CN111353357B (zh) * 2019-01-31 2023-06-30 杭州海康威视数字技术股份有限公司 一种人脸建模系统、方法和装置
CN115984427A (zh) * 2022-12-08 2023-04-18 上海积图科技有限公司 基于音频的动画合成方法、装置、设备及存储介质
CN115984427B (zh) * 2022-12-08 2024-05-17 上海积图科技有限公司 基于音频的动画合成方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN105872717A (zh) 2016-08-17

Similar Documents

Publication Publication Date Title
WO2017071227A1 (zh) 视频处理方法及系统、视频播放器与云服务器
US20170116465A1 (en) Video processing method and system, video player and cloud server
US10735494B2 (en) Media information presentation method, client, and server
US10698952B2 (en) Using digital fingerprints to associate data with a work
US10474717B2 (en) Live video streaming services with machine-learning based highlight replays
CN112218112B (zh) 媒体捕获事件中的实体的自动识别
CN104618803B (zh) 信息推送方法、装置、终端及服务器
US9860593B2 (en) Devices, systems, methods, and media for detecting, indexing, and comparing video signals from a video display in a background scene using a camera-enabled device
US9621950B2 (en) TV program identification method, apparatus, terminal, server and system
US9118886B2 (en) Annotating general objects in video
US11025967B2 (en) Method for inserting information push into live video streaming, server, and terminal
US10524005B2 (en) Facilitating television based interaction with social networking tools
WO2017166472A1 (zh) 广告数据匹配方法、装置及系统
KR101903639B1 (ko) 전자 장치 및 시청 정보 제공 방법
CN115499677A (zh) 基于直播的音视频同步检测方法及装置
JP6623905B2 (ja) サーバ装置、情報処理方法およびプログラム
CN113327308A (zh) 表情包图片的生成方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16858670

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16858670

Country of ref document: EP

Kind code of ref document: A1