CN109947988B - Information processing method and device, terminal equipment and server - Google Patents

Information processing method and device, terminal equipment and server Download PDF

Info

Publication number
CN109947988B
CN109947988B CN201910175790.XA CN201910175790A CN109947988B CN 109947988 B CN109947988 B CN 109947988B CN 201910175790 A CN201910175790 A CN 201910175790A CN 109947988 B CN109947988 B CN 109947988B
Authority
CN
China
Prior art keywords
information
video frame
display interface
current video
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910175790.XA
Other languages
Chinese (zh)
Other versions
CN109947988A (en
Inventor
芦斌
于新卫
陈明
夏凡
于天宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910175790.XA priority Critical patent/CN109947988B/en
Publication of CN109947988A publication Critical patent/CN109947988A/en
Application granted granted Critical
Publication of CN109947988B publication Critical patent/CN109947988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides an information processing method, an information processing device, terminal equipment and a server. The method is applied to the terminal equipment and comprises the following steps: when a video is played through a display interface, a current video frame or a frame identifier of the current video frame is sent to a server; receiving return information returned by the server; the return information comprises object association information and object position information of the current video frame; and performing display processing on the display interface according to the object correlation information and the object position information. Therefore, compared with the prior art, the method and the device for recognizing the video object have the advantage that the user can recognize the object appearing in the video more conveniently.

Description

Information processing method and device, terminal equipment and server
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to an information processing method, an information processing device, terminal equipment and a server.
Background
At present, the popularization programs of terminal devices such as mobile terminals and tablet computers are higher and higher, and more users are used to watch videos by using the terminal devices.
In the process of watching video using a terminal device, there are cases where: an object unknown to the user, such as an unknown person or object, appears in the video, but the user wants to know the object, such as the name of the person. In view of the above situation, a user needs to open a browser or a search engine to search for recognizing an object in a video, and thus, in the prior art, an operation of recognizing an object appearing in a video is very cumbersome to implement.
Disclosure of Invention
Embodiments of the present invention provide an information processing method and apparatus, a terminal device, and a server, so as to solve the problem that in the prior art, an operation for recognizing an object appearing in a video is very complicated to implement.
In a first aspect, an embodiment of the present invention provides an information processing method, which is applied to a terminal device, and the method includes:
when a video is played through a display interface, sending a current video frame or a frame identifier of the current video frame to a server;
receiving return information returned by the server; wherein, the return information comprises the object association information and the object position information of the current video frame;
and performing display processing on the display interface according to the object association information and the object position information.
In a second aspect, an embodiment of the present invention provides an information processing method, which is applied to a server, and the method includes:
receiving a current video frame or a frame identifier of the current video frame sent by terminal equipment;
generating return information; wherein, the return information comprises the object association information and the object position information of the current video frame;
and sending the return information to the terminal equipment.
In a third aspect, an embodiment of the present invention provides an information processing apparatus, which is applied to a terminal device, and the apparatus includes:
the sending module is used for sending the current video frame or the frame identifier of the current video frame to the server when the video is played through the display interface;
the receiving module is used for receiving the return information returned by the server; wherein, the return information comprises the object association information and the object position information of the current video frame;
and the display processing module is used for performing display processing on the display interface according to the object association information and the object position information.
In a fourth aspect, an embodiment of the present invention provides an information processing apparatus, which is applied to a server, and includes:
the receiving module is used for receiving a current video frame or a frame identifier of the current video frame sent by the terminal equipment;
the generating module is used for generating return information; wherein, the return information comprises the object association information and the object position information of the current video frame;
and the sending module is used for sending the return information to the terminal equipment.
In a fifth aspect, an embodiment of the present invention provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and operable on the processor, where the computer program, when executed by the processor, implements the steps of the information processing method provided in the first aspect.
In a sixth aspect, an embodiment of the present invention provides a server, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the information processing method provided in the second aspect.
In a seventh aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the information processing method provided in the first aspect, or implements the steps of the information processing method provided in the second aspect.
In the embodiment of the invention, when the video is played through the display interface, the terminal equipment can send the current video frame or the frame identifier of the current video frame to the server so as to receive the return information returned by the server; the return information may include object association information and object location information of the current video frame. Then, the terminal device may perform display processing on the display interface according to the object association information and the object position information in the return information, and then, based on the display processing result, the user viewing the video can easily recognize the object appearing in the video, for example, know the name of the person appearing in the video. Therefore, in the embodiment of the invention, when the terminal device plays the video, through the interaction between the terminal device and the server and the display processing performed on the display interface by the terminal device according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine for searching, so that compared with the prior art, the user can recognize the object appearing in the video more conveniently.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of an information processing method according to an embodiment of the present invention;
FIG. 2 is one of the interface diagrams of the terminal device in the embodiment of the present invention;
FIG. 3 is a second schematic interface diagram of a terminal device according to an embodiment of the present invention;
FIG. 4 is a third schematic interface diagram of a terminal device according to an embodiment of the present invention;
FIG. 5 is a fourth schematic interface diagram of a terminal device in an embodiment of the present invention;
FIG. 6 is a schematic position diagram corresponding to the object position coordinates in the object position information;
FIG. 7 is a schematic diagram of the location corresponding to normalized coordinates;
FIG. 8 is a schematic diagram of the display position of the video frame Z1 in the display interface;
FIG. 9 is a schematic diagram of interaction between a terminal device and a server according to an embodiment of the present invention;
FIG. 10 is a flow chart of another information processing method provided by an embodiment of the invention;
fig. 11 is a block diagram of an information processing apparatus according to an embodiment of the present invention;
fig. 12 is a block diagram showing the configuration of another information processing apparatus according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of a terminal device according to an embodiment of the present invention;
fig. 14 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of an information processing method according to an embodiment of the present invention is shown. As shown in fig. 1, the method is applied to a terminal device, and includes the following steps:
step 101, when playing a video through a display interface, sending a current video frame or a frame identifier of the current video frame to a server.
In step 101, the terminal device may play a video through a display interface, for example, play an online video or a local video through the display interface, and at this time, an identification button, for example, the identification button 200 in fig. 2, may be displayed in the display interface.
If the user's click operation on the identification button 200 in fig. 2 is detected, the terminal device may send the current video frame or the frame identifier of the current video frame to the server according to the instruction of the user. Here, the frame identification of the current video frame may include an identification of the currently playing video (e.g., a video name) and an identification of the current video frame in the currently playing video (e.g., timestamp information and/or frame bitmap information).
Of course, the action of sending the current video frame or the frame identifier of the current video frame to the server may also be automatically performed by the terminal device at regular intervals without being actively triggered by the user.
Step 102, receiving return information returned by a server; the return information comprises object association information and object position information of the current video frame.
Here, the object related information may be face related information, and the object position information may be face position information, where the object related to the embodiment of the present invention is specifically a person, and the embodiment of the present invention can realize identification of the person appearing in the video.
Of course, the object involved in the embodiment of the present invention may also be an animal, an article, or the like, so as to identify the animal, the article, or the like appearing in the video. In order to facilitate understanding of the present disclosure by those skilled in the art, the following embodiments are described by taking a case where an object is a person as an example.
It should be noted that, after the terminal device sends the current video frame or the frame identifier of the current video frame, the server may receive the current video frame or the frame identifier of the current video frame from the terminal device, and the server may generate the return information according to the received frame identifier; the return information may include object association information and object location information of the current video frame.
Specifically, the object association information is information associated with an object in the current video frame, which can be used for the user to recognize the object in the current video frame, and for example, the information may include the name, age, and other introduction of a person in the current video frame. In addition, the object position information may be used to characterize the position of the object in the current video frame, for example, the object position coordinates of the object in the current video frame may be included in the object position information.
And 103, performing display processing on a display interface according to the object association information and the object position information.
The following describes a specific implementation form of the display processing performed by the terminal device on the display interface by way of example.
In one implementation form, assuming that the object position information includes the object position coordinates, the terminal device may directly determine a position in the display interface corresponding to the object position coordinates, and display at least part of the object associated information in the return information at the determined position, so that the user can recognize the object in the video through at least part of the object associated information.
In another implementation form, assuming that the object position information includes object position coordinates, the terminal device may determine a position in the display interface corresponding to the object position coordinates, and display the operation button in the display interface. In a case where a click operation on the operation button is received, the terminal device may display at least part of the object associated information in the return information at the determined position, so that the user recognizes the object in the video through at least part of the object associated information.
Of course, the implementation form of the display processing performed on the display interface is not limited to the above two cases, and it is only required to ensure that the display of at least part of the information in the object-related information can be conveniently realized based on the display processing result, so as to facilitate the realization of the user's knowledge of the object in the video, which is not listed here.
It should be noted that the terminal device related in the embodiment of the present invention may specifically be: computers (Computer), mobile phones, tablet Personal computers (Tablet Personal Computer), laptop computers (Laptop Computer), personal digital assistants (PDA for short), mobile Internet Devices (MID), and the like.
In the embodiment of the invention, when the video is played through the display interface, the terminal equipment can send the current video frame or the frame identifier of the current video frame to the server so as to receive the return information returned by the server; the return information may include object association information and object location information of the current video frame. Then, the terminal device may perform display processing on the display interface according to the object association information and the object position information in the return information, and then, based on the display processing result, the user viewing the video can easily recognize the object appearing in the video, for example, know the name of the person appearing in the video. Therefore, in the embodiment of the invention, when the terminal device plays the video, through the interaction between the terminal device and the server and the display processing performed on the display interface by the terminal device according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine for searching, so that compared with the prior art, the user can recognize the object appearing in the video more conveniently.
Optionally, the object association information includes object index information;
according to the object correlation information and the object position information, performing display processing on a display interface, wherein the display processing comprises the following steps:
and displaying the object index information on a display interface according to the object position information.
Here, the object index information may be information capable of indexing an object in the video, specifically, the object index information may be an identification of the object, and in the case where the object is a person, the object index information may be a name of the person, a nickname of the person, or the like.
In this embodiment, the server may capture videos of the entire network in advance, and perform frame-by-frame analysis on each video to obtain object index information and object position information of each video frame. Next, the server may construct a correspondence between the video frame, the object index information, and the object position information according to the analyzed information, and store the constructed correspondence in the database.
Under the condition of receiving a current video frame from the terminal equipment or a frame identifier of the current video frame, the server can determine object index information and object position information corresponding to the current video frame according to the corresponding relation stored in the database and generate return information comprising the determined object position information and object association information; wherein, the object association information comprises the determined object index information.
Of course, the manner of generating the return information is not limited to the above. For example, it is also possible that the correspondence is not pre-constructed and stored, and in the case of receiving a current video frame from the terminal device or a frame identifier of the current video frame, the server may analyze the current video frame to obtain object index information and object position information of the current video frame, and generate return information according to the object index information and the object position information.
No matter which generation method is adopted, the server can send the generated return information to the terminal device, and then the terminal device can display the object index information in the return information on the display interface according to the object position information in the return information, for example, the object index information is directly displayed at the position corresponding to the object position information, so that the user can recognize the object appearing in the video through the displayed object index information. Therefore, in the embodiment, through the display of the object index information, the user can conveniently know the objects appearing in the video.
Optionally, the object association information further includes an object index result corresponding to the object index information;
according to the object association information and the object position information, display processing is carried out on the display interface, and the method further comprises the following steps:
displaying a control corresponding to the object index result in a preset area of a display interface;
and under the condition that a first input operation on the control is received, displaying an object index result in a preset area according to the first input operation.
In this embodiment, after determining the object index information and the object position information corresponding to the current video frame according to the correspondence stored in the database, the server may perform information search according to the determined object index information to obtain a corresponding object index result. Specifically, in the case where the determined object index information is a name of a person, the server may search resource data related to the keyword using the name as a keyword, and use the resource data obtained by the search as an object index result; the object index result may include a personal profile of a person, related Video, and the like, for example, in the case where the person is a singer, a birth date, a place of birth, a name of a work, a Music Video (MV) in a released album, a concert Video, and the like of the person may be included in the object index result. The return information sent by the server to the terminal device may include the object index result.
After receiving the return information sent by the server, the terminal device may display a control corresponding to the object index result in the return information in a preset area of the display interface. Specifically, the preset area may be located at the upper left portion, the lower left portion, the upper right portion, or the lower right portion of the display interface; the control corresponding to the object index result in the return information can be a button in a circular shape, a rectangular shape or other shapes.
Next, the terminal device may detect whether the first input operation to the control is received periodically or aperiodically. Specifically, the first input operation on the control includes, but is not limited to, a click operation, a press operation, a drag operation, and the like on the control.
If the first input operation on the control is received, the terminal device can update the display content of the preset area, so that the preset area displays the object index result in the return information, and a user watching the video can conveniently know the object appearing in the video through the object index result in the display interface.
In specific implementation, assuming that the current video frame is the video frame (including three people) shown in fig. 3, after the terminal device sends the current video frame or the frame identifier of the current video frame to the server, the return information sent by the server may include object index information 1, object position information 1, object index result 1, object index information 2, object position information 2, object index result 2, object index information 3, object position information 3, and object index result 3; the object index information 1, the object position information 1 and the object index result 1 have a correspondence, the object index information 2, the object position information 2 and the object index result 2 have a correspondence, and the object index information 3, the object position information 3 and the object index result 3 have a correspondence.
As shown in fig. 4, a drawer 400 may be disposed at a lower right corner of the display interface, the drawer 400 is opened when the terminal device receives the return information, the drawer 400 in the opened state may be located in a preset region of the display interface, and the first control 401 corresponding to the object index result 1, the second control 402 corresponding to the object index result 2, and the third control 403 corresponding to the object index result 3 may be dropped into the drawer 400 with a shooting effect, so that the first control 401 to the third control 403 may be displayed in the preset region. Optionally, the first control 401 may present a face at a position corresponding to the object position information 1; the second control 402 may present a face at a position corresponding to the object position information 2; the third control 403 may present a face at a position corresponding to the object position information 3. In addition, the video may continue to play without pausing while the drawer 400 is open.
Thereafter, if the user performs a click operation on the first control 401 in fig. 4, the preset area may display the object index result 1, so that the user can further know the person corresponding to the face at the position corresponding to the object position information 1 according to the object index result 1.
Therefore, in the embodiment, the user can actually trigger the display of the object index result through the first input operation according to the requirement, so as to further know the objects appearing in the video.
Optionally, the object index result includes resource data of a video type;
after displaying the object index result in the preset area, the method further comprises:
and under the condition that a second input operation on the resource data of the video type is received, the resource data of the video type is played in a full screen mode through the display interface according to the second input operation.
It should be noted that the object index result displayed in the preset area may include text-type resource data, picture-type resource data, video-type resource data, and the like. The terminal device may periodically or aperiodically detect whether a second input operation for the resource data of the video type in the displayed object index result is received. Specifically, the second input operation on the resource data of the video type includes, but is not limited to, a click operation, a press operation, and the like on the resource data of the video type.
If a second input operation on the video-type resource data is received, the terminal device can update the display content of the whole display interface to enable the display interface to play the video-type resource data in a full screen mode, so that a user can watch the video-type resource data conveniently, and the user can further know objects appearing in the video.
In specific implementation, as shown in fig. 5, the object index result displayed in the preset area may include both text-type resource data (e.g., resource data D1) and video-type resource data (e.g., resource data D2 and resource data D3), and if the user clicks the play button 500 in the resource data D2, the terminal device may play the resource data D2 in a full screen manner through the display interface, so that the user can further know an object appearing in the video according to the resource data D2.
Therefore, in this embodiment, the user can trigger full-screen playing of the resource data of the video type through the second input operation according to the actual requirement, so as to further know the object appearing in the video.
Optionally, displaying the object index information on the display interface according to the object position information, including:
and displaying the object mark on a display interface according to the object position information, and displaying the object index information at the preset position of the object mark.
Here, the object mark may be a floating mark frame, and the mark frame may be rectangular, circular or other shapes; the preset position of the object marker may be at the top, bottom, or other position of the marker box.
In specific implementation, it is assumed that a current video frame is the video frame (including three people) displayed in fig. 3, and the return information sent by the server includes object index information 1, object position information 1, object index result 1, object index information 2, object position information 2, object index result 2, object index information 3, object position information 3, and object index result 3, as shown in fig. 4, the terminal device may display a mark frame 411 on the display interface according to the object position information 1, and display the object index information 1 on the top of the mark frame 411; displaying a mark frame 412 on the display interface according to the object position information 2, and displaying object index information 2 at the top of the mark frame 412; according to the object position information 3, a mark frame 413 is displayed on the display interface, and the object index information 3 is displayed on the top of the mark frame 413. In this way, the association relationship between each face in the video and each object index information in the return information can be clearly illustrated by the mark boxes 411 to 413.
As can be seen, in this embodiment, based on the display of the object markers and the object index information, each object marker can mark a corresponding object in the video, and the user can conveniently know the association relationship between each object in the video and each object index information in the return information, so that the user can more accurately recognize the object appearing in the video.
Optionally, the return information further includes a first video size corresponding to the current video frame;
displaying the object mark on the display interface according to the object position information, wherein the method comprises the following steps:
acquiring the interface size of a display interface;
determining a target position in a display interface according to the object position information, the first video size and the interface size;
an object marker is displayed at the target location.
It should be noted that, what is stored in the database may be a corresponding relationship among video frames, object index information, object location information, and video sizes; in the correspondence relationship, the video size corresponding to any video frame may be the original size of the video frame. After receiving the current video frame sent by the terminal device or the frame identifier of the current video frame, the return information generated by the terminal device may include not only the object index information and the object position information corresponding to the current video frame, but also the video size (i.e., the first video size) corresponding to the current video frame.
It should be noted that, if the terminal device needs to play a certain video, the interface size of the display interface of the terminal device needs to be compared with the original video size of the video to be played. If the interface size is matched with the video size (namely the two sizes are the same), the video can be directly played on the display interface; if the interface size is not matched with the video size, the video to be played is generally scaled in an equal proportion, and then the video to be played is displayed in the middle of the display interface, so that the video playing effect is ensured. In this way, if the original video size of the video currently played by the terminal device does not match the interface size of the display interface in step 101, the object marker is subsequently and directly displayed at the position corresponding to the object position information in the returned information, which may cause the object marker to incorrectly mark the object in the video.
In view of this, in this embodiment, the terminal device may determine the target position in the display interface according to the object position information and the first video size in the return information, and the interface size of the display interface, where the target position may be an actual position where the object in the video is located. And then, the terminal equipment displays the object mark at the target position to ensure that the object mark can mark the object in the video correctly.
Optionally, the object position information includes object position coordinates. Here, the object position coordinates may be expressed in the form of (x, y, w, h).
Determining a target position in the display interface according to the object position information, the first video size and the interface size, wherein the determining comprises the following steps:
according to the first video size, carrying out normalization processing on the position coordinates of the object to obtain normalized coordinates;
determining a second video size of the video in the display interface and a blank size of the display interface according to the first video size and the interface size;
and determining the target position in the display interface according to the normalized coordinates, the second video size and the blank size.
Generally, in order to ensure that a better playing effect can be achieved in various network environments, when video playing is performed, different video resolutions can be provided for a user to select, for example, videos can be divided into a high definition video and an ultra high definition video, resolutions of the high definition video and the ultra high definition video are different, and of course, code rates of the high definition video and the ultra high definition video are also different.
It should be noted that the resolution of the video may be used to represent the size of the video, and the resolution of the display interface may be used to represent the size of the display interface, and a specific implementation process of this embodiment is described in detail below by using a specific example.
Assuming that the resolution of the video corresponding to the video 1 captured from the whole network is 100 × 100 when the corresponding relationship for storing into the database is constructed, the video size corresponding to each video frame in the video 1 in the subsequently constructed corresponding relationship may be represented as 100 × 100. That is, the original video width (i.e., video _ width) of each video frame in video 1 may be considered as 100, and the original video height (video _ height) of each video frame in video 1 may be considered as 100.
Further, assuming that the object position coordinate in the object position information corresponding to the video frame Z1 in the video 1 is (20, 20, 30, 40) in the constructed correspondence relationship, the position corresponding to the object position information may specifically be the position where the rectangular frame with the upper left end point coordinate of (20, 20) and the lower right end point coordinate of (50, 60) is located in fig. 6.
Assuming that the terminal device is playing a video 1 and the current video frame is a video frame Z1, at this time, the first video size may be represented by 100 × 100, and after the normalization processing is performed on the object position coordinates, the obtained normalized coordinates are:
x1=x/video_width=20/100=0.2
y1=y/video_height=20/100=0.2
x2=x1+w/video_width=0.2+30/100=0.5
y2=y1+h/video_height=0.2+40/100=0.6
thus, the position corresponding to the normalized coordinates is the position of the rectangular frame with the coordinate of the upper left end point being (0.2) and the coordinate of the lower right end point being (0.5,0.6) in fig. 7.
In addition, assuming that the resolution of the display interface of the terminal device is 160 × 90, the interface size of the display interface may be represented by 160 × 90, that is, the view width (i.e., view _ width) of the display interface is 160, and the view height (i.e., view _ height) of the display interface is 90, and then the aspect ratio of the view (i.e., view _ aspect _ ratio) and the aspect ratio of the video (i.e., video _ aspect _ ratio) may be calculated by using the following formulas, where:
view_aspect_ratio=view_width/view_height=160/90=1.777
video_aspect_ratio=video_width/video_height=100/100=1。
since view _ aspect _ ratio > video _ aspect _ ratio, as shown in fig. 8, it is necessary to preferentially adapt the height, and it is ensured that all the contents of the video 1 are within the displayable range of the display interface, then, the height ratio height _ ratio can be calculated, where:
height_ratio=video_height/view_height=90/100=0.9
thus, the rendered size of video 1 (which may be used to characterize the second video size described above) is:
draw_video_width=video_width*height_ratio=100*0.9=90
draw_video_height=video_height*height_ratio=100*0.9=90
where draw _ video _ width is a rendering width of video (i.e., an actual width of video 1 in the display interface), and draw _ video _ height is a rendering height of video (i.e., an actual height of video 1 in the display interface).
Since video 1 is displayed centered, the left margin size left _ space is:
left_space=(view_width-draw_video_width)/2=(160-90)/2=35
thus, the face rendering coordinates can be obtained as:
draw_x1=x1*draw_video_width+left_space=0.2*90+35=53
draw_y1=y1*draw_video_height=0.2*90=18
draw_x2=x2*draw_video_width+left_space=0.5*90+35=80
draw_y2=y2*draw_video_height=0.6*90=54
thus, the target position in the display interface can be represented through (53, 18, 80, 54); the target position is a rectangle, the coordinates of the upper left end point of the rectangle are (53, 18), and the coordinates of the lower right end point of the rectangle are (80, 54).
As can be seen, in this embodiment, based on the normalization processing and the second video size and the blank size, the target position can be determined, so that the object marker can be displayed at the correct position, and thus the object marker can correctly mark the object in the video.
Optionally, after determining the target position in the display interface, the method further includes:
determining whether a preset type of object is displayed at the target position;
displaying an object marker at a target location, comprising:
in the case where the target position is displayed with a preset type of object, an object mark is displayed at the target position.
In this embodiment, after determining the target position in the display interface, the terminal device may perform image recognition to determine whether a preset type of object (e.g., a person) is displayed at the target position. Specifically, the terminal device may determine whether the target position displays a human face.
If the target position displays the preset type of object, the object position information in the return information returned by the server can be considered to be correct, and the terminal equipment can display an object mark at the target position so as to correctly mark the object in the video.
If the target position does not display the preset type of object, the object position information in the return information returned by the server can be considered to be wrong, and the terminal equipment does not display the object mark at the target position.
Therefore, in the embodiment, the terminal device can verify whether the object position information in the returned information is correct, and mark the object in the video by using the object mark only according to the correct object position information, so that the marking accuracy is ensured.
Optionally, when the video is played through the display interface, sending the current video frame or the frame identifier of the current video frame to the server, including:
when a video is played through a display interface, determining whether a preset type of object is displayed at a video playing position in the display interface;
and under the condition that the preset type of object is displayed at the video playing position, sending the current video frame or the frame identifier of the current video frame to the server.
In this embodiment, when a video is played through the display interface, it may be determined whether an object (e.g., a person) of a preset type is displayed at a video playing position in the display interface. Specifically, the terminal device may determine whether a face is displayed at the video playing position.
If the video playing position in the display interface displays an object of a preset type, which indicates that an identifiable object exists in the video, the terminal device may display an identification button 200 shown in fig. 2. In case that the user clicks the identification button 200, the terminal device may transmit the current video frame or the frame identifier of the current video frame to the server, so as to facilitate the execution of the subsequent operation.
Therefore, in the embodiment, only under the condition that the identifiable object exists in the currently played video, the terminal device interacts with the server, so that unnecessary resource loss and power consumption can be reduced.
Optionally, after sending the current video frame or the frame identifier of the current video frame to the server, before performing display processing on the display interface, the method further includes:
and displaying a preset object recognition animation effect on a display interface.
In this embodiment, in the process of performing object identification, the terminal device may display a preset object identification animation effect, such as a flickering effect, a timing effect, and the like, on the display interface to notify the user that object identification is currently performed.
The following describes an interaction process between the terminal device and the server by using a specific example in conjunction with fig. 9.
As shown in fig. 9, when the terminal device plays a video on the display interface by using a video player, if the user clicks the identification button 200 in fig. 2, character identification may be triggered, at this time, the terminal device suspends playing of the video, acquires the frame identifier of the current video frame, and sends the frame identifier of the current video frame to the server to request the server to identify the character in the current video frame.
Next, the server may send return information to the terminal device; the return information may include object association information and object location information of the current video frame.
Thus, according to the object position information in the return information, the terminal equipment can display the object mark and the object index information in the return information so as to mark the face in the video; according to the object index result in the returned information, the terminal equipment can display the character introduction and the related recommendation of the characters in the video. In addition, the terminal device can continue playing the video.
Therefore, in this embodiment, when the user watches the video through the terminal device, analysis and understanding of the video content can be realized through interaction between the terminal device and the server, so as to assist the user in knowing the video content, and in addition, related resource data can be searched and recommended.
Generally, video content analysis refers to analyzing the time, characters, objects, scenes, actions and the like of video frames and analyzing audio information such as dialogs, lines and the like appearing in audio; video content understanding refers to attempting to understand relevant information analyzed in a guessed video through an Artificial Intelligence (AI) technology; the video content searching means that after the video content is analyzed and understood, information related to the video content is searched in the whole network, for example, star character information, bridge video and the like are searched; the video content recommendation is to actively push related information after searching the related information after analyzing and understanding the video content.
In summary, compared with the prior art, in the embodiment, the user can more conveniently recognize the object appearing in the video.
Referring to fig. 10, a flowchart of an information processing method according to an embodiment of the present invention is shown. As shown in fig. 10, the method is applied to a server, and includes the following steps:
1001, receiving a current video frame or a frame identifier of the current video frame sent by a terminal device;
step 1002, generating return information; the return information comprises object association information and object position information of the current video frame;
step 1003, sending the return information to the terminal device.
Optionally, generating the return information includes:
determining object index information and object position information corresponding to the current video frame according to the preset corresponding relation among the video frame, the object index information and the object position information;
generating return information; the return information comprises object associated information and the determined object position information, and the object associated information comprises the determined object index information.
Optionally, before generating the return information, the method further includes:
according to the determined object index information, information search is carried out to obtain a corresponding object index result;
wherein, the return information also comprises the obtained object index result.
Optionally, determining object index information and object position information corresponding to the current video frame according to a preset correspondence between the video frame, the object index information, and the object position information, includes:
determining object index information, object position information and video size corresponding to the current video frame according to the preset corresponding relation among the video frame, the object index information, the object position information and the video size;
wherein, the return information also comprises the determined video size.
Therefore, in the embodiment of the invention, when the terminal device plays the video, through the interaction between the terminal device and the server and the display processing performed on the display interface by the terminal device according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine for searching, so that compared with the prior art, the user can recognize the object appearing in the video more conveniently.
Referring to fig. 11, a block diagram of an information processing apparatus 1100 according to an embodiment of the present invention is shown. As shown in fig. 11, the information processing apparatus 1100 is applied to a terminal device, and the information processing apparatus 1100 includes:
a sending module 1101, configured to send a current video frame or a frame identifier of the current video frame to a server when a video is played through a display interface;
a receiving module 1102, configured to receive return information returned by the server; the return information comprises object association information and object position information of the current video frame;
and a display processing module 1103, configured to perform display processing on a display interface according to the object association information and the object location information.
Optionally, the object association information includes object index information;
the display processing module 1103 is specifically configured to:
and displaying the object index information on a display interface according to the object position information.
Optionally, the object association information further includes an object index result corresponding to the object index information;
the display processing module 1103 includes:
the first display unit is used for displaying a control corresponding to the object index result in a preset area of a display interface;
and the second display unit is used for displaying the object index result in a preset area according to the first input operation under the condition that the first input operation on the control is received.
Optionally, the object index result includes resource data of a video type;
the information processing apparatus 1100 further includes:
and the playing module is used for displaying the object index result in the preset area, and under the condition of receiving a second input operation on the video type resource data, according to the second input operation, the video type resource data is played in a full screen mode through the display interface.
Optionally, the display processing module 1103 is specifically configured to:
and displaying the object mark on a display interface according to the object position information, and displaying the object index information at the preset position of the object mark.
Optionally, the return information further includes a first video size corresponding to the current video frame;
the display processing module 1103 includes:
the obtaining unit is used for obtaining the interface size of the display interface;
the first determining unit is used for determining a target position in the display interface according to the object position information, the first video size and the interface size;
a third display unit for displaying the object marker at the target position.
Optionally, the object position information includes an object position coordinate;
a first determination unit comprising:
the obtaining subunit is used for carrying out normalization processing on the position coordinates of the object according to the size of the first video to obtain normalized coordinates;
the first determining subunit is used for determining a second video size of the video in the display interface and a blank size of the display interface according to the first video size and the interface size;
and the second determining subunit is used for determining the target position in the display interface according to the normalized coordinates, the second video size and the blank size.
Optionally, the information processing apparatus 1100 further includes:
the first determination module is used for determining whether a preset type of object is displayed at a target position after the target position in the display interface is determined;
the third display unit is specifically configured to:
in the case where an object of a preset type is displayed at the target position, an object mark is displayed at the target position.
Optionally, the sending module includes:
the second determining unit is used for determining whether the preset type of object is displayed at the video playing position in the display interface when the video is played through the display interface;
and the sending unit is used for sending the current video frame or the frame identifier of the current video frame to the server under the condition that the preset type of object is displayed at the video playing position.
Optionally, the information processing apparatus 1100 further includes:
and the display module is used for displaying a preset object recognition animation effect on the display interface after the current video frame or the frame identifier of the current video frame is sent to the server and before the display interface is subjected to display processing.
Optionally, the object related information is face related information, and the object position information is face position information.
Therefore, in the embodiment of the invention, when the terminal device plays the video, through the interaction between the terminal device and the server and the display processing performed on the display interface by the terminal device according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine for searching, so that compared with the prior art, the user can recognize the object appearing in the video more conveniently.
Referring to fig. 12, a block diagram of an information processing apparatus 1200 according to an embodiment of the present invention is shown. As shown in fig. 12, the information processing apparatus 1200 is applied to a server, and the information processing apparatus 1200 includes:
a receiving module 1201, configured to receive a current video frame or a frame identifier of the current video frame sent by a terminal device;
a generating module 1202, configured to generate return information; the return information comprises object association information and object position information of the current video frame;
a sending module 1203, configured to send the return information to the terminal device.
Optionally, the generating module 1202 includes:
the determining unit is used for determining object index information and object position information corresponding to the current video frame according to the corresponding relation among the preset video frame, the object index information and the object position information;
a generating unit configured to generate return information; the return information comprises object associated information and the determined object position information, and the object associated information comprises the determined object index information.
Optionally, the information processing apparatus 1200 further includes:
the searching module is used for searching information according to the determined object index information before generating the return information so as to obtain a corresponding object index result;
wherein, the return information also comprises the obtained object index result.
Optionally, the determining unit is specifically configured to:
determining object index information, object position information and video size corresponding to the current video frame according to the preset corresponding relation among the video frame, the object index information, the object position information and the video size;
wherein, the return information also comprises the determined video size.
Therefore, in the embodiment of the invention, when the terminal device plays the video, through the interaction between the terminal device and the server and the display processing performed on the display interface by the terminal device according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine for searching, so that compared with the prior art, the user can recognize the object appearing in the video more conveniently.
Referring to fig. 13, a schematic structural diagram of a terminal device 1300 according to an embodiment of the present invention is shown. As shown in fig. 13, terminal device 1300 includes, but is not limited to: a radio frequency unit 1301, a network module 1302, an audio output unit 1303, an input unit 1304, a sensor 1305, a display unit 1306, a user input unit 1307, an interface unit 1308, a memory 1309, a processor 1310, a power supply 1311, and the like. Those skilled in the art will appreciate that the terminal device architecture shown in fig. 13 does not constitute a limitation of the terminal device, and that terminal device 1300 may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
Wherein, the processor 1310 is configured to:
when a video is played through a display interface, a current video frame or a frame identifier of the current video frame is sent to a server;
receiving return information returned by the server; the return information comprises object association information and object position information of the current video frame;
and performing display processing on the display interface according to the object correlation information and the object position information.
Optionally, the object association information includes object index information;
the processor 1310 is specifically configured to:
and displaying the object index information on a display interface according to the object position information.
Optionally, the object association information further includes an object index result corresponding to the object index information;
the processor 1310 is specifically configured to:
displaying a control corresponding to the object index result in a preset area of a display interface;
and under the condition that a first input operation on the control is received, displaying an object index result in a preset area according to the first input operation.
Optionally, the object index result includes resource data of a video type;
a processor 1310, further configured to:
and after the object index result is displayed in the preset area, under the condition that a second input operation on the resource data of the video type is received, the resource data of the video type is played in a full screen mode through the display interface according to the second input operation.
Optionally, the processor 1310 is specifically configured to:
and displaying the object mark on the display interface according to the object position information, and displaying the object index information at the preset position of the object mark.
Optionally, the return information further includes a first video size corresponding to the current video frame;
the processor 1310 is specifically configured to:
acquiring the interface size of a display interface;
determining a target position in a display interface according to the object position information, the first video size and the interface size;
an object marker is displayed at the target location.
Optionally, the object position information includes an object position coordinate;
the processor 1310 is specifically configured to:
according to the first video size, carrying out normalization processing on the position coordinates of the object to obtain normalized coordinates;
determining a second video size of the video in the display interface and a blank size of the display interface according to the first video size and the interface size;
and determining the target position in the display interface according to the normalized coordinates, the second video size and the blank size.
Optionally, the processor 1310 is further configured to:
after determining the target position in the display interface, determining whether the target position displays an object of a preset type;
the processor 1310 is specifically configured to:
in the case where an object of a preset type is displayed at the target position, an object mark is displayed at the target position.
Optionally, the processor 1310 is further configured to:
when a video is played through a display interface, determining whether a preset type of object is displayed at a video playing position in the display interface;
and under the condition that the preset type of object is displayed at the video playing position, sending the current video frame or the frame identifier of the current video frame to the server.
Optionally, the processor 1310 is further configured to:
after the current video frame or the frame identifier of the current video frame is sent to the server, a preset object recognition animation effect is displayed on a display interface before the display interface is subjected to display processing.
Optionally, the object related information is face related information, and the object position information is face position information.
Therefore, in the embodiment of the present invention, when the terminal device 1300 plays the video, through the interaction between the terminal device 1300 and the server and the display processing performed by the terminal device 1300 on the display interface according to the return information from the server, the user can recognize the object in the video without opening a browser or a search engine to search.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 1301 may be configured to receive and transmit signals during a message transmission or call process, and specifically, receive downlink data from a base station and then process the received downlink data to the processor 1310; in addition, the uplink data is transmitted to the base station. Generally, radio frequency unit 1301 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 1301 can also communicate with a network and other devices through a wireless communication system.
The terminal device 1300 provides the user with wireless broadband internet access via the network module 1302, such as helping the user send and receive e-mails, browse web pages, and access streaming media.
The audio output unit 1303 can convert audio data received by the radio frequency unit 1301 or the network module 1302 or stored in the memory 1309 into an audio signal and output as sound. Also, the audio output unit 1303 can also provide audio output related to a specific function performed by the terminal apparatus 1300 (e.g., a call signal reception sound, a message reception sound, and the like). The audio output unit 1303 includes a speaker, a buzzer, a receiver, and the like.
The input unit 1304 is used to receive audio or video signals. The input Unit 1304 may include a Graphics Processing Unit (GPU) 13041 and a microphone 13042, and the Graphics processor 13041 processes image data of still pictures or video obtained by an image capturing apparatus (such as a camera) in a video capture mode or an image capture mode. The processed image frames may be displayed on the display unit 1306. The image frames processed by the graphic processor 13041 may be stored in the memory 1309 (or other storage medium) or transmitted via the radio frequency unit 1301 or the network module 1302. The microphone 13042 can receive sounds and can process such sounds into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 1301 in case of a phone call mode.
Terminal device 1300 also includes at least one sensor 1305, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 13061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 13061 and/or the backlight when the terminal device 1300 moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the terminal device posture (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration identification related functions (such as pedometer, tapping), and the like; the sensors 1305 may also include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which will not be described in detail herein.
The display unit 1306 is used to display information input by a user or information provided to the user. The Display unit 1306 may include a Display panel 13061, and the Display panel 13061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 1307 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the terminal device. Specifically, the user input unit 1307 includes a touch panel 13071 and other input devices 13072. Touch panel 13071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., a user operating on touch panel 13071 or near touch panel 13071 using a finger, stylus, or any suitable object or attachment). The touch panel 13071 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1310, and receives and executes commands sent from the processor 1310. In addition, the touch panel 13071 can be implemented by various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 1307 may include other input devices 13072 in addition to the touch panel 13071. In particular, the other input devices 13072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described herein again.
Further, touch panel 13071 can be overlaid on display panel 13061, and when touch panel 13071 detects a touch operation thereon or thereabout, it can be passed to processor 1310 to determine the type of touch event, and processor 1310 can then provide a corresponding visual output on display panel 13061 according to the type of touch event. Although in fig. 13, touch panel 13071 and display panel 13061 are two independent components to implement the input and output functions of the terminal device, in some embodiments, touch panel 13071 and display panel 13061 may be integrated to implement the input and output functions of the terminal device, and are not limited herein.
The interface unit 1308 is an interface through which an external device is connected to the terminal apparatus 1300. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 1308 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the terminal apparatus 1300 or may be used to transmit data between the terminal apparatus 1300 and an external device.
The memory 1309 may be used to store software programs as well as various data. The memory 1309 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1309 can include high-speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 1310 is a control center of the terminal device 1300, connects various parts of the entire terminal device by using various interfaces and lines, and performs various functions of the terminal device 1300 and processes data by running or executing software programs and/or modules stored in the memory 1309 and calling data stored in the memory 1309, thereby performing overall monitoring of the terminal device 1300. Processor 1310 may include one or more processing units; preferably, the processor 1310 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 1310.
The terminal device 1300 may further include a power supply 1311 (e.g., a battery) for supplying power to the various components, and preferably, the power supply 1311 may be logically connected to the processor 1310 via a power management system, so that functions of managing charging, discharging, and power consumption are performed via the power management system.
In addition, the terminal device 1300 includes some functional modules that are not shown, and are not described herein again.
Preferably, an embodiment of the present invention further provides a terminal device, including a processor 1310, a memory 1309, and a computer program stored in the memory 1309 and capable of running on the processor 1310, where the computer program, when executed by the processor 1310, implements each process of the above-mentioned information processing method embodiment applied to the terminal device, and can achieve the same technical effect, and in order to avoid repetition, it is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the processes of the information processing method applied to the terminal device, and can achieve the same technical effects, and in order to avoid repetition, the computer program is not described herein again. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
Referring to fig. 14, a schematic structural diagram of a server 1400 provided in the embodiment of the present invention is shown. As shown in fig. 14, the server 1400 includes: a processor 1401, a memory 1403, a user interface 1404 and a bus interface.
A processor 1401 for reading a program in the memory 1403, and performing the following processes:
receiving a current video frame or a frame identifier of the current video frame sent by terminal equipment;
generating return information; the return information comprises object association information and object position information of the current video frame;
and sending the return information to the terminal equipment.
In fig. 14, the bus architecture may include any number of interconnected buses and bridges, with one or more processors, represented by processor 1401, and various circuits, represented by memory 1403, linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. For different user devices, the user interface 1404 may also be an interface capable of interfacing with a desired device externally, including but not limited to a keypad, display, speaker, microphone, joystick, etc.
The processor 1401 is responsible for managing a bus architecture and general processing, and the memory 1403 may store data used by the processor 1401 in performing operations.
Optionally, the processor 1401 is specifically configured to:
determining object index information and object position information corresponding to the current video frame according to a preset corresponding relation among the video frame, the object index information and the object position information;
generating return information; the return information includes object association information and the determined object position information, and the object association information includes the determined object index information.
Optionally, the processor 1401 is further configured to:
before generating the return information, searching information according to the determined object index information to obtain a corresponding object index result;
wherein, the return information also comprises the obtained object index result.
Optionally, the processor 1401 is specifically configured to:
determining object index information, object position information and video size corresponding to the current video frame according to the corresponding relation among preset video frames, object index information, object position information and video size;
wherein the return information further comprises the determined video size.
Therefore, in the embodiment of the present invention, when the terminal device plays the video, through the interaction between the terminal device and the server 1400 and the display processing performed by the terminal device on the display interface according to the return information from the server 1400, the user can recognize the object in the video without opening a browser or a search engine to search.
Preferably, an embodiment of the present invention further provides a server, including a processor 1401, a memory 1403, and a computer program stored in the memory 1403 and capable of running on the processor 1401, where the computer program, when executed by the processor 1401, implements each process of the above-described information processing method embodiment applied to the server, and can achieve the same technical effect, and in order to avoid repetition, the details are not described here again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the processes of the information processing method applied to the server, and can achieve the same technical effects, and in order to avoid repetition, the computer program is not described herein again. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (15)

1. An information processing method, applied to a terminal device, the method comprising:
when a video is played through a display interface, sending a current video frame or a frame identifier of the current video frame to a server;
receiving return information returned by the server; the return information comprises object association information and object position information of the current video frame;
performing display processing on the display interface according to the object association information and the object position information;
the object association information comprises object index information;
the displaying processing is performed on the display interface according to the object association information and the object position information, and the displaying processing comprises:
displaying the object index information on the display interface according to the object position information;
the object association information also comprises an object index result corresponding to the object index information;
the displaying processing is performed on the display interface according to the object association information and the object position information, and the method further includes:
displaying a control corresponding to the object index result in a preset area of the display interface;
under the condition that a first input operation on the control is received, displaying the object index result in the preset area according to the first input operation;
the sending the current video frame or the frame identifier of the current video frame to the server includes:
if the clicking operation of the user on the identification button is detected, the current video frame or the frame identifier of the current video frame is sent to a server;
the displaying the object index information on the display interface according to the object position information includes:
displaying an object mark on the display interface according to the object position information, and displaying the object index information at a preset position of the object mark;
the object position information represents the position of an object in the current video frame on the video frame picture; the preset area is at least partially different from the display area of the object index information; the video comprises at least two video frames, and the video continues to play without pausing in the process of displaying the control;
the return information comprises at least one object index result of the current video frame, one object index result corresponds to one control, the control corresponding to the at least one object index result is put into a drawer in the display interface in a shooting effect, and the drawer in an open state is located in a preset area of the display interface.
2. The method according to claim 1, wherein the object index result comprises resource data of video type;
after the object index result is displayed in the preset area, the method further comprises:
and under the condition that a second input operation on the resource data of the video type is received, according to the second input operation, the resource data of the video type is played in a full screen mode through the display interface.
3. The method according to claim 1, wherein the return information further includes a first video size corresponding to the current video frame;
displaying an object mark on the display interface according to the object position information, wherein the displaying comprises:
acquiring the interface size of the display interface;
determining a target position in the display interface according to the object position information, the first video size and the interface size;
displaying an object marker at the target location.
4. The method of claim 3, wherein the object location information includes object location coordinates;
determining a target position in the display interface according to the object position information, the first video size and the interface size, including:
according to the first video size, carrying out normalization processing on the object position coordinates to obtain normalized coordinates;
determining a second video size of the video in the display interface and a margin size of the display interface according to the first video size and the interface size;
and determining the target position in the display interface according to the normalized coordinates, the second video size and the blank size.
5. The method of claim 3,
after determining the target location in the display interface, the method further comprises:
determining whether the target position displays an object of a preset type;
the displaying an object marker at the target location, comprising:
and displaying an object mark at the target position under the condition that the target position displays the preset type of object.
6. The method according to claim 1, wherein the sending the current video frame or the frame identifier of the current video frame to the server when the video is played through the display interface comprises:
when a video is played through a display interface, determining whether a preset type of object is displayed at a video playing position in the display interface;
and under the condition that the preset type of object is displayed at the video playing position, sending the current video frame or the frame identifier of the current video frame to a server.
7. The method according to claim 1, wherein after the sending the current video frame or the frame identifier of the current video frame to the server, before the display interface performs the display processing, the method further comprises:
and displaying a preset object recognition animation effect on the display interface.
8. The method according to any one of claims 1 to 7, wherein the object-related information is face-related information, and the object position information is face position information.
9. An information processing method applied to a server, the method comprising:
receiving a current video frame or a frame identifier of the current video frame sent by terminal equipment;
generating return information; wherein, the return information comprises the object association information and the object position information of the current video frame;
sending the return information to the terminal equipment;
the generating of the return information includes:
determining object index information and object position information corresponding to the current video frame according to a preset corresponding relation among the video frame, the object index information and the object position information;
generating return information; the return information comprises object association information and determined object position information, and the object association information comprises the determined object index information;
before the generating the return information, the method further includes:
according to the determined object index information, searching information to obtain a corresponding object index result;
wherein, the return information also comprises the obtained object index result;
the receiving of the current video frame or the frame identifier of the current video frame sent by the terminal device includes:
receiving a current video frame or a frame identifier of the current video frame sent by terminal equipment under the condition of detecting the clicking operation of a user on an identification button;
the object position information represents the position of an object in the current video frame on the video frame picture;
the return information comprises at least one object index result of the current video frame, one object index result in the terminal equipment corresponds to one control, the control corresponding to the at least one object index result is put into a drawer in a display interface in the terminal equipment in a shooting effect, and the drawer in an open state is located in a preset area of the display interface.
10. The method of claim 9,
the determining the object index information and the object position information corresponding to the current video frame according to the preset corresponding relationship among the video frame, the object index information and the object position information includes:
determining object index information, object position information and video size corresponding to the current video frame according to the preset corresponding relation among the video frame, the object index information, the object position information and the video size;
wherein the return information further comprises the determined video size.
11. An information processing apparatus, characterized by being applied to a terminal device, the apparatus comprising:
the sending module is used for sending the current video frame or the frame identifier of the current video frame to the server when the video is played through the display interface;
the receiving module is used for receiving the return information returned by the server; wherein, the return information comprises the object association information and the object position information of the current video frame;
the display processing module is used for performing display processing on the display interface according to the object association information and the object position information;
the display processing module is specifically configured to:
displaying object index information on a display interface according to the object position information;
the object correlation information also comprises an object index result corresponding to the object index information;
a display processing module comprising:
the first display unit is used for displaying a control corresponding to the object index result in a preset area of a display interface;
the second display unit is used for displaying the object index result in a preset area according to the first input operation under the condition that the first input operation on the control is received;
the sending module is specifically used for sending the current video frame or the frame identifier of the current video frame to the server if the clicking operation of the user on the identification button is detected;
the display processing module is specifically used for displaying an object mark on the display interface according to the object position information and displaying the object index information at a preset position of the object mark;
the object position information represents the position of an object in the current video frame in the video frame picture; the preset area is at least partially different from the display area of the object index information; the video comprises at least two video frames, and the video continues to play without pausing in the process of displaying the control;
the return information comprises at least one object index result of the current video frame, one object index result corresponds to one control, the control corresponding to the at least one object index result is put into a drawer in the display interface in a shooting effect, and the drawer in an open state is located in a preset area of the display interface.
12. An information processing apparatus, applied to a server, the apparatus comprising:
the receiving module is used for receiving a current video frame or a frame identifier of the current video frame sent by the terminal equipment;
the generating module is used for generating return information; wherein, the return information comprises the object association information and the object position information of the current video frame;
a sending module, configured to send the return information to the terminal device;
a generation module comprising:
the determining unit is used for determining object index information and object position information corresponding to the current video frame according to the corresponding relation among the preset video frame, the object index information and the object position information;
a generating unit configured to generate return information; the return information comprises object associated information and the determined object position information, and the object associated information comprises the determined object index information;
the information processing apparatus further includes:
the searching module is used for searching information according to the determined object index information before generating the return information so as to obtain a corresponding object index result;
wherein, the return information also comprises the obtained object index result;
the receiving module is specifically used for receiving a current video frame or a frame identifier of the current video frame sent by the terminal device under the condition that the clicking operation of the user on the identification button is detected;
the object position information represents the position of an object in the current video frame in the video frame picture;
the return information comprises at least one object index result of the current video frame, one object index result in the terminal equipment corresponds to one control, the control corresponding to the at least one object index result is put into a drawer in a display interface in the terminal equipment in a shooting effect, and the drawer in an open state is located in a preset area of the display interface.
13. A terminal device, characterized by comprising a processor, a memory, a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the information processing method according to any one of claims 1 to 8.
14. A server, characterized by comprising a processor, a memory, a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the information processing method according to any one of claims 9 to 10.
15. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, implements the steps of the information processing method according to any one of claims 1 to 8, or implements the steps of the information processing method according to any one of claims 9 to 10.
CN201910175790.XA 2019-03-08 2019-03-08 Information processing method and device, terminal equipment and server Active CN109947988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910175790.XA CN109947988B (en) 2019-03-08 2019-03-08 Information processing method and device, terminal equipment and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910175790.XA CN109947988B (en) 2019-03-08 2019-03-08 Information processing method and device, terminal equipment and server

Publications (2)

Publication Number Publication Date
CN109947988A CN109947988A (en) 2019-06-28
CN109947988B true CN109947988B (en) 2022-12-13

Family

ID=67008606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910175790.XA Active CN109947988B (en) 2019-03-08 2019-03-08 Information processing method and device, terminal equipment and server

Country Status (1)

Country Link
CN (1) CN109947988B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112822557A (en) * 2019-11-15 2021-05-18 中移物联网有限公司 Information processing method, information processing device, electronic equipment and computer readable storage medium
CN111225266B (en) 2020-02-25 2022-03-15 上海哔哩哔哩科技有限公司 User interface interaction method and system
CN111652678B (en) * 2020-05-27 2023-11-14 腾讯科技(深圳)有限公司 Method, device, terminal, server and readable storage medium for displaying article information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013172738A1 (en) * 2012-05-15 2013-11-21 Obshestvo S Ogranichennoy Otvetstvennostyu "Sinezis" Method for video-data indexing using a map
CN105847998A (en) * 2016-03-28 2016-08-10 乐视控股(北京)有限公司 Video playing method, playing terminal, and media server

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101380777B1 (en) * 2008-08-22 2014-04-02 정태우 Method for indexing object in video
CN104105002B (en) * 2014-07-15 2018-12-21 百度在线网络技术(北京)有限公司 The methods of exhibiting and device of audio-video document
GB2528330B (en) * 2014-07-18 2021-08-04 Unifai Holdings Ltd A method of video analysis
CN106358092B (en) * 2015-07-13 2019-11-26 阿里巴巴集团控股有限公司 Information processing method and device
CN105072460B (en) * 2015-07-15 2018-08-07 中国科学技术大学先进技术研究院 A kind of information labeling and correlating method based on video content element, system and equipment
CN107515871A (en) * 2016-06-15 2017-12-26 北京陌上花科技有限公司 Searching method and device
CN106534944B (en) * 2016-11-30 2020-01-14 北京字节跳动网络技术有限公司 Video display method and device
CN107679156A (en) * 2017-09-27 2018-02-09 努比亚技术有限公司 A kind of video image identification method and terminal, readable storage medium storing program for executing
CN108255922A (en) * 2017-11-06 2018-07-06 优视科技有限公司 Video frequency identifying method, equipment, client terminal device, electronic equipment and server
CN108040280A (en) * 2017-12-08 2018-05-15 北京小米移动软件有限公司 Content item display methods and device, storage medium
CN108156508B (en) * 2017-12-28 2020-10-20 北京安云世纪科技有限公司 Barrage information processing method and device, mobile terminal, server and system
CN108491419A (en) * 2018-02-06 2018-09-04 北京奇虎科技有限公司 It is a kind of to realize the method and apparatus recommended based on video
CN108446385A (en) * 2018-03-21 2018-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN108596095A (en) * 2018-04-24 2018-09-28 维沃移动通信有限公司 A kind of information processing method and mobile terminal
CN109218750B (en) * 2018-10-30 2022-01-04 百度在线网络技术(北京)有限公司 Video content retrieval method, device, storage medium and terminal equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013172738A1 (en) * 2012-05-15 2013-11-21 Obshestvo S Ogranichennoy Otvetstvennostyu "Sinezis" Method for video-data indexing using a map
CN105847998A (en) * 2016-03-28 2016-08-10 乐视控股(北京)有限公司 Video playing method, playing terminal, and media server

Also Published As

Publication number Publication date
CN109947988A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN109240577B (en) Screen capturing method and terminal
CN109409244B (en) Output method of object placement scheme and mobile terminal
CN110109593B (en) Screen capturing method and terminal equipment
CN108628985B (en) Photo album processing method and mobile terminal
CN109947988B (en) Information processing method and device, terminal equipment and server
CN110830362B (en) Content generation method and mobile terminal
CN109388456B (en) Head portrait selection method and mobile terminal
CN109495616B (en) Photographing method and terminal equipment
CN111556371A (en) Note recording method and electronic equipment
CN109922294B (en) Video processing method and mobile terminal
CN109618218B (en) Video processing method and mobile terminal
CN111222063A (en) Rich text rendering method and device, electronic equipment and storage medium
WO2021083091A1 (en) Screenshot capturing method and terminal device
CN109815462B (en) Text generation method and terminal equipment
CN109495638B (en) Information display method and terminal
CN110719527A (en) Video processing method, electronic equipment and mobile terminal
CN110908627A (en) Screen projection method and first electronic device
CN110096203B (en) Screenshot method and mobile terminal
CN110062281B (en) Play progress adjusting method and terminal equipment thereof
CN109669710B (en) Note processing method and terminal
CN109885490B (en) Picture comparison method and device
CN109063079B (en) Webpage labeling method and electronic equipment
CN108509126B (en) Picture processing method and mobile terminal
CN109189735B (en) Preview image display method and mobile terminal
CN108471549B (en) Remote control method and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant