CN113191184A - Real-time video processing method and device, electronic equipment and storage medium - Google Patents

Real-time video processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113191184A
CN113191184A CN202110231205.0A CN202110231205A CN113191184A CN 113191184 A CN113191184 A CN 113191184A CN 202110231205 A CN202110231205 A CN 202110231205A CN 113191184 A CN113191184 A CN 113191184A
Authority
CN
China
Prior art keywords
real
gesture information
acquiring
image
preset background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110231205.0A
Other languages
Chinese (zh)
Inventor
陈海波
权甲
潘志锐
赵昕
李珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenlan Industrial Intelligent Innovation Research Institute Ningbo Co ltd
Original Assignee
Deep Blue Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deep Blue Technology Shanghai Co Ltd filed Critical Deep Blue Technology Shanghai Co Ltd
Priority to CN202110231205.0A priority Critical patent/CN113191184A/en
Publication of CN113191184A publication Critical patent/CN113191184A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a real-time video processing method, a real-time video processing device, an electronic device and a computer readable storage medium, wherein the method comprises the following steps: acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera; aiming at each frame of real-time image of the real-time video information, acquiring a figure outline region from the real-time image, and acquiring gesture information according to the real-time image; acquiring partial images corresponding to the figure outline area from the real-time images, and recording the partial images as figure parts; displaying the character part in the preset background; generating a control instruction based on the gesture information or not executing any operation, wherein the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background. The method can realize the interaction between the character and the speech content.

Description

Real-time video processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a real-time video processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
At present, a mouse is one of main interfaces for communicating and exchanging with a computer, but the communication mode can not meet the requirements of people, and a simpler, intelligent and humanized man-machine interaction mode is expected.
The gesture is a natural, intuitive and easy-to-learn man-machine interaction means. The gesture recognition technology is an important research subject of human-computer interaction, human hands are used as direct input, intermediate media are not needed for communication between human machines, and a user can simply define various appropriate gestures to control a computer, so that a non-contact mouse control event can be realized. Compared with other input methods, the method has the characteristics of naturalness, simplicity, richness and directness.
The existing video processing method simulates the operation of a mouse based on gesture recognition, and has the disadvantages that a user still needs to operate a virtual mouse by using a gesture for operating an entity mouse, and the experience of the user is poor.
In a conference scene or a live broadcast scene, because the distance between the on-site audience and the speaker or the anchor is far, the prior art often simply projects the image of the speaker or the anchor on a screen and displays the speech content by using another screen, that is, the speaker and the speech content need to be respectively displayed by different display devices, the audience hardly pays attention to the images of the speech content and the speaker or the anchor at the same time, and the viewing experience of the audience is poor; in addition, the speaker or the anchor and the speech content cannot directly interact, and when the speaker needs to control the playing of the speech content, the speaker needs to operate through a mouse or a laser pen, so that the operation is inconvenient.
Disclosure of Invention
The application aims to provide a real-time video processing method and device, electronic equipment and a computer readable storage medium, which are wide in application range, can display character parts in a preset background, realize interaction between characters and speech content, and improve use experience of audiences.
The purpose of the application is realized by adopting the following technical scheme:
in a first aspect, the present application provides a real-time video processing method, including: acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera; aiming at each frame of real-time image of the real-time video information, acquiring a figure outline region from the real-time image, and acquiring gesture information according to the real-time image; acquiring partial images corresponding to the figure outline area from the real-time images, and recording the partial images as figure parts; displaying the character part in the preset background; generating a control instruction based on the gesture information or not executing any operation, wherein the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background. The technical scheme has the advantages that on one hand, a real-time image can be obtained according to real-time video information, a figure part in the real-time image is obtained, the figure part is displayed in a preset background, the preset background can be pictures, PPT, videos and the like, the method can be applied to scenes such as large-scale meeting sites, release meeting sites, live broadcast rooms and the like, the application range is wide, images of reporters, speakers, anchor casts and lecturers can be extracted and embedded into lecture contents by the method, and the images are projected on a screen in real time, so that on-site audiences can pay attention to the screen contents and the figures at the same time, and a better audio-visual effect is realized; on the other hand, the gesture information can be obtained according to the real-time image, a control instruction is generated based on the gesture information, interaction between the character and the speech content is achieved, the control instruction can be used for controlling playing of the preset background, electronic equipment such as a mouse is not needed to control playing of the preset background, operation is more direct and convenient, use experience of the character, audiences and users and other audiences is improved, the control instruction can also be used for controlling the size and/or the position of the character part in the preset background, the character part can be moved to a specified area needing prompting, the character part can be reduced or amplified in real time, the situation that when the character part is too large, important content in the preset background is shielded, or when the character part is too small, the audiences cannot see the character clearly is avoided, and the intelligent degree is high.
In some optional embodiments, the acquiring gesture information according to the real-time image includes: inputting the real-time image into a target detection model to obtain the gesture information; or inputting the character part into the target detection model to obtain the gesture information. The technical scheme has the advantages that the real-time image can be directly input into the target detection model, so that gesture information is obtained; the figure part in the real-time image can be also scratched out, and the figure part is input into the target detection model, so that the gesture information is obtained, compared with the gesture information obtained by directly inputting the real-time image, the gesture information is obtained through the scratched figure part, the operation data amount is small, and the operation efficiency is higher.
In some optional embodiments, the gesture information includes left-hand gesture information and right-hand gesture information, and the control instruction includes a first type of control instruction and a second type of control instruction, where the first type of control instruction is used to control a playing operation of the preset background, and the second type of control instruction is used to control a moving operation and a zooming operation of the character part in the preset background; the generating of the control instruction based on the gesture information comprises: generating the first type of control instruction based on the left-hand gesture information, and generating the second type of control instruction based on the right-hand gesture information; or generating the first type of control instruction based on the right-hand gesture information, and generating the second type of control instruction based on the left-hand gesture information. The technical scheme has the advantages that the preset background and the character part can be regulated and controlled according to the left-hand and right-hand gestures, when the left-hand gesture information corresponds to a first type of control instruction and the right-hand gesture information corresponds to a second type of control instruction, the left-hand gesture can control the playing of the preset background, and the right-hand gesture can control the size and/or the position of the character part in the preset background; when the right-hand gesture information corresponds to the first type of control instruction and the left-hand gesture information corresponds to the second type of control instruction, the right-hand gesture can control playing of the preset background, and the left-hand gesture can control the size and/or position of the character part in the preset background. Therefore, the operator can set the control options corresponding to the left hand and the right hand according to the use habit of the operator.
In some optional embodiments, the generating a control instruction based on the gesture information includes: receiving user-defined operation of a user; and determining the corresponding relation between the gesture information and the control instruction based on the self-defined operation, and enabling each type of gesture information to correspond to at most one type of control instruction. The technical scheme has the advantages that the corresponding relation between the gesture information and the control instruction can be determined by the operation of the user, the user can use the designated gesture as the corresponding gesture of the designated control instruction according to the habit of the user, the use experience of the user is greatly improved, and the intelligent degree is high.
In some optional embodiments, the method further comprises: performing stylization processing on the real-time image to obtain a stylized image; the acquiring of the partial image corresponding to the human figure region from the real-time image is marked as a human figure part, and the acquiring of the partial image comprises the following steps: and acquiring a partial image corresponding to the human figure outline area from the stylized image as the human figure part. The technical scheme has the advantages that stylization processing can be carried out on the real-time image to obtain the stylized image, the figure part in the stylized image is obtained, when a user uses the method to give a speech, the user can carry out real-time stylization processing on the self image, the stylized form can be animation, sketch, oil painting and the like, the stylized form suitable for the self image can be selected by combining speech content, the presented visual effect is further improved, the speech is more vivid, and the experience is better.
In some optional embodiments, for each frame of real-time image of the real-time video information, the following is performed simultaneously: acquiring the figure outline area from the real-time image; acquiring the gesture information according to the real-time image; and carrying out stylization processing on the real-time image to obtain the stylized image. The technical scheme has the advantages that the three steps of acquiring the character part, acquiring the gesture information and stylizing processing can be simultaneously carried out, and the efficiency of video processing can be improved.
In some alternative embodiments, each gesture information is used to indicate any one of the following gesture types: non-control gestures, fist grasp, jeer, biok, index finger up, index finger down, index finger left, index finger right, thumb open with index finger, and thumb merged with index finger; the preset background comprises at least one of the following: video material, picture material, document material and the real-time image itself. The technical scheme has the advantages that on one hand, each gesture information can correspond to any gesture type, the gesture type can be the action of a single finger or the combined action of a plurality of fingers, the operation of a user is facilitated, the operation requirement of the user on a computer is reduced, and the use experience of the user is improved; on the other hand, the preset background may include at least one of a video material, a picture material, a document material and a real-time image itself, where the document material is, for example, a PPT document or a WORD document, and thus, a user may select the material such as the picture, the PPT, the video, etc. as the preset background, and the user may also directly use the real-time image itself as the preset background.
In a second aspect, the present application provides a real-time video processing apparatus, comprising: the data acquisition module is used for acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera; the image processing module is used for acquiring a figure outline region from each frame of real-time image of the real-time video information and acquiring gesture information according to the real-time image; the figure acquisition module is used for acquiring partial images corresponding to the figure outline area from the real-time image and recording the partial images as figure parts; the display control module is used for displaying the character part in the preset background; the interaction control module is used for generating a control instruction based on the gesture information or not executing any operation, and the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
In some optional embodiments, the image processing module is to: inputting the real-time image into a target detection model to obtain the gesture information; or inputting the character part into the target detection model to obtain the gesture information.
In some optional embodiments, the gesture information includes left-hand gesture information and right-hand gesture information, and the control instruction includes a first type of control instruction and a second type of control instruction, where the first type of control instruction is used to control a playing operation of the preset background, and the second type of control instruction is used to control a moving operation and a zooming operation of the character part in the preset background; the interaction control module is used for: generating the first type of control instruction based on the left-hand gesture information, and generating the second type of control instruction based on the right-hand gesture information; or generating the first type of control instruction based on the right-hand gesture information, and generating the second type of control instruction based on the left-hand gesture information.
In some optional embodiments, the interaction control module comprises: the operation receiving unit is used for receiving user-defined operation of a user; and the corresponding relation unit is used for determining the corresponding relation between the gesture information and the control instruction based on the user-defined operation, and enabling each kind of gesture information to correspond to at most one kind of control instruction.
In some optional embodiments, the image processing module is further configured to perform stylization processing on the real-time image to obtain a stylized image; the figure obtaining module is used for obtaining a partial image corresponding to the figure outline area from the stylized image as the figure part.
In some optional embodiments, the image processing module is configured to, for each frame of real-time image of the real-time video information, simultaneously: acquiring the figure outline area from the real-time image; acquiring the gesture information according to the real-time image; and carrying out stylization processing on the real-time image to obtain the stylized image.
In some alternative embodiments, each gesture information is used to indicate any one of the following gesture types: non-control gestures, fist grasp, jeer, biok, index finger up, index finger down, index finger left, index finger right, thumb open with index finger, and thumb merged with index finger; the preset background comprises at least one of the following: video material, picture material, document material and the real-time image itself.
In a third aspect, the present application provides an electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of any of the above methods when executing the computer program.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the methods described above.
Drawings
The present application is further described below with reference to the drawings and examples.
Fig. 1 is a schematic flowchart of a real-time video processing method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating a process for generating control commands according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of another real-time video processing method provided in the embodiment of the present application;
fig. 4 is a schematic structural diagram of a real-time video processing apparatus according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of an interactive control module according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a program product for implementing a real-time video processing method according to an embodiment of the present application.
Detailed Description
The present application is further described with reference to the accompanying drawings and the detailed description, and it should be noted that, in the present application, the embodiments or technical features described below may be arbitrarily combined to form a new embodiment without conflict.
Referring to fig. 1, an embodiment of the present application provides a real-time video processing method, which includes steps S101 to S105.
Step S101: the method comprises the steps of obtaining real-time video information and a preset background, wherein the real-time video information is obtained by shooting people through a camera.
Step S102: and aiming at each frame of real-time image of the real-time video information, acquiring a figure outline region from the real-time image, and acquiring gesture information according to the real-time image.
In a specific embodiment, the method for acquiring gesture information according to the real-time image in step S102 may include:
inputting the real-time image into a target detection model to obtain the gesture information; or,
and inputting the character part into the target detection model to obtain the gesture information.
Therefore, the real-time image can be directly input into the target detection model, so that gesture information is obtained; the figure part in the real-time image can be also scratched out, and the figure part is input into the target detection model, so that the gesture information is obtained, compared with the gesture information obtained by directly inputting the real-time image, the gesture information is obtained through the scratched figure part, the operation data amount is small, and the operation efficiency is higher.
Step S103: and acquiring partial images corresponding to the human figure outline area from the real-time image and marking as human parts.
Step S104: and displaying the character part in the preset background.
Step S105: generating a control instruction based on the gesture information or not executing any operation, wherein the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
Therefore, on one hand, a real-time image can be obtained according to real-time video information, a figure part in the real-time image is obtained, the figure part is displayed in a preset background, the preset background can be pictures, PPT, videos and the like, the method can be applied to scenes such as large-scale conference sites, release meeting sites, live broadcast rooms and the like, the application range is wide, images of reporters, speakers, anchor broadcasters and lecturers can be extracted and embedded into lecture content by the method, and the images are projected on a screen in real time, so that field audiences can pay attention to the screen content and the figure image at the same time, and a better audio-visual effect is realized; on the other hand, the gesture information can be obtained according to the real-time image, a control instruction is generated based on the gesture information, interaction between the character and the speech content is achieved, the control instruction can be used for controlling playing of the preset background, electronic equipment such as a mouse is not needed to control playing of the preset background, operation is more direct and convenient, use experience of the character, audiences and users and other audiences is improved, the control instruction can also be used for controlling the size and/or the position of the character part in the preset background, the character part can be moved to a specified area needing prompting, the character part can be reduced or amplified in real time, the situation that when the character part is too large, important content in the preset background is shielded, or when the character part is too small, the audiences cannot see the character clearly is avoided, and the intelligent.
In a specific embodiment, each gesture information may be used to indicate any one of the following gesture types: non-control gestures, fist grasp, jeer, biok, index finger up, index finger down, index finger left, index finger right, thumb open with index finger, and thumb merged with index finger; the preset background may include at least one of: video material, picture material, document material and the real-time image itself. The non-control gesture is a gesture type other than the gesture types of fist making, biye, biok, index finger stretching, index finger upward, index finger downward, index finger leftward, index finger rightward, thumb and index finger open, thumb and index finger merge, and the like, and when the gesture information is used for indicating the non-control gesture, no operation is performed based on the gesture information.
Therefore, on one hand, each gesture information can correspond to any gesture type, and the gesture type can be the action of a single finger or the combined action of a plurality of fingers, so that the operation of a user is facilitated, the operation requirement of the user on the computer is reduced, and the use experience of the user is improved; on the other hand, the preset background may include at least one of a video material, a picture material, a document material, and a real-time image itself, and the user may select a picture, a PPT, a video, and other materials as the preset background, and may directly use the real-time image itself as the preset background.
In a specific embodiment, the gesture information may include left-hand gesture information and right-hand gesture information, the control instruction may include a first type of control instruction and a second type of control instruction, the first type of control instruction may be used to control a playing operation of the preset background, and the second type of control instruction may be used to control a moving operation and a zooming operation of the character part in the preset background;
the method for generating a control instruction based on the gesture information in step S105 may include:
generating the first type of control instruction based on the left-hand gesture information, and generating the second type of control instruction based on the right-hand gesture information; or,
and generating the first type of control instruction based on the right-hand gesture information, and generating the second type of control instruction based on the left-hand gesture information.
Therefore, the preset background and the character part can be regulated and controlled according to the gestures of the left hand and the right hand, when the gesture information of the left hand corresponds to the first type of control instruction and the gesture information of the right hand corresponds to the second type of control instruction, the playing of the preset background can be controlled by the gesture of the left hand, and the size and/or the position of the character part in the preset background can be controlled by the gesture of the right hand; when the right-hand gesture information corresponds to the first type of control instruction and the left-hand gesture information corresponds to the second type of control instruction, the right-hand gesture can control playing of the preset background, the left-hand gesture can control the size and/or position of the character part in the preset background, and an operator can set control options corresponding to the left hand and the right hand according to own use habits.
In a specific embodiment, the right-hand gesture information may be preset to correspond to the first type of control instruction, the left-hand gesture information may correspond to the second type of control instruction, and the user may accept the setting mode or change the setting mode to set the corresponding relationship between the left hand and the right hand and the control instruction according to his own habit.
For example, in a specific application, the preset background is PPT, and the right-hand gesture information corresponds to a first type of control instruction to control the playing of the PPT; the left-hand gesture information corresponds to a second type of control command and controls the size and/or position of the character part in the PPT.
When the right hand is Biye, controlling the PPT to turn over the next page; when the right hand ratio is OK, controlling PPT to turn over one page; when the right hand holds a fist, the PPT is reset; the pointing direction of the index finger of the right hand can be used to simulate the direction of the mouse marking the prompt line.
When the thumb and the index finger of the left hand are opened, the figure part is controlled to be enlarged in the PPT; when the thumb of the left hand is combined with the index finger, the character part is controlled to shrink in the PPT; the pointing direction of the left index finger can be used to indicate the direction in which the character part is moved within the PPT.
Referring to fig. 2, in a specific embodiment, the method for generating a control instruction based on the gesture information in step S105 may include steps S201 to S202.
Step S201: and receiving the user-defined operation of the user.
Step S202: and determining the corresponding relation between the gesture information and the control instruction based on the self-defined operation, and enabling each type of gesture information to correspond to at most one type of control instruction.
Therefore, the corresponding relation between the gesture information and the control instruction can be determined by the operation of the user, the user can take the designated gesture as the corresponding gesture of the designated control instruction according to the habit of the user, the use experience of the user is greatly improved, and the intelligent degree is high.
Referring to fig. 3, in a specific embodiment, the method may further include step S106.
Step S106: and carrying out stylization processing on the real-time image to obtain a stylized image.
The step S103 may include: and acquiring a partial image corresponding to the human figure outline area from the stylized image as the human figure part.
Therefore, the stylized image can be obtained by stylizing the real-time image, the character part in the stylized image is obtained, when a user uses the method to perform speech, the self image can be stylized in real time, the stylized form can be animation, sketch, oil painting and the like, the stylized form suitable for the self image can be selected by combining speech content, the presented visual effect is further improved, the speech is more vivid, and the experience audience is better.
In a specific embodiment, the following processing may be performed simultaneously for each frame of real-time image of the real-time video information: acquiring the figure outline area from the real-time image; acquiring the gesture information according to the real-time image; and carrying out stylization processing on the real-time image to obtain the stylized image.
Therefore, the three steps of acquiring the character part, acquiring the gesture information and stylizing processing can be simultaneously carried out, and the efficiency of video processing can be improved.
The embodiment of the application also provides a real-time video processing method, which comprises the following steps: acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera; acquiring a figure outline region from each frame of real-time image of the real-time video information, acquiring gesture information according to the real-time image, and performing stylization processing on the real-time image to obtain a stylized image; acquiring partial images corresponding to the figure outline areas from the stylized images and recording the partial images as figure parts; displaying the character part in the preset background; generating a control instruction based on the gesture information or not executing any operation, wherein the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
Referring to fig. 4, an embodiment of the present application further provides a real-time video processing apparatus, and a specific implementation manner of the real-time video processing apparatus is consistent with the implementation manner and the achieved technical effect described in the embodiment of the real-time video processing method, and details of a part of the implementation manner are not repeated.
The real-time video processing apparatus includes: the data acquisition module 101 is configured to acquire real-time video information and a preset background, where the real-time video information is obtained by shooting a person with a camera; the image processing module 102 is configured to, for each frame of real-time image of the real-time video information, obtain a person contour region from the real-time image, and obtain gesture information according to the real-time image; a person obtaining module 103, configured to obtain, from the real-time image, a partial image corresponding to the person outline area, and mark the partial image as a person part; a display control module 104, configured to display the character part in the preset background; an interaction control module 105, configured to generate a control instruction based on the gesture information, where the control instruction is used to control at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
In a specific embodiment, the image processing module 102 may be configured to: inputting the real-time image into a target detection model to obtain the gesture information; or inputting the character part into the target detection model to obtain the gesture information.
In a specific embodiment, the gesture information may include left-hand gesture information and right-hand gesture information, the control instruction may include a first type of control instruction and a second type of control instruction, the first type of control instruction may be used to control a playing operation of the preset background, and the second type of control instruction may be used to control a moving operation and a zooming operation of the character part in the preset background; the interaction control module 105 may be configured to: generating the first type of control instruction based on the left-hand gesture information, and generating the second type of control instruction based on the right-hand gesture information; or generating the first type of control instruction based on the right-hand gesture information, and generating the second type of control instruction based on the left-hand gesture information.
Referring to fig. 5, in a specific embodiment, the interaction control module 105 may include: an operation receiving unit 1051, which can be used for receiving a user-defined operation; a corresponding relation unit 1052, configured to determine a corresponding relation between the gesture information and the control instruction based on the custom operation, and correspond each gesture information to at most one control instruction.
In a specific embodiment, the image processing module 102 may be further configured to perform stylization processing on the real-time image to obtain a stylized image; the person obtaining module 103 may be configured to obtain a partial image corresponding to the person outline region from the stylized image as the person part.
In a specific embodiment, the image processing module 102 may be configured to perform the following processing for each frame of the real-time video information: acquiring the figure outline area from the real-time image; acquiring the gesture information according to the real-time image; and carrying out stylization processing on the real-time image to obtain the stylized image.
In a particular embodiment, each gesture information may be used to indicate at most one gesture type of: fist, belvek, belok, index finger up, index finger down, index finger left, index finger right, thumb open with index finger and thumb merged; the preset background may include at least one of: video material, picture material, document material and the real-time image itself.
Referring to fig. 6, an embodiment of the present application further provides an electronic device 200, where the electronic device 200 includes at least one memory 210, at least one processor 220, and a bus 230 connecting different platform systems.
The memory 210 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)211 and/or cache memory 212, and may further include Read Only Memory (ROM) 213.
The memory 210 further stores a computer program, and the computer program can be executed by the processor 220, so that the processor 220 executes the steps of the real-time video processing method in the embodiment of the present application, and a specific implementation manner of the method is consistent with the implementation manner and the achieved technical effect described in the embodiment of the real-time video processing method, and some contents are not described again.
Memory 210 may also include a utility 214 having at least one program module 215, such program modules 215 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Accordingly, the processor 220 may execute the computer programs described above, and may execute the utility 214.
Bus 230 may be a local bus representing one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or any other type of bus structure.
The electronic device 200 may also communicate with one or more external devices 240, such as a keyboard, pointing device, bluetooth device, etc., and may also communicate with one or more devices capable of interacting with the electronic device 200, and/or with any devices (e.g., routers, modems, etc.) that enable the electronic device 200 to communicate with one or more other computing devices. Such communication may be through input-output interface 250. Also, the electronic device 200 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 260. The network adapter 260 may communicate with other modules of the electronic device 200 via the bus 230. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 200, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.
The embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium is used for storing a computer program, and when the computer program is executed, the steps of the real-time video processing method in the embodiment of the present application are implemented, and a specific implementation manner of the method is consistent with the implementation manner and the achieved technical effect described in the embodiment of the real-time video processing method, and some contents are not described again.
Fig. 7 shows a program product 300 provided by the present embodiment for implementing the above-mentioned real-time video processing method, which may employ a portable compact disc read only memory (CD-ROM) and include program codes, and may be run on a terminal device, such as a personal computer. However, the program product 300 of the present invention is not so limited, and in this application, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Program product 300 may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that can communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the C language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
While the present application is described in terms of various aspects, including exemplary embodiments, the principles of the invention should not be limited to the disclosed embodiments, but are also intended to cover various modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for real-time video processing, the method comprising:
acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera;
aiming at each frame of real-time image of the real-time video information, acquiring a figure outline region from the real-time image, and acquiring gesture information according to the real-time image;
acquiring partial images corresponding to the figure outline area from the real-time images, and recording the partial images as figure parts;
displaying the character part in the preset background;
generating a control instruction based on the gesture information or not executing any operation, wherein the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
2. The method for real-time video processing according to claim 1, wherein the acquiring gesture information according to the real-time image comprises:
inputting the real-time image into a target detection model to obtain the gesture information; or,
and inputting the character part into the target detection model to obtain the gesture information.
3. The real-time video processing method according to claim 1, wherein the gesture information includes left-hand gesture information and right-hand gesture information, the control command includes a first type of control command and a second type of control command, the first type of control command is used for controlling a playing operation of the preset background, and the second type of control command is used for controlling a moving operation and a zooming operation of the character part in the preset background;
the generating of the control instruction based on the gesture information comprises:
generating the first type of control instruction based on the left-hand gesture information, and generating the second type of control instruction based on the right-hand gesture information; or,
and generating the first type of control instruction based on the right-hand gesture information, and generating the second type of control instruction based on the left-hand gesture information.
4. The method of claim 1, wherein generating a control command based on the gesture information comprises:
receiving user-defined operation of a user;
and determining the corresponding relation between the gesture information and the control instruction based on the self-defined operation, and enabling each type of gesture information to correspond to at most one type of control instruction.
5. The real-time video processing method according to claim 1, wherein the method further comprises:
performing stylization processing on the real-time image to obtain a stylized image;
the acquiring of the partial image corresponding to the human figure region from the real-time image is marked as a human figure part, and the acquiring of the partial image comprises the following steps:
and acquiring a partial image corresponding to the human figure outline area from the stylized image as the human figure part.
6. The real-time video processing method according to claim 5, wherein the following processing is performed simultaneously for each frame of real-time image of the real-time video information:
acquiring the figure outline area from the real-time image;
acquiring the gesture information according to the real-time image;
and carrying out stylization processing on the real-time image to obtain the stylized image.
7. The real-time video processing method according to claim 1, wherein each gesture information is used for indicating any one of the following gesture types: non-control gestures, fist grasp, jeer, biok, index finger up, index finger down, index finger left, index finger right, thumb open with index finger, and thumb merged with index finger;
the preset background comprises at least one of the following: video material, picture material, document material and the real-time image itself.
8. A real-time video processing apparatus, characterized in that the real-time video processing apparatus comprises:
the data acquisition module is used for acquiring real-time video information and a preset background, wherein the real-time video information is obtained by shooting a person by a camera;
the image processing module is used for acquiring a figure outline region from each frame of real-time image of the real-time video information and acquiring gesture information according to the real-time image;
the figure acquisition module is used for acquiring partial images corresponding to the figure outline area from the real-time image and recording the partial images as figure parts;
the display control module is used for displaying the character part in the preset background;
the interaction control module is used for generating a control instruction based on the gesture information or not executing any operation, and the control instruction is used for controlling at least one of the following operations: playing the preset background; moving the character part in the preset background; and zooming the human figure part in the preset background.
9. An electronic device, characterized by an electronic device memory storing a computer program and a processor implementing the steps of the method according to any of claims 1-7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202110231205.0A 2021-03-02 2021-03-02 Real-time video processing method and device, electronic equipment and storage medium Pending CN113191184A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110231205.0A CN113191184A (en) 2021-03-02 2021-03-02 Real-time video processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110231205.0A CN113191184A (en) 2021-03-02 2021-03-02 Real-time video processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113191184A true CN113191184A (en) 2021-07-30

Family

ID=76973060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110231205.0A Pending CN113191184A (en) 2021-03-02 2021-03-02 Real-time video processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113191184A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114051172A (en) * 2022-01-11 2022-02-15 阿里巴巴达摩院(杭州)科技有限公司 Live broadcast interaction method and device, electronic equipment and computer program product
CN114613215A (en) * 2022-03-09 2022-06-10 云学堂信息科技(江苏)有限公司 Online interactive system for lecturer video teaching
CN114786032A (en) * 2022-06-17 2022-07-22 深圳市必提教育科技有限公司 Training video management method and system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101489150A (en) * 2009-01-20 2009-07-22 北京航空航天大学 Virtual and reality mixed remote collaboration working method
CN106254784A (en) * 2016-09-29 2016-12-21 宇龙计算机通信科技(深圳)有限公司 A kind of method and device of Video processing
CN106502402A (en) * 2016-10-25 2017-03-15 四川农业大学 A kind of Three-Dimensional Dynamic Scene Teaching system and method
CN107741781A (en) * 2017-09-01 2018-02-27 中国科学院深圳先进技术研究院 Flight control method, device, unmanned plane and the storage medium of unmanned plane
CN108874136A (en) * 2018-06-13 2018-11-23 北京百度网讯科技有限公司 Dynamic image generation method, device, terminal and storage medium
CN110072067A (en) * 2019-05-07 2019-07-30 北京市华风声像技术中心 The TV and film production and sending method, system and equipment of interactive operation
CN110164440A (en) * 2019-06-03 2019-08-23 清华大学 Electronic equipment, method and medium are waken up based on the interactive voice for sealing mouth action recognition
CN111292337A (en) * 2020-01-21 2020-06-16 广州虎牙科技有限公司 Image background replacing method, device, equipment and storage medium
CN111596757A (en) * 2020-04-02 2020-08-28 林宗宇 Gesture control method and device based on fingertip interaction
CN111954024A (en) * 2020-08-27 2020-11-17 顾建亮 Course recording live broadcasting method and system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101489150A (en) * 2009-01-20 2009-07-22 北京航空航天大学 Virtual and reality mixed remote collaboration working method
CN106254784A (en) * 2016-09-29 2016-12-21 宇龙计算机通信科技(深圳)有限公司 A kind of method and device of Video processing
CN106502402A (en) * 2016-10-25 2017-03-15 四川农业大学 A kind of Three-Dimensional Dynamic Scene Teaching system and method
CN107741781A (en) * 2017-09-01 2018-02-27 中国科学院深圳先进技术研究院 Flight control method, device, unmanned plane and the storage medium of unmanned plane
CN108874136A (en) * 2018-06-13 2018-11-23 北京百度网讯科技有限公司 Dynamic image generation method, device, terminal and storage medium
CN110072067A (en) * 2019-05-07 2019-07-30 北京市华风声像技术中心 The TV and film production and sending method, system and equipment of interactive operation
CN110164440A (en) * 2019-06-03 2019-08-23 清华大学 Electronic equipment, method and medium are waken up based on the interactive voice for sealing mouth action recognition
CN111292337A (en) * 2020-01-21 2020-06-16 广州虎牙科技有限公司 Image background replacing method, device, equipment and storage medium
CN111596757A (en) * 2020-04-02 2020-08-28 林宗宇 Gesture control method and device based on fingertip interaction
CN111954024A (en) * 2020-08-27 2020-11-17 顾建亮 Course recording live broadcasting method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114051172A (en) * 2022-01-11 2022-02-15 阿里巴巴达摩院(杭州)科技有限公司 Live broadcast interaction method and device, electronic equipment and computer program product
CN114051172B (en) * 2022-01-11 2024-03-22 杭州阿里云飞天信息技术有限公司 Live broadcast interaction method, live broadcast interaction device, electronic equipment and computer program product
CN114613215A (en) * 2022-03-09 2022-06-10 云学堂信息科技(江苏)有限公司 Online interactive system for lecturer video teaching
CN114786032A (en) * 2022-06-17 2022-07-22 深圳市必提教育科技有限公司 Training video management method and system
CN114786032B (en) * 2022-06-17 2022-08-23 深圳市必提教育科技有限公司 Training video management method and system

Similar Documents

Publication Publication Date Title
CN113191184A (en) Real-time video processing method and device, electronic equipment and storage medium
CN110297550B (en) Label display method and device, screen throwing equipment, terminal and storage medium
CN102609188B (en) User interface interaction behavior based on insertion point
US20120107790A1 (en) Apparatus and method for authoring experiential learning content
KR20200032055A (en) Method and program for making reactive video
US11528535B2 (en) Video file playing method and apparatus, and storage medium
US20230062951A1 (en) Augmented reality platform for collaborative classrooms
CN103034395A (en) Techniques to facilitate asynchronous communication
US20050257137A1 (en) Animation review methods and apparatus
US8872813B2 (en) Parallax image authoring and viewing in digital media
CN111862280A (en) Virtual role control method, system, medium, and electronic device
CN105247463B (en) The painting canvas environment of enhancing
CN113485779A (en) Operation guiding method and device for application program
CN111901518B (en) Display method and device and electronic equipment
CN112328085A (en) Control method and device of virtual role, storage medium and electronic equipment
CN116645247A (en) Panoramic view-based augmented reality industrial operation training system and method
US20220179552A1 (en) Gesture-based menuless command interface
Ledermann An authoring framework for augmented reality presentations
KR101116538B1 (en) Choreography production system and choreography production method
CN113741775A (en) Image processing method and device and electronic equipment
KR20200137594A (en) A mobile apparatus and a method for controlling the mobile apparatus
CN113096686A (en) Audio processing method and device, electronic equipment and storage medium
CN114138250A (en) Method, device and equipment for generating steps of system case and storage medium
JPH1166351A (en) Method and device for controlling object operation inside three-dimensional virtual space and recording medium recording object operation control program
TWI475420B (en) Editable editing method of media interaction device and media interactive platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220104

Address after: 315000 No. 138-1, Zhongshan West Road, Fenghua District, Ningbo City, Zhejiang Province (self declaration)

Applicant after: Shenlan industrial intelligent Innovation Research Institute (Ningbo) Co.,Ltd.

Address before: Unit 1001, 369 Weining Road, Changning District, Shanghai, 200336 (9th floor of actual floor)

Applicant before: DEEPBLUE TECHNOLOGY (SHANGHAI) Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730