CN113535040A - Image processing method, image processing device, electronic equipment and storage medium - Google Patents

Image processing method, image processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113535040A
CN113535040A CN202010291730.7A CN202010291730A CN113535040A CN 113535040 A CN113535040 A CN 113535040A CN 202010291730 A CN202010291730 A CN 202010291730A CN 113535040 A CN113535040 A CN 113535040A
Authority
CN
China
Prior art keywords
magic expression
voice
information
control
magic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010291730.7A
Other languages
Chinese (zh)
Inventor
赵伟
王聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202010291730.7A priority Critical patent/CN113535040A/en
Publication of CN113535040A publication Critical patent/CN113535040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • G06T3/04
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces

Abstract

The application discloses an image processing method, an image processing device, electronic equipment and a storage medium, which are used for solving the problem that display and processing resources are wasted due to the fact that magic expressions are not convenient to control and the control operation of the magic expressions is complex. The method comprises the following steps: responding to a selection instruction for adding the designated magic expression, and acquiring and analyzing voice information if the designated magic expression supports voice control; designating a magic expression as a special effect for adding image elements on an image; if the voice information contains parameter control information, adjusting parameters of the appointed magic expression according to the parameter control information to obtain the adjusted magic expression; and carrying out image processing on the acquired image according to the adjusted magic expression. The application provides the function of controlling the special effect of the image through voice, the voice control operation is simple and convenient, a user does not need to switch the control interface of the magic expression back and forth, and the processing resource consumption caused by displaying and switching the control interface can be saved.

Description

Image processing method, image processing device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, an electronic device, and a storage medium.
Background
Terminals having a photographing function enable people to photograph images or record videos anytime and anywhere to record life or create works.
Most of the current shooting-type applications support the use of magic expression functions with special effects. Some magic expressions function as a support for accommodation. For example, the positions and the sizes of the pictures of some magic expressions can be changed, for the adjustable magic expressions, corresponding user adjustable pages can be displayed on the shooting page, and the user can manually adjust the shooting effect of the magic watch in the page according to the requirements of the user.
However, the inventor researches and discovers that most users use a single hand to hold the shooting terminal in the shooting process, the magic expression adjusted by the users in advance cannot meet the adjustment requirement due to the fact that the effect change of the magic expression cannot meet the adjustment requirement due to the fact that the display effect on the shooting picture changes with the movement of the equipment, the magic expression cannot be adjusted conveniently when the users hold the terminal by the single hand, and certain resource consumption is brought to display and processing due to the fact that the users switch the adjustment interface and the view-finding picture of the template expression back and forth.
Disclosure of Invention
The application aims to provide an image processing method, an image processing device, electronic equipment and a storage medium, which are used for solving the problems that the effect change of the magic expression cannot meet the requirement during adjustment due to the fact that the magic expression is adjusted in advance by a user, the display effect on a shooting picture changes with the movement of the equipment to cause the change of external factors such as light and the like, the magic expression is inconvenient to adjust when the terminal is held by one hand, and the user switches the adjustment interface of the template expression and a view-finding picture back and forth, so that certain resource consumption is brought to display and processing.
In a first aspect, an embodiment of the present application provides an image processing method, including:
responding to a selection instruction for adding a specified magic expression, and collecting and analyzing voice information if the specified magic expression supports voice control; the specified magic expression is a special effect used for adding image elements on the image;
if the voice information contains parameter control information, adjusting the parameters of the designated magic expression according to the parameter control information to obtain the adjusted magic expression;
and carrying out image processing on the acquired image according to the adjusted magic expression.
In one embodiment, prior to collecting and parsing the voice information, the method further comprises:
determining whether the specified magic expression supports voice control according to the following method:
analyzing the special effect configuration parameters of the specified magic expression;
if the special effect configuration parameters for representing support of voice control are analyzed, determining that the designated magic expression supports voice control;
and if the special effect configuration parameter for representing support of voice control is not analyzed, determining that the specified magic expression does not support voice control.
In one embodiment, after the analyzing the special effect configuration parameters of the designated magic expression, the method further includes:
if the special effect configuration parameters for representing the support of voice control are not analyzed, responding to a control instruction for controlling the specified magic expression to adjust the parameters of the specified magic expression to obtain the adjusted magic expression, wherein the control instruction is an instruction triggered according to the following operation mode: touch control operation and limb operation;
and carrying out image processing on the acquired image according to the adjusted magic expression.
After the parsing is performed to the special effect configuration parameters for representing support of voice control, the method further comprises:
and outputting prompt information of the special effect supporting voice adjustment, wherein the prompt information comprises the type of the image processing parameter of the specified magic expression supporting voice control and/or an example of the specified magic expression controlled by voice.
In one embodiment, the parameters specifying the magic expression include at least one of: a filter, selectable material elements, position information of the material elements, and sizes of the material elements.
In one embodiment, the parsing the voice information includes: converting the voice information into text information;
and performing voice recognition on the text information.
In one embodiment, the performing speech recognition on the text information includes:
analyzing the key words in the text information;
the analyzing that the voice information contains image special effect control information comprises the following steps:
the analyzed keywords comprise preset voice control keywords.
In one embodiment, the performing speech recognition on the text information includes:
performing word embedding on the text information to obtain a word vector of the text information;
extracting the features of the word vectors;
matching the extracted features with features in a semantic library; the features in the semantic library are pre-configured features for performing voice control on the special effect;
the analyzing that the voice information contains image special effect control information comprises the following steps:
features matched into the semantic library.
In one embodiment, the image is an arbitrary frame image in a video acquired in real time.
In a second aspect, an embodiment of the present application further provides an image processing apparatus, including:
the voice analysis module is configured to execute a selection instruction responding to the addition of the specified magic expression, and collect and analyze voice information if the specified magic expression supports voice control; the specified magic expression is a special effect used for adding image elements on the image;
the adjusting module is configured to adjust the parameters of the designated magic expression according to the parameter control information if the voice information contains parameter control information through analysis, so that the adjusted magic expression is obtained;
and the image execution module is configured to execute image processing on the acquired image according to the adjusted magic expression.
In one embodiment, the apparatus further comprises:
a parameter parsing module configured to perform, before the voice parsing module collects and parses voice information, determining whether the specified magic expression supports voice control according to the following method:
analyzing the special effect configuration parameters of the specified magic expression;
if the special effect configuration parameters for representing support of voice control are analyzed, determining that the designated magic expression supports voice control;
and if the special effect configuration parameter for representing support of voice control is not analyzed, determining that the specified magic expression does not support voice control.
In one embodiment, the apparatus further comprises:
a control module, configured to execute, if the parameter analysis module analyzes the special effect configuration parameter of the designated magic expression, and if the special effect configuration parameter for indicating that voice control is supported is not analyzed, adjusting the parameter of the designated magic expression in response to a control instruction for controlling the designated magic expression to obtain an adjusted magic expression, where the control instruction is an instruction triggered according to the following operation mode: touch control operation and limb operation;
the image execution module is configured to execute image processing on the acquired image according to the adjusted designated magic expression.
In one embodiment, the apparatus further comprises:
and the prompting module is configured to output prompting information of voice adjustment support for the special effects after the parameter analysis module analyzes the special effect configuration parameters for indicating that voice control is supported, wherein the prompting information comprises the type of the image processing parameters of voice control support for the specified magic expression and/or an example of voice control for the specified magic expression.
In one embodiment, the parameters specifying the magic expression include at least one of: a filter, selectable material elements, position information of the material elements, and sizes of the material elements.
In one embodiment, the voice parsing module includes:
a text conversion unit configured to perform conversion of the voice information into text information;
an analysis unit configured to perform speech recognition on the text information.
In one embodiment, the text conversion unit is configured to perform parsing of a keyword in the text information;
the analysis unit is configured to determine that the voice information contains image special effect control information when the analyzed keywords comprise preset voice control keywords.
In one embodiment, the text conversion unit is configured to perform word embedding on the text information to obtain a word vector of the text information;
the analysis unit is configured to perform feature extraction on the word vectors; matching the extracted features with features in a semantic library; the features in the semantic library are pre-configured features for performing voice control on the special effect; and determining and analyzing that the voice information contains image special effect control information when the characteristics in the semantic library are matched.
In one embodiment, the image is an arbitrary frame image in a video acquired in real time.
In a third aspect, another embodiment of the present application further provides an electronic device, including at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute any image processing method provided by the embodiment of the application.
In a fourth aspect, another embodiment of the present application further provides a computer storage medium, where the computer storage medium stores a computer program, and the computer program is used to make a computer execute any image processing method in the embodiments of the present application.
The embodiment of the application provides the voice control function of the magic expression, so that a user does not need to switch the interface back and forth when adjusting the magic expression, the user can operate conveniently, and meanwhile processing resources can be saved.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an application environment according to one embodiment of the present application;
FIG. 2 is a schematic diagram of an image processing flow according to an embodiment of the present application;
FIG. 3 is a schematic diagram of another image processing flow according to an embodiment of the present application;
FIG. 4 is a schematic diagram of outputting a prompt in accordance with one embodiment of the present application;
FIGS. 5A-5B are schematic diagrams of a voice adjusted "heart" material location according to one embodiment of the present application;
FIGS. 6A-6B are schematic diagrams of speech adjusting "heart" material size according to one embodiment of the present application;
FIGS. 7A-7C are schematic diagrams of a step-wise adjustment of the location of "heart" material as it is being voiced according to one embodiment of the present application;
FIG. 8 is a schematic diagram of an interface prompt for adjustable image processing parameters according to one embodiment of the present application;
FIG. 9 is a schematic view of a recommendation device according to one embodiment of the present application;
FIG. 10 is a schematic view of an electronic device according to one embodiment of the present application.
Detailed Description
As described above, in the process of using the magic expression, since the user usually holds the shooting terminal with one hand to shoot, the magic expression adjusted by the user in advance, the change of external factors such as light and the like caused by the change of the display effect on the shooting picture along with the movement of the device, the change of the effect of the magic expression cannot meet the requirement during adjustment, the adjustment of the magic expression is very inconvenient when the user holds the terminal with one hand, and the user switches the adjustment interface of the template expression and the view-finding picture back and forth, which also brings certain resource consumption to the display and the processing.
In view of this, the embodiment of the present application provides an image processing method, which is convenient for a user to control a magic expression in real time and anywhere according to a shooting posture and an actual need.
It should be noted that the magic expression is an image special effect applied to an image collected by a camera during video recording or photographing, and the special effect can be realized by a special effect corresponding to the magic expression. Of course, the method and the device are not only suitable for magic expressions, but also suitable for adjusting image special effects added to the images. Meanwhile, the image special effect can be added to the acquired image while the image is acquired. Image special effects can also be added to the acquired and stored images.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It should be understood that in the following description, the recommended aspects of the present application are explained in detail by taking "magic expressions" as an example.
FIG. 1 is a schematic diagram of an application environment according to one embodiment of the present application.
As shown in fig. 1, the application environment may include, for example, at least one server 20 and a plurality of terminal devices 30. Any suitable electronic device that each terminal device 30 may use for network access includes, but is not limited to, a computer, a laptop, a smart phone, a tablet, or other type of terminal. The server 20 is any server capable of providing information required for an interactive service through a network. The terminal device 30 can perform information transmission and reception with the server 20 via the network 40, for example, download a magic expression package from the server 20. The server 20 can acquire and provide contents required by the terminal device 30, such as a photographing-type application, a multimedia resource, and the like, by accessing the database 50. Terminal devices (e.g., 30_1 and 30_2 or 30_ N) may also communicate with each other via network 40. Network 40 may be a network for information transfer in a broad sense and may include one or more communication networks such as a wireless communication network, the internet, a private network, a local area network, a metropolitan area network, a wide area network, or a cellular data network, among others.
In the following description, only a single server or terminal device is described in detail, but it should be understood by those skilled in the art that the single server 20, terminal device 30 and database 50 shown are intended to represent that the technical solution of the present application relates to the operation of the terminal device, server and database. The detailed description of a single terminal device and a single server and database is for convenience of description at least and does not imply limitations on the type or location of terminal devices and servers. It should be noted that the underlying concepts of the example embodiments of the present application may not be altered if additional modules are added or removed from the illustrated environments. In addition, although a bidirectional arrow from the database 50 to the server 20 is shown in the figure for convenience of explanation, it will be understood by those skilled in the art that the above-described data transmission and reception may be realized through the network 40.
Taking the example that the terminal device shoots the short video, the terminal device starts shooting software according to the user operation and waits for further instruction of the user. For example, in step 201, in response to a selection instruction for adding a specified magic expression, if the specified magic expression supports voice control, voice information is collected and analyzed; the specified magic expression is a special effect for adding image elements on the image.
In step 202, if it is analyzed that the voice message contains parameter control information, adjusting the parameter of the designated magic expression according to the parameter control information to obtain an adjusted magic expression. For example, when the magic expression includes various filters for beauty, such as filters for natural white, solar system wind, etc., the filters for natural white and solar system wind provide a selectable option for this parameter of the filter. When a map is also included in the magic expression, the position and size of the map are also adjustable parameters.
The user speaks "filter using the solar wind" through voice control, and then the filter can be set to the solar wind. After adjusting the parameters of the designated magic expression according to the voice control, in step 203, the image processing may be performed on the acquired image according to the adjusted magic expression.
It should be noted that the image acquired here may be an image acquired during the use of the specified magic expression, or may be an image acquired before the use of the specified magic expression. That is, the timing of image acquisition is not limited as long as the image is suitable. For example, for video, the image is any frame image in the video captured in real time.
In one embodiment, not all magic expressions necessarily support voice control. For example, some special effects are used as picture frames, and the positions of the special effects are fixed and do not need to be adjusted. Therefore, in the embodiment of the present application, each magic expression may be provided with a corresponding configuration parameter, and before the voice information is collected and analyzed, in the embodiment of the present application, the configuration parameter of the special effect may be analyzed according to the image special effect instruction, and the operation of collecting and analyzing the voice information is executed only when the configuration parameter indicating that the voice control is enabled is analyzed. Therefore, useless voice information acquisition and analysis can be avoided, and energy consumption is saved. For example, whether the specified magic expression supports voice control may be determined according to the following method:
step 301: and starting the designated magic expression based on the user operation.
Step 302: and reading a configuration file of the designated magic expression and analyzing the special effect configuration parameters in the configuration file.
Step 303: and if the special effect configuration parameters for representing support of voice control are analyzed, determining that the designated magic expression supports voice control, and starting a voice acquisition module to acquire voice.
Step 304: and if the special effect configuration parameter for representing support of voice control is not analyzed, determining that the specified magic expression does not support voice control.
When the magic expression system is implemented, each magic expression can have a configuration file, and the special effect configuration parameters are recorded in the configuration file so as to be convenient to read and analyze.
Step 305: and if the special effect configuration parameters for representing the support of voice control are not analyzed, responding to a control instruction for controlling the specified magic expression to adjust the parameters of the specified magic expression, and obtaining the adjusted magic expression.
The control instruction is triggered according to the following operation mode: touch control operation and limb operation.
Step 306: and carrying out image processing on the acquired image according to the adjusted magic expression.
In one embodiment, parsing the voice information may be performed as: converting the voice information into text information; and performing voice recognition on the text information. And controlling the designated magic expression according to the voice recognition result.
For example, the following two ways can be implemented:
mode 1: and analyzing the keywords in the text information, and determining that the voice information contains image special effect control information if the analyzed keywords comprise preset voice control keywords.
For example, if the "solar system filter" is included, the image special effect control information is determined to be included, and the specific keyword may be determined according to a parameter included in the actual magic expression, which is not limited in the present application.
Mode 2: the method can be implemented by performing word embedding on the text information to obtain a word vector of the text information and performing feature extraction on the word vector; then matching the extracted features with features in a semantic library; the features in the semantic library are pre-configured features for performing voice control on the special effect; and if the characteristics matched in the semantic library are determined, the voice information is analyzed to contain image special effect control information.
For example, if the semantic parsing result is "adjust the map to the upper left corner", it is determined that the control information for specifying the magic expression is included. Similarly, the implementation of semantic parsing can be realized according to natural semantic processing, and what semantic parsing result is correspondingly adopted by various magic expressions can be set based on the difference of the functions of the magic expressions, which is not limited in the application.
For example, in implementation, for each magic expression, various language expression modes for controlling the magic expression can be traversed as much as possible to serve as a sample. And then, extracting the characteristics of the sample to map the characteristics with the same semantics to a semantic space, thereby realizing the magic expression control based on the semantics.
In one embodiment, in order to facilitate the user to know and use the voice control function of the magic expression, after the special effect configuration parameters for representing the support of voice control are analyzed, prompt information of voice adjustment support of the special effect can be output. For example, as shown in fig. 4, after the user selects a magic expression, a message "the magic expression supports voice adjustment" may pop up in the user interface. In another embodiment, the output method of the prompt message may not be limited, for example, the prompt message may be output through an interface display as shown in fig. 4, or may be output through a voice mode, or may be output through both an interface display and a voice mode.
In another embodiment, the prompt message includes an example of the specified magic expression and/or a type of image processing parameter for which the specified magic expression supports voice control.
For example, in order to guide the user to correctly control the magic expression and avoid waste of processing resources caused by invalid operations, in the embodiment of the present application, the prompt information may include an example of how to adopt the voice to control the magic expression. For example, magic expressions include examples of how to voice control the movement of the location of heart-shaped material. For example, fig. 5A prompts the user to speak the voice "please move the heart shape to the upper left corner", fig. 5B shows the display effect of moving the heart shape to the upper left corner according to the voice.
Further, as described in fig. 6A, the user is prompted in the interface to speak "heart shape is enlarged", and then the soil 6B displays a display effect of enlarging the heart shape according to the voice.
When the method is implemented, the adjustment of the display position and the size of the material can be carried out according to the corresponding adjustment step length. For example, each time the user utters a "grow" voice control, the size may be 1.2 times larger, each grow being based on the current size. For another example, as shown in FIG. 7A, the user speech controls "move up" and then moves the heart shape up by a specified number of pixel lengths (i.e., adjustment steps), the display effect being shown in FIG. 7B. On the basis of fig. 7B, the user continues to control "move up" by voice, and then continues to speak the heart-shaped upward movement by a specified number of pixel steps on the basis of the position of fig. 7B, with the display effect as shown in fig. 7C.
In another embodiment, when the same magic expression includes multiple image processing parameters, the voice adjustment may be supported by a portion of the image processing parameters, which do not support voice adjustment. Then the output prompt message may include specification information for the image processing parameters that support voice control, i.e., specifying which image processing parameters support voice adjustment. As shown in fig. 8, which image processing parameters support voice adjustment may be shown in the interface in a list manner, and corresponding adjustment examples may be followed by the corresponding image processing parameters for the user to learn.
As shown in fig. 8, the adjustable image processing parameters may include at least one of: filter, optional material elements, position information and size of the material elements, and the like. Such as heart-shaped decorations, star-shaped decorations, pink bubble decorations, etc., can be used as the selectable material elements. The position of each material element can be adjusted by voice, and the size can also be adjusted.
When the image collected by the camera is processed for multiple times, each processing can be called a filter, and for example, a makeup effect, an ear-sticking effect, a background or foreground map, and the like can be called a filter respectively.
Based on the same conception, the embodiment of the application also provides an image processing device.
Fig. 9 is a schematic diagram of an image processing apparatus according to an embodiment of the present application.
As shown in fig. 9, the recommendation apparatus 900 may include:
a voice parsing module 901 configured to execute a selection instruction in response to adding a specified magic expression, and collect and parse voice information if the specified magic expression supports voice control; the specified magic expression is a special effect used for adding image elements on the image;
an adjusting module 902, configured to execute, if it is analyzed that the voice information contains parameter control information, adjusting the parameter of the designated magic expression according to the parameter control information, so as to obtain an adjusted magic expression;
an image processing module 903 configured to perform image processing on the acquired image according to the adjusted magic expression.
In one embodiment, the apparatus further comprises:
a parameter parsing module 904 configured to perform determining whether the specified magic expression supports voice control according to the following method before the voice parsing module collects and parses voice information;
analyzing the special effect configuration parameters of the specified magic expression;
if the special effect configuration parameters for representing support of voice control are analyzed, determining that the designated magic expression supports voice control;
and if the special effect configuration parameter for representing support of voice control is not analyzed, determining that the specified magic expression does not support voice control.
In an embodiment, the control module is configured to, after the parameter analysis module analyzes the special effect configuration parameter of the specified magic expression, if the special effect configuration parameter for indicating that voice control is supported is not analyzed, adjust the parameter of the specified magic expression in response to a control instruction for controlling the specified magic expression to obtain an adjusted magic expression, where the control instruction is an instruction triggered according to the following operation mode: touch control operation and limb operation;
the image execution module is configured to execute image processing on the acquired image according to the adjusted designated magic expression.
In one embodiment, the apparatus further comprises:
a prompt module 905 configured to perform outputting prompt information of the special effect support voice adjustment after the parameter parsing module parses the special effect configuration parameter for indicating that voice control is supported, where the prompt information includes a type of an image processing parameter of the specified magic expression support voice control and/or an example of the specified magic expression controlled by voice.
In one embodiment, the parameters specifying the magic expression include at least one of: a filter, selectable material elements, position information of the material elements, and sizes of the material elements.
In one embodiment, the voice parsing module includes:
a text conversion unit configured to perform conversion of the voice information into text information;
an analysis unit configured to perform speech recognition on the text information.
In one embodiment, the text conversion unit is configured to perform parsing of a keyword in the text information;
the analysis unit is configured to determine that the voice information contains image special effect control information when the analyzed keywords comprise preset voice control keywords.
In one embodiment, the text conversion unit is configured to perform word embedding on the text information to obtain a word vector of the text information;
the analysis unit is configured to perform feature extraction on the word vectors; matching the extracted features with features in a semantic library; the features in the semantic library are pre-configured features for performing voice control on the special effect; and determining and analyzing that the voice information contains image special effect control information when the characteristics in the semantic library are matched.
In one embodiment, the image is an arbitrary frame image in a video acquired in real time.
The implementation and beneficial effects of the operations in the image processing apparatus can be referred to the description in the foregoing method, and are not described herein again.
Having described an image processing method and apparatus according to an exemplary embodiment of the present application, an electronic device according to another exemplary embodiment of the present application is described next.
As will be appreciated by one skilled in the art, aspects of the present application may be embodied as a system, method or program product. Accordingly, various aspects of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible implementations, an electronic device according to the present application may include at least one processor, and at least one memory. Wherein the memory stores program code which, when executed by the processor, causes the processor to perform the steps in the image processing method according to various exemplary embodiments of the present application described above in the present specification. For example, the processor may perform the steps shown in fig. 2-3.
The electronic apparatus 130 according to this embodiment of the present application is described below with reference to fig. 10. The electronic device 130 shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 10, the electronic device 130 is represented in the form of a general electronic device. The components of the electronic device 130 may include, but are not limited to: the at least one processor 131, the at least one memory 132, and a bus 133 that connects the various system components (including the memory 132 and the processor 131).
Bus 133 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.
The memory 132 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)1321 and/or cache memory 1322, and may further include Read Only Memory (ROM) 1323.
Memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324, such program modules 1324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 130 may also communicate with one or more external devices 134 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with the electronic device 130, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 130 to communicate with one or more other electronic devices. Such communication may occur via input/output (I/O) interfaces 135. Also, the electronic device 130 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 136. As shown, network adapter 136 communicates with other modules for electronic device 130 over bus 133. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some possible embodiments, aspects of an image processing method provided by the present application may also be implemented in the form of a program product including program code for causing a computer device to perform the steps of an image processing method according to various exemplary embodiments of the present application described above in this specification when the program product is run on the computer device, for example, the computer device may perform the steps as shown in fig. 2-3.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product for recommendation of an embodiment of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be executable on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device and partly on a remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic devices may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, according to embodiments of the application. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Further, while the operations of the methods of the present application are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. An image processing method, characterized in that the method comprises:
responding to a selection instruction for adding a specified magic expression, and collecting and analyzing voice information if the specified magic expression supports voice control; the specified magic expression is a special effect used for adding image elements on the image;
if the voice information contains parameter control information, adjusting the parameters of the designated magic expression according to the parameter control information to obtain the adjusted magic expression;
and carrying out image processing on the acquired image according to the adjusted magic expression.
2. The method of claim 1, wherein prior to collecting and parsing the voice information, the method further comprises:
determining whether the specified magic expression supports voice control according to the following method:
analyzing the special effect configuration parameters of the specified magic expression;
if the special effect configuration parameters for representing support of voice control are analyzed, determining that the designated magic expression supports voice control;
and if the special effect configuration parameter for representing support of voice control is not analyzed, determining that the specified magic expression does not support voice control.
3. The method of claim 2, wherein after parsing the special effects configuration parameters for the specified magic expression, the method further comprises:
if the special effect configuration parameters for representing the support of voice control are not analyzed, responding to a control instruction for controlling the specified magic expression to adjust the parameters of the specified magic expression to obtain the adjusted magic expression, wherein the control instruction is an instruction triggered according to the following operation mode: touch control operation and limb operation;
and carrying out image processing on the acquired image according to the adjusted magic expression.
4. The method of claim 2, wherein after parsing to indicate special effects configuration parameters that support voice control, the method further comprises:
and outputting prompt information of the special effect supporting voice adjustment, wherein the prompt information comprises the type of the image processing parameter of the specified magic expression supporting voice control and/or an example of the specified magic expression controlled by voice.
5. A method as claimed in claim 1, wherein the parameters specifying magic expression comprise at least one of: a filter, selectable material elements, position information of the material elements, and sizes of the material elements.
6. The method of claim 1, wherein parsing the voice information comprises:
converting the voice information into text information;
and performing voice recognition on the text information.
7. The method of claim 6, wherein the performing speech recognition on the text information comprises:
analyzing the key words in the text information;
the analyzing that the voice information contains image special effect control information comprises the following steps:
the analyzed keywords comprise preset voice control keywords.
8. An image processing apparatus, characterized in that the apparatus comprises:
the voice analysis module is configured to execute a selection instruction responding to the addition of the specified magic expression, and collect and analyze voice information if the specified magic expression supports voice control; the specified magic expression is a special effect used for adding image elements on the image;
the adjusting module is configured to adjust the parameters of the designated magic expression according to the parameter control information if the voice information contains parameter control information through analysis, so that the adjusted magic expression is obtained;
and the image execution module is configured to execute image processing on the acquired image according to the adjusted magic expression.
9. An electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A computer storage medium, characterized in that the computer storage medium stores a computer program for causing a computer to perform the method according to any one of claims 1-7.
CN202010291730.7A 2020-04-14 2020-04-14 Image processing method, image processing device, electronic equipment and storage medium Pending CN113535040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010291730.7A CN113535040A (en) 2020-04-14 2020-04-14 Image processing method, image processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010291730.7A CN113535040A (en) 2020-04-14 2020-04-14 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113535040A true CN113535040A (en) 2021-10-22

Family

ID=78120301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010291730.7A Pending CN113535040A (en) 2020-04-14 2020-04-14 Image processing method, image processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113535040A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467735A (en) * 2021-06-16 2021-10-01 荣耀终端有限公司 Image adjusting method, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339201A (en) * 2016-09-14 2017-01-18 北京金山安全软件有限公司 Map processing method and device and electronic equipment
CN106791370A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 A kind of method and apparatus for shooting photo
CN109543646A (en) * 2018-11-30 2019-03-29 深圳市脸萌科技有限公司 Face image processing process, device, electronic equipment and computer storage medium
CN109584879A (en) * 2018-11-23 2019-04-05 华为技术有限公司 A kind of sound control method and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339201A (en) * 2016-09-14 2017-01-18 北京金山安全软件有限公司 Map processing method and device and electronic equipment
CN106791370A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 A kind of method and apparatus for shooting photo
CN109584879A (en) * 2018-11-23 2019-04-05 华为技术有限公司 A kind of sound control method and electronic equipment
CN109543646A (en) * 2018-11-30 2019-03-29 深圳市脸萌科技有限公司 Face image processing process, device, electronic equipment and computer storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467735A (en) * 2021-06-16 2021-10-01 荣耀终端有限公司 Image adjusting method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN106303723B (en) Video processing method and device
CN110288682B (en) Method and apparatus for controlling changes in a three-dimensional virtual portrait mouth shape
CN109168026B (en) Instant video display method and device, terminal equipment and storage medium
CN109754783B (en) Method and apparatus for determining boundaries of audio sentences
JP7222008B2 (en) Video clip search method and device
CN107463700B (en) Method, device and equipment for acquiring information
US11758088B2 (en) Method and apparatus for aligning paragraph and video
US11490168B2 (en) Method and apparatus for selecting video clip, server and medium
CN110717337A (en) Information processing method, device, computing equipment and storage medium
US20190371023A1 (en) Method and apparatus for generating multimedia content, and device therefor
US11710510B2 (en) Video generation method and apparatus, electronic device, and computer readable medium
CN109582825B (en) Method and apparatus for generating information
CN110880198A (en) Animation generation method and device
US11595591B2 (en) Method and apparatus for triggering special image effects and hardware device
US20190379919A1 (en) System and method for perspective switching during video access
US11580971B2 (en) Photo album management method, storage medium and electronic device
EP4344229A1 (en) Video processing method and apparatus, device, and storage medium
CN113886612A (en) Multimedia browsing method, device, equipment and medium
CN107659603B (en) Method and device for interaction between user and push information
CN113535040A (en) Image processing method, image processing device, electronic equipment and storage medium
CN110442806B (en) Method and apparatus for recognizing image
US10910014B2 (en) Method and apparatus for generating video
CN115967833A (en) Video generation method, device and equipment meter storage medium
CN112562733A (en) Media data processing method and device, storage medium and computer equipment
CN112651231A (en) Spoken language information processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination