CN115809006B

CN115809006B - Method and device for controlling manual instructions through picture

Info

Publication number: CN115809006B
Application number: CN202211548246.3A
Authority: CN
Inventors: 袁潮; 邓迪旻; 肖占中
Original assignee: Beijing Zhuohe Technology Co Ltd
Current assignee: Beijing Zhuohe Technology Co Ltd
Priority date: 2022-12-05
Filing date: 2022-12-05
Publication date: 2023-08-08
Anticipated expiration: 2042-12-05
Also published as: CN115809006A

Abstract

The invention discloses a method and a device for controlling manual instructions by pictures. Wherein the method comprises the following steps: acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people; performing action recognition on the first instruction information to obtain instruction action data; carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data; and outputting the local dispersion instruction data by using an instruction analysis model to obtain instruction operation intention. The invention solves the technical problems that in the prior art, the manual instruction control of the camera system only carries out identification and judgment by fixed timely input instruction information, so that a user is required to input instructions in a palm-like manner, unnecessary workload is wasted when the input is wrong, and the efficiency of overall picture control and image processing is reduced.

Description

Method and device for controlling manual instructions through picture

Technical Field

The invention relates to the field of image recognition and image processing, in particular to a method and a device for controlling manual instructions by pictures.

Background

Along with the continuous development of intelligent science and technology, intelligent equipment is increasingly used in life, work and study of people, and the quality of life of people is improved and the learning and working efficiency of people is increased by using intelligent science and technology means.

At present, in a user instruction control process of a high-precision camera system, a mode of inputting instruction information in real time is generally adopted to identify and judge a manual instruction, and the manual instruction is converted into an electric signal according to an identification judgment result so as to achieve the technical effect of controlling image picture processing through the instruction. However, in the prior art, the manual command control of the camera system only carries out identification and judgment by inputting command information in time, so that a user is required to input commands in a palm-like manner, unnecessary workload waste is caused when the commands are input incorrectly, and the overall picture control and image processing efficiency is reduced.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a method and a device for controlling manual instructions by a picture, which at least solve the technical problems that in the prior art, the manual instruction control of an image pickup system is only realized by inputting instruction information in a fixed and timely manner, so that a user is required to input instructions in a palm mode, unnecessary workload is wasted when the instructions are input in error, and the efficiency of overall picture control and image processing is reduced.

According to an aspect of the embodiment of the present invention, there is provided a method for controlling a manual command on a screen, including: acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people; performing action recognition on the first instruction information to obtain instruction action data; carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data; and outputting the local dispersion instruction data by using an instruction analysis model to obtain instruction operation intention.

Optionally, after the first instruction information is acquired by means of image acquisition, the method further includes: and carrying out gray scale processing on the image data of the first instruction information, wherein the RGB segment value of the gray scale processing is (0, 255).

Optionally, the step of performing lagrangian computation on the instruction action data to obtain local dispersion instruction data includes: acquiring a closest Lagrangian operator according to the first instruction information; and generating the local dispersion instruction data by using the Lagrangian operator, wherein the Lagrangian operator has a calculation formula of:wherein,,the representation is rounded up, i is the diffusion instruction data set, θ is the Lagrangian operator, N is the instruction action data, and L is the instruction data type number.

Optionally, after the outputting the local dispersion instruction data by using the instruction analysis model to obtain the instruction operation intention, the method further includes: and operating the instruction intended to execute the picture processing according to the instruction.

According to another aspect of the embodiment of the present invention, there is also provided an apparatus for controlling a manual command on a screen, including: the acquisition module is used for acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people; the identification module is used for carrying out action identification on the first instruction information to obtain instruction action data; the calculation module is used for carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data; and the output module is used for outputting the local dispersion instruction data by utilizing an instruction analysis model to obtain instruction operation intention.

Optionally, the apparatus further includes: and the processing module is used for carrying out gray processing on the image data of the first instruction information, wherein the RGB segment value of the gray processing is (0, 255).

Optionally, the computing module includes: the acquisition unit is used for acquiring the closest Lagrange operator according to the first instruction information; the computing unit is used for generating the local dispersion instruction data by using the Lagrange operator, wherein the Lagrange operator computing formula is as follows:wherein (1)>The representation is rounded up, i is the diffusion instruction data set, θ is the Lagrangian operator, N is the instruction action data, and L is the instruction data type number.

Optionally, the apparatus further includes: and the execution module is used for executing the instruction of the picture processing according to the instruction operation intention.

According to another aspect of the embodiment of the present invention, there is further provided a nonvolatile storage medium, where the nonvolatile storage medium includes a stored program, and when the program runs, a method for controlling a device in which the nonvolatile storage medium is located to execute a picture control manual instruction is provided.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device including a processor and a memory; the memory stores computer readable instructions, and the processor is configured to execute the computer readable instructions, where the computer readable instructions execute a method for controlling manual instructions by a picture when executed.

In the embodiment of the invention, first instruction information is acquired in an image acquisition mode, wherein the first instruction information is sent by people; performing action recognition on the first instruction information to obtain instruction action data; carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data; the method has the advantages that the instruction analysis model is utilized to output the local dispersion instruction data to obtain the instruction operation intention, and the technical problems that in the prior art, the manual instruction control of the camera system is only realized by inputting instruction information in time fixedly to identify and judge, so that a user is required to input the instruction in a palm-like manner, unnecessary workload is wasted when the input is wrong, and the overall picture control and image processing efficiency is reduced are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a flow chart of a method of picture control manual instruction according to an embodiment of the present invention;

FIG. 2 is a block diagram of an apparatus for picture control of manual instructions according to an embodiment of the present invention;

fig. 3 is a block diagram of a terminal device for performing the method according to the invention according to an embodiment of the invention;

fig. 4 is a memory unit for holding or carrying program code for implementing a method according to the invention, according to an embodiment of the invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an embodiment of the present invention, there is provided a method embodiment of a method of screen control of manual instructions, it being noted that the steps shown in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and although a logical order is shown in the flowchart, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

Example 1

Fig. 1 is a flowchart of a method for controlling manual instructions on a screen according to an embodiment of the present invention, as shown in fig. 1, the method includes the steps of:

step S102, acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people.

Specifically, in order to solve the technical problems that in the prior art, the manual instruction control of the camera system only carries out identification and judgment through fixed timely input instruction information, so that a user is required to carry out instruction input mode such as palm pointing, unnecessary workload waste is caused when the input is wrong, and the efficiency of overall picture control and image processing is reduced, first instruction information sent by the user is required to be acquired, the acquisition mode can be that an independent action identification camera identifies and acquires instruction actions of the user, and acquired image data is converted into the first instruction information to be stored and processed.

Specifically, in order to efficiently identify the instruction condition of the user according to the first instruction information acquired by the camera, gray processing is required to be performed on the image data of the first instruction information, wherein the RGB segment value of the gray processing is (0, 255), so that the subsequent action identification of the instruction information is facilitated, the image interference data is eliminated, and the accuracy degree of the image action identification is increased.

Step S104, performing action recognition on the first instruction information to obtain instruction action data.

Specifically, after the gray image data of the first instruction information is obtained in the embodiment of the present invention, the gray image data needs to be subjected to action recognition, so as to obtain instruction action data, for example, when the user performs "zoom-in" instruction action, after the user zooms in the action image data through gray processing, the instruction action image of the user can be recognized according to the action recognition model, so that an instruction recognition result, that is, real-time image data collected by the zoom-in imaging system, is output.

And step S106, carrying out Lagrange operator calculation on the instruction action data to obtain local dispersion instruction data.

Specifically, in order to further refine instruction action data, the embodiment of the invention needs to be decomposed into local dispersion instruction data, so that user instruction actions are screened in a targeted manner, and the technical aim of eliminating useless instructions or actions of a user is achieved, wherein the Lagrange screening algorithm is used for carrying out the steps ofAs a method for screening dispersion instruction data, sigma (N-1) is used as instruction action compensation quantity after Lagrange operator multiplication, and is used for representing correction added value of a screening result, so that the accuracy of the whole screening is improved.

And S108, outputting the local dispersion instruction data by using an instruction analysis model to obtain the instruction operation intention.

Specifically, after the local accurate dispersion instruction data is obtained, the dispersion instruction data is required to be input into a trained DNN neural network model, the neural network model is used for inputting the dispersion instruction data as a characteristic input vector, instruction operation intention is output by utilizing a corresponding algorithm relation trained by a hidden layer, namely, what instruction data generate what user instruction operation intention, so that the overall data of the user instruction is utilized to judge the processing intention of the end user to be operated.

Through the embodiment, the technical problems that in the prior art, the manual instruction control of the camera system only carries out identification and judgment through fixed timely input instruction information, so that a user is required to input instructions in a finger palm mode, unnecessary workload waste is caused when the input is wrong, and the overall picture control and image processing efficiency is reduced are solved.

Example two

Fig. 2 is a block diagram of an apparatus for picture control of manual instructions according to an embodiment of the present invention, as shown in fig. 2, the apparatus comprising:

the acquiring module 20 is configured to acquire first instruction information by means of image acquisition, where the first instruction information is sent by a person.

And the identification module 22 is configured to perform action identification on the first instruction information to obtain instruction action data.

And the calculation module 24 is used for performing Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data.

Specifically, in order to further refine instruction action data, the embodiment of the invention needs to be decomposed into local dispersion instruction data, so that user instruction actions can be screened in a targeted manner, and a technology for eliminating useless instructions or actions of a user is achievedThe purpose is that, by Lagrange screening algorithmAs a method for screening dispersion instruction data, sigma (N-1) is used as instruction action compensation quantity after Lagrange operator multiplication, and is used for representing correction added value of a screening result, so that the accuracy of the whole screening is improved.

And the output module 26 is used for outputting the local dispersion instruction data by utilizing an instruction analysis model to obtain instruction operation intention.

Specifically, the method comprises the following steps: acquiring first instruction information in an image acquisition mode, wherein the first instruction information comprises a first instruction information acquisition unit, a second instruction information acquisition unit and a third instruction information acquisition unit, wherein the first instruction information is acquired in an image acquisition modeThe instruction information is sent by human; performing action recognition on the first instruction information to obtain instruction action data; carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data; and outputting the local dispersion instruction data by using an instruction analysis model to obtain instruction operation intention. Optionally, after the first instruction information is acquired by means of image acquisition, the method further includes: and carrying out gray scale processing on the image data of the first instruction information, wherein the RGB segment value of the gray scale processing is (0, 255). Optionally, the step of performing lagrangian computation on the instruction action data to obtain local dispersion instruction data includes: acquiring a closest Lagrangian operator according to the first instruction information; and generating the local dispersion instruction data by using the Lagrangian operator, wherein the Lagrangian operator has a calculation formula of:wherein (1)>The representation is rounded up, i is the diffusion instruction data set, θ is the Lagrangian operator, N is the instruction action data, and L is the instruction data type number. Optionally, after the outputting the local dispersion instruction data by using the instruction analysis model to obtain the instruction operation intention, the method further includes: and operating the instruction intended to execute the picture processing according to the instruction.

Specifically, the method comprises the following steps: acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people; performing action recognition on the first instruction information to obtain instruction action data; act the instructionCarrying out Lagrangian operator calculation on the data to obtain local dispersion instruction data; and outputting the local dispersion instruction data by using an instruction analysis model to obtain instruction operation intention. Optionally, after the first instruction information is acquired by means of image acquisition, the method further includes: and carrying out gray scale processing on the image data of the first instruction information, wherein the RGB segment value of the gray scale processing is (0, 255). Optionally, the step of performing lagrangian computation on the instruction action data to obtain local dispersion instruction data includes: acquiring a closest Lagrangian operator according to the first instruction information; and generating the local dispersion instruction data by using the Lagrangian operator, wherein the Lagrangian operator has a calculation formula of:wherein (1)>The representation is rounded up, i is the diffusion instruction data set, θ is the Lagrangian operator, N is the instruction action data, and L is the instruction data type number. Optionally, after the outputting the local dispersion instruction data by using the instruction analysis model to obtain the instruction operation intention, the method further includes: and operating the instruction intended to execute the picture processing according to the instruction.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, fig. 3 is a schematic hardware structure of a terminal device according to an embodiment of the present application. As shown in fig. 3, the terminal device may include an input device 30, a processor 31, an output device 32, a memory 33, and at least one communication bus 34. The communication bus 34 is used to enable communication connections between the elements. The memory 33 may comprise a high-speed RAM memory or may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, in which various programs may be stored for performing various processing functions and implementing the method steps of the present embodiment.

Alternatively, the processor 31 may be implemented as, for example, a central processing unit (Central Processing Unit, abbreviated as CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the processor 31 is coupled to the input device 30 and the output device 32 through wired or wireless connections.

Alternatively, the input device 30 may include a variety of input devices, for example, may include at least one of a user-oriented user interface, a device-oriented device interface, a programmable interface of software, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware insertion interface (such as a USB interface, a serial port, etc.) for data transmission between devices; alternatively, the user-oriented user interface may be, for example, a user-oriented control key, a voice input device for receiving voice input, and a touch-sensitive device (e.g., a touch screen, a touch pad, etc. having touch-sensitive functionality) for receiving user touch input by a user; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, for example, an input pin interface or an input interface of a chip, etc.; optionally, the transceiver may be a radio frequency transceiver chip, a baseband processing chip, a transceiver antenna, etc. with a communication function. An audio input device such as a microphone may receive voice data. The output device 32 may include a display, audio, or the like.

In this embodiment, the processor of the terminal device may include functions for executing each module of the data processing apparatus in each device, and specific functions and technical effects may be referred to the above embodiments and are not described herein again.

Fig. 4 is a schematic hardware structure of a terminal device according to another embodiment of the present application. Fig. 4 is a specific embodiment of the implementation of fig. 3. As shown in fig. 4, the terminal device of the present embodiment includes a processor 41 and a memory 42.

The processor 41 executes the computer program code stored in the memory 42 to implement the methods of the above-described embodiments.

The memory 42 is configured to store various types of data to support operation at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, video, etc. The memory 42 may include a random access memory (random access memory, simply referred to as RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.

Optionally, a processor 41 is provided in the processing assembly 40. The terminal device may further include: a communication component 43, a power supply component 44, a multimedia component 45, an audio component 46, an input/output interface 47 and/or a sensor component 48. The components and the like specifically included in the terminal device are set according to actual requirements, which are not limited in this embodiment.

The processing component 40 generally controls the overall operation of the terminal device. The processing component 40 may include one or more processors 41 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 40 may include one or more modules that facilitate interactions between the processing component 40 and other components. For example, processing component 40 may include a multimedia module to facilitate interaction between multimedia component 45 and processing component 40.

The power supply assembly 44 provides power to the various components of the terminal device. Power supply components 44 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for terminal devices.

The multimedia component 45 comprises a display screen between the terminal device and the user providing an output interface. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation.

The audio component 46 is configured to output and/or input audio signals. For example, the audio component 46 includes a Microphone (MIC) configured to receive external audio signals when the terminal device is in an operational mode, such as a speech recognition mode. The received audio signals may be further stored in the memory 42 or transmitted via the communication component 43. In some embodiments, audio assembly 46 further includes a speaker for outputting audio signals.

The input/output interface 47 provides an interface between the processing assembly 40 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: volume button, start button and lock button.

The sensor assembly 48 includes one or more sensors for providing status assessment of various aspects for the terminal device. For example, the sensor assembly 48 may detect the open/closed state of the terminal device, the relative positioning of the assembly, the presence or absence of user contact with the terminal device. The sensor assembly 48 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 48 may also include a camera or the like.

The communication component 43 is configured to facilitate communication between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot, where the SIM card slot is used to insert a SIM card, so that the terminal device may log into a GPRS network, and establish communication with a server through the internet.

From the above, it will be appreciated that the communication component 43, the audio component 46, and the input/output interface 47, the sensor component 48 referred to in the embodiment of fig. 4 may be implemented as an input device in the embodiment of fig. 3.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for picture control of manual instructions, comprising:

acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people;

performing action recognition on the first instruction information to obtain instruction action data;

carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data;

outputting the local dispersion instruction data by using an instruction analysis model to obtain instruction operation intention;

the step of calculating the Lagrangian operator of the instruction action data to obtain local dispersion instruction data comprises the following steps:

acquiring a closest Lagrangian operator according to the first instruction information;

and generating the local dispersion instruction data by using the Lagrangian operator, wherein the Lagrangian operator has a calculation formula of:

wherein (1)>The representation is rounded upwards, i is a dispersion instruction data set, θ is a Lagrangian operator, N is instruction action data, and L is the number of instruction data types;

after the first instruction information is acquired by means of image acquisition, the method further comprises:

performing gray scale processing on the image data of the first instruction information, wherein the RGB segment value of the gray scale processing is (0, 255);

after the outputting the local dispersion instruction data by using the instruction analysis model to obtain the instruction operation intention, the method further comprises:

and operating the instruction intended to execute the picture processing according to the instruction.

2. An apparatus for picture control of manual instructions, comprising:

the acquisition module is used for acquiring first instruction information in an image acquisition mode, wherein the first instruction information is sent by people;

the identification module is used for carrying out action identification on the first instruction information to obtain instruction action data;

the calculation module is used for carrying out Lagrangian operator calculation on the instruction action data to obtain local dispersion instruction data;

the output module is used for outputting the local dispersion instruction data by utilizing an instruction analysis model to obtain an instruction operation intention;

wherein the computing module comprises:

the acquisition unit is used for acquiring the closest Lagrange operator according to the first instruction information;

the computing unit is used for generating the local dispersion instruction data by using the Lagrange operator, wherein the Lagrange operator computing formula is as follows:

the apparatus further comprises:

a processing module, configured to perform gray-scale processing on the image data of the first instruction information, where an RGB segment value of the gray-scale processing is (0, 255);

the apparatus further comprises:

and the execution module is used for executing the instruction of the picture processing according to the instruction operation intention.

3. A non-volatile storage medium comprising a stored program, wherein the program when run controls a device in which the non-volatile storage medium resides to perform the method of claim 1.

4. An electronic device comprising a processor and a memory; the memory has stored therein computer readable instructions for execution by the processor, wherein the computer readable instructions when executed perform the method of claim 1.