CN113011259A - Operation method of electronic equipment - Google Patents

Operation method of electronic equipment Download PDF

Info

Publication number
CN113011259A
CN113011259A CN202110181016.7A CN202110181016A CN113011259A CN 113011259 A CN113011259 A CN 113011259A CN 202110181016 A CN202110181016 A CN 202110181016A CN 113011259 A CN113011259 A CN 113011259A
Authority
CN
China
Prior art keywords
target object
video stream
electronic device
detection
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110181016.7A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Zhendi Intelligent Technology Co Ltd
Original Assignee
Suzhou Zhendi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Zhendi Intelligent Technology Co Ltd filed Critical Suzhou Zhendi Intelligent Technology Co Ltd
Priority to CN202110181016.7A priority Critical patent/CN113011259A/en
Publication of CN113011259A publication Critical patent/CN113011259A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides an operation method for an electronic device, wherein the electronic device is provided with a display screen, and the operation method comprises the following steps: s101: acquiring a video stream of the display screen; s102: performing multi-region retrieval on the video stream, and executing step S103 when the video stream contains a plurality of regions; s103: performing target object detection on at least one of the plurality of regions, and merging and outputting detection results of the at least one region; or carrying out target object detection on the selected area in the plurality of areas and outputting a detection result. The preferred embodiment of the present invention provides an operation method of an electronic device, which adopts the strategy of a regional NMS, a combination of a global NMS and a partial region NMS, and a selected region NMS to avoid or reduce the missed detection of a target object when a display screen of the electronic device displays a plurality of windows and contains a plurality of target objects.

Description

Operation method of electronic equipment
Technical Field
The present invention relates generally to the field of pattern recognition technology, and more particularly, to an operating method for an electronic device.
Background
Under the situation that a plurality of application programs in an electronic device system (embedded system) run simultaneously, the plurality of application programs running simultaneously often cannot access a camera of the electronic device or simultaneously acquire a video stream shot by the camera, so that some of the application programs cannot perform subsequent operations. For example, only one app can be allowed to access the camera at a certain time in the mobile phone system, if a user is making a WeChat call at a certain time, another app also needs to acquire a video stream of the camera to perform a specific operation, such as face recognition, tracking or face beautifying operation, but at this time, the camera is occupied and cannot acquire the video stream shot by the camera, so that the later-enabled app cannot normally work unless the WeChat call is closed.
In addition, for a video stream of a display screen of an electronic device system (instead of a video stream shot by a camera), under the condition that a multi-party interactive app such as a WeChat telephone, a video conference and the like is operated, a plurality of windows are displayed on the display screen, and the video stream of the display screen contains a plurality of target objects.
The statements in this background section merely represent techniques known to the public and are not, of course, representative of the prior art.
Disclosure of Invention
In view of at least one of the drawbacks of the prior art, the present invention provides an operating method for an electronic device having a display screen, the operating method comprising:
s101: acquiring a video stream of the display screen;
s102: performing multi-region retrieval on the video stream, and executing step S103 when the video stream contains a plurality of regions;
s103: performing target object detection on at least one of the plurality of regions, and merging and outputting detection results of the at least one region; or carrying out target object detection on the selected area in the plurality of areas and outputting a detection result.
According to an aspect of the invention, wherein step S103 further comprises:
and carrying out global target object detection on the video stream, and combining and outputting the global detection result and the detection result of the at least one region.
According to an aspect of the invention, wherein step S102 is accomplished by hough detection.
According to an aspect of the invention, wherein the target object detection further comprises:
setting an anchor point, and outputting a characteristic diagram through convolution of a neural network;
and filtering a plurality of candidate boxes of the feature map by using a non-maximum suppression algorithm.
According to an aspect of the present invention, in which step S103 is to perform target object detection on at least one of the plurality of regions, and output the detection result of the at least one region in a combined manner, the method further includes:
and respectively setting anchor points according to the size of the at least one area.
According to an aspect of the present invention, further comprising:
s104: and performing feature recognition on at least one target object in the video stream, and selecting a tracking target from the at least one target object according to a recognition result.
According to an aspect of the invention, wherein step S104 further comprises:
and identifying according to the skeleton characteristics of the target object, and when a preset gesture is identified to be made on a target object in the video stream, taking the target object as the tracking target.
According to an aspect of the present invention, further comprising:
s104: and taking a target object selected by a user of the electronic equipment in the video stream as the tracking target.
According to an aspect of the invention, wherein the selection of the target object by the user of the electronic device is achieved by double-clicking and/or sliding the display screen.
According to an aspect of the invention, wherein step S103 further comprises:
and performing feature recognition on the video stream, and determining the selected area according to a recognition result.
According to an aspect of the invention, wherein step S103 further comprises:
and recognizing according to the pedestrian skeleton characteristics, and when recognizing that the video stream contains a preset gesture, taking the area where the preset gesture is located as the selected area.
According to an aspect of the invention, wherein step S103 further comprises:
and taking the area selected by the user of the electronic equipment in the video stream as the selected area.
According to an aspect of the invention, wherein the selection of the target area by the user of the electronic device is achieved by double clicking and/or sliding the display screen.
According to an aspect of the invention, the method of operation further comprises:
s104: and taking the target object in the selected area as a tracking target.
According to an aspect of the invention, wherein step S103 further comprises:
s1031: carrying out global target object detection on the selected area, and outputting a detection result;
s1032: detecting a target object near a tracking target in the selected area, and outputting a detection result;
wherein step S1031 and step S1032 are alternately performed every other frame.
According to an aspect of the invention, the method of operation further comprises:
s105: and acquiring first parameter information of the tracking target, and outputting updated pose parameters of the electronic equipment according to the first parameter information of the tracking target.
According to an aspect of the invention, wherein the first parameter information comprises a position parameter and/or a size parameter, an acceleration parameter.
According to an aspect of the present invention, wherein the electronic device further comprises a camera, wherein step S105 further comprises:
and calculating the updated pose parameters of the electronic equipment according to the first parameter information of the tracking target, so that the position and/or the size of the tracking target in the image acquired by the camera meet the preset requirement.
According to an aspect of the present invention, wherein the electronic device is mounted on a pan/tilt head, the step S105 further includes: outputting the updated pose parameters of the electronic device to the pan-tilt head, the method of operation further comprising:
s106: and adjusting the pose of the electronic equipment according to the updated pose parameter of the electronic equipment through the cradle head.
According to one aspect of the invention, the electronic device comprises one or more of a mobile phone, a PAD, a motion camera, AR/VR glasses, and a home smart camera, and the target object comprises one or more of a human face, an iris, a heating element, and a dynamic target object.
The present invention also provides a computer-readable storage medium comprising computer-executable instructions stored thereon which, when executed by a processor, implement a method of operation as described above.
The preferred embodiment of the present invention provides an operation method of an electronic device, which adopts the strategy of a regional NMS, a combination of a global NMS and a partial region NMS, and a selected region NMS to avoid or reduce the missed detection of a target object when a display screen of the electronic device displays a plurality of windows and contains a plurality of target objects. After the detection result of the target object is output, subsequent operation can be performed, and the problem that operations such as face recognition, face beautification, human body temperature measurement, holder following and the like cannot be synchronously performed under the condition that other application programs have called the camera of the electronic equipment is solved. The preferred embodiment of the invention does not need to rely on other application programs, so that the user can finish the identification and tracking of the target object in the way of making WeChat calls and carrying out video conferences, and practice proves that the invention has beneficial effects.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 schematically illustrates the occurrence of a miss-detection when detecting an object in a video stream on a display screen;
FIG. 2 is a diagram schematically illustrating multi-region object detection and merging of output detection results for a video stream of a display screen in accordance with a preferred embodiment of the present invention;
FIG. 3 schematically illustrates object detection of a selected area of a video stream on a display screen and outputting the detection results in accordance with a preferred embodiment of the present invention;
FIG. 4 is a diagram schematically illustrating global object detection and partial area object detection for a video stream of a display screen and merging of output detection results in accordance with a preferred embodiment of the present invention;
FIG. 5 illustrates a method of operation for an electronic device in accordance with a preferred embodiment of the present invention;
FIG. 6 schematically illustrates the recognition of a tracked target from pedestrian skeletal features in accordance with a preferred embodiment of the present invention;
FIG. 7 schematically illustrates the selection of a selected region based on pedestrian skeletal features in accordance with a preferred embodiment of the present invention;
FIG. 8 schematically illustrates displaying screen content on a display screen of an electronic device in accordance with a preferred embodiment of the present invention;
fig. 9 schematically illustrates an electronic device mounted on a pan/tilt head and tracking a target object through the pan/tilt head according to a preferred embodiment of the present invention.
Detailed Description
In the following, only certain exemplary embodiments are briefly described. As those skilled in the art will recognize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention. Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the description of the present invention, it should be noted that unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection, either mechanically, electrically, or in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, "above" or "below" a first feature means that the first and second features are in direct contact, or that the first and second features are not in direct contact but are in contact with each other via another feature therebetween. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly above and obliquely above the second feature, or simply meaning that the first feature is at a lesser level than the second feature.
The following disclosure provides many different embodiments or examples for implementing different features of the invention. To simplify the disclosure of the present invention, the components and arrangements of specific examples are described below. Of course, they are merely examples and are not intended to limit the present invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples, such repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. In addition, the present invention provides examples of various specific processes and materials, but one of ordinary skill in the art may recognize applications of other processes and/or uses of other materials.
The embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that the embodiments described herein are only for the purpose of illustrating and explaining the present invention, and are not intended to limit the present invention.
In the existing electronic device, the permission of a plurality of application programs to access the camera at the same time cannot be opened, that is, when the camera is called by the process of one application program, other application programs cannot acquire the video stream data of the camera. When the camera is called, in some cases, the video content shot by the camera in real time is displayed on the display screen, and as shown in fig. 1, the image containing the target object shot by the camera in real time is displayed in a window B on the display screen. The target object is a human face, but those skilled in the art will readily understand that the target object also includes other objects such as a human body, human eyes, and a dynamic object. The preferred embodiment of the present invention performs subsequent target detection by reading video stream data of a display screen of the electronic device in a case where a video stream of a camera of the electronic device cannot be acquired.
If the video stream of the display screen of the electronic equipment only contains the target object shot by the camera in real time, the subsequent operation can be directly carried out on the target object; however, in general, a video stream of a display screen of an electronic device includes not only a target object shot by a camera in real time, but also other target objects. For example, when a camera of an electronic device is called by a multi-party interaction app such as a WeChat telephone, a video conference, etc., a display screen of the electronic device includes a plurality of windows, each of which has at least one target object, and as shown in FIG. 1, a window A on the display screen displays a target object of remote interaction.
If the video stream of the display screen is subject to target detection, it may happen that the target object in window B is missed. The reason is that the nature of the Non-Maximum Suppression algorithm (NMS) in target detection is to search for local maxima, suppressing Non-Maximum elements. The main purpose of applying the NMS algorithm in object detection is to eliminate redundant (cross-repeat) windows and find the best object detection position. However, when two target objects belong to the same kind of targets, the two target objects may have the same anchor point (anchor) with the same confidence of the two targets when the NMS algorithm performs probability sorting, and if the detection is performed by using a non-maximum suppression method, the possibility that the next maximum candidate frame that belongs to another target object is removed may occur, which may directly lose the candidate detection frame of another target object. Fig. 1 schematically illustrates the above situation, that is, the same kind of objects must have a certain distance, otherwise, missing detection in object detection may occur.
First embodiment
As shown in fig. 2, according to a preferred embodiment of the present invention, when an electronic device runs a multi-party interaction type app, a display screen of the electronic device displays a plurality of windows including a plurality of target objects, in order to avoid suppression (rejection when performing NMS filtering) of target objects in a window with a small part of area, the present invention provides an operation method 10 for an electronic device, where the electronic device has a display screen, and the operation method 10 includes:
in step S101, a video stream of a display screen is acquired. Video streams of the display screen are derived from (1) screen recording images of the electronic equipment; (2) video streaming data with open interfaces of APP (application) such as video call and WeChat; (3) and the electronic equipment system opens video stream data through the interface. In the embodiment shown in fig. 2, the video stream of the display screen includes video streams of windows a-F, wherein the video streams of windows a-E have target objects therein, and the video stream of window F has text and/or picture content therein. Those skilled in the art will readily understand that all or part of the windows of the display screen of the electronic device having the target object are suitable for the operation method provided by the present invention, and all of them are within the protection scope of the present invention.
In step S102, a multi-region search is performed on the video stream, and when the video stream includes a plurality of regions, step S103 is performed. In the embodiment shown in fig. 2, a video stream of a display screen is subjected to multi-region search, and a plurality of regions respectively containing windows a-F are divided.
In step S103, the target object is detected for at least one of the plurality of regions, and the detection results of the at least one region are combined and output. In the embodiment shown in fig. 2, the target object detection is performed on the plurality of regions including the windows a to F, respectively, and the detection results of the plurality of regions including the windows a to F are merged and output.
Second embodiment
As shown in fig. 3, when the electronic device runs a multi-party interaction type app, a plurality of windows including a plurality of target objects are displayed on a display screen of the electronic device, in order to avoid suppression of the target objects in the windows with a small part of their areas (the small window areas are likely to cause the target objects to occupy a small area on the display screen and therefore to be easily rejected when performing NMS filtering), the present invention provides an operation method 10 for an electronic device, where the electronic device has a display screen, and the operation method 10 includes:
in step S101, a video stream of a display screen is acquired. Video streams of the display screen are derived from (1) screen recording images of the electronic equipment; (2) video streaming data with open interfaces of APP (application) such as video call and WeChat; (3) and the electronic equipment system opens video stream data through the interface. In the embodiment shown in fig. 3, the video stream of the display screen includes video streams of windows a-F, wherein the video streams of windows a-E have target objects therein, and the video stream of window F has text and/or picture content therein. Those skilled in the art will readily understand that all or part of the windows of the display screen of the electronic device having the target object are suitable for the operation method provided by the present invention, and all of them are within the protection scope of the present invention.
In step S102, a multi-region search is performed on the video stream, and when the video stream includes a plurality of regions, step S103 is performed. In the embodiment shown in fig. 3, a video stream of a display screen is subjected to multi-region search, and a plurality of regions respectively containing windows a-F are divided.
In step S103, target object detection is performed on a selected area among the plurality of areas, and a detection result is output. In the embodiment shown in fig. 3, the selected area is an area including the window a, and only the selected area is subjected to target object detection, and a detection result is output. Preferably, the selected area is determined through feature recognition (for example, through the pedestrian skeleton feature recognition specific gesture) or through selection of a user (for example, through clicking or sliding a display screen), and in a window included in the selected area, an image shot by a camera of the electronic device in real time is displayed, so that subsequent operations such as target tracking and the like are facilitated.
Third embodiment
As shown in fig. 4, when the electronic device runs the multi-party interaction type app, a plurality of windows including a plurality of target objects are displayed on a display screen of the electronic device, in order to avoid suppression of the target objects in the windows with a small part of their areas (the small window areas are likely to cause the target objects to occupy a small area on the display screen and therefore to be easily rejected when performing NMS filtering), the present invention provides an operation method 10 for the electronic device, where the electronic device has a display screen, and the operation method 10 includes:
in step S101, a video stream of a display screen is acquired. Video streams of the display screen are derived from (1) screen recording images of the electronic equipment; (2) video streaming data with open interfaces of APP (application) such as video call and WeChat; (3) and the electronic equipment system opens video stream data through the interface. In the embodiment shown in fig. 4, the video stream of the display screen includes video streams of a window a and a window B, where the window a on the display screen displays an image containing a target object captured by a camera of the electronic device in real time, and the window B displays an image containing the target object interacted remotely.
In step S102, a multi-region search is performed on the video stream, and when the video stream includes a plurality of regions, step S103 is performed. In the embodiment shown in fig. 4, a video stream of a display screen is subjected to multi-region search, and two regions respectively including a window a and a window B are divided.
In step S103, target object detection is performed on at least one of the plurality of regions, global target object detection is performed on the video stream, and the global detection result and the detection result of the at least one region are merged and output. In the process of detecting global target objects in video streams of a display screen, detection omission may occur, and then target object detection is independently performed on one or more regions with small areas, so as to supplement the global detection result. After NMS filters, the result is merged with the detection result of the global detection and output, thereby overcoming the possible condition of missing detection and saving the algorithm.
In the embodiment shown in fig. 4, the display screen of the electronic device is detected as a global target object, the area including the window B with a smaller area is detected as a target object, and the detection results are merged and output, so that the situation that the target object in the small window B is suppressed is avoided. Although fig. 4 only shows an embodiment with two windows, it is easily understood by those skilled in the art that, for the case of more windows, the global detection is performed first, then the target detection is performed on one or more windows with smaller areas separately, and the result of the global detection and the result of the separate detection of a part of the windows are merged and output, which are within the protection scope of the present invention.
For the first embodiment, each region is subjected to NMS independently, so that missing detection caused by overlapping candidate boxes due to too close distance between target objects included in a plurality of regions is avoided. The embodiment solves the technical problem that the video stream data of the camera of the electronic equipment cannot be acquired on one hand, and on the other hand, the subsequent operation can be performed on a plurality of target objects appearing in the video stream of the display screen of the electronic equipment instead of only the target objects shot by the camera in real time.
For the second embodiment, the NMS is only performed on the selected area, so that the computing power is greatly saved, and the method is suitable for an application scenario in which only the target object (usually, the target object shot by the camera in real time) included in the selected area needs to be subsequently operated.
For the third embodiment, global NMS is performed on the video stream of the display screen, and each of the partial areas with high possibility of missed detection is individually subjected to NMS, and the partial areas are combined to combine the detection results. The embodiment improves the accuracy of target detection, and simultaneously simplifies the algorithm compared with the first embodiment.
Fig. 5 shows a flowchart of an operation method 10 for an electronic device according to a preferred embodiment of the present invention, where the electronic device has a display screen, and the operation method 10 includes:
in step S101, a video stream of a display screen is acquired;
in step S102, performing a multi-region search on the video stream, and if the video stream includes a plurality of regions, executing step S103;
in step S103, performing target object detection on at least one of the plurality of regions, and merging and outputting detection results of the at least one region; or carrying out target object detection on a selected area in the plurality of areas and outputting a detection result; or carrying out target object detection on at least one of the multiple regions, carrying out global target object detection on the video stream, and combining and outputting the global detection result and the detection result of at least one region.
According to a preferred embodiment of the present invention, wherein step S102: and performing multi-region retrieval on the video stream is completed by Hough detection. Hough detection transforms rectangular coordinates and polar coordinates, and intersection points of polar coordinate curves are possible detected straight lines by point mapping straight lines and traversing all pixel points.
Step S102 can also be completed by SIFT/SURF algorithm and Haar feature extraction algorithm, but for the identification of the size of the region, the SIFT/SURF algorithm and the Haar feature extraction algorithm can not be completed separately. Therefore, for the third embodiment, the target detection is performed on the regions with small partial areas separately, and the target detection is combined with the global target detection result of the video stream for output, and if the SIFT/SURF algorithm and the Haar feature extraction algorithm are adopted in step S102, the edge detection should be combined to identify the large and small regions.
According to a preferred embodiment of the present invention, wherein the target object detection further comprises:
setting an anchor point, and outputting a characteristic diagram through convolution of a neural network;
a plurality of candidate boxes of the feature map are filtered with a non-maximum suppression algorithm (NMS).
According to a preferred embodiment of the present invention, when the step S103 is to perform target object detection on a plurality of areas, and output the detection results in a combined manner, the operation method 10 further includes: anchor points (anchors) having different sizes are set according to the sizes of the plurality of regions.
Detection anchors of different sizes are set for the regions of different sizes. Preferably, 3 sets of anchors are set during training (for example, three sets of anchors with large, medium and small scales, which may be regularly spaced by 1000 or 2000 pixels, or irregularly), the anchor with large scale is selected for calculation in the area with larger size, and more anchors with small scale are selected for calculation in the area with smaller size, so as to speed up the calculation speed. In practical application, the size of the anchor can be derived according to the model according to different region sizes.
Steps S101-S103 of the operating method 10 have enabled all target objects in the video stream of the display screen to be detected, requiring selection among a plurality of target objects for subsequent operations such as target object following. According to a preferred embodiment of the present invention, the method of operation 10 further comprises:
in step S104, feature recognition is performed on at least one target object in the video stream, and a tracking target is selected from the at least one target object according to a recognition result.
As shown in fig. 6, according to a preferred embodiment of the present invention, recognition is performed according to skeleton features of a target object, and when a preset gesture is recognized to be performed on a target object in a video stream, the target object is taken as a tracking target. According to the human body key point information, the pedestrian to which the current gesture belongs can be determined through pedestrian skeleton recognition, and then the tracking target is determined.
According to a preferred embodiment of the present invention, the method of operation 10 further comprises:
in step S104, a target object selected by the user of the electronic device in the video stream is taken as a tracking target. Preferably, the selection of the target object by the user of the electronic device is achieved by clicking and/or sliding the display screen.
According to a preferred embodiment of the present invention, when the step S103 is to perform target object detection on the selected area and output the detection result, the operation method 10 may select the selected area by a method similar to the above-mentioned method for selecting the tracking target, and then the step S103 further includes: and performing feature recognition on the video stream, and determining the selected area according to the recognition result. Preferably, as shown in fig. 7, the recognition is performed according to the skeleton feature of the pedestrian, and when the video stream is recognized to include the preset gesture, the area where the preset gesture is located is used as the selected area.
According to a preferred embodiment of the present invention, the operation method 10 may select the selected area by a method similar to the above-mentioned method for selecting the tracking target, and the step S103 further includes: the area selected by the user of the electronic device in the video stream is taken as the selected area. Preferably, the selection of the target area by the user of the electronic device is achieved by clicking and/or sliding the display screen.
According to a preferred embodiment of the present invention, when the step S103 is performing target object detection on the selected area and outputting the detection result, the operation method 10 further includes:
and directly taking the target object detected in the selected area as a tracking target. In general, the target object detected in the selected area is a target object shot by a camera of the electronic device in real time, and subsequent operations such as target tracking and the like can be performed on the target object. For more complex situations, such as a situation where a plurality of target objects are contained in the selected area, the above-mentioned method for determining the target objects (pedestrian skeleton feature recognition, manual selection by the user) can be further adopted for selection, and these are all within the scope of the present invention.
According to a preferred embodiment of the present invention, when the step S103 is performing target object detection on the selected area and outputting the detection result, the operation method 10 further includes:
in step S1031, global target object detection is performed on the selected region, and a detection result is output;
in step S1032, target object detection near the tracking target is performed on the selected area, and a detection result is output;
wherein step S1031 and step S1032 are alternately performed every other frame.
For the selected area, global detection and local detection are adopted, and global detection is carried out in a tracking-free stage; in the tracking stage, the detection distance in the tracking process can be increased by carrying out local detection, and the calculation expense of the algorithm is reduced. In order to ensure the detection feasibility, a strategy of alternately carrying out global detection and local detection at intervals is adopted, so that the problem of missed detection caused by rapid motion of a tracked target (the tracked target is not in a local detection frame) is avoided.
According to a preferred embodiment of the present invention, as shown in fig. 8, after the tracking of the target object is started, all the contents of the current screen, such as the window C shown in fig. 8, are displayed on the display screen of the electronic device, and the display window may be displayed or hidden.
According to a preferred embodiment of the present invention, the method of operation 10 further comprises:
in step S105, first parameter information of the tracking target is acquired, and an updated pose parameter of the electronic device is output according to the first parameter information of the tracking target. Preferably, the first parameter information comprises a position parameter and/or a size parameter, an acceleration parameter.
According to a preferred embodiment of the present invention, wherein the electronic device further comprises a camera, step S105 further comprises:
according to the first parameter information of the tracking target, the updated pose parameter of the electronic equipment is calculated, so that the position and/or the size of the tracking target in the image collected by the camera meet the preset requirement.
According to a preferred embodiment of the present invention, as shown in fig. 9, wherein the electronic device 100 is mounted on the cradle head 200, the step S105 further comprises: outputting the updated pose parameters of the electronic device 100 to the pan-tilt head 200, the method of operation 10 further comprising:
in step S106, the pose of the electronic device 100 is adjusted by the cradle head 200 according to the updated pose parameter of the electronic device 100.
According to a preferred embodiment of the present invention, in the operation method 10, the electronic device includes one or more of a mobile phone, a PAD, a motion camera, AR/VR glasses, and a home smart camera, and the target object includes one or more of a human face, an iris, a heating element, and a dynamic target object.
The present invention also provides, in accordance with a preferred embodiment thereof, a computer-readable storage medium including computer-executable instructions stored thereon which, when executed by a processor, implement the method of operation 10 as described above.
The preferred embodiment of the present invention provides an operation method of an electronic device, which adopts the strategy of a regional NMS, a combination of a global NMS and a partial region NMS, and a selected region NMS to avoid or reduce the missed detection of a target object when a display screen of the electronic device displays a plurality of windows and contains a plurality of target objects. After the detection result of the target object is output, subsequent operation can be performed, and the problem that operations such as face recognition, face beautification, human body temperature measurement, holder following and the like cannot be synchronously performed under the condition that other application programs have called the camera of the electronic equipment is solved. The preferred embodiment of the invention does not need to rely on other application programs, so that the user can finish the identification and tracking of the target object in the way of making WeChat calls and carrying out video conferences, and practice proves that the invention has beneficial effects.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (22)

1. An operating method for an electronic device, wherein the electronic device has a display screen, the operating method comprising:
s101: acquiring a video stream of the display screen;
s102: performing multi-region retrieval on the video stream, and executing step S103 when the video stream contains a plurality of regions;
s103: and detecting a target object in at least one of the plurality of areas, and outputting a detection result of the at least one area.
2. The operating method of claim 1, wherein step S103 further comprises:
and detecting the target object in the selected area of the plurality of areas and outputting a detection result.
3. The operating method of claim 1, wherein step S103 further comprises:
and carrying out global target object detection on the video stream, and combining and outputting the global detection result and the detection result of the at least one region.
4. The method of operation of any of claims 1-3, wherein step S102 is accomplished by Hough detection.
5. The method of operation of any of claims 1-3, wherein the target object detection further comprises:
setting an anchor point, and outputting a characteristic diagram through convolution of a neural network;
and filtering a plurality of candidate boxes of the feature map by using a non-maximum suppression algorithm.
6. The operating method according to claim 5, wherein step S103 is to perform target object detection on at least one of the plurality of regions, and output the detection result of the at least one region in a combined manner, and the method further comprises:
and respectively setting anchor points according to the size of the at least one area.
7. The method of operation of any of claims 1-3, further comprising:
s104: and performing feature recognition on at least one target object in the video stream, and selecting a tracking target from the at least one target object according to a recognition result.
8. The operating method of claim 7, wherein step S104 further comprises:
and recognizing according to the skeleton characteristics, and when a preset gesture is recognized to be made on a target object in the video stream, taking the target object as the tracking target.
9. The method of operation of any of claims 1-3, further comprising:
s104: and taking a target object selected by a user of the electronic equipment in the video stream as a tracking target.
10. An operating method according to claim 9, wherein the selection of a target object by a user of the electronic device is effected by clicking and/or sliding the display screen.
11. The operating method of claim 2, wherein step S103 further comprises:
and performing feature recognition on the video stream, and determining the selected area according to a recognition result.
12. The operating method of claim 11, wherein step S103 further comprises:
and identifying according to the skeleton characteristics, and when the video stream is identified to contain a preset gesture, taking the area where the preset gesture is located as the selected area.
13. The operating method of claim 2, wherein step S103 further comprises:
and taking the area selected by the user of the electronic equipment in the video stream as the selected area.
14. An operating method according to claim 13, wherein the selection of a target area by a user of the electronic device is achieved by clicking and/or sliding the display screen.
15. The method of operation of any of claims 11-14, further comprising:
s104: and taking the target object in the selected area as a tracking target.
16. The operating method of claim 15, wherein step S103 further comprises:
s1031: carrying out global target object detection on the selected area, and outputting a detection result;
s1032: detecting a target object near a tracking target in the selected area, and outputting a detection result;
wherein step S1031 and step S1032 are alternately performed every other frame.
17. The method of operation of claim 8 or 10, further comprising:
s105: and acquiring first parameter information of the tracking target, and outputting updated pose parameters of the electronic equipment according to the first parameter information of the tracking target.
18. Operating method according to claim 17, wherein the first parameter information comprises a position parameter and/or a size parameter, an acceleration parameter.
19. The operating method of claim 17, wherein the electronic device further comprises a camera, wherein step S105 further comprises:
and calculating the updated pose parameters of the electronic equipment according to the first parameter information of the tracking target, so that the position and/or the size of the tracking target in the image acquired by the camera meet the preset requirement.
20. The operating method according to claim 17, wherein the electronic device is mounted on a pan/tilt head, the step S105 further comprising: outputting the updated pose parameters of the electronic device to the pan-tilt head, the method of operation further comprising:
s106: and adjusting the pose of the electronic equipment according to the updated pose parameter of the electronic equipment through the cradle head.
21. The method of operation of any of claims 1-3, wherein the electronic device includes one or more of a cell phone, a PAD, a motion camera, AR/VR glasses, a home smart camera, and the target object includes one or more of a human face, an iris, a heat emitting body, and a dynamic target object.
22. A computer-readable storage medium comprising computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-21.
CN202110181016.7A 2021-02-09 2021-02-09 Operation method of electronic equipment Pending CN113011259A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110181016.7A CN113011259A (en) 2021-02-09 2021-02-09 Operation method of electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110181016.7A CN113011259A (en) 2021-02-09 2021-02-09 Operation method of electronic equipment

Publications (1)

Publication Number Publication Date
CN113011259A true CN113011259A (en) 2021-06-22

Family

ID=76401981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110181016.7A Pending CN113011259A (en) 2021-02-09 2021-02-09 Operation method of electronic equipment

Country Status (1)

Country Link
CN (1) CN113011259A (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870117A (en) * 2014-02-18 2014-06-18 联想(北京)有限公司 Information processing method and electronic equipment
US20150172599A1 (en) * 2013-12-13 2015-06-18 Blake Caldwell System and method for interactive animations for enhanced and personalized video communications
CN108683855A (en) * 2018-07-26 2018-10-19 广东小天才科技有限公司 A kind of control method and terminal device of camera
CN109325967A (en) * 2018-09-14 2019-02-12 腾讯科技(深圳)有限公司 Method for tracking target, device, medium and equipment
CN109584213A (en) * 2018-11-07 2019-04-05 复旦大学 A kind of selected tracking of multiple target number
CN109669658A (en) * 2018-12-29 2019-04-23 联想(北京)有限公司 A kind of display methods, device and display system
CN109785358A (en) * 2018-11-23 2019-05-21 山东航天电子技术研究所 It is a kind of that Tracking Method of IR Small Target is blocked based on circulation the anti-of affirmation mechanism
CN109963187A (en) * 2017-12-14 2019-07-02 腾讯科技(深圳)有限公司 A kind of cartoon implementing method and device
CN109977952A (en) * 2019-03-27 2019-07-05 深动科技(北京)有限公司 Candidate target detection method based on local maximum
CN110147461A (en) * 2019-04-30 2019-08-20 维沃移动通信有限公司 Image display method, device, terminal device and computer readable storage medium
CN110348318A (en) * 2019-06-18 2019-10-18 北京大米科技有限公司 Image-recognizing method, device, electronic equipment and medium
CN110428449A (en) * 2019-07-31 2019-11-08 腾讯科技(深圳)有限公司 Target detection tracking method, device, equipment and storage medium
CN110428448A (en) * 2019-07-31 2019-11-08 腾讯科技(深圳)有限公司 Target detection tracking method, device, equipment and storage medium
CN111885303A (en) * 2020-07-06 2020-11-03 雍朝良 Active tracking recording and shooting visual method
CN112073770A (en) * 2019-06-10 2020-12-11 海信视像科技股份有限公司 Display device and video communication data processing method
CN112330715A (en) * 2020-10-09 2021-02-05 深圳英飞拓科技股份有限公司 Tracking method, tracking device, terminal equipment and readable storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150172599A1 (en) * 2013-12-13 2015-06-18 Blake Caldwell System and method for interactive animations for enhanced and personalized video communications
CN103870117A (en) * 2014-02-18 2014-06-18 联想(北京)有限公司 Information processing method and electronic equipment
CN109963187A (en) * 2017-12-14 2019-07-02 腾讯科技(深圳)有限公司 A kind of cartoon implementing method and device
CN108683855A (en) * 2018-07-26 2018-10-19 广东小天才科技有限公司 A kind of control method and terminal device of camera
CN109325967A (en) * 2018-09-14 2019-02-12 腾讯科技(深圳)有限公司 Method for tracking target, device, medium and equipment
CN109584213A (en) * 2018-11-07 2019-04-05 复旦大学 A kind of selected tracking of multiple target number
CN109785358A (en) * 2018-11-23 2019-05-21 山东航天电子技术研究所 It is a kind of that Tracking Method of IR Small Target is blocked based on circulation the anti-of affirmation mechanism
CN109669658A (en) * 2018-12-29 2019-04-23 联想(北京)有限公司 A kind of display methods, device and display system
CN109977952A (en) * 2019-03-27 2019-07-05 深动科技(北京)有限公司 Candidate target detection method based on local maximum
CN110147461A (en) * 2019-04-30 2019-08-20 维沃移动通信有限公司 Image display method, device, terminal device and computer readable storage medium
CN112073770A (en) * 2019-06-10 2020-12-11 海信视像科技股份有限公司 Display device and video communication data processing method
CN110348318A (en) * 2019-06-18 2019-10-18 北京大米科技有限公司 Image-recognizing method, device, electronic equipment and medium
CN110428449A (en) * 2019-07-31 2019-11-08 腾讯科技(深圳)有限公司 Target detection tracking method, device, equipment and storage medium
CN110428448A (en) * 2019-07-31 2019-11-08 腾讯科技(深圳)有限公司 Target detection tracking method, device, equipment and storage medium
CN111885303A (en) * 2020-07-06 2020-11-03 雍朝良 Active tracking recording and shooting visual method
CN112330715A (en) * 2020-10-09 2021-02-05 深圳英飞拓科技股份有限公司 Tracking method, tracking device, terminal equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
武玉伟 等: "《深度学习基础及应用》", vol. 1, 30 November 2020, 北京理工大学出版社, pages: 233 - 235 *

Similar Documents

Publication Publication Date Title
CN108229277B (en) Gesture recognition method, gesture control method, multilayer neural network training method, device and electronic equipment
EP3648448B1 (en) Target feature extraction method and device, and application system
KR101760109B1 (en) Method and device for region extraction
CN108960067B (en) Real-time train driver action recognition system and method based on deep learning
CN108805900B (en) Method and device for determining tracking target
CN107438173A (en) Video process apparatus, method for processing video frequency and storage medium
EP2616993A1 (en) Smile detection systems and methods
GB2529943A (en) Tracking processing device and tracking processing system provided with same, and tracking processing method
JPH07168932A (en) Method for search of human being in video image
CN110163211B (en) Image recognition method, device and storage medium
CN103079034A (en) Perception shooting method and system
CN109002776B (en) Face recognition method, system, computer device and computer-readable storage medium
CN112207821A (en) Target searching method of visual robot and robot
JP2010057105A (en) Three-dimensional object tracking method and system
US9947106B2 (en) Method and electronic device for object tracking in a light-field capture
CN111259757B (en) Living body identification method, device and equipment based on image
CN114885119A (en) Intelligent monitoring alarm system and method based on computer vision
CN113065568A (en) Target detection, attribute identification and tracking method and system
CN113610865B (en) Image processing method, device, electronic equipment and computer readable storage medium
US9286707B1 (en) Removing transient objects to synthesize an unobstructed image
CN113225451A (en) Image processing method and device and electronic equipment
CN113569594A (en) Method and device for labeling key points of human face
CN113011259A (en) Operation method of electronic equipment
CN106780599A (en) A kind of circular recognition methods and system based on Hough changes
CN110472551A (en) A kind of across mirror method for tracing, electronic equipment and storage medium improving accuracy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination