WO2022078190A1 - 图像采集方法、装置、终端和存储介质 - Google Patents

图像采集方法、装置、终端和存储介质 Download PDF

Info

Publication number
WO2022078190A1
WO2022078190A1 PCT/CN2021/120652 CN2021120652W WO2022078190A1 WO 2022078190 A1 WO2022078190 A1 WO 2022078190A1 CN 2021120652 W CN2021120652 W CN 2021120652W WO 2022078190 A1 WO2022078190 A1 WO 2022078190A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target object
preset
target
image acquisition
Prior art date
Application number
PCT/CN2021/120652
Other languages
English (en)
French (fr)
Inventor
孙镇江
李辉
王桐
李军
张圣
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Priority to US18/249,160 priority Critical patent/US20230394614A1/en
Publication of WO2022078190A1 publication Critical patent/WO2022078190A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0007Image acquisition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/80Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10141Special mode during image acquisition
    • G06T2207/10148Varying focus
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present disclosure relates to the technical field of image acquisition, and in particular, to an image acquisition method, device, terminal and storage medium.
  • image acquisition devices such as cameras are often required to capture and transmit video images.
  • the actual shooting application scenarios are rich and varied, and the current image acquisition device cannot well adapt to the needs of a specific scenario, and has problems such as low operation efficiency or poor shooting effect.
  • the present disclosure provides an image acquisition method, device, terminal and storage medium.
  • the present disclosure adopts the following technical solutions.
  • the present disclosure provides an image acquisition method for an image acquisition device, comprising:
  • the target object is photographed to obtain an image of the target object.
  • the present disclosure provides an image acquisition method for an image acquisition device, comprising:
  • the target object is photographed to obtain an image of the target object.
  • an image capture device comprising:
  • an identification unit for judging whether the voice information meets the first preset condition
  • a positioning unit configured to determine the position of the target object if the voice information meets the first preset condition
  • the photographing unit is used for photographing the target object according to the position of the target object to obtain an image of the target object.
  • an image capture device comprising:
  • the acquisition module is used to acquire the user's body image
  • a positioning module for determining the position of the target object according to the image of the preset object
  • the photographing module is used for photographing the target object according to the position of the target object to obtain an image of the target object.
  • the present disclosure provides a terminal comprising: at least one memory and at least one processor;
  • the memory is used for storing program codes
  • the processor is used for calling the program codes stored in the memory to execute the above method.
  • the present disclosure provides a storage medium for storing program code for performing the above-described method.
  • the image acquisition method provided by some embodiments of the present disclosure can locate the position of the target object when the acquired voice information meets the first preset condition, photograph the target object to obtain an image of the target object, and show the target to other people when it is necessary to When the object is detected, it is convenient for others to view the target object without the user manually operating the image acquisition device, thereby liberating the user's hands and improving the convenience.
  • an image of a preset object is acquired, the position of the target object is determined according to the image of the preset object, the target object is photographed according to the position of the target object, and when the target object needs to be displayed, you can It is natural to shoot the target object by controlling the preset object, and the whole process can be done without pause or manual operation of the image capture device, thereby improving the convenience and the fluency of the display process.
  • FIG. 1 is a flowchart of an image acquisition method 100 according to an embodiment of the present disclosure.
  • FIG. 2 is a flowchart of an image acquisition method 200 according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an image acquisition method 300 according to an embodiment of the present disclosure.
  • FIG. 4 is a composition diagram of an image acquisition device according to an embodiment of the present disclosure.
  • FIG. 5 is a composition diagram of another image acquisition device according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • an image acquisition device For example, when a video conference or live broadcast is performed, an image acquisition device needs to be used for image acquisition.
  • an image acquisition device needs to be used for image acquisition.
  • it usually requires manual operation to adjust the camera to adapt to different shooting scenarios, such as shooting a target object, or making a close-up of an item to be displayed, that is, the user needs to operate manually, causing the user's hands to be occupied. It is very inconvenient during meetings or live broadcasts.
  • some embodiments of the present disclosure propose an image acquisition method, which can be used in an image acquisition apparatus, and the image acquisition apparatus may be, for example, an image acquisition apparatus with a zoom camera.
  • FIG. 1 is a flowchart of an image acquisition method 100 according to an embodiment of the present disclosure.
  • the image acquisition method 100 includes the following steps S101-S104.
  • the image capture device may be provided with a voice capture device, such as a microphone, so as to acquire voice information.
  • the image capture device may acquire voice information from other devices through a network, such as a voice capture device It is connected in communication with the image acquisition device, and the voice information is collected by the voice acquisition device and then transmitted to the image acquisition device.
  • S102 Determine whether the voice information meets the first preset condition.
  • the first preset condition may be, for example, that the voice information includes a specific word, or it may be that the accent of the voice information is recognized, and the recognized voice is the accent of a specific user.
  • the first A preset condition is not specifically limited.
  • the voice information meets the first preset condition, and the position of the target object is obtained at this time.
  • the target object may be, for example, an object that needs to be displayed or featured, or a character that needs to be displayed or featured.
  • the product introduced in the live broadcast, or the user's face after using a specific beauty product, or the sample to be displayed in the video conference, etc., the position of the target object can be represented by coordinates.
  • steps S101 and S102 are repeatedly performed until the acquired voice information meets the first preset condition.
  • S104 According to the position of the target object, photograph the target object to obtain an image of the target object.
  • the image acquisition device automatically adjusts the camera to capture the image of the target object, for example, zooming in on the image of the target object or making a close-up of the target object, so that the target can be captured more clearly An image of the object for easy viewing.
  • the focal length of the image acquisition device before the target object is just suitable for photographing the target object.
  • the shooting angle of the image acquisition device can be directly adjusted to photograph the target object.
  • another smaller object may be close-up.
  • the target object is larger than the object being close-up, so it is necessary to adjust the focus properly to reduce the magnification to enlarge the current field of view.
  • the proposed image acquisition method further includes: sending the captured image to a target terminal for playback, and the target terminal may be, for example, a terminal that is connected in communication with the image capturing apparatus to view the captured image, such as in the embodiments of the present disclosure.
  • the target terminal may be a terminal used by a participant of the remote conference to view the captured images.
  • the target terminal may, for example, be used by a viewer watching the live broadcast. terminal.
  • the method 100 proposed in the embodiment of the present disclosure is used for the scene of delivering goods in a live video broadcast as an example, and an embodiment proposed in the present disclosure will be introduced.
  • Introducing the product in order to make the audience have a clearer understanding of the product when introducing the product, the camera is often adjusted so that the camera shoots the product.
  • the user is required to manually adjust the camera to shoot the product.
  • the host when the host needs to photograph the product, the host sends out voice information, and the image acquisition device obtains the voice information and determines whether the first preset condition is met, and when the first preset condition is met, the image The acquisition device acquires the position of the commodity and automatically adjusts the camera to capture the image of the commodity, so that the host does not need to manually adjust the image acquisition device during the live broadcast, freeing hands and facilitating the introduction of the commodity.
  • the user can make the image acquisition device determine the position of the target object by sending out voice information, and shoot the target object.
  • voice information When it is necessary to show the target object to others, It is convenient for others to view the target object without the need for the user to manually operate the image acquisition device.
  • step S102 judging whether the voice information complies with the first preset condition includes: judging whether the voice information includes a preset keyword; if the voice information includes a keyword, the first preset condition is met condition; or, if the voice information does not include a keyword, the first preset condition is not met.
  • keywords are preset.
  • the keyword in this embodiment can be set by the user. For example, it can be words such as "physical display”, “look here", “look to the left”, “look to the right” and so on.
  • determining the position of the target object in step S103 includes: acquiring a body image of the user, and determining the position of the target object according to the user's body image.
  • the user's body image may be a part of the user's body image or the user's whole body image.
  • the user's body will often make corresponding actions, and these actions will identify The position of the target object, so the position of the target object can be determined according to the user's body image. For example, the user usually points to the target object with his hands, or the user's eyes look at the target object. The gaze direction determines the position to the target object.
  • determining the position of the target object according to the user's body image includes: judging whether the body image includes the feature points of the target limb; if the body image includes the feature points of the target limb, then The position of the feature point determines the position of the target object; or, if the feature point of the target limb is not included in the body image, the user's body image is reacquired.
  • the target limb is preset, and the target limb may be related to the target object.
  • the limb for example, can be the limb of the operation target object, for example, it can be set that the target limb includes at least one of a hand and an arm. Under normal circumstances, when the user needs to show the target object, he will point to or hold the target object with his hand, so the position of the target object can be determined by the position of the feature points of the target limb.
  • determining the position of the target object according to the position of the feature point of the target limb may include, for example, taking the feature point of the target limb as the center of the circle, locating the target range with a preset distance as the radius, and locating the target object within the target range , and then determine the location of the target object.
  • the position of the target object is usually located near the feature point of the target limb, so it can be searched in the vicinity of the feature point of the target limb and locate the target object. In this way, the speed of determining the position of the target object can be improved and computing resources can be saved.
  • photographing the target object to obtain an image of the target object includes: adjusting the angle of view during photographing, and/or adjusting the focal length during photographing to photograph the target object.
  • the target object may not be located in the current field of view before the target object is photographed, and the focal length may be inappropriate. Therefore, when the target object is photographed, the angle of view and/or the shooting angle need to be adjusted. focal length, thereby improving the shooting effect when shooting.
  • a communicatively connected controller may be pre-configured for image capture, and in response to control information sent by the controller, the angle of view and/or focal length may be adjusted according to the control information.
  • the user controls the angle of view and/or focal length through the controller and will not be photographed by the image acquisition device, which is convenient for the user to select a suitable angle of view and/or focal length.
  • the angle of view during shooting is adjusted so that the target object is located in the middle of the captured image.
  • the purpose of photographing the target object is to show the target object. Therefore, adjusting the viewing angle during photographing can show the target object more clearly.
  • the angle of view can be adjusted so that the coordinates of the target object are located in the center of the captured image, so that the target object is located in the middle of the captured image. target perspective.
  • the focal length during shooting is adjusted to increase the magnification during shooting.
  • the details of the target object need to be displayed when shooting the target object, so that other people can view the details and close-up of the target object at a close distance.
  • the magnification of the image acquisition device here refers to the magnification of the image acquisition device when the target object is photographed, which is greater than the magnification of the image acquisition device before the target object is photographed.
  • the image The magnification of the capturing device is 1, and the magnification when photographing the target object should be greater than 1.
  • the camera captures the image of the participant and transmits it to other remote participants.
  • the participants and the participants need to display the details of the exhibits, they can control the image acquisition device to increase the magnification to 3 times by sending out voice messages, so as to shoot the details of the exhibits.
  • the remote participants can see the details of the exhibits, and no manual operation is required by the participants, which frees the hands of the participants and improves the convenience.
  • adjusting the focal length during shooting includes: adjusting the focal length during shooting according to the size of the display screen; wherein the display screen is used to display the captured image.
  • the focal length of the image capturing device during shooting is related to the size of the display screen for display.
  • the focal length during shooting can be adjusted so that the size of the image of the target object to be captured on the display screen is not smaller than Target size, and/or so that the ratio of the area of the image of the target object to be photographed on the display screen to the area of the display screen is not smaller than the target ratio.
  • the size of the target object to be photographed in the horizontal and vertical directions of the display screen is not less than 10cm
  • the area of the image of the target object to be captured is set to account for not less than 75% of the area of the display screen, so that when the display screen is When the size is small, the focus will be automatically adjusted to make the image of the target object large enough, and when the size of the display screen is large, the focus can be automatically adjusted with the area of the display screen, and the image of the target subject will not be too small.
  • a voice instruction is acquired, and the angle of view and/or the focal length during shooting is adjusted according to the voice instruction.
  • the viewing angle and/or focal length of the target object can be controlled by voice to further adjust the shooting angle and/or focal length.
  • the voice command can be included in the voice information, and the user can Send a voice message with voice commands in the voice message.
  • the method further includes: obtaining voice information again; judging whether the voice information obtained again meets the second preset condition; if the voice information obtained again meets the second preset condition, The image acquisition device is adjusted to a first state; wherein the first state is the state of the image acquisition device before the image of the target object is obtained by photographing the target object. In this embodiment, after the target object is photographed, it may be unnecessary to photograph the target object.
  • the image acquisition device can be controlled to return to the first state of the image acquisition device before step S104 by sending out voice information
  • the second preset condition in this embodiment may be, for example, that the re-acquired voice information includes a preset target word, when it is recognized that the re-acquired voice information includes the target word, the state of exiting the shooting of the target object, and returning to the step
  • the viewing angle and focal length of the image capturing device before step S104 may be recorded, and the viewing angle and focal length of the image capturing device may be adjusted to the viewing angle and focal length recorded before step S104.
  • the image may be recorded.
  • the image capture device is controlled to capture the recorded user again with the recorded focal length.
  • FIG. 2 another image acquisition method 200 is proposed.
  • the image acquisition method 200 is used in an image acquisition apparatus, and includes steps S201 to S203, and the details are as follows:
  • the preset object may be a preset item or a part or all of the user's body, so the image of the preset object may be an image of a preset item or an image of a preset user's body, which is not limited thereto .
  • S202 Determine the position of the target object according to the image of the preset object.
  • the target object is located based on the image of the preset object, and the position of the target object can be represented by coordinates. Products introduced in the live broadcast, or samples to be displayed in the video conference, etc.
  • S203 According to the position of the target object, photograph the target object to obtain an image of the target object.
  • the image acquisition device automatically adjusts the camera to capture the image of the target object, for example, zooming in on the image of the target object or making a close-up of the target object, so that the target can be captured more clearly An image of the object for easy viewing.
  • the focal length of the image capture device before the target object is just suitable for the target object to be captured.
  • the shooting angle of the image capture device can be directly adjusted to capture the target object.
  • another smaller object may be close-up.
  • the target object is larger than the object being close-up. Therefore, it is necessary to adjust the focus properly to enlarge the current field of view and reduce the magnification.
  • the proposed image acquisition method further includes: sending the captured image to a target terminal for playback, and the target terminal may be, for example, a terminal that is communicatively connected to the image acquisition device.
  • the target terminal may be a participant of the remote conference.
  • the target terminal may be, for example, a viewer watching the live broadcast.
  • an embodiment proposed in the present disclosure will be introduced by taking the scenario in which the method 200 proposed in the embodiment of the present disclosure is used for a video conference as an example.
  • the image acquisition device shoots the main conference Participants participate in the remote conference through the captured graphics.
  • the participants in the main venue need to introduce the exhibits.
  • the camera is often adjusted so that the camera shoots the exhibits.
  • the participants in the main venue need to manually adjust the camera to shoot the exhibits, which causes inconvenience.
  • the image acquisition device acquires the user's body image
  • the image acquisition device acquires the position of the exhibit according to the user's body image and automatically adjusts the camera to capture the image of the exhibit, so that there is no need for the live conference.
  • Manually adjust the image acquisition device during the process of the exhibition so as to facilitate the introduction of the exhibits.
  • the preset object image includes: a user's body image or an image of a preset item.
  • the user's body image may be a full-body image of the user or an image of a part of the user's body, wherein the number of users may be one or more, that is, the number of users may not be limited, and the number of users may be unlimited. Capture body images of multiple users.
  • the image of the preset item may be, for example, an image of a teaching pole, a demonstration pole, or the like.
  • the method further includes: determining whether the image of the preset object meets a third preset condition ; If the image of the preset object meets the third preset condition, determine the position of the target object according to the image of the preset object. In some embodiments, if the image of the preset object does not meet the third preset condition, repeating the acquisition of the image of the preset object and determining whether the image of the preset object meets the third preset condition, until the obtained body image meets the third preset condition.
  • the third preset condition can be, for example, that the user has made a predetermined action. By setting the third preset condition, the shooting can be performed only when the target object needs to be photographed, which is convenient for the user to independently control when Shoot the target object.
  • the image of the preset object includes a body image of the user, and determining whether the image of the preset object meets the third preset condition includes: judging whether the body image includes: a target limb having a target action; If so, the third preset condition is met; or, if not, the third preset condition is not met.
  • a target limb is pre-specified (for example, the target limb may include: at least one of a hand and an arm), and a target action is pre-specified, and when the target limb is detected and the action of the target limb is the target action, the first Three preset conditions.
  • the target limb with the target action includes at least one of a hand pointing at the object, a hand holding the object, a hand holding the object, and an eye looking at the object.
  • the above action is the action that the user will naturally perform when the target object needs to be displayed, so that the user does not need to perform additional actions, and the whole process is natural and smooth. will feel awkward.
  • the feature points of the target limb can be monitored in real time to determine the state of the target limb, thereby determining whether to photograph the target object.
  • the image of the preset object includes: an image of a preset item, and determining whether the image of the preset object meets the third preset condition includes: judging whether the preset item in the image of the preset item is held and pointed Object; if so, the third preset condition is met, or, if not, the third preset condition is not met.
  • the preset item may be an item such as a demonstration pole, which may be used to point the target object with the preset item. When the user uses the preset item, the user will hold the preset item and use it to point to the target object.
  • the preset item pointing to any object may refer to the vicinity of the preset feature point of the preset item Has an item within the distance threshold.
  • determining the position of the target object according to the image of the preset object includes: acquiring the position of the feature point of the preset object in the image of the preset object; determining the position of the target object according to the position of the feature point of the preset object .
  • the distance between the preset object and the target object is often relatively short, so the position of the target object can be determined according to the position of the feature point on the preset object.
  • the image of the preset object includes the user.
  • determining the position of the target object includes: determining the position of the target object according to the body image.
  • determining the position of the target object according to the body image includes: acquiring the position of the feature point of the target limb in the body image; and determining the position of the target object according to the position of the feature point of the target limb.
  • the target limb is preset, and the target limb may be a limb related to the target object, for example, the limb of the operation target object.
  • the target limb may be set to include at least one of a hand and an arm. Under normal circumstances, when the user needs to show the target object, he will point to or hold the target object with his hand, so the position of the target object can be determined by the position of the feature points of the target limb.
  • determining the position of the target object according to the positions of the feature points of the preset object may include, for example, taking the feature points of the preset object as the center of the circle, locating the target range with the preset distance as the radius, and positioning within the target range.
  • the preset object may be the target limb of the user. Considering that the user usually uses the target limb (such as a hand) to point or support the target object, the position of the target object is usually located near the feature point of the target limb, so The target object can be found and located near the feature points of the target limb.
  • the preset object is a preset item, and at this time, the target object is located near the feature point of the preset item.
  • photographing the target object to obtain an image of the target object includes: adjusting the angle of view during photographing, and/or adjusting the angle of view during photographing to photograph the target object.
  • the target object may not be located in the current field of view before the target object is photographed, and the focal length may be inappropriate. Therefore, when the target object is photographed, the angle of view and/or the shooting angle need to be adjusted. focal length, thereby improving the shooting effect when shooting.
  • the viewing angle when shooting is adjusted so that the target object is located in the middle of the shot image; in some embodiments, the purpose of shooting the target object is to show the target object, therefore, adjusting the viewing angle when shooting can Show the target audience more clearly.
  • the focal length when shooting is adjusted to increase the magnification when shooting. Adjust the focus when shooting to increase the magnification when shooting.
  • the details of the target object need to be displayed when shooting the target object, so that other people can view the details and close-up of the target object at a close distance. Therefore, it is necessary to adjust the focal length during shooting to increase the magnification during shooting.
  • the magnification refers to the magnification of the image acquisition device when shooting the target object, which is greater than the magnification of the image acquisition device before shooting the target object. For example, when acquiring voice information, the magnification of the image acquisition device is 1.
  • the magnification ratio when shooting the target object should be greater than 1. By increasing the magnification ratio, the image of the captured target object can be enlarged, so that the details of the target object can be captured and the target object can be close-up.
  • adjusting the viewing angle during shooting includes: adjusting the focal length during shooting according to the size of the display screen; wherein the display screen is used to display the captured image.
  • the focal length of the image capturing device during shooting is related to the size of the display screen for display. For example, the focal length during shooting can be adjusted so that the size of the image of the target object to be captured on the display screen is not smaller than Target size, and/or so that the ratio of the area of the image of the target object to be photographed on the display screen to the area of the display screen is not smaller than the target ratio.
  • the size of the target object to be photographed in the horizontal and vertical directions of the display screen is not less than 10cm
  • the area of the image of the target object to be captured is set to account for not less than 75% of the area of the display screen, so that when the display screen is When the size is small, the focus will be automatically adjusted to make the image of the target object large enough, and when the size of the display screen is large, the focus can be automatically adjusted with the area of the display screen, and the image of the target subject will not be too small.
  • it also includes acquiring a voice command, and adjusting the angle of view and/or the focal length during shooting according to the voice command.
  • the viewing angle and/or focal length of the target object can be controlled by voice to further adjust the shooting angle and/or focal length.
  • the voice command can be included in the voice information, and the user can Send a voice message with voice commands in the voice message.
  • the method further includes: acquiring voice information; judging whether the acquired voice information complies with a fourth preset condition; if the acquired voice information complies with the fourth preset condition, adjusting the image capture device to a second state;
  • the second state is the state of the image acquisition device before the image of the target object is obtained by photographing the target object.
  • the fourth preset condition in this embodiment may be, for example, that the acquired voice information includes a preset target word, when it is recognized that the acquired voice information includes the target word, the state of exiting the shooting of the target object, and returning to before step S203
  • the viewing angle and focal length of the image capturing device before step S203 may be recorded, and the viewing angle and focal length of the image capturing device may be adjusted to the viewing angle and focal length recorded before step S203.
  • the image capturing device may be recorded.
  • the image acquisition device is controlled to shoot the recorded user again with the recorded focal length.
  • an image acquisition method 300 is also proposed.
  • the method in this embodiment is described by taking a video conference as an example.
  • the video conference system is activated, the image acquisition device captures the conference site, and all parties join the conference. Start the voice detection thread and monitor the voice information.
  • the voice information sent by the user determine whether the preset keyword is recognized in the voice information. If the preset keyword is not recognized, continue to monitor the voice information. If the preset keyword is recognized keyword, it indicates that the user wants to display the target object. At this time, the user's body image is obtained, and the feature points such as hands and bones in the body image are identified, and it is judged whether the identified feature points include the target feature point.
  • the target feature here
  • the point can be a hand feature point.
  • the coordinates of the display object are positioned according to the coordinates of the target feature point, and the display object coordinates are taken as the center, and the display screen size is used as a reference to calculate a new viewing angle.
  • Adjust the direction and zoom of the image capture device adjust the focal length to zoom in on the details of the display, and send the detailed picture to the remote participants, so that the remote participants can view the details of the display.
  • the voice message is sent again. If the voice message sent again includes a preset stop close-up command, the detailed close-up of the display is exited, and the original image is output.
  • the original image such as It may be a picture taken using the angle of view and focal length before the close-up of the display object.
  • an image acquisition apparatus including: a voice unit 401, configured to acquire voice information;
  • the identification unit 402 is used to judge whether the voice information meets the first preset condition
  • a positioning unit 403 configured to determine the position of the target object if the voice information meets the first preset condition
  • the photographing unit 404 is configured to photograph the target object according to the position of the target object to obtain an image of the target object.
  • determining the position of the target object by the positioning unit 403 includes: acquiring a body image of the user, and determining the position of the target object according to the body image of the user.
  • the positioning unit 403 determines the position of the target object according to the user's body image, including: judging whether the body image includes the feature point of the target limb; if the body image includes the feature point of the target limb, then determining according to the position of the feature point of the target limb The position of the target object; or, if the body image does not include the feature points of the target limb, re-acquire the user's body image.
  • the positioning unit 403 determines the position of the target object according to the positions of the feature points of the target limb, including: taking the feature points of the target limb as the center, and determining the target range with a preset distance as the radius; locating the target within the target range object to determine the position of the target object; alternatively, find and locate the target object in the vicinity of the feature points of the target limb.
  • the target limb includes at least one of a hand and an arm.
  • the photographing unit 404 photographing the target object to obtain an image of the target object includes: adjusting the angle of view during photographing, and/or adjusting the focal length during photographing to photograph the target object.
  • the photographing unit 404 adjusts the angle of view when photographing, so that the target object is located in the middle of the photographed image. In some embodiments, the photographing unit 404 adjusts the focal length when photographing, so as to increase the magnification when photographing.
  • the photographing unit 404 adjusts the focal length during photographing, including: adjusting the focal length during photographing according to the size of the display screen; wherein the display screen is used to display the photographed image.
  • the voice unit 401 is also used to obtain voice instructions.
  • the photographing unit 404 is further configured to adjust the angle of view and/or the focal length during photographing according to the voice command.
  • the voice unit 401 is also used to obtain voice information again.
  • the identification unit 402 is further configured to determine whether the re-acquired speech information meets the second preset condition.
  • the photographing unit 404 is further configured to adjust the image acquisition device to a first state if the re-acquired voice information meets the second preset condition, wherein the first state is the state of the image acquisition device before the target object is photographed to obtain an image of the target object. state.
  • the identifying unit 402 determines whether the voice information meets the first preset condition, including: judging whether the voice information includes a preset keyword; if the voice information includes a keyword, the first preset condition is met; Or, if the voice information does not include the keyword, the first preset condition is not met.
  • an image acquisition device including:
  • an acquisition module 501 configured to acquire an image of a preset object
  • the positioning module 502 is used for determining the position of the target object according to the image of the preset object;
  • the photographing module 503 is configured to photograph the target object according to the position of the target object to obtain an image of the target object.
  • the image of the preset object includes an image of a user's body or an image of a preset item.
  • the image acquisition apparatus further includes a determination module, which is configured to determine the preset object after the acquisition module 501 acquires the image of the preset object and before the positioning module 502 determines the position of the target object according to the image of the preset object Whether the image of the preset object meets the third preset condition; the positioning module 502 is configured to determine the position of the target object according to the image of the preset object if the image of the preset object meets the third preset condition.
  • a determination module which is configured to determine the preset object after the acquisition module 501 acquires the image of the preset object and before the positioning module 502 determines the position of the target object according to the image of the preset object Whether the image of the preset object meets the third preset condition; the positioning module 502 is configured to determine the position of the target object according to the image of the preset object if the image of the preset object meets the third preset condition.
  • the image of the preset object includes an image of the user's body.
  • the determining module determines whether the image of the preset object meets the third preset condition, including: judging whether the body image includes: the target limb with the target action; if so, it meets the third preset condition; or, if not, it does not meet the the third preset condition;
  • the image of the preset object includes: an image of a preset item; the determining module determines whether the image of the preset object meets the third preset condition, including: judging whether the preset item in the image of the preset item is held Hold and point to any object; if so, the third preset condition is met, or, if not, the third preset condition is not met.
  • the target limb having the target motion includes at least one of: a hand pointing at the object, a hand holding the object, a hand holding the object, and an eye looking at the object.
  • the positioning module 502 determines the position of the target object according to the image of the preset object, including: acquiring the position of the feature point of the preset object in the image of the preset object; determining the target according to the position of the feature point of the preset object the location of the object.
  • the positioning module 502 determines the position of the target object according to the position of the feature point of the preset object, including: determining the target range with the feature point of the preset object as the center and the preset distance as the radius; within the target range Locate the target object to determine the position of the target object; or, find and locate the target object in the vicinity of the feature points of the preset object.
  • the target limb includes at least one of a hand and an arm.
  • the photographing module 503 photographing the target object to obtain an image of the target object includes: adjusting the angle of view during photographing, and/or adjusting the focal length during photographing to photograph the target object.
  • the photographing module 503 adjusts the angle of view when photographing, so that the target object is located in the middle of the photographed image; and/or, adjusts the focal length when photographing to increase the magnification when photographing.
  • the photographing module 503 adjusts the focal length during photographing, including: adjusting the focal length during photographing according to the size of the display screen; wherein the display screen is used to display the photographed image.
  • a voice module is also included for obtaining voice commands.
  • the positioning module 502 is further configured to determine the position of the target object according to the image of the preset object, or adjust the angle of view and/or the focal length during shooting according to the voice command when the voice command satisfies the preset condition.
  • the speech module is also used to obtain speech information.
  • the determining module is further configured to determine whether the acquired voice information meets the fourth preset condition.
  • the shooting module 503 is further configured to adjust the image acquisition device to a second state if the acquired voice information meets the fourth preset condition; wherein the second state is the state of the image acquisition device before the image of the target object is obtained by shooting the target object .
  • the present disclosure also provides a terminal and a storage medium, which are described below.
  • FIG. 6 it shows a schematic structural diagram of an electronic device (eg, a terminal device or a server) 800 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in the figure is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • Electronic device 800 may include processing means (eg, central processing unit, graphics processor, etc.) 801 that may be loaded into random access memory (RAM) 803 according to a program stored in read only memory (ROM) 802 or from storage means 808 program to perform various appropriate actions and processes. In the RAM 803, various programs and data necessary for the operation of the electronic device 800 are also stored.
  • the processing device 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also connected to bus 804 .
  • I/O interface 805 the following devices can be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration An output device 807 of a computer, etc.; a storage device 808 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 809. Communication means 809 may allow electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While the figures show electronic device 800 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • input devices 806 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.
  • LCD liquid crystal display
  • speakers vibration
  • storage device 808
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 809, or from the storage device 808, or from the ROM 802.
  • the processing device 801 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the aforementioned computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to execute the aforementioned method of the present disclosure.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • an image acquisition method for an image acquisition device including:
  • the target object is photographed to obtain an image of the target object.
  • an image acquisition method for judging whether voice information meets a first preset condition including:
  • the voice information includes keywords, the first preset condition is met; or,
  • the first preset condition is not met.
  • an image acquisition method includes: acquiring a body image of a user, and determining the position of the target object according to the user's body image.
  • an image acquisition method for determining the position of a target object according to a user's body image including:
  • the position of the target object is determined according to the positions of the feature points of the target limb; or,
  • the user's body image is acquired again.
  • an image acquisition method determining the position of the target object according to the position of the feature point of the target limb, comprising: taking the feature point of the target limb as the center, and taking a predetermined Set the distance as the radius to determine the target range; locate the target object within the target range to determine the position of the target object; or find and locate the target object near the feature points of the preset object.
  • an image acquisition method is provided, and the target limb includes at least one of a hand and an arm.
  • an image acquisition method wherein an image of a target object is obtained by photographing a target object, including:
  • an image acquisition method which adjusts the angle of view during shooting, so that the target object is located in the middle of the captured image
  • an image acquisition method which adjusts the focal length during shooting, including:
  • the display screen is used to display the captured image.
  • an image acquisition method further comprising:
  • an image acquisition method further comprising:
  • the first state is the state of the image acquisition device before the image of the target object is obtained by photographing the target object.
  • an image acquisition method for an image acquisition device including:
  • the target object is photographed to obtain an image of the target object.
  • an image acquisition method wherein the image of the preset object includes: an image of a user's body or an image of a preset item.
  • an image acquisition method After acquiring an image of a preset object, and before determining the position of the target object according to the image of the preset object, the method further includes: determining the preset object. Whether the image of the object meets the third preset condition; if the image of the preset object meets the third preset condition, the position of the target object is determined according to the image of the preset object.
  • an image acquisition method wherein the image of the preset object includes: a body image of a user; and determining whether the body image meets a third preset condition includes:
  • an image acquisition method wherein the image of the preset object includes: an image of a preset item; determining whether the image of the preset object meets a third preset condition, Including: judging whether the preset item in the image of the preset item is held and points to any object; if so, the third preset condition is satisfied, or, if not, the third preset condition is not satisfied preset conditions.
  • an image acquisition method includes: a hand pointing to an object, a hand holding the object, a hand holding the object, and looking at the object at least one of the eyes.
  • an image acquisition method wherein determining the position of the target object according to the image of the preset object includes: acquiring the image of the preset object in the image of the preset object. The position of the feature point; the position of the target object is determined according to the position of the feature point of the preset object.
  • an image acquisition method wherein determining the position of the target object according to the positions of the feature points of the preset object includes: taking the feature points of the preset object as The target range is determined with the preset distance as the radius; the target object is positioned within the target range to determine the position of the target object; or the target object is searched and positioned in the vicinity of the feature points of the preset object.
  • an image acquisition method is provided, and the target limb includes at least one of a hand and an arm.
  • an image acquisition method is provided, wherein an image of a target object is obtained by photographing a target object, including:
  • an image acquisition method which adjusts the angle of view during shooting, so that the target object is located in the middle of the captured image
  • an image acquisition method which adjusts the focal length during shooting, including:
  • the display screen is used to display the captured image.
  • an image acquisition method further comprising:
  • an image acquisition method further comprising:
  • the second state is the state of the image acquisition device before the image of the target object is obtained by photographing the target object.
  • an image acquisition device comprising:
  • an identification unit for judging whether the voice information meets the first preset condition
  • a positioning unit configured to determine the position of the target object if the voice information meets the first preset condition
  • the photographing unit is used for photographing the target object according to the position of the target object to obtain an image of the target object.
  • an image acquisition device comprising:
  • the acquisition module is used to acquire the image of the preset object
  • a positioning module for determining the position of the target object according to the image of the preset object
  • the photographing module is configured to photograph the target object according to the position of the target object to obtain an image of the target object.
  • a terminal including: at least one memory and at least one processor;
  • the at least one memory is used for storing program codes
  • the at least one processor is used for calling the program codes stored in the at least one memory to execute any one of the methods described above.
  • a storage medium for storing a program code for executing the above-described method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Studio Devices (AREA)

Abstract

本公开提供一种图像采集方法、装置、终端和存储介质。本公开一些实施例中提供了一种图像采集方法,用于图像采集装置,包括:获取语音信息;判断语音信息是否符合第一预设条件;若语音信息符合第一预设条件,确定目标对象的位置;根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。本公开的方法在需要向其他人展示目标对象时,无需用户手动操作图像采集装置就可以方便其他人查看到目标对象,从而解放了双手,提高了便利性。

Description

图像采集方法、装置、终端和存储介质
相关申请的交叉引用
本申请基于申请号为202011102914.0、申请日为2020年10月15日,名称为“图像采集方法、装置、终端和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及图像采集技术领域,尤其涉及一种图像采集方法、装置、终端和存储介质。
背景技术
目前,经常需要使用诸如摄像机等图像采集装置,用于拍摄和传输视频图像。实际的拍摄应用场景丰富多样,目前的图像采集装置不能很好地适应特定场景的需求,存在操作效率较低、或拍摄效果较差等问题。
发明内容
本公开提供一种图像采集方法、装置、终端和存储介质。
本公开采用以下的技术方案。
在一些实施例中,本公开提供一种图像采集方法,用于图像采集装置,包括:
获取语音信息;
判断语音信息是否符合第一预设条件;
若语音信息符合第一预设条件,确定目标对象的位置;
根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,本公开提供一种图像采集方法,用于图像采集装置,包括:
获取预设对象的图像;
根据预设对象的图像确定目标对象的位置;
根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,本公开提供一种图像采集装置,包括:
语音单元,用于获取语音信息;
识别单元,用于判断语音信息是否符合第一预设条件;
定位单元,用于若语音信息符合第一预设条件,确定目标对象的位置;
拍摄单元,用于根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,本公开提供一种图像采集装置,包括:
获取模块,用于获取用户的身体图像;
定位模块,用于根据预设对象的图像确定目标对象的位置;
拍摄模块,用于根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,本公开提供一种终端,包括:至少一个存储器和至少一个处理器;
其中,存储器用于存储程序代码,处理器用于调用所述存储器所存储的程序代码执行上述的方法。
在一些实施例中,本公开提供一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行上述的方法。
本公开一些实施例提供的图像采集方法,在获取的语音信息符合第一预设条件时,能够定位目标对象的位置,并对目标对象进行拍摄得到目标对象的图像,在需要向其他人展示目标对象时,无需用户手动操作图像采集装置就可以方便其他人查看到目标对象,从而解放用户双手,提高便利性。在本公开的另一些实施例中,获取预设对象的图像,根据预设对象的图像确定目标对象的位置,根据目标对象的位置对目标对象进行拍摄,在需要对目标对象进行展示时,可以很自然的通过控制预设对象从而对目标对象进行拍摄,整个过程可以无需停顿,无需手动操作图像采集装置从而提高了便利性,又提高了展示过程的流畅性。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,元件和元素不一定按照比例绘制。
图1是本公开实施例的图像采集方法100的流程图。
图2是本公开实施例的图像采集方法200的流程图。
图3是本公开实施例的图像采集方法300的示意图。
图4是本公开实施例的一种图像采集装置的组成图。
图5是本公开实施例的另一种图像采集装置的组成图。
图6是本公开实施例的电子设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
在工作和生活期间,有时需要使用图像采集装置进行图像采集,例如在开视频会议或者是在进行直播的时候,都需要使用图像采集装置进行图像采集,以进行视频会议或者直播为例,有时会有需要展示的展示物,某些情况下,还需要对需要展示的展示物进行特写以显示细节。在相关技术中,通常需要人工操作才能调节摄像头来适应不同的拍摄场景,例如拍摄目标对象、或对需要展示的物品进行特写,即需要用户自己手动操作,造成用户的双手被占用,当进行视频会议或者直播时,非常不便。
为了至少部分解决上述问题,本公开一些实施例中提出一种图像采集方法,可以用于图像采集装置,图像采集装置例如可以是具有变焦摄像头的图像采集装置。以下将结合附图,对本公开实施例提供的方案进行详细描述。
如图1所示,图1是本公开实施例的图像采集方法100的流程图,在本实施例中,图像采集方法100包括如下步骤S101-S104。
S101:获取语音信息。
在一些实施例中,图像采集装置上可以设置有语音采集装置,例如麦克风,从而获取语音信息,在另一些实施例中,图像采集装置可以是通过网络从其他装置获取语音信息,例如语音采集装置与图像采集装置通信连接,通过语音采集装置采集语音信息然后传输给图像采集装置。
S102:判断语音信息是否符合第一预设条件。
在一些实施例中,第一预设条件例如可以是语音信息中包括特定的词,也可以是对语音信息的口音进行识别,识别出的语音为特定用户的口音,在本实施例中对于第一预设条件不做具体限定。
S103:若语音信息符合第一预设条件,确定目标对象的位置。
在一些实施例中,语音信息符合了第一预设条件,此时获取目标对象的位置,目标对象例如可以为需要进行展示或特写的物体,也可以是需要进行展示或特写的人物,例如在直播中进行介绍的商品,或是使用了特定美容产品后的用户脸部,或者是在视频会议中需要展示的样品等,目标对象的位置可以采用坐标进行表示。在一些实施例中,如果语音信息不符合第一预设条 件,则重复执行步骤S101和步骤S102,直到获取的语音信息符合第一预设条件。
S104:根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,在得到目标对象的位置后,图像采集装置自动调节摄像头从而拍摄目标对象的图像,例如对目标对象的图像进行对焦放大或者对目标对象进行特写,这样可以较为清楚的拍摄目标对象的图像,以方便查看。在另一些实施例中,图像采集装置的在拍摄目标对象前的焦距正适合拍摄目标对象,此时可以直接调节图像采集装置的拍摄视角,对目标对象进行拍摄,在另一些实施例中,在拍摄目标对象之前,可能正在对另一个较小的物体进行特写,此时目标对象相对正在特写的物体较大,因此需要适当调节焦距以缩小放大倍数放大当前视野。本公开的实施例中在需要对目标对象进行拍摄时无需用户手动操作,从而可以解放用户的双手,提高便捷性。在一些实施例中,所提出的图像采集方法还包括:将拍摄的图像发送到目标终端进行播放,目标终端例如可以是与图像采集装置通信连接观看拍摄的图像的终端,例如本公开实施例中提出的方法用于远程视频会议时,目标终端可以是远程会议的参与方观看拍摄的图像的终端,本公开实施例中提出的方法用于直播时,目标终端例如可以是观看直播的观众所使用的终端。
以下本公开实施例中提出的方法100用于视频直播带货的场景为例,对本公开提出的一个实施例进行介绍,在视频直播带货的过程中,主播使用图像采集装置进行自拍,主播会对商品进行介绍,在对商品进行介绍的时候为了使观众对商品有更为清楚的了解,往往会调节摄像头使得摄像头拍摄商品,相关技术中,需要用户手动调节摄像头以拍摄商品,导致不便,采用本公开一些实施例中提出的方法时,当主播需要拍摄商品时,主播发出语音信息,图像采集装置获取到语音信息并判断是否符合第一预设条件,在符合第一预设条件时,图像采集装置获取商品的位置并自动调节摄像头从而拍摄商品的图像,这样主播就无需在直播的过程中手动调节图像采集装置,解放了双手,方便对商品进行介绍。
由上可知,本公开一些实施例中提出的图像采集方法,用户可以通过发出语音信息,使得图像采集装置确定目标对象的位置,并对目标对象进行拍 摄,在需要向其他人展示目标对象时,无需用户手动操作图像采集装置就可以方便其他人查看到目标对象。
在本公开的一些实施例中,步骤S102判断语音信息是否符合第一预设条件,包括:判断语音信息中是否包括预设的关键词;若语音信息中包括关键词,则符合第一预设条件;或者,若语音信息中不包括关键词,则不符合第一预设条件。在本实施例中,预先设置了关键词,当用户想要拍摄目标对象的图像时,可以发出语音信息说出关键词,从而拍摄目标对象的图像,本实施例中的关键词可以由用户设定,例如可以为“实物展示”、“看这里”、“看左侧”、“看右侧”等词语。
在本公开的一些实施例中,步骤S103中的确定目标对象的位置,包括:获取用户的身体图像,根据用户的身体图像确定目标对象的位置。在本实施例中,用户的身体图像可以是用户的部分身体图像也可以是用户的全身身体图像,在用户想要介绍目标对象时,用户的身体往往会做出相应的动作,这些动作会标识目标对象的位置,因此可以根据用户的身体图像确定目标对象的位置,例如用户通常会用手指向目标对象,或者用户的双眼会看向目标对象,此时可以根据用户的手指的指向或者用户的视线方向确定到目标对象的位置。
在本公开的一些实施例中,根据用户的身体图像确定目标对象的位置,包括:判断身体图像中是否包括目标肢体的特征点;若身体图像中包括目标肢体的特征点,则根据目标肢体的特征点的位置确定目标对象的位置;或者,若身体图像中不包括目标肢体的特征点,则重新获取用户的身体图像。在本实施例中,可以先判断身体图像中是否包括目标肢体,在包括目标肢体的情况下确定目标肢体的特征点,本实施例中预先设定了目标肢体,目标肢体可以是与目标对象相关的肢体,例如可以是操作目标对象的肢体,例如可以设定目标肢体包括:手和胳膊中的至少一个。在通常情况下,用户在需要对目标对象进行展示的时候,会用手指向或者用手托起目标对象,因此可以通过目标肢体的特征点的位置确定目标对象的位置。
在一些实施例中,根据目标肢体的特征点的位置确定目标对象的位置,例如可以包括:以目标肢体的特征点为圆心,以预设距离为半径定位目标范围,在目标范围内定位目标对象,再确定目标对象的位置。在本实施例中, 考虑到用户通常采用目标肢体(例如手)来指向或承托目标对象,因此目标对象的位置通常位于目标肢体的特征点附近,因此可以在目标肢体的特征点的附近寻找并定位目标对象。通过这种方式,可以提高确定目标对象位置的速度、节省计算资源。
在本公开的一些实施例中,对目标对象进行拍摄得到目标对象的图像,包括:调节拍摄时的视角,和/或,调节拍摄时的焦距从而对目标对象进行拍摄。在本实施例中,在拍摄目标对象之前目标对象可能并不位于当前的视野内,并且采用的焦距可能并不合适,因此在对目标对象进行图像拍摄时需要调节拍摄的视角和/或拍摄的焦距,从而改善拍摄时的拍摄效果。在一些实施例中,可以预先为图像采集配置通信连接的控制器,可以响应于控制器发出的控制信息,根据控制信息调节拍摄时的视角和/或焦距,在一些实施例中,由于对目标对象进行拍摄时视野的大部分区域被目标对象占据,因此此时用户通过控制器控制视角和/或焦距也不会被图像采集装置拍摄到,方便用户选取合适的视角和/或焦距。
在本公开的一些实施例中,调节拍摄时的视角,以使目标对象位于拍摄的图像的中部。在一些实施例中,对目标对象进行拍摄的目的是为了展示目标对象,因此,调节拍摄时的视角能够更为清楚的展示目标对象。例如可以调节视角使得目标对象的坐标位于拍摄的图片的中心,从而使得目标对象位于拍摄的图像的中部,例如先以目标对象的坐标为中心,计算出目标视角,将图像采集装置的视角调整至目标视角。在本公开的一些实施例中,调节拍摄时的焦距,以提高拍摄时的放大倍数。一些实施例中,拍摄目标对象时需要展示目标对象的细节,方便其他人能够近距离观看的目标对象的细节和近景,因此需要调节拍摄时的焦距以提高拍摄时的放大倍数,从而放大目标对象的图像,此处的提高拍摄时的放大倍数是指拍摄目标对象时图像采集装置的拍摄时的放大倍率,大于对目标对象进行拍摄前的图像采集装置的放大倍率,例如在获取语音信息时图像采集装置拍摄的放大倍率为1,在拍摄目标对象时的放大倍率应当大于1,通过提高放大倍数,可以放大被拍摄的目标对象的图像,从而可以拍摄目标对象的细节,对目标对象进行特写。以本公开实施例中提出的方法用于视频会议为例,在进行视频会议时摄像头拍摄参会人员的图像并传输给远程其他参会人员,图像采集装置当前正在以1倍的 放大倍数拍摄参会人员,参会人员在需要对展品进行细节展示时,通过发出语音信息,控制图像采集装置提高放大倍数为3倍从而拍摄展品的细节对展品进行近景拍摄展示被拍摄的展品的细节,从而让远程的参会人员能够看到展品的细节,而且无需参会人员进行手动操作,解放了参会人员的双手,提高了便利性。
在本公开的一些实施例中,调节拍摄时的焦距,包括:根据显示屏的尺寸调节拍摄时的焦距;其中,显示屏用于显示拍摄的图像。在本实施例中,图像采集装置拍摄时的焦距与用于显示的显示屏的尺寸相关,例如可以设置调节拍摄时的焦距,从而使得被拍摄的目标对象的图像在显示屏上的尺寸不小于目标尺寸,和/或,使得被拍摄的目标对象的图像在显示屏上的面积与显示屏的面积比例不小于目标比例。例如通过调节焦距让被拍摄的目标对象在显示屏横向和纵向上的尺寸不小于10cm,并且,设置被拍摄的目标对象的图像的面积占显示屏的面积比例不小于75%,这样当显示屏尺寸较小时,会自动调节焦距让拍摄的目标对象的图像足够大,而当显示屏的尺寸较大时,可以自动随显示屏的面积调节焦距,也不会导致拍摄的目标对象的图像过小。
在本公开的一些实施例中,获取语音指令,根据所述语音指令调节拍摄时的视角和/或焦距。在一些实施例中,对目标对象进行拍摄时的视角和/或焦距可以使用语音的方式进行控制,以进一步调节拍摄的视角和/或焦距,语音指令可以是包括在语音信息中的,用户可以发出语音信息,并在语音信息中夹杂语音指令。
在本公开的一些实施例中,在步骤S104之后,还包括:再次获取语音信息;判断再次获取的语音信息是否符合第二预设条件;若再次获取的语音信息符合第二预设条件,将图像采集装置调节至第一状态;其中,第一状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。在本实施例中,在对目标对象进行拍摄之后,可能会无需再对目标对象进行拍摄,此时可以通过发出语音信息的方式控制图像采集装置返回到步骤S104之前图像采集装置的第一状态,本实施例中的第二预设条件例如可以是再次获取的语音信息中包括预设目标词,在识别到再次获取的语音信息中包括目标词时退出对目标对象的拍摄的状态,返回到步骤S104之前的第一状态,例如可以记录步骤S104之前图像采集装置的视角和焦距,将图像采集装置的视角和 焦距调整到步骤S104之前记录的视角和焦距,在另一些实施例中,可以记录图像采集装置在步骤S104之前所拍摄的用户和采用的焦距,在再次获取的语音信息满足第二预设条件时,控制图像采集装置再次以记录的焦距拍摄所记录的用户。
在本公开的一些实施例中,如图2所示,提出另一种图像采集方法200,图像采集方法200用于图像采集装置,包括步骤S201-步骤S203,具体如下:
S201:获取预设对象的图像。
在一些实施例中,预设对象可以为预设的物品也可以为用户的部分或全部身体,因此预设对象的图像可以是预设物品的图像或预设用户的身体图像,对此不作限定。
S202:根据预设对象的图像确定目标对象的位置。
在一些实施例中,在获取了预设对象的图像后,基于预设对象的图像定位目标对象,可以用坐标表示目标对象的位置,目标对象例如可以为需要进行展示或特写的物体,例如在直播中进行介绍的商品,或者是在视频会议中需要展示的样品等。
S203:根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,在得到目标对象的位置后,图像采集装置自动调节摄像头从而拍摄目标对象的图像,例如对目标对象的图像进行对焦放大或者对目标对象进行特写,这样可以较为清楚的拍摄目标对象的图像,以方便查看。在另一些实施例中,图像采集装置的在拍摄目标对象前的焦距正适合拍摄目标对象,此时可以直接调节图形采集装置的拍摄视角,对目标对象进行拍摄,在另一些实施例中,在拍摄目标对象之前,可能正在对另一个较小的物体进行特写,此时目标对象相对正在特写的物体较大,因此需要适当调节焦距以放大当前视野减小放大倍数。本公开的实施例中在需要对目标对象进行拍摄时无需用户手动操作图像采集装置,从而在例如直播或视频会议等场景中无需停顿控制图像采集装置,提高了过程的流畅性和便捷性。在一些实施例中,所提出的图像采集方法还包括:将拍摄的图像发送到目标终端进行播放,目标终端例如可以是与图像采集装置通信连接的终端,例如本公开实施例中提 出的方法用于远程视频会议时,目标终端可以是远程会议的参与方,本公开实施例中提出的方法用于直播时,目标终端例如可以是观看直播的观众。
以下以本公开实施例中提出的方法200用于视频会议的场景为例,对本公开提出的一个实施例进行介绍,在远程视频会议的过程中,图像采集装置对主会场进行拍摄,分会场的参会人员通过拍摄的图形参与远程会议,主会场的参会人员在需要对展品进行介绍式,为了使分会场的参会人员对展品有更为清楚的了解,往往会调节摄像头使得摄像头拍摄展品,相关技术中,需要主会场的参会人员手动调节摄像头以拍摄展品,导致不便,采用本公开提出的方法时,以预设对象为用户的身体为例,当主会场的参会人员需要拍摄展品时,参会人员可以做出一定的身体动作,图像采集装置获取用户的身体图像,图像采集装置根据用户的身体图像获取展品的位置并自动调节摄像头从而拍摄展品的图像,这样就无需在直播会议的过程中手动调节图像采集装置,方便对展品进行介绍。
在本公开的一些实施例中,预设对象图像包括:用户的身体图像或者预设物品的图像。在一些实施例中,用户的身体图像可以是用户的全身图像,也可以是用户的部分身体的图像,其中用户的个数可以是一个也可以是多个,即可以不限定用户的数量,可以采集多个用户的身体图像。在一些实施例中,预设物品的图像例如可以是教杆、演示杆等物品的图像。
在本公开的一些实施例中,获取预设对象的图像之后,根据所述预设对象的图像确定目标对象的位置之前,还包括:确定所述预设对象的图像是否符合第三预设条件;若所述预设对象的图像符合第三预设条件,根据所述预设对象的图像确定目标对象的位置。一些实施例中,如果预设对象的图像不符合第三预设条件,则重复执行获取预设对象的图像并确定预设对象的图像是否符合第三预设条件,直到获取的身体图像符合第三预设条件。在一些实施例中,第三预设条件例如可以是用户做出了预定的动作,通过设定第三预设条件,可以在需要对目标对象进行拍摄时才进行拍摄,方便用户自主控制何时对目标对象进行拍摄。
在本公开的一些实施例中,预设对象的图像包括用户的身体图像,确定预设对象的图像是否符合第三预设条件,包括:判断身体图像中是否包括:具有目标动作的目标肢体;若是,则符合第三预设条件;或者,若否,则不 符合第三预设条件。在本实施例中,预先指定了目标肢体(目标肢体例如可以包括:手和胳膊中的至少一个),并预先指定了目标动作,在检测到目标肢体并且目标肢体的动作为目标动作时满足第三预设条件。如果身体图像中不包括目标肢体或者目标肢体的动作不是目标动作,则不满足第三预设条件。在实际情况中,当用户需要展示目标对象时,往往会做出一定的肢体动作,例如用手指指向目标对象,或者用手掌托起目标对象,或者朝目标对象看去,这些动作都暗示了用户想要对目标对象进行展示,因此一些实施例中,具有目标动作的目标肢体包括:指向物体的手、托起物体的手、握住物体的手和看向物体的眼睛中的至少一个,可以设定上述动作为目标动作,执行上述动作的肢体为目标肢体,上述动作为用户很自然就会在需要展示目标对象时执行的动作,这样就无需用户执行额外的动作,整个过程自然流畅,不会感到突兀。本实施例中可以通过实时监测目标肢体的特征点,从而确定目标肢体的状态,从而确定是否对目标对象进行拍摄。
在一些实施例中,预设对象的图像包括:预设物品的图像,确定预设对象的图像是否符合第三预设条件包括:判断预设物品的图像中预设物品是否被握持且指向物体;若是,则满足第三预设条件,或者,若否,则不满足第三预设条件。一些实施例中,预设物品可以是演示杆等物品,用于可能会用预设物品指向目标对象,在用户使用预设物品时会握持预设物品并用指向目标对象,因此在检测到上述物品被握持且指向任一物品时,表明用户将要展示被指向的物品,此时符合第三预设条件,而当预设物品未被握持时,表明用户没有使用预设物品,而当预设物品被握持但没有指向任一物体时,表明用户可能只是拿在手里而未使用,一些实施例中,预设物品指向任一物体可以是指预设物品的预设特征点附近距离阈值内具有物品。
在一些实施例中,根据预设对象的图像确定目标对象的位置,包括:获取预设对象的图像中预设对象的特征点的位置;根据预设对象的特征点的位置确定目标对象的位置。一些实施例中,预设对象与目标对象之间的距离往往较近,因此可以依据预设对象上特征点的位置确定目标对象的位置,例如在一些实施例中,预设对象的图像包括用户的身体图像,此时确定目标对象的位置,包括:根据身体图像确定目标对象的位置。一些实施例中,在用户想要介绍目标对象时,用户的身体往往会做出相应的动作,这些动作会标识 目标对象的位置,因此可以根据用户的身体图像确定目标对象的位置,例如用户通常会用手指向目标对象,或者用户的双眼会看向目标对象,此时可以根据用户的手指的指向或者用户的视线方向确定到目标对象的位置。在一些实施例中,根据身体图像确定目标对象的位置,包括:获取身体图像中目标肢体的特征点的位置;根据目标肢体的特征点的位置确定目标对象的位置。在本实施例中预先设定了目标肢体,目标肢体可以是与目标对象相关的肢体,例如可以是操作目标对象的肢体,例如可以设定目标肢体包括:手和胳膊中的至少一个。在通常情况下,用户在需要对目标对象进行展示的时候,会用手指向或者用手托起目标对象,因此可以通过目标肢体的特征点的位置确定目标对象的位置。
在一些实施例中,根据预设对象的特征点的位置确定目标对象的位置,例如可以包括:以预设对象的特征点为圆心,以预设距离为半径定位目标范围,在目标范围内定位目标对象的位置。在一些实施例中,预设对象可以为用户的目标肢体,考虑到用户通常采用目标肢体(例如手)来指向或承托目标对象,因此目标对象的位置通常位于目标肢体的特征点附近,因此可以在目标肢体的特征点的附近寻找并定位目标对象,一些实施例中预设对象为预设物品,此时在预设物品的特征点附近定位目标对象。
在一些实施例中,对目标对象进行拍摄得到目标对象的图像,包括:调节拍摄时的视角,和/或,调节拍摄时的视角从而对目标对象进行拍摄。在本实施例中,在拍摄目标对象之前目标对象可能并不位于当前的视野内,并且采用的焦距可能并不合适,因此在对目标对象进行图像拍摄时需要调节拍摄的视角和/或拍摄的焦距,从而改善拍摄时的拍摄效果。
在一些实施例中,调节拍摄时的视角,以使目标对象位于拍摄的图像的中部;在一些实施例中,对目标对象进行拍摄的目的是为了展示目标对象,因此,调节拍摄时的视角能够更为清楚的展示目标对象。
在一些实施例中,调节拍摄时的焦距,以提高拍摄时的放大倍数。调节拍摄时的焦距,以提高拍摄时的放大倍数。一些实施例中,拍摄目标对象时需要展示目标对象的细节,方便其他人能够近距离观看的目标对象的细节和近景,因此需要调节拍摄时的焦距以提高拍摄时的放大倍数,此处的提高放大倍数是指拍摄目标对象时图像采集装置的拍摄时的放大倍率,大于对目标 对象进行拍摄前的图像采集装置的放大倍率,例如在获取语音信息时图像采集装置拍摄的放大倍率为1,在拍摄目标对象时的放大倍率应当大于1,通过提高放大倍数,可以放大被拍摄的目标对象的图像,从而可以拍摄目标对象的细节,对目标对象进行特写。
在一些实施例中,调节拍摄时的视角,包括:根据显示屏的尺寸调节拍摄时的焦距;其中,显示屏用于显示拍摄的图像。在本实施例中,图像采集装置拍摄时的焦距与用于显示的显示屏的尺寸相关,例如可以设置调节拍摄时的焦距,从而使得被拍摄的目标对象的图像在显示屏上的尺寸不小于目标尺寸,和/或,使得被拍摄的目标对象的图像在显示屏上的面积与显示屏的面积比例不小于目标比例。例如通过调节焦距让被拍摄的目标对象在显示屏横向和纵向上的尺寸不小于10cm,并且,设置被拍摄的目标对象的图像的面积占显示屏的面积比例不小于75%,这样当显示屏尺寸较小时,会自动调节焦距让拍摄的目标对象的图像足够大,而当显示屏的尺寸较大时,可以自动随显示屏的面积调节焦距,也不会导致拍摄的目标对象的图像过小。
在一些实施例中还包括获取语音指令,根据语音指令调节拍摄时的视角和/或焦距。在一些实施例中,对目标对象进行拍摄时的视角和/或焦距可以使用语音的方式进行控制,以进一步调节拍摄的视角和/或焦距,语音指令可以是包括在语音信息中的,用户可以发出语音信息,并在语音信息中夹杂语音指令。
在本公开的一些实施例,还包括:获取语音信息;判断获取的语音信息是否符合第四预设条件;若获取的语音信息符合第四预设条件,将图像采集装置调节至第二状态;其中,第二状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。本实施例中的第四预设条件例如可以是获取的语音信息中包括预设目标词,在识别到获取的语音信息中包括目标词时退出对目标对象的拍摄的状态,返回到步骤S203之前的第二状态,例如可以记录步骤S203之前图像采集装置的视角和焦距,将图像采集装置的视角和焦距调整到步骤S203之前记录的视角和焦距,在另一些实施例中,可以记录图像采集装置在步骤S203之前所拍摄的用户和采用的焦距,在获取的语音信息满足第四预设条件时,控制图像采集装置再次以记录的焦距拍摄所记录的用户。
在本公开的一些实施例中,还提出一种图像采集方法300,本实施例中的方法以用于视频会议为例进行说明,启动视频会议系统,图像采集装置拍摄会场,各方加入会议,开启语音检测线程,监听语音信息,在监听到用户发出语音信息时,确定语音信息中是否识别到预设关键词,如果没有识别到预设关键词,则继续监听语音信息,如果识别到预设关键词,则表明用户想要展示目标对象,此时获取用户的身体图像,并识别身体图像中手、骨骼等特征点,判断识别到的特征点中是否包括目标特征点,此处的目标特征点例如可以是手部特征点,如果包括目标特征点,则根据目标特征点的坐标定位展示物坐标,以展示物坐标为中心,以显示屏尺寸为参考,计算出新的视角,以此视角调节图像采集装置的方向并变焦,调节焦距放大展示物的细节,并将细节画面发给远程的参会者,这样远程的参会者就能够观看到展示物的细节之处。在无需对展示物的细节进行展示的时候,则再次发出语音信息,如果再次发出的语音信息中包括预先设定的停止特写命令,则退出对展示物的细节特写,输出原始画面,原始画面例如可以是采用对展示物进行特写前的视角和焦距拍摄的画面。
在本公开的一些实施例中,如图4所示,还提出一种图像采集装置,包括:语音单元401,用于获取语音信息;
识别单元402,用于判断语音信息是否符合第一预设条件;
定位单元403,用于若语音信息符合第一预设条件,确定目标对象的位置;
拍摄单元404,用于根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,定位单元403确定目标对象的位置,包括:获取用户的身体图像,根据用户的身体图像确定目标对象的位置。
定位单元403,根据用户的身体图像确定目标对象的位置,包括:判断身体图像中是否包括目标肢体的特征点;若身体图像中包括目标肢体的特征点,则根据目标肢体的特征点的位置确定目标对象的位置;或者,若身体图像中不包括目标肢体的特征点,则重新获取用户的身体图像。
在一些实施例中,定位单元403根据目标肢体的特征点的位置确定目标对象的位置,包括:以目标肢体的特征点为中心,以预设距离为半径确定目 标范围;在目标范围内定位目标对象以确定目标对象的位置;或者,在目标肢体的特征点的附近寻找并定位目标对象。
在一些实施例中,目标肢体包括:手和胳膊中的至少一个。
在一些实施例中,拍摄单元404对目标对象进行拍摄得到目标对象的图像,包括:调节拍摄时的视角,和/或,调节拍摄时的焦距从而对目标对象进行拍摄。
在一些实施例,拍摄单元404调节拍摄时的视角,以使目标对象位于拍摄的图像的中部。在一些实施例中,拍摄单元404调节拍摄时的焦距,以提高拍摄时的放大倍数。
在一些实施例中,拍摄单元404调节拍摄时的焦距,包括:根据显示屏的尺寸调节拍摄时的焦距;其中,显示屏用于显示拍摄的图像。
在一些实施例中,语音单元401还用于获取语音指令。拍摄单元404还用于根据语音指令调节拍摄时的视角和/或焦距。
在一些实施例中,语音单元401还用于再次获取语音信息。识别单元402还用于判断再次获取的语音信息是否符合第二预设条件。拍摄单元404还用于若再次获取的语音信息符合第二预设条件,将图像采集装置调节至第一状态,其中,第一状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。
在一些实施例中,识别单元402判断语音信息是否符合第一预设条件,包括:判断语音信息中是否包括预设的关键词;若语音信息中包括关键词,则符合第一预设条件;或者,若语音信息中不包括关键词,则不符合第一预设条件。
在本公开的一些实施例中,如图5所示,还提出一种图像采集装置,包括:
获取模块501,用于获取预设对象的图像;
定位模块502,用于根据预设对象的图像确定目标对象的位置;
拍摄模块503,用于根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
在一些实施例中,预设对象的图像包括:用户的身体图像或者预设物品的图像。
在一些实施例中,图像采集装置还包括确定模块,确定模块用于在获取模块501获取预设对象的图像之后,定位模块502根据预设对象的图像确定目标对象的位置之前,确定预设对象的图像是否符合第三预设条件;定位模块502用于若预设对象的图像符合第三预设条件,根据预设对象的图像确定目标对象的位置。
在一些实施例中,预设对象的图像包括:用户的身体图像。确定模块确定预设对象的图像是否符合第三预设条件,包括:判断身体图像中是否包括:具有目标动作的目标肢体;若是,则符合第三预设条件;或者,若否,则不符合第三预设条件;
在一些实施例中,预设对象的图像包括:预设物品的图像;确定模块确定预设对象的图像是否符合第三预设条件,包括:判断预设物品的图像中预设物品是否被握持且指向任一物体;若是,则满足第三预设条件,或者,若否,则不满足第三预设条件。
在一些实施例中,具有目标动作的目标肢体,包括:指向物体的手、托起物体的手、握住物体的手和看向物体的眼睛中的至少一个。
在一些实施例中,定位模块502根据预设对象的图像确定目标对象的位置,包括:获取预设对象的图像中预设对象的特征点的位置;根据预设对象的特征点的位置确定目标对象的位置。
在一些实施例中,定位模块502根据预设对象的特征点的位置确定目标对象的位置,包括:以预设对象的特征点为中心,以预设距离为半径确定目标范围;在目标范围内定位目标对象以确定目标对象的位置;或者,在预设对象的特征点的附近寻找并定位目标对象。
在一些实施例中,目标肢体包括:手和胳膊中的至少一个。
在一些实施例中,拍摄模块503对目标对象进行拍摄得到目标对象的图像,包括:调节拍摄时的视角,和/或,调节拍摄时的焦距从而对目标对象进行拍摄。
在一些实施例中,拍摄模块503调节拍摄时的视角,以使目标对象位于拍摄的图像的中部;和/或,调节拍摄时的焦距,以提高拍摄时的放大倍数。
在一些实施例中,拍摄模块503调节拍摄时的焦距,包括:根据显示屏的尺寸调节拍摄时的焦距;其中,显示屏用于显示拍摄的图像。
在一些实施例中,还包括语音模块,用于获取语音指令。定位模块502还用于在语音指令满足预设条件的情况下,根据预设对象的图像确定目标对象的位置,或者根据语音指令调节拍摄时的视角和/或焦距。
在一些实施例中,语音模块还用于获取语音信息。确定模块还用于判断获取的语音信息是否符合第四预设条件。拍摄模块503还用于若获取的语音信息符合第四预设条件,将图像采集装置调节至第二状态;其中,第二状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。
对于装置的实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离模块说明的模块可以是或者也可以不是分开的。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上,基于实施例和应用例说明了本公开的方法及装置。此外,本公开还提供一种终端及存储介质,以下说明这些终端和存储介质。
下面参考图6,其示出了适于用来实现本公开实施例的电子设备(例如终端设备或服务器)800的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图中示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
电子设备800可以包括处理装置(例如中央处理器、图形处理器等)801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储装置808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM803中,还存储有电子设备800操作所需的各种程序和数据。处理装置801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。
通常,以下装置可以连接至I/O接口805:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置806;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置807;包括例如 磁带、硬盘等的存储装置808;以及通信装置809。通信装置809可以允许电子设备800与其他设备进行无线或有线通信以交换数据。虽然图中示出了具有各种装置的电子设备800,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置809从网络上被下载和安装,或者从存储装置808被安装,或者从ROM 802被安装。在该计算机程序被处理装置801执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText  Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述的本公开的方法。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也 可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,提供了一种图像采集方法,用于图像采集装置,包括:
获取语音信息;
判断语音信息是否符合第一预设条件;
若语音信息符合第一预设条件,确定目标对象的位置;
根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,判断语音信息是否符合第一预设条件,包括:
判断语音信息中是否包括预设的关键词;
若语音信息中包括关键词,则符合第一预设条件;或者,
若语音信息中不包括关键词,则不符合第一预设条件。
根据本公开的一个或多个实施例,提供了一种图像采集方法,确定目标对象的位置,包括:获取用户的身体图像,根据用户的身体图像确定目标对象的位置。
根据本公开的一个或多个实施例,提供了一种图像采集方法,根据用户的身体图像确定目标对象的位置,包括:
判断身体图像中是否包括目标肢体的特征点;
若身体图像中包括目标肢体的特征点,则根据目标肢体的特征点的位置确定目标对象的位置;或者,
若身体图像中不包括目标肢体的特征点,则重新获取用户的身体图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,根据所述目标肢体的特征点的位置确定所述目标对象的位置,包括:以目标肢体的特征点为中心,以预设距离为半径确定目标范围;在所述目标范围内定位目标对象以确定目标对象的位置;或者,在所述预设对象的特征点的附近寻找并定位目标对象。
根据本公开的一个或多个实施例,提供了一种图像采集方法,目标肢体包括:手和胳膊中的至少一个。
根据本公开的一个或多个实施例,提供了一种图像采集方法,对目标对象进行拍摄得到目标对象的图像,包括:
调节拍摄时的视角,和/或,调节拍摄时的焦距从而对目标对象进行拍摄。
根据本公开的一个或多个实施例,提供了一种图像采集方法,调节拍摄时的视角,以使目标对象位于拍摄的图像的中部;
和/或,调节拍摄时的焦距,以提高拍摄时的放大倍数。
根据本公开的一个或多个实施例,提供了一种图像采集方法,调节拍摄时的焦距,包括:
根据显示屏的尺寸调节拍摄时的焦距;
其中,显示屏用于显示拍摄的图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,还包括:
获取语音指令;根据所述语音指令调节拍摄时的视角和/或焦距。
根据本公开的一个或多个实施例,提供了一种图像采集方法,还包括:
再次获取语音信息;
判断再次获取的语音信息是否符合第二预设条件;
若再次获取的语音信息符合第二预设条件,将图像采集装置调节至第一状态;
其中,第一状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。
根据本公开的一个或多个实施例,提供了一种图像采集方法,用于图像采集装置,包括:
获取预设对象的图像;
根据预设对象的图像确定目标对象的位置;
根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,预设对象的图像包括:用户的身体图像或者预设物品的图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,获取预设对象的图像之后,根据所述预设对象的图像确定目标对象的位置之前,还包括:确定所述预设对象的图像是否符合第三预设条件;若所述预设对象的图像符合第三预设条件,根据所述预设对象的图像确定目标对象的位置。
根据本公开的一个或多个实施例,提供了一种图像采集方法,预设对象的图像包括:用户的身体图像;确定身体图像是否符合第三预设条件,包括:
判断身体图像中是否包括:具有目标动作的目标肢体;
若是,则符合第三预设条件;或者,
若否,则不符合第三预设条件。
根据本公开的一个或多个实施例,提供了一种图像采集方法,所述预设对象的图像包括:预设物品的图像;确定所述预设对象的图像是否符合第三预设条件,包括:判断所述预设物品的图像中所述预设物品是否被握持且指向任一物体;若是,则满足所述第三预设条件,或者,若否,则不满足所述第三预设条件。
根据本公开的一个或多个实施例,提供了一种图像采集方法,所述具有目标动作的目标肢体,包括:指向物体的手、托起物体的手、握住物体的手和看向物体的眼睛中的至少一个。
根据本公开的一个或多个实施例,提供了一种图像采集方法,根据所述预设对象的图像确定所述目标对象的位置,包括:获取所述预设对象的图 像中预设对象的特征点的位置;根据所述预设对象的特征点的位置确定所述目标对象的位置。
根据本公开的一个或多个实施例,提供了一种图像采集方法,根据所述预设对象的特征点的位置确定所述目标对象的位置,包括:以所述预设对象的特征点为中心,以预设距离为半径确定目标范围;在所述目标范围内定位目标对象以确定目标对象的位置;或者,在所述预设对象的特征点的附近寻找并定位目标对象。
根据本公开的一个或多个实施例,提供了一种图像采集方法,目标肢体包括:手和胳膊中的至少一个。根据本公开的一个或多个实施例,提供了一种图像采集方法,对目标对象进行拍摄得到目标对象的图像,包括:
调节拍摄时的视角,和/或,调节拍摄时的焦距从而对目标对象进行拍摄。
根据本公开的一个或多个实施例,提供了一种图像采集方法,调节拍摄时的视角,以使目标对象位于拍摄的图像的中部;
和/或,
调节拍摄时的焦距,以提高拍摄时的放大倍数。
根据本公开的一个或多个实施例,提供了一种图像采集方法,调节拍摄时的焦距,包括:
根据显示屏的尺寸调节拍摄时的焦距;
其中,显示屏用于显示拍摄的图像。
根据本公开的一个或多个实施例,提供了一种图像采集方法,还包括:
获取语音指令,在所述语音指令满足预设条件的情况下,根据所述预设对象的图像确定目标对象的位置,或者根据所述语音指令调节拍摄时的视角和/或焦距。
根据本公开的一个或多个实施例,提供了一种图像采集方法,还包括:
获取语音信息;
判断获取的语音信息是否符合第四预设条件;
若获取的语音信息符合第四预设条件,将图像采集装置调节至第二状态;
其中,第二状态是对目标对象进行拍摄得到目标对象的图像之前图像采集装置的状态。
根据本公开的一个或多个实施例,提供了一种图像采集装置,包括:
语音单元,用于获取语音信息;
识别单元,用于判断语音信息是否符合第一预设条件;
定位单元,用于若语音信息符合第一预设条件,确定目标对象的位置;
拍摄单元,用于根据目标对象的位置,对目标对象进行拍摄得到目标对象的图像。
根据本公开的一个或多个实施例,提供了一种图像采集装置,包括:
获取模块,用于获取预设对象的图像;
定位模块,用于根据所述预设对象的图像确定目标对象的位置;
拍摄模块,用于根据所述目标对象的位置,对所述目标对象进行拍摄得到所述目标对象的图像。
根据本公开的一个或多个实施例,提供了一种终端,包括:至少一个存储器和至少一个处理器;
其中,所述至少一个存储器用于存储程序代码,所述至少一个处理器用于调用所述至少一个存储器所存储的程序代码执行上述中任一项所述的方法。
根据本公开的一个或多个实施例,提供了一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行上述的方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (28)

  1. 一种图像采集方法,用于图像采集装置,其特征在于,包括:
    获取语音信息;
    判断所述语音信息是否符合第一预设条件;
    若所述语音信息符合第一预设条件,确定目标对象的位置;
    根据所述目标对象的位置,对所述目标对象进行拍摄得到所述目标对象的图像。
  2. 根据权利要求1所述的图像采集方法,其特征在于,所述确定目标对象的位置,包括:获取用户的身体图像,根据所述用户的身体图像确定所述目标对象的位置。
  3. 根据权利要求2所述的图像采集方法,其特征在于,根据所述用户的身体图像确定所述目标对象的位置,包括:
    判断所述身体图像中是否包括目标肢体的特征点;
    若所述身体图像中包括所述目标肢体的特征点,则根据所述目标肢体的特征点的位置确定所述目标对象的位置;或者,
    若所述身体图像中不包括所述目标肢体的特征点,则重新获取用户的身体图像。
  4. 根据权利要求3所述的图像采集方法,其特征在于,根据所述目标肢体的特征点的位置确定所述目标对象的位置,包括:
    以目标肢体的特征点为中心,以预设距离为半径确定目标范围;在所述目标范围内定位目标对象以确定目标对象的位置;
    或者,在目标肢体的特征点的附近寻找并定位目标对象。
  5. 根据权利要求3所述的图像采集方法,其特征在于,
    所述目标肢体包括:手和胳膊中的至少一个。
  6. 根据权利要求1所述的图像采集方法,其特征在于,对所述目标对象进行拍摄得到所述目标对象的图像,包括:
    调节拍摄时的视角,和/或,调节拍摄时的焦距从而对所述目标对象进行拍摄。
  7. 根据权利要求6所述的图像采集方法,其特征在于,
    调节拍摄时的视角,以使所述目标对象位于拍摄的图像的中部;
    和/或,调节拍摄时的焦距,以提高拍摄时的放大倍数。
  8. 根据权利要求6所述的图像采集方法,其特征在于,所述调节拍摄时的焦距,包括:
    根据显示屏的尺寸调节拍摄时的焦距;
    其中,所述显示屏用于显示拍摄的图像。
  9. 根据权利要求1所述的图像采集方法,其特征在于,还包括:
    获取语音指令;根据所述语音指令调节拍摄时的视角和/或焦距。
  10. 根据权利要求1所述的图像采集方法,其特征在于,还包括:
    再次获取语音信息;
    判断再次获取的语音信息是否符合第二预设条件;
    若再次获取的语音信息符合所述第二预设条件,将所述图像采集装置调节至第一状态;
    其中,所述第一状态是对所述目标对象进行拍摄得到所述目标对象的图像之前所述图像采集装置的状态。
  11. 根据权利要求1所述的图像采集方法,其特征在于,判断所述语音信息是否符合第一预设条件,包括:
    判断所述语音信息中是否包括预设的关键词;
    若所述语音信息中包括所述关键词,则符合所述第一预设条件;或者,
    若所述语音信息中不包括所述关键词,则不符合所述第一预设条件。
  12. 一种图像采集方法,用于图像采集装置,其特征在于,包括:
    获取预设对象的图像;
    根据所述预设对象的图像确定目标对象的位置;
    根据所述目标对象的位置,对所述目标对象进行拍摄得到所述目标对象的图像。
  13. 根据权利要求12所述的图像采集方法,其特征在于,
    所述预设对象的图像包括:用户的身体图像或者预设物品的图像。
  14. 根据权利要求12或13所述的图像采集方法,其特征在于,获取预设对象的图像之后,根据所述预设对象的图像确定目标对象的位置之前,还包括:
    确定所述预设对象的图像是否符合第三预设条件;
    若所述预设对象的图像符合第三预设条件,根据所述预设对象的图像确定目标对象的位置。
  15. 根据权利要求14所述的图像采集方法,其特征在于,
    所述预设对象的图像包括:用户的身体图像;
    确定所述预设对象的图像是否符合第三预设条件,包括:
    判断所述身体图像中是否包括:具有目标动作的目标肢体;若是,则符合所述第三预设条件;或者,若否,则不符合所述第三预设条件;
    或者,
    所述预设对象的图像包括:预设物品的图像;
    确定所述预设对象的图像是否符合第三预设条件,包括:
    判断所述预设物品的图像中所述预设物品是否被握持且指向物体;若是,则满足所述第三预设条件,或者,若否,则不满足所述第三预设条件。
  16. 根据权利要求15所述的图像采集方法,其特征在于,所述具有目标动作的目标肢体,包括:
    指向物体的手、托起物体的手、握住物体的手和看向物体的眼睛中的至少一个。
  17. 根据权利要求12所述的图像采集方法,其特征在于,
    根据所述预设对象的图像确定所述目标对象的位置,包括:获取所述预设对象的图像中预设对象的特征点的位置;根据所述预设对象的特征点的位置确定所述目标对象的位置。
  18. 根据权利要求17所述的图像采集方法,其特征在于,
    根据所述预设对象的特征点的位置确定所述目标对象的位置,包括:
    以所述预设对象的特征点为中心,以预设距离为半径确定目标范围;在所述目标范围内定位目标对象以确定目标对象的位置;
    或者,在所述预设对象的特征点的附近寻找并定位目标对象。
  19. 根据权利要求15所述的图像采集方法,其特征在于,
    所述目标肢体包括:手和胳膊中的至少一个。
  20. 根据权利要求12所述的图像采集方法,其特征在于,对所述目标对象进行拍摄得到所述目标对象的图像,包括:
    调节拍摄时的视角,和/或,调节拍摄时的焦距从而对所述目标对象进行拍摄。
  21. 根据权利要求20所述的图像采集方法,其特征在于,
    调节拍摄时的视角,以使所述目标对象位于拍摄的图像的中部;
    和/或,
    调节拍摄时的焦距,以提高拍摄时的放大倍数。
  22. 根据权利要求20所述的图像采集方法,其特征在于,所述调节拍摄时的焦距,包括:
    根据显示屏的尺寸调节拍摄时的焦距;
    其中,所述显示屏用于显示拍摄的图像。
  23. 根据权利要求12所述的图像采集方法,其特征在于,还包括:
    获取语音指令,在所述语音指令满足预设条件的情况下,根据所述预设对象的图像确定目标对象的位置,或者根据所述语音指令调节拍摄时的视角和/或焦距。
  24. 根据权利要求12所述的图像采集方法,其特征在于,还包括:
    获取语音信息;
    判断获取的语音信息是否符合第四预设条件;
    若获取的语音信息符合所述第四预设条件,将所述图像采集装置调节至第二状态;
    其中,所述第二状态是对所述目标对象进行拍摄得到所述目标对象的图像之前所述图像采集装置的状态。
  25. 一种图像采集装置,其特征在于,包括:
    语音单元,用于获取语音信息;
    识别单元,用于判断所述语音信息是否符合第一预设条件;
    定位单元,用于若所述语音信息符合第一预设条件,确定目标对象的位置;
    拍摄单元,用于根据所述目标对象的位置,对所述目标对象进行拍摄得到所述目标对象的图像。
  26. 一种图像采集装置,其特征在于,包括:
    获取模块,用于获取预设对象的图像;
    定位模块,用于根据所述预设对象的图像确定目标对象的位置;
    拍摄模块,用于根据所述目标对象的位置,对所述目标对象进行拍摄得到所述目标对象的图像。
  27. 一种终端,包括:
    至少一个存储器和至少一个处理器;
    其中,所述至少一个存储器用于存储程序代码,所述至少一个处理器用于调用所述至少一个存储器所存储的程序代码执行权利要求1至24中任一项所述的方法。
  28. 一种存储介质,所述存储介质用于存储程序代码,所述程序代码用于执行权利要求1至24中任一项所述的方法。
PCT/CN2021/120652 2020-10-15 2021-09-26 图像采集方法、装置、终端和存储介质 WO2022078190A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/249,160 US20230394614A1 (en) 2020-10-15 2021-09-26 Image collection method and apparatus, terminal, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011102914.0 2020-10-15
CN202011102914.0A CN114374815B (zh) 2020-10-15 2020-10-15 图像采集方法、装置、终端和存储介质

Publications (1)

Publication Number Publication Date
WO2022078190A1 true WO2022078190A1 (zh) 2022-04-21

Family

ID=81137967

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120652 WO2022078190A1 (zh) 2020-10-15 2021-09-26 图像采集方法、装置、终端和存储介质

Country Status (3)

Country Link
US (1) US20230394614A1 (zh)
CN (1) CN114374815B (zh)
WO (1) WO2022078190A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160191812A1 (en) * 2014-12-24 2016-06-30 Canon Kabushiki Kaisha Zoom control device, control method of zoom control device, and recording medium
US20170308763A1 (en) * 2016-04-25 2017-10-26 Microsoft Technology Licensing, Llc Multi-modality biometric identification
CN107924392A (zh) * 2015-08-26 2018-04-17 微软技术许可有限责任公司 基于姿势的注释
CN109963073A (zh) * 2017-12-26 2019-07-02 浙江宇视科技有限公司 摄像机控制方法、装置、系统和云台摄像机
CN110213492A (zh) * 2019-06-28 2019-09-06 Oppo广东移动通信有限公司 设备成像方法、装置、存储介质及电子设备
CN110602391A (zh) * 2019-08-30 2019-12-20 Oppo广东移动通信有限公司 拍照控制方法、装置、存储介质及电子设备
CN111491212A (zh) * 2020-04-17 2020-08-04 维沃移动通信有限公司 视频处理方法及电子设备

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106385537A (zh) * 2016-09-19 2017-02-08 深圳市金立通信设备有限公司 一种拍照方法及终端
CN106803882A (zh) * 2017-02-27 2017-06-06 宇龙计算机通信科技(深圳)有限公司 聚焦方法及其设备
WO2019090717A1 (zh) * 2017-11-10 2019-05-16 深圳传音通讯有限公司 自动对焦方法与装置
CN107888833A (zh) * 2017-11-28 2018-04-06 维沃移动通信有限公司 一种图像拍摄方法及移动终端
CN110192168B (zh) * 2017-12-29 2022-06-10 深圳市大疆创新科技有限公司 一种无人机拍照方法、图像处理方法和装置
CN112771472B (zh) * 2018-10-15 2022-06-10 美的集团股份有限公司 提供实时产品交互协助的系统和方法
KR102664688B1 (ko) * 2019-02-19 2024-05-10 삼성전자 주식회사 가상 캐릭터 기반 촬영 모드를 제공하는 전자 장치 및 이의 동작 방법
CN109872297A (zh) * 2019-03-15 2019-06-11 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN110418064B (zh) * 2019-09-03 2022-03-04 北京字节跳动网络技术有限公司 对焦方法、装置、电子设备及存储介质
CN110604579B (zh) * 2019-09-11 2024-05-17 腾讯科技(深圳)有限公司 一种数据采集方法、装置、终端及存储介质
CN110809115B (zh) * 2019-10-31 2021-04-13 维沃移动通信有限公司 拍摄方法及电子设备
CN111212226A (zh) * 2020-01-10 2020-05-29 Oppo广东移动通信有限公司 对焦拍摄方法和装置
KR102112517B1 (ko) * 2020-03-06 2020-06-05 모바일센 주식회사 실시간 영상 분석을 통한 카메라 위치 제어 및 영상 편집을 통한 무인 스포츠 중계 서비스 방법 및 이를 위한 장치

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160191812A1 (en) * 2014-12-24 2016-06-30 Canon Kabushiki Kaisha Zoom control device, control method of zoom control device, and recording medium
CN107924392A (zh) * 2015-08-26 2018-04-17 微软技术许可有限责任公司 基于姿势的注释
US20170308763A1 (en) * 2016-04-25 2017-10-26 Microsoft Technology Licensing, Llc Multi-modality biometric identification
CN109963073A (zh) * 2017-12-26 2019-07-02 浙江宇视科技有限公司 摄像机控制方法、装置、系统和云台摄像机
CN110213492A (zh) * 2019-06-28 2019-09-06 Oppo广东移动通信有限公司 设备成像方法、装置、存储介质及电子设备
CN110602391A (zh) * 2019-08-30 2019-12-20 Oppo广东移动通信有限公司 拍照控制方法、装置、存储介质及电子设备
CN111491212A (zh) * 2020-04-17 2020-08-04 维沃移动通信有限公司 视频处理方法及电子设备

Also Published As

Publication number Publication date
CN114374815A (zh) 2022-04-19
CN114374815B (zh) 2023-04-11
US20230394614A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
EP3465620B1 (en) Shared experience with contextual augmentation
US20170347039A1 (en) Video pinning
WO2021218325A1 (zh) 视频处理方法、装置、计算机可读介质和电子设备
CN110213616B (zh) 视频提供方法、获取方法、装置及设备
WO2023051185A1 (zh) 图像处理方法、装置、电子设备及存储介质
CN113225483B (zh) 图像融合方法、装置、电子设备和存储介质
WO2021218318A1 (zh) 视频传输方法、电子设备和计算机可读介质
WO2023284708A1 (zh) 一种视频处理方法、装置、电子设备和存储介质
CN111078011A (zh) 手势控制方法、装置、计算机可读存储介质及电子设备
US20240121349A1 (en) Video shooting method and apparatus, electronic device and storage medium
US9325776B2 (en) Mixed media communication
CN114095671A (zh) 云会议直播系统、方法、装置、设备及介质
CN111710048A (zh) 展示方法、装置和电子设备
CN111710046A (zh) 交互方法、装置和电子设备
CN112307323B (zh) 信息推送方法和装置
CN112637495A (zh) 拍摄方法、装置、电子设备及可读存储介质
WO2022078190A1 (zh) 图像采集方法、装置、终端和存储介质
CN115639934A (zh) 内容分享方法、装置、设备、计算机可读存储介质及产品
CN115002359A (zh) 视频处理方法、装置、电子设备及存储介质
CN114125358A (zh) 云会议字幕显示方法、系统、装置、电子设备和存储介质
CN113641247A (zh) 视线角度调整方法、装置、电子设备及存储介质
CN113253847A (zh) 终端的控制方法、装置、终端和存储介质
CN112347301A (zh) 图像特效处理方法、装置、电子设备和计算机可读存储介质
WO2023029992A1 (zh) 拍摄方法、装置、电子设备和存储介质
US11880919B2 (en) Sticker processing method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21879236

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21879236

Country of ref document: EP

Kind code of ref document: A1