WO2019136636A1 - Image recognition method and system, electronic device, and computer program product - Google Patents

Image recognition method and system, electronic device, and computer program product Download PDF

Info

Publication number
WO2019136636A1
WO2019136636A1 PCT/CN2018/072111 CN2018072111W WO2019136636A1 WO 2019136636 A1 WO2019136636 A1 WO 2019136636A1 CN 2018072111 W CN2018072111 W CN 2018072111W WO 2019136636 A1 WO2019136636 A1 WO 2019136636A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
scene
recognition
focal length
recognition function
Prior art date
Application number
PCT/CN2018/072111
Other languages
French (fr)
Chinese (zh)
Inventor
刘兆祥
廉士国
王敏
Original Assignee
深圳前海达闼云端智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海达闼云端智能科技有限公司 filed Critical 深圳前海达闼云端智能科技有限公司
Priority to PCT/CN2018/072111 priority Critical patent/WO2019136636A1/en
Priority to CN201880000060.XA priority patent/CN108235816B/en
Publication of WO2019136636A1 publication Critical patent/WO2019136636A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals

Definitions

  • the present application relates to the field of image recognition technologies, and in particular, to an image recognition method, system, electronic device, and computer program product.
  • China is the country with the most blind people in the world.
  • they live in the boundless darkness for life, so they often encounter various problems.
  • Camera-based intelligent image recognition can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions.
  • the fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined.
  • Image Identification can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions.
  • the fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined.
  • Embodiments of the present application provide an image recognition method, system, electronic device, and computer program product.
  • an embodiment of the present application provides an image recognition method, where the method includes:
  • the identification object is identified by the recognition focal length.
  • an embodiment of the present application provides an electronic device, where the electronic device includes:
  • a memory one or more processors; a memory coupled to the processor via a communication bus; a processor configured to execute instructions in the memory; the storage medium storing instructions for performing the various steps of the method of the first aspect.
  • an embodiment of the present application provides a computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer
  • the program mechanism includes instructions for performing the various steps in the method of the first aspect described above.
  • an embodiment of the present application provides an image recognition system, where the image recognition system includes: an image acquisition unit and a mobile calculation processing unit;
  • the image acquisition unit is a camera with a controllable focus, wherein the focal length controllable range includes a telephoto focal length, a medium focal length, and a short focal length; or
  • the image acquisition unit is three or more fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera;
  • the mobile computing processing unit is the electronic device of the second aspect
  • the mobile computing processing unit is coupled to the image acquisition unit via a universal serial bus USB or wireless communication.
  • the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing the automatic environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving recognition accuracy and greatly improving the convenience of blind life.
  • FIG. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 2 is a schematic flow chart of an image recognition method in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of an implementation method of an image recognition method in an embodiment of the present application.
  • Camera-based intelligent image recognition can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions.
  • the fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined.
  • Image Identification can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions.
  • the fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined.
  • the embodiment of the present application provides an image recognition method, which adopts a focal length of a focal length to collect a scene image of a recognition object, determines a recognition focal length according to the scene image, and recognizes the recognition object by using the recognition focal length, thereby realizing no user intervention and input.
  • the environment is automatically recognized, and the appropriate focal length is selected based on the environment to achieve the best shooting effect, thereby improving the recognition accuracy and greatly improving the convenience of the blind life.
  • the image recognition method provided by the present application can be used in an image recognition system as follows.
  • the image recognition system includes an image acquisition unit and a mobile calculation processing unit.
  • the image acquisition unit is configured to acquire an image at the current focal length.
  • the focal length of the image acquisition unit in the present application can be controlled and modified by the mobile computing processing unit, and the focal length can be adjusted in a wide range, for example, a few centimeters of text can be clearly captured to a tens of meters of traffic lights or traffic signs. Therefore, the specific implementation form of the image acquisition unit in the present application may be various.
  • the image acquisition unit is a camera with a focal length controllable, wherein the focal length controllable range includes a telephoto focal length, a medium focal length, and a short focal length.
  • the number of cameras is not limited, but at least one camera with a controlled focal length is at least one.
  • the image acquisition unit is three or more fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera.
  • the image acquisition unit may be located on the wearable glasses, such as on the guide glasses.
  • the mobile computing processing unit can be connected to the image acquisition unit via USB (Universal Serial Bus) or wireless communication (such as Bluetooth).
  • USB Universal Serial Bus
  • wireless communication such as Bluetooth
  • the mobile computing processing unit is responsible for controlling the focal length of the image acquisition unit, image acquisition, scene rough classification, specific image recognition, and voice broadcast output.
  • the mobile computing processing unit may control the focal length of the image acquiring unit to be a focal length of the focal length, and acquire an image of the scene in which the identifying object is located by an image acquiring unit whose focal length is a focal length of the focal length.
  • the mobile technology processing unit may further control the focal length of the image acquisition unit to identify the focal length, and acquire the first image of the recognition object by the image acquisition unit that recognizes the focal length by the focal length.
  • the mobile technology processing unit may further control the focal length of the image acquisition unit to be a mid-focus focal length after determining that the recognition is completed.
  • the mobile computing processing unit may be an electronic device as shown in FIG. 1 in a specific application.
  • the electronic device can be a general purpose smartphone.
  • the electronic device includes: a memory 101, one or more processors 102; and a transceiver component 103.
  • the memory, the processor, and the transceiver component 103 are communicated through a communication bus (in the embodiment of the present application, the communication bus is an I/O bus).
  • the storage medium stores instructions for performing the steps in the image recognition method shown in FIG.
  • the function of recognizing the recognition object realizes the automatic recognition of the environment without user intervention and input, and selects the appropriate focal length based on the environment to achieve the best shooting effect, thereby improving the recognition accuracy and greatly improving the life of the blind. Convenience.
  • transceiver component 103 is not necessarily required.
  • the image recognition method provided by this embodiment includes:
  • the mobile computing processing unit establishes a connection with the image acquiring unit by using a wireless communication method such as USB or Bluetooth.
  • the mobile computing processing unit collects the scene image of the recognition object by using the focal length of the focal length through the connection.
  • the mobile computing processing unit adjusts the focal length of the image acquiring unit at an intermediate distance by the connection, and the current focal length of the image acquiring unit is the focal length of the focal length.
  • the image acquisition unit collects the scene image of the recognition object under the current focal length, and transmits the scene image to the mobile calculation processing unit, so that the mobile calculation processing unit collects the scene where the recognition object is located by the image acquisition unit with the focal length of the focal length of the focus. image.
  • the image acquisition unit is more than three fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera,
  • the mobile computing processing unit selects the mid-focus camera in the image acquisition unit through the connection.
  • the selected mid-focus camera captures the scene image of the recognition object, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the scene image of the recognition object by the image acquiring unit with the focal length of the focal length.
  • the scene image is roughly classified to obtain a coarsely classified scene, that is, a coarse scene to which the scene image belongs.
  • the recognition focal length at the time of specific recognition is determined according to the associated coarse scene, so that the lens is adjusted to the corresponding recognition focal length, an image is collected here, and then the corresponding image recognition function is called.
  • step 202 The specific implementation process of step 202 is as follows:
  • the scene image is coarsely classified by the scene rough classification model, and the coarse scene to which the scene image belongs is determined.
  • the coarse scene is a telephoto scene, a medium focus scene or a short focus scene.
  • the coarse scene corresponds to one or more image recognition functions.
  • the image recognition functions corresponding to different coarse scenes are different, and the number of corresponding image recognition functions may be the same or different.
  • the coarse scene is a telephoto scene
  • the traffic light recognition function there is only one image recognition function corresponding to the telephoto scene, that is, the traffic light recognition function.
  • the image recognition function corresponding to the short-focus scene is two, that is, the reading recognition function and the item recognition function.
  • the image recognition function corresponding to the medium-focus scene is multiple, and one of the image recognition functions is a face recognition function.
  • the coarse scene in this embodiment may be added according to actual conditions.
  • the number and specific functions of the image recognition function corresponding to each coarse scene can also be adjusted according to actual conditions.
  • This embodiment does not limit the specific categories included in the coarse scene, the specific number and specific functions of the image recognition function corresponding to the coarse scene, and the categories included in the coarse scene, the number of corresponding image recognition functions, and the adjustment time and adjustment form of the function.
  • the scene rough classification model is obtained by deep learning the samples of the telephoto scene, the samples of the medium focal scene, and the samples of the short focus scene.
  • CNN Convolutional Neural Network
  • the trained scene rough classification model is obtained.
  • the trained scene coarse classification model and the weight can be used to classify the scene image, and according to the magnitude of the output probability, which coarse scene belongs to, based on
  • the deep learning uses CNN to perform rough classification and recognition on the scene image collected in step 201.
  • the recognition focal length when the identification object is actually recognized is obtained, and the function of dynamically changing the focal length based on different coarse scenes is realized, and the image capturing quality at the subsequent recognition is improved, thereby improving the recognition accuracy and greatly improving the convenience of the blind life.
  • the first image of the recognition object is collected by using the recognition focal length.
  • the mobile technology processing unit controls the focal length of the image acquisition unit to recognize the focal length, and the first image of the recognition object is acquired by the image acquisition unit that recognizes the focal length by the focal length.
  • the mobile computing processing unit adjusts the focal length of the image acquiring unit to recognize the focal length by the connection established with the image acquiring unit, and the current focal length of the image acquiring unit is the recognized focal length.
  • the image acquisition unit collects the scene image of the recognition object under the current focal length, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the first image of the recognition object by the image acquisition unit that focuses the focal length. .
  • the image acquisition unit is more than three fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera,
  • the mobile computing processing unit selects a camera corresponding to the recognized focal length in the image acquiring unit by the connection established with the image acquiring unit.
  • the selected camera captures the scene image of the recognition object, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the first image of the recognition object by the image acquiring unit that focuses the focal length.
  • the image recognition function corresponding to the coarse scene is called to identify the first image.
  • the coarse scene is a telephoto scene
  • the image recognition function corresponding to the telephoto scene has only one traffic light recognition function
  • the recognition function such as red and green can be directly called.
  • the traffic light recognition function is called, and the identification object is determined to be a red light, a green light or a yellow light according to the first image; or the traffic light recognition function is called, and the identification object is determined to be a red light, a green light, a yellow light or a non-red green light according to the first image.
  • the first image is identified by the deep neural network to realize the discrimination of the red light, the green light, and the yellow light, or the first image is identified by the deep neural network to realize the red light, the green light, the yellow light, or the non-red light. The judgment of the green yellow light.
  • the target detection mode For example, in the traffic light recognition function, three targets of red light, green light, and yellow light in the first image are detected by the target detection mode, or red light and green light in the first image are detected by the target detection mode. Four targets, yellow light or non-red, green and yellow light are detected.
  • the target detection mode includes, but is not limited to, detection based on an SSD (Single Shot MultiBox Detector) target detection model.
  • SSD Single Shot MultiBox Detector
  • the corresponding coarse scene corresponds to multiple image recognition functions
  • the first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined;
  • the corresponding image recognition function in the coarse scene identifies the first image.
  • the scene classification may be performed first, and then the specific image recognition function is called.
  • the short-focus scene corresponds to two image recognition functions: a book recognition function and an item recognition function.
  • newspapers and periodicals include publications such as newspapers and magazines.
  • the recognition object is a book, or if the recognition object is a newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function, and the reading recognition function is called to recognize the first image, that is, OCR recognition is performed.
  • the recognition result can also be output in the form of a voice broadcast.
  • OCR recognition After completing the OCR recognition output, it is necessary to collect the image of the recognition object again to analyze whether the image needs to be recognized again. Because blind people may move books up and down and left and right when reading books. In this case, there is no need to identify them again. Only when the user turns pages, OCR recognition needs to be performed again. This can avoid repeating the broadcast from the beginning and effectively improve the user experience.
  • the second image of the recognition object is continuously acquired while the reading recognition function is called to recognize the first image. Whenever a second image is acquired, the content similarity between the second image and the first image is determined, and if the content similarity is lower than the first threshold, the continuous acquisition of the second image is stopped, and OCR is performed again.
  • the second image is the new first image
  • re-executing the call recognition function to recognize the new first image, and simultaneously acquiring the new second image of the recognition object, determining the new second image and Content similarity between the first images, if the content similarity is lower than the first threshold, stopping the continuous acquisition of the second image, performing the recognition of the OCR again, and continuously acquiring the new second image of the recognition object, determining the new The content similarity between the second image and the first image and subsequent steps are cycled until the recognition of the object is recognized.
  • the process of determining the similarity of content between the second image and the first image may be implemented by feature point matching.
  • the second feature point in the second image is extracted
  • the first feature point in the first image is extracted
  • the second image and the first image are determined according to the number of second feature points matching the first feature point. Content similarity between images.
  • the ORB/SIFT points of the first frame image and the current frame image are respectively extracted, matched, and judged according to the number of matching success points, and the more points that are successfully matched, the higher the similarity.
  • the object to be recognized is an item of a non-book newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is an item identification function, and the item identification function is called to identify the first image.
  • the image recognition method provided in this embodiment further determines whether the image recognition function in the scene is used, that is, whether the recognition is completed. If the recognition is completed, the mobile computing processing unit re-adjusts the focal length of the image acquisition unit (such as the camera) to the mid-focus position, that is, controls the focal length of the image acquisition unit to be the mid-focus focal length, and enters the next working cycle.
  • the focal length of the image acquisition unit such as the camera
  • the movement calculation processing unit adjusts the focal length of the image acquisition unit at an intermediate distance by a connection established with the image acquisition unit, and the current focal length of the image acquisition unit is a focal length of the focal length. The work of adjusting the focal length of the image acquisition unit to the mid-focus position is completed.
  • the mobile computing processing unit acquires the image through the connection established with the image acquisition unit.
  • the middle focus camera is selected as the default camera for the next photograph, and the work of adjusting the focal length of the image acquisition unit to the middle focus position is completed.
  • the manner of determining whether the recognition is completed includes, but is not limited to, continuously acquiring an image for processing, and determining whether the current recognition function ends according to an output result of the image recognition function in the scene.
  • the function is considered to be ended.
  • the identification object is determined to be a red light, a green light, a yellow light, or a non-red, green, and yellow light according to the first image
  • the identification is determined. Finish, end the image recognition method.
  • the identification object is determined to be a red light, a green light or a yellow light according to the first image
  • the second image of the recognition object is continuously collected, and whenever a second image is acquired, the traffic light recognition function is called, and the second image is determined according to the second image.
  • the object is a red light, a green light, a yellow light or a non-red, green and yellow light. If the identification object is determined to be a non-red, green and yellow light according to the second image, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the image recognition method is ended. .
  • the reading recognition function in the short-focus scene can be judged according to the number of output OCR characters and the degree of confidence. If the characters are few or the character confidence is low, the reading recognition function is considered to be ended.
  • the book recognition function is called to identify the number of characters and the character confidence in the second image. If the number of characters in the second image is less than the second threshold and/or the character confidence in the second image is less than the third threshold, determining that the recognition is completed, stopping the continuous acquisition of the second image, and ending the image recognition method .
  • the human/face detection can be performed to determine whether there is a human body/face in the scene that meets a certain size, and if not, the face recognition function is considered to be ended.
  • the image recognition function corresponding to the first image in the associated coarse scene is called, after the first image is recognized, the second image of the recognition object is continuously collected, and each time a second image is collected, the sheet is The human body or the human face in the second image is detected. If the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, the identification is determined. Upon completion, the continuous acquisition of the second image is stopped, and the image recognition method is ended.
  • the image recognition method provided by the embodiment is completed.
  • the basic idea of the method is shown in FIG. 3, specifically: the image recognition function required by the blind person is divided into a telephoto (corresponding to a long distance), a medium focus (corresponding to an intermediate distance), and a short focus (corresponding to a close distance) according to the use distance, For example, traffic light recognition requires telephoto shooting, face recognition requires mid-focus shooting, and OCR requires short-focus shooting.
  • the mobile computing processing unit first controls the image acquisition unit (such as a focal length controllable camera) to focus on the focal position, and collects a scene image of the recognition object, although in this case for the short focal scene and The image captured in the telephoto scene is not clear enough or the resolution of the target area is not enough, but the application can roughly classify the image to determine whether the scene suitable for telephoto is suitable for the short-focus scene or the medium-focus scene, and then according to The rough classification scene type determines the recognition focal length, adjusts the camera lens to the corresponding recognition focal length, and then uses the recognition focal length to collect an image of the recognition object, and then calls the image recognition function of the corresponding scene; if there are several subdivisions in the scene The function can first classify the scene and then call the specific image recognition function. Finally, the results of the recognition can be fed back to the blind by means of voice announcements.
  • the image acquisition unit such as a focal length controllable camera
  • the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing automatic environment for the environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving the accuracy of recognition, greatly improving the convenience of life for the blind.
  • embodiments of the present application further provide a computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein,
  • the computer program mechanism includes instructions for performing the various steps described below:
  • the recognition object is identified by the recognition focal length.
  • determining the recognized focal length according to the scene image includes:
  • the scene image is coarsely classified by the scene rough classification model to determine the coarse scene to which the scene image belongs;
  • the coarse scene corresponds to one or more image recognition functions
  • the coarse scene is a telephoto scene, a medium focus scene or a short focus scene
  • the scene rough classification model is obtained by deep learning the samples of the telephoto scene, the samples of the medium focal scene, and the samples of the short focus scene.
  • identifying the object by identifying the focal length includes:
  • the image recognition function corresponding to the coarse scene is called to identify the first image
  • the first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined; the first image is called to correspond to the corresponding coarse scene.
  • the image recognition function identifies the first image.
  • the associated coarse scene is a telephoto scene
  • the image recognition function corresponding to the telephoto scene is a traffic light recognition function
  • the image recognition function corresponding to the coarse scene is called to identify the first image, including:
  • the traffic light recognition function is called, and the identification object is determined to be a red light, a green light, a yellow light or a non-red, green and yellow light according to the first image.
  • the traffic light recognition function is invoked, and after determining that the identification object is a red light, a green light, a yellow light, or a non-red, green, and yellow light according to the first image, the method further includes:
  • the identification object is a non-red, green, and yellow light according to the first image, determining that the recognition is completed, ending the image recognition method;
  • the identification object is determined to be a red light, a green light or a yellow light according to the first image
  • the second image of the recognition object is continuously collected, and whenever a second image is acquired, the traffic light recognition function is called, and the second image is determined according to the second image.
  • the object is a red light, a green light, a yellow light or a non-red, green and yellow light. If the identification object is determined to be a non-red, green and yellow light according to the second image, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the image recognition method is ended. .
  • the associated coarse scene is a short-focus scene
  • the image recognition function corresponding to the short-focus scene is a reading recognition function and an item identification function
  • the first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined, including:
  • the recognition object is a book, or if the recognition object is a newspaper, determining that the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function;
  • the identification object is an item other than the book newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is the item identification function.
  • the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function
  • the image recognition function corresponding to the first image in the associated coarse scene is called to identify the first image, including:
  • the content similarity between the second image and the first image is determined, and if the content similarity is lower than the first threshold, the continuous acquisition of the second image is stopped, and the first
  • the second image is used as a new first image, and the call recognition function is re-executed to recognize the new first image, and at the same time, the new second image of the recognition object is continuously acquired, and the new second image and the first image are determined.
  • Content similarity and subsequent steps are determined.
  • determining content similarity between the second image and the first image including:
  • Content similarity between the second image and the first image is determined according to the number of second feature points matching the first feature point.
  • the method further includes:
  • the book recognition function is called to identify the number of characters and the character confidence in the second image
  • the number of characters in the second image is less than the second threshold and/or the character confidence in the second image is less than the third threshold, determining that the recognition is completed, stopping the continuous acquisition of the second image, and ending the image recognition method .
  • the corresponding image recognition function of the first image in the associated coarse scene is a face recognition function in the medium focus scene
  • the image recognition function corresponding to the first image in the associated coarse scene is called, and after the first image is identified, the method further includes:
  • the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the end is ended.
  • Image recognition method If the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the end is ended. Image recognition method.
  • the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing automatic environment for the environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving the accuracy of recognition, greatly improving the convenience of life for the blind.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware.
  • the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)

Abstract

Provided are an image recognition method and system, an electronic device, and a computer program product, wherein same are applied to the technical field of image recognition. The method comprises: collecting, by using the medium focal length, an image of a scene where an object to be recognized is located; determining, according to the image of the scene, the focal recognition length; and recognizing, by using the focal recognition length, the object to be recognized. The focal recognition length is dynamically determined based on a scene where an object to be recognized is located, and the focal recognition length is used to recognize the object to be recognized, thus realizing automatic recognition of an environment without intervention and input by a user, and a suitable focal length is selected based on the environment in order to achieve the best photography effect, thus improving the recognition accuracy and greatly improving the convenience of life for the blind.

Description

图像识别方法、系统、电子设备和计算机程序产品Image recognition method, system, electronic device and computer program product 技术领域Technical field
本申请涉及图像识别技术领域,特别涉及一种图像识别方法、系统、电子设备和计算机程序产品。The present application relates to the field of image recognition technologies, and in particular, to an image recognition method, system, electronic device, and computer program product.
背景技术Background technique
中国是世界盲人最多的国家,作为社会群体中的特殊人群,他们终生生活在无边的黑暗中,因此常常会遇到各种难题。China is the country with the most blind people in the world. As a special group of social groups, they live in the boundless darkness for life, so they often encounter various problems.
基于摄像头的智能图像识别可以提升盲人生活的便利,而拍摄的图像质量对后续的识别功能至关重要。定焦摄像头只能在一定的景深范围内的拍摄出清晰的图像,适用范围受限;而自动聚焦摄像头在用户不介入的情况下经常聚焦不准,导致拍摄的图像,无法进行后续的精细的图像识别。Camera-based intelligent image recognition can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions. The fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined. Image Identification.
发明内容Summary of the invention
本申请实施例提供了一种图像识别方法、系统、电子设备和计算机程序产品。Embodiments of the present application provide an image recognition method, system, electronic device, and computer program product.
第一方面,本申请实施例提供了一种图像识别方法,所述方法,包括:In a first aspect, an embodiment of the present application provides an image recognition method, where the method includes:
采用中焦焦距采集识别对象所在的场景图像;Collecting a scene image of the recognition object by using a focal length of the focal length;
根据所述场景图像确定识别焦距;Determining a focal length according to the scene image;
采用所述识别焦距对所述识别对象进行识别。The identification object is identified by the recognition focal length.
第二方面,本申请实施例提供了一种电子设备,所述电子设备包括:In a second aspect, an embodiment of the present application provides an electronic device, where the electronic device includes:
存储器,一个或多个处理器;存储器与处理器通过通信总线相连;处理器被配置为执行存储器中的指令;所述存储介质中存储有用于执行第一方面所述方法中各个步骤的指令。A memory, one or more processors; a memory coupled to the processor via a communication bus; a processor configured to execute instructions in the memory; the storage medium storing instructions for performing the various steps of the method of the first aspect.
第三方面,本申请实施例提供了一种与包括显示器的电子设备结合使 用的计算机程序产品,所述计算机程序产品包括计算机可读的存储介质和内嵌于其中的计算机程序机制,所述计算机程序机制包括用于执行上述第一方面所述方法中各个步骤的指令。In a third aspect, an embodiment of the present application provides a computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer The program mechanism includes instructions for performing the various steps in the method of the first aspect described above.
第四方面,本申请实施例提供了一种图像识别系统,所述图像识别系统包括:图像获取单元和移动计算处理单元;In a fourth aspect, an embodiment of the present application provides an image recognition system, where the image recognition system includes: an image acquisition unit and a mobile calculation processing unit;
所述图像获取单元为焦距可控的摄像头,其中焦距可控范围包括长焦焦距、中焦焦距和短焦焦距;或者,The image acquisition unit is a camera with a controllable focus, wherein the focal length controllable range includes a telephoto focal length, a medium focal length, and a short focal length; or
所述图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至少一个,中焦摄像头至少一个,短焦摄像头至少一个;The image acquisition unit is three or more fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera;
所述移动计算处理单元为第二方面所述的电子设备;The mobile computing processing unit is the electronic device of the second aspect;
所述移动计算处理单元通过通用串行总线USB,或者,无线通信方式与所述图像获取单元连接。The mobile computing processing unit is coupled to the image acquisition unit via a universal serial bus USB or wireless communication.
有益效果如下:The benefits are as follows:
本申请实施例中,采用中焦焦距采集识别对象所在的场景图像,根据场景图像确定识别焦距,采用识别焦距对识别对象进行识别,实现了在无用户的干预和输入的情况下,自动对环境进行识别,基于环境选择适当的焦距达到最好的拍摄效果,进而提升识别精准性,大大提高盲人生活的便利性。In the embodiment of the present application, the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing the automatic environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving recognition accuracy and greatly improving the convenience of blind life.
附图说明DRAWINGS
下面将参照附图描述本申请的具体实施例,其中:Specific embodiments of the present application will be described below with reference to the accompanying drawings, in which:
图1为本申请实施例中的一种电子设备的结构示意图;1 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
图2本申请实施例中的一种图像识别方法流程示意图;2 is a schematic flow chart of an image recognition method in the embodiment of the present application;
图3本申请实施例中的一种图像识别方法实现思路示意图。FIG. 3 is a schematic diagram of an implementation method of an image recognition method in an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。并且在不冲突的情况下,本申请中的实施例及实施例中的特征可以互相结合。The exemplary embodiments of the present application are further described in detail below with reference to the accompanying drawings, in which the embodiments described are only a part of the embodiments of the present application, but not all embodiments. An exhaustive example. And in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other.
基于摄像头的智能图像识别可以提升盲人生活的便利,而拍摄的图像质量对后续的识别功能至关重要。定焦摄像头只能在一定的景深范围内的拍摄出清晰的图像,适用范围受限;而自动聚焦摄像头在用户不介入的情况下经常聚焦不准,导致拍摄的图像,无法进行后续的精细的图像识别。Camera-based intelligent image recognition can enhance the convenience of blind people's life, and the quality of the captured image is crucial for subsequent recognition functions. The fixed focus camera can only capture clear images within a certain depth of field, and the scope of application is limited; while the auto focus camera often does not focus when the user does not intervene, resulting in images that cannot be subsequently refined. Image Identification.
为了提升图像拍摄质量,提高盲人生活的便利性。本申请实施例提供了一种图像识别方法,采用中焦焦距采集识别对象所在的场景图像,根据场景图像确定识别焦距,采用识别焦距对识别对象进行识别,实现了在无用户的干预和输入的情况下,自动对环境进行识别,基于环境选择适当的焦距达到最好的拍摄效果,进而提升识别精准性,大大提高盲人生活的便利性。In order to improve the quality of image capture, the convenience of blind life is improved. The embodiment of the present application provides an image recognition method, which adopts a focal length of a focal length to collect a scene image of a recognition object, determines a recognition focal length according to the scene image, and recognizes the recognition object by using the recognition focal length, thereby realizing no user intervention and input. In the case, the environment is automatically recognized, and the appropriate focal length is selected based on the environment to achieve the best shooting effect, thereby improving the recognition accuracy and greatly improving the convenience of the blind life.
本申请提供的图像识别方法,可以用于如下的图像识别系统中。该图像识别系统包括:图像获取单元和移动计算处理单元。The image recognition method provided by the present application can be used in an image recognition system as follows. The image recognition system includes an image acquisition unit and a mobile calculation processing unit.
1、图像获取单元1, image acquisition unit
图像获取单元用于获取当前焦距下的图像。但本申请中的图像获取单元的焦距可以被移动计算处理单元控制并修改,而且焦距可调范围大,比如可以清晰拍摄几厘米的文字到几十米之的红绿灯或者交通标志牌。因此,本申请中的图像获取单元的具体实现形式可以有多种。The image acquisition unit is configured to acquire an image at the current focal length. However, the focal length of the image acquisition unit in the present application can be controlled and modified by the mobile computing processing unit, and the focal length can be adjusted in a wide range, for example, a few centimeters of text can be clearly captured to a tens of meters of traffic lights or traffic signs. Therefore, the specific implementation form of the image acquisition unit in the present application may be various.
例如:图像获取单元为焦距可控的摄像头,其中焦距可控范围包括长焦焦距、中焦焦距和短焦焦距。此种情况下摄像头的数量不进行限制,但焦距可控的摄像头至少1台。For example, the image acquisition unit is a camera with a focal length controllable, wherein the focal length controllable range includes a telephoto focal length, a medium focal length, and a short focal length. In this case, the number of cameras is not limited, but at least one camera with a controlled focal length is at least one.
再例如,图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至 少一个,中焦摄像头至少一个,短焦摄像头至少一个。For another example, the image acquisition unit is three or more fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera.
在实际应用中,图像获取单元可以位于可穿戴式眼镜上,如位于导盲眼镜上。In practical applications, the image acquisition unit may be located on the wearable glasses, such as on the guide glasses.
2、移动计算处理单元2, mobile computing processing unit
移动计算处理单元可以通过USB(Universal Serial Bus,通用串行总线),或者,无线通信方式(如蓝牙方式)与图像获取单元连接。The mobile computing processing unit can be connected to the image acquisition unit via USB (Universal Serial Bus) or wireless communication (such as Bluetooth).
移动计算处理单元负责控制图像获取单元的焦距、图像采集、场景粗分类、具体的图像识别以及语音播报输出。The mobile computing processing unit is responsible for controlling the focal length of the image acquisition unit, image acquisition, scene rough classification, specific image recognition, and voice broadcast output.
例如,移动计算处理单元,可以控制图像获取单元的焦距为中焦焦距,通过焦距为中焦焦距的图像获取单元采集识别对象所在的场景图像。For example, the mobile computing processing unit may control the focal length of the image acquiring unit to be a focal length of the focal length, and acquire an image of the scene in which the identifying object is located by an image acquiring unit whose focal length is a focal length of the focal length.
再例如,移动技术处理单元,还可以控制图像获取单元的焦距为识别焦距,通过焦距为识别焦距的图像获取单元采集识别对象的第一图像。For another example, the mobile technology processing unit may further control the focal length of the image acquisition unit to identify the focal length, and acquire the first image of the recognition object by the image acquisition unit that recognizes the focal length by the focal length.
再例如,移动技术处理单元,还可以在确定识别完成之后,控制图像获取单元的焦距为中焦焦距。For another example, the mobile technology processing unit may further control the focal length of the image acquisition unit to be a mid-focus focal length after determining that the recognition is completed.
移动计算处理单元在具体应用时,可以为如图1所示的电子设备。该电子设备可以为通用的智能手机。该电子设备包括:存储器101,一个或多个处理器102;以及收发组件103,存储器、处理器以及收发组件103通过通信总线(本申请实施例中是以通信总线为I/O总线进行的说明)相连;所述存储介质中存储有用于执行图2所示的图像识别方法中各个步骤的指令,进而实现采用中焦焦距采集识别对象所在的场景图像,根据场景图像确定识别焦距,采用识别焦距对识别对象进行识别的功能,实现了在无用户的干预和输入的情况下,自动对环境进行识别,基于环境选择适当的焦距达到最好的拍摄效果,进而提升识别精准性,大大提高盲人生活的便利性。The mobile computing processing unit may be an electronic device as shown in FIG. 1 in a specific application. The electronic device can be a general purpose smartphone. The electronic device includes: a memory 101, one or more processors 102; and a transceiver component 103. The memory, the processor, and the transceiver component 103 are communicated through a communication bus (in the embodiment of the present application, the communication bus is an I/O bus). Connected; the storage medium stores instructions for performing the steps in the image recognition method shown in FIG. 2, thereby realizing the scene image in which the recognition target is located by using the focal length of the focal point, determining the focal length according to the scene image, and using the recognition focal length The function of recognizing the recognition object realizes the automatic recognition of the environment without user intervention and input, and selects the appropriate focal length based on the environment to achieve the best shooting effect, thereby improving the recognition accuracy and greatly improving the life of the blind. Convenience.
不难理解的是,在具体实施时,就为了实现本申请的基本目的而言,上述的并不必然的需要包含上述的收发组件103。It is not difficult to understand that, in the specific implementation, in order to achieve the basic purpose of the present application, the above-mentioned transceiver component 103 is not necessarily required.
参见图2,本实施例提供的图像识别方法,包括:Referring to FIG. 2, the image recognition method provided by this embodiment includes:
201,采用中焦焦距采集识别对象所在的场景图像。201. Collect a scene image of the recognition object by using a focal length of the focal length.
201-1,移动计算处理单元通过USB,或者,蓝牙等无线通信方式与图像获取单元建立连接。201-1. The mobile computing processing unit establishes a connection with the image acquiring unit by using a wireless communication method such as USB or Bluetooth.
201-2,移动计算处理单元通过该连接采用中焦焦距采集识别对象所在的场景图像。201-2. The mobile computing processing unit collects the scene image of the recognition object by using the focal length of the focal length through the connection.
具体为:Specifically:
1)如果图像获取单元为焦距可控的摄像头,则1) If the image acquisition unit is a camera with a controllable focus, then
(1)移动计算处理单元通过该连接将图像获取单元的焦距调整在中间距离,此时图像获取单元的当前焦距为中焦焦距。(1) The mobile computing processing unit adjusts the focal length of the image acquiring unit at an intermediate distance by the connection, and the current focal length of the image acquiring unit is the focal length of the focal length.
(2)图像获取单元采集当前焦距下的识别对象所在的场景图像,将该场景图像传送给移动计算处理单元,实现移动计算处理单元通过焦距为中焦焦距的图像获取单元采集识别对象所在的场景图像。(2) The image acquisition unit collects the scene image of the recognition object under the current focal length, and transmits the scene image to the mobile calculation processing unit, so that the mobile calculation processing unit collects the scene where the recognition object is located by the image acquisition unit with the focal length of the focal length of the focus. image.
2)如果图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至少一个,中焦摄像头至少一个,短焦摄像头至少一个,则2) if the image acquisition unit is more than three fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera,
(1)移动计算处理单元通过该连接在图像获取单元中选择中焦摄像头。(1) The mobile computing processing unit selects the mid-focus camera in the image acquisition unit through the connection.
(2)选择的中焦摄像头采集识别对象所在的场景图像,将该场景图像传送给移动计算处理单元,实现移动计算处理单元通过焦距为中焦焦距的图像获取单元采集识别对象所在的场景图像。(2) The selected mid-focus camera captures the scene image of the recognition object, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the scene image of the recognition object by the image acquiring unit with the focal length of the focal length.
202,根据场景图像确定识别焦距。202. Determine a recognition focal length according to the scene image.
在步骤201中采集到场景图像之后,会对场景图像进行粗分类,得到粗分类的场景,即得到场景图像所属粗场景。根据该所属粗场景确定具体识别时的识别焦距,以便调整镜头到相应的识别焦距上,在此采集图像,然后调用相应的图像识别功能。After the scene image is collected in step 201, the scene image is roughly classified to obtain a coarsely classified scene, that is, a coarse scene to which the scene image belongs. The recognition focal length at the time of specific recognition is determined according to the associated coarse scene, so that the lens is adjusted to the corresponding recognition focal length, an image is collected here, and then the corresponding image recognition function is called.
步骤202的具体实现过程如下:The specific implementation process of step 202 is as follows:
202-1,通过场景粗分类模型对场景图像进行粗分类,确定场景图像所属粗场景。202-1: The scene image is coarsely classified by the scene rough classification model, and the coarse scene to which the scene image belongs is determined.
其中,粗场景为长焦场景,中焦场景或者短焦场景。粗场景对应一或多个图像识别功能。不同的粗场景对应的图像识别功能不同,且对应的图像识别功能的数量可以相同也可以不同。The coarse scene is a telephoto scene, a medium focus scene or a short focus scene. The coarse scene corresponds to one or more image recognition functions. The image recognition functions corresponding to different coarse scenes are different, and the number of corresponding image recognition functions may be the same or different.
例如,对于粗场景为长焦场景的情况,长焦场景对应的图像识别功能只有一个,即红绿灯识别功能。For example, in the case where the coarse scene is a telephoto scene, there is only one image recognition function corresponding to the telephoto scene, that is, the traffic light recognition function.
再例如,对于粗场景为短焦场景的情况,短焦场景对应的图像识别功能为两个,即读书识别功能和物品识别功能。For another example, for the case where the coarse scene is a short-focus scene, the image recognition function corresponding to the short-focus scene is two, that is, the reading recognition function and the item recognition function.
再例如,对于粗场景为中焦场景中的情况,中焦场景对应的图像识别功能为多个,其中一个图像识别功能为人脸识别功能。For another example, for the case where the coarse scene is in the mid-focus scene, the image recognition function corresponding to the medium-focus scene is multiple, and one of the image recognition functions is a face recognition function.
本实施例中的粗场景除长焦场景,中焦场景或者短焦场景之外,还可以根据实际情况进行增加。每个粗场景对应的图像识别功能的数量及具体功能,也可以根据实际情况进行调整。本实施例不对粗场景包括的具体类别,粗场景对应的图像识别功能的具体数量及具体功能,以及粗场景包括的类别,对应图像识别功能的数量及功能的调整时间、调整形式进行限定。In addition to the telephoto scene, the medium focus scene, or the short focus scene, the coarse scene in this embodiment may be added according to actual conditions. The number and specific functions of the image recognition function corresponding to each coarse scene can also be adjusted according to actual conditions. This embodiment does not limit the specific categories included in the coarse scene, the specific number and specific functions of the image recognition function corresponding to the coarse scene, and the categories included in the coarse scene, the number of corresponding image recognition functions, and the adjustment time and adjustment form of the function.
另外,场景粗分类模型是通过对长焦场景的样本、中焦场景的样本以及短焦场景的样本进行深度学习得到的。In addition, the scene rough classification model is obtained by deep learning the samples of the telephoto scene, the samples of the medium focal scene, and the samples of the short focus scene.
具体的,specific,
1、基于中焦镜头采集三个场景下的图像样本,比如使用长焦场景时一般为红绿灯识别场景,可以采集此类场景图像作为该场景类别的样本;使用中焦场景时一般为人脸识别场景,可以采集行人距离较近而且正面的场景图像作为该场景类别的样本;使用短焦场景时一般为OCR(Optical Character Recognition,光学字符识别)识别场景,可以采集读书之类场景图像作为该场景类别的样本。1. Collecting image samples in three scenes based on the mid-focus lens. For example, when using a telephoto scene, the scene is generally identified by a traffic light, and such a scene image can be collected as a sample of the scene category; when the medium-focus scene is used, the face recognition scene is generally used. The scene image with the pedestrian distance and the front is collected as a sample of the scene category; when the short focal scene is used, the scene is generally recognized by OCR (Optical Character Recognition), and a scene image such as a book can be collected as the scene category. Sample.
2)基于CNN(Convolutional Neural Network,卷积神经网络)进行训练,比如采用resnet网络进行训练。2) Training based on CNN (Convolutional Neural Network), such as training using the resnet network.
在训练完毕后即得到训练好的场景粗分类模型,在步骤202-1中可以利用训练好的场景粗分类模型和权重对场景图像进行分类,根据输出概率的大小判别属于哪个粗场景,实现基于深度学习采用CNN对步骤201中采集的场景图像进行粗分类识别。After the training is completed, the trained scene rough classification model is obtained. In step 202-1, the trained scene coarse classification model and the weight can be used to classify the scene image, and according to the magnitude of the output probability, which coarse scene belongs to, based on The deep learning uses CNN to perform rough classification and recognition on the scene image collected in step 201.
202-2,将所属粗场景对应的焦距确定为识别焦距。202-2. Determine a focal length corresponding to the associated coarse scene as the recognition focal length.
执行至此,得到对识别对象进行实际识别时的识别焦距,实现基于不同的粗场景动态改变焦距的功能,提升后续识别时的图像拍摄质量,进而提升识别精准性,大大提高盲人生活的便利性。At this point, the recognition focal length when the identification object is actually recognized is obtained, and the function of dynamically changing the focal length based on different coarse scenes is realized, and the image capturing quality at the subsequent recognition is improved, thereby improving the recognition accuracy and greatly improving the convenience of the blind life.
203,采用识别焦距对识别对象进行识别。203. Identify the recognition object by using the recognition focal length.
203-1,采用识别焦距采集识别对象的第一图像。203-1, the first image of the recognition object is collected by using the recognition focal length.
移动技术处理单元控制图像获取单元的焦距为识别焦距,通过焦距为识别焦距的图像获取单元采集识别对象的第一图像。The mobile technology processing unit controls the focal length of the image acquisition unit to recognize the focal length, and the first image of the recognition object is acquired by the image acquisition unit that recognizes the focal length by the focal length.
具体为:Specifically:
1)如果图像获取单元为焦距可控的摄像头,则1) If the image acquisition unit is a camera with a controllable focus, then
(1)移动计算处理单元通过与图像获取单元建立的连接将图像获取单元的焦距调整在识别焦距距离,此时图像获取单元的当前焦距为识别焦距。(1) The mobile computing processing unit adjusts the focal length of the image acquiring unit to recognize the focal length by the connection established with the image acquiring unit, and the current focal length of the image acquiring unit is the recognized focal length.
(2)图像获取单元采集当前焦距下的识别对象所在的场景图像,将该场景图像传送给移动计算处理单元,实现移动计算处理单元通过焦距为识别焦距的图像获取单元采集识别对象的第一图像。(2) The image acquisition unit collects the scene image of the recognition object under the current focal length, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the first image of the recognition object by the image acquisition unit that focuses the focal length. .
2)如果图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至少一个,中焦摄像头至少一个,短焦摄像头至少一个,则2) if the image acquisition unit is more than three fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera,
(1)移动计算处理单元通过与图像获取单元建立的连接在图像获取单元中选择识别焦距对应的摄像头。(1) The mobile computing processing unit selects a camera corresponding to the recognized focal length in the image acquiring unit by the connection established with the image acquiring unit.
(2)选择的摄像头采集识别对象所在的场景图像,将该场景图像传送给移动计算处理单元,实现移动计算处理单元通过焦距为识别焦距的图像获取单元采集识别对象的第一图像。(2) The selected camera captures the scene image of the recognition object, and transmits the scene image to the mobile computing processing unit, so that the mobile computing processing unit collects the first image of the recognition object by the image acquiring unit that focuses the focal length.
203-2,若所属粗场景对应一个图像识别功能,则调用所属粗场景对应的图像识别功能,对第一图像进行识别。203-2: If the corresponding rough scene corresponds to an image recognition function, the image recognition function corresponding to the coarse scene is called to identify the first image.
如果所属粗场景下只有一个图像识别功能,比如所属粗场景为长焦场景,长焦场景对应的图像识别功能只有一个红绿灯识别功能,则可以直接调用红绿等识别功能。If there is only one image recognition function in the coarse scene, for example, the coarse scene is a telephoto scene, and the image recognition function corresponding to the telephoto scene has only one traffic light recognition function, then the recognition function such as red and green can be directly called.
调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯或者黄灯;或者,调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯。The traffic light recognition function is called, and the identification object is determined to be a red light, a green light or a yellow light according to the first image; or the traffic light recognition function is called, and the identification object is determined to be a red light, a green light, a yellow light or a non-red green light according to the first image.
例如,在红绿灯识别功能中,通过深度神经网络识别第一图像,实现红灯、绿灯、黄灯的判别,或者,通过深度神经网络识别第一图像,实现红灯、绿灯、黄灯或者非红绿黄灯的判别。For example, in the traffic light recognition function, the first image is identified by the deep neural network to realize the discrimination of the red light, the green light, and the yellow light, or the first image is identified by the deep neural network to realize the red light, the green light, the yellow light, or the non-red light. The judgment of the green yellow light.
此方案的具体实现过程可参考步骤202-1中对场景图像进行粗分类的实现过程。For the specific implementation process of this solution, refer to the implementation process of rough classification of the scene image in step 202-1.
再例如,在红绿灯识别功能中,通过目标检测方式,对第一图像中的红灯、绿灯、黄灯三种目标进行检测,或者,通过目标检测方式,对第一图像中的红灯、绿灯、黄灯或者非红绿黄灯四种目标进行检测。For example, in the traffic light recognition function, three targets of red light, green light, and yellow light in the first image are detected by the target detection mode, or red light and green light in the first image are detected by the target detection mode. Four targets, yellow light or non-red, green and yellow light are detected.
其中,目标检测方式包括但不限于基于SSD(Single Shot MultiBox Detector)目标检测模型的检测。The target detection mode includes, but is not limited to, detection based on an SSD (Single Shot MultiBox Detector) target detection model.
203-3,若所属粗场景对应多个图像识别功能,则通过场景细分类模型对第一图像进行细分类,确定第一图像在所属粗场景中对应的图像识别功能;调用第一图像在所属粗场景中对应的图像识别功能,对第一图像进行识别。203-3, if the corresponding coarse scene corresponds to multiple image recognition functions, the first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined; The corresponding image recognition function in the coarse scene identifies the first image.
若所属粗场景对应多个细分的图像识别功能,则可以先进行场景细分类,再调用具体的图像识别功能。If the coarse scene corresponds to multiple subdivided image recognition functions, the scene classification may be performed first, and then the specific image recognition function is called.
例如,若所属粗场景为短焦场景,短焦场景对应的图像识别功能为两个:读书识别功能和物品识别功能。此时,可以先通过卷积神经网络,或者,通过CNN,基于手里拿的是什么东西判断是OCR场景还是物体识别场景(具体实现方式可以参考步骤202-1中对场景图像进行粗分类的实现过程),若为OCR场景则调用读书识别功能,否则调用物品识别功能。For example, if the coarse scene is a short-focus scene, the short-focus scene corresponds to two image recognition functions: a book recognition function and an item recognition function. At this time, you can first use the convolutional neural network, or through the CNN, to determine whether it is an OCR scene or an object recognition scene based on what is in hand. (For specific implementation, refer to step 202-1 for rough classification of scene images. Implementation process), if it is OCR scene, the book recognition function is called, otherwise the item identification function is called.
具体的,specific,
1,通过场景细分类模型对第一图像进行识别,确定识别对象为书籍、报刊、或者,非书籍报刊的物品。1. Identify the first image by the scene classification model, and determine that the object to be identified is a book, a newspaper, or an item of a non-book newspaper.
其中,报刊包括报纸、杂志等发行刊物。Among them, newspapers and periodicals include publications such as newspapers and magazines.
2,若识别对象为书籍,或者,若识别对象为报刊,则确定第一图像在所属粗场景中对应的图像识别功能为读书识别功能,调用读书识别功能识别第一图像,即进行OCR识别。2. If the recognition object is a book, or if the recognition object is a newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function, and the reading recognition function is called to recognize the first image, that is, OCR recognition is performed.
OCR识别后,还可以通过语音播报的形式输出识别结果。After the OCR is recognized, the recognition result can also be output in the form of a voice broadcast.
在实际应用时,调用读书识别功能识别第一图像的过程,还可以进行如下优化实现:In the actual application, the process of calling the reading recognition function to recognize the first image can also be optimized as follows:
当完成一次OCR识别输出后,再次采集识别对象的图像对图像进行分析是否需要在再次识别。因为盲人看书时可能会上下左右前后移动书本,这种情况下,不需要再次识别,只有当用户翻页时才需要再次进行OCR识别,这样就可以避免重复从头播报,有效提升用户体验。After completing the OCR recognition output, it is necessary to collect the image of the recognition object again to analyze whether the image needs to be recognized again. Because blind people may move books up and down and left and right when reading books. In this case, there is no need to identify them again. Only when the user turns pages, OCR recognition needs to be performed again. This can avoid repeating the broadcast from the beginning and effectively improve the user experience.
具体实现如下:The specific implementation is as follows:
在调用读书识别功能识别第一图像的同时,连续采集识别对象的第二图像。每当采集到一张第二图像,确定该张第二图像与第一图像之间的内容相似性,若内容相似性低于第一阈值,则停止第二图像的连续采集,进 行再次OCR的识别,即将该张第二图像作为新的第一图像,重新执行调用读书识别功能识别新的第一图像,同时,连续采集识别对象的新的第二图像,确定该张新的第二图像与第一图像之间的内容相似性,若内容相似性低于第一阈值,则停止第二图像的连续采集,进行再次OCR的识别以及连续采集识别对象的新的第二图像,确定该张新的第二图像与第一图像之间的内容相似性及后续步骤,如此循环,直至识别对象的识别。The second image of the recognition object is continuously acquired while the reading recognition function is called to recognize the first image. Whenever a second image is acquired, the content similarity between the second image and the first image is determined, and if the content similarity is lower than the first threshold, the continuous acquisition of the second image is stopped, and OCR is performed again. Recognizing that the second image is the new first image, re-executing the call recognition function to recognize the new first image, and simultaneously acquiring the new second image of the recognition object, determining the new second image and Content similarity between the first images, if the content similarity is lower than the first threshold, stopping the continuous acquisition of the second image, performing the recognition of the OCR again, and continuously acquiring the new second image of the recognition object, determining the new The content similarity between the second image and the first image and subsequent steps are cycled until the recognition of the object is recognized.
例如,首先保存第一帧OCR识别时的图像,然后不断采集图片与其进行内容相似性判断,如果相似性低于一个设定阈值,则认为用户可能已经翻页,需要重新进行OCR识别,否则认为用户是在移动当前页,不需要再次OCR识别。For example, first save the image in the first frame OCR recognition, and then continuously collect the picture and perform content similarity judgment. If the similarity is lower than a set threshold, it is considered that the user may have turned the page, and the OCR recognition needs to be performed again, otherwise it is considered The user is moving the current page and does not need OCR recognition again.
对于确定该张第二图像与第一图像之间的内容相似性的过程,可以通过特征点匹配的方式实现。The process of determining the similarity of content between the second image and the first image may be implemented by feature point matching.
具体的,提取该张第二图像中的第二特征点,提取第一图像中的第一特征点,根据与第一特征点匹配的第二特征点的数量确定该张第二图像与第一图像之间的内容相似性。Specifically, the second feature point in the second image is extracted, the first feature point in the first image is extracted, and the second image and the first image are determined according to the number of second feature points matching the first feature point. Content similarity between images.
例如,分别提取第一帧图像和当前帧图像的ORB/SIFT点,进行匹配,根据匹配成功点的个数进行判断,成功匹配的点越多则表示相似性越高。For example, the ORB/SIFT points of the first frame image and the current frame image are respectively extracted, matched, and judged according to the number of matching success points, and the more points that are successfully matched, the higher the similarity.
3,若识别对象为非书籍报刊的物品,则确定第一图像在所属粗场景中对应的图像识别功能为物品识别功能,调用物品识别功能识别第一图像。3. If the object to be recognized is an item of a non-book newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is an item identification function, and the item identification function is called to identify the first image.
执行至此,完成识别对象的精确识别。At this point, the precise identification of the identified object is completed.
在进行识别之后,本实施例提供的图像识别方法还会判断该场景下的图像识别功能是否使用结束,即判断是否识别完成。如果识别完成,则移动计算处理单元会重新调整图像获取单元(如摄像头)的焦距到中焦位置,即控制图像获取单元的焦距为中焦焦距,进入下一次工作循环。After the identification is performed, the image recognition method provided in this embodiment further determines whether the image recognition function in the scene is used, that is, whether the recognition is completed. If the recognition is completed, the mobile computing processing unit re-adjusts the focal length of the image acquisition unit (such as the camera) to the mid-focus position, that is, controls the focal length of the image acquisition unit to be the mid-focus focal length, and enters the next working cycle.
具体为:Specifically:
1)如果图像获取单元为焦距可控的摄像头,则移动计算处理单元通过与图像获取单元建立的连接将图像获取单元的焦距调整在中间距离,此时图像获取单元的当前焦距为中焦焦距,完成调整图像获取单元的焦距到中焦位置的工作。1) If the image acquisition unit is a camera with a focal length controllable, the movement calculation processing unit adjusts the focal length of the image acquisition unit at an intermediate distance by a connection established with the image acquisition unit, and the current focal length of the image acquisition unit is a focal length of the focal length. The work of adjusting the focal length of the image acquisition unit to the mid-focus position is completed.
2)如果图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至少一个,中焦摄像头至少一个,短焦摄像头至少一个,则移动计算处理单元通过与图像获取单元建立的连接在图像获取单元中选择中焦摄像头作为下次拍照的默认摄像头,完成调整图像获取单元的焦距到中焦位置的工作。2) if the image acquisition unit is more than three fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera, the mobile computing processing unit acquires the image through the connection established with the image acquisition unit. The middle focus camera is selected as the default camera for the next photograph, and the work of adjusting the focal length of the image acquisition unit to the middle focus position is completed.
另外,判断是否识别完成的方式包括但不限于:不断采集图像进行处理,根据该场景下图像识别功能的输出结果判断当前识别功能是否结束。In addition, the manner of determining whether the recognition is completed includes, but is not limited to, continuously acquiring an image for processing, and determining whether the current recognition function ends according to an output result of the image recognition function in the scene.
例如,对于长焦场景下的红绿灯识别功能,如果第四类别(非红绿黄灯)的概率最高,则认为该功能结束。For example, for the traffic light recognition function in a telephoto scene, if the probability of the fourth category (non-red, green, and yellow light) is the highest, the function is considered to be ended.
具体的,在调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯之后,若根据第一图像确定识别对象为非红绿黄灯,则确定识别完成,结束图像识别方法。若根据第一图像确定识别对象为红灯、绿灯或者黄灯,则连续采集识别对象的第二图像,每当采集到一张第二图像,调用红绿灯识别功能,根据该张第二图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯,若根据该张第二图像确定识别对象为非红绿黄灯,则确定识别完成,停止第二图像的连续采集,结束图像识别方法。Specifically, after the traffic light recognition function is called, after the identification object is determined to be a red light, a green light, a yellow light, or a non-red, green, and yellow light according to the first image, if the identification object is determined to be a non-red, green, and yellow light according to the first image, the identification is determined. Finish, end the image recognition method. If the identification object is determined to be a red light, a green light or a yellow light according to the first image, the second image of the recognition object is continuously collected, and whenever a second image is acquired, the traffic light recognition function is called, and the second image is determined according to the second image. The object is a red light, a green light, a yellow light or a non-red, green and yellow light. If the identification object is determined to be a non-red, green and yellow light according to the second image, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the image recognition method is ended. .
再例如,对于短焦场景下的读书识别功能,可以跟据输出OCR字符的个数以及置信度来判断,如果字符很少或者字符置信度很低们则认为读书识别功能结束。For another example, the reading recognition function in the short-focus scene can be judged according to the number of output OCR characters and the degree of confidence. If the characters are few or the character confidence is low, the reading recognition function is considered to be ended.
具体的,在连续采集识别对象的第二图像之后,还会每当采集到一张第二图像时,调用读书识别功能识别该张第二图像中的字符个数和字符置 信度。若该张第二图像中的字符个数小于第二阈值和/或该张第二图像中的字符置信度小于第三阈值,则确定识别完成,停止第二图像的连续采集,结束图像识别方法。Specifically, after continuously acquiring the second image of the recognition object, each time a second image is acquired, the book recognition function is called to identify the number of characters and the character confidence in the second image. If the number of characters in the second image is less than the second threshold and/or the character confidence in the second image is less than the third threshold, determining that the recognition is completed, stopping the continuous acquisition of the second image, and ending the image recognition method .
再例如,对于中焦场景中的人脸识别功能,可以进行人体/人脸检测,判断场景内有无符合一定尺寸大小的人体/人脸,如果没有,则认为人脸识别功能结束。For example, for the face recognition function in the mid-focus scene, the human/face detection can be performed to determine whether there is a human body/face in the scene that meets a certain size, and if not, the face recognition function is considered to be ended.
具体的,在调用第一图像在所属粗场景中对应的图像识别功能,对第一图像进行识别之后,连续采集识别对象的第二图像,每当采集到一张第二图像,对该张第二图像中的人体或者人脸进行检测,若检测结果为无人体,且无人脸,或者,检测结果为人体尺寸小于第四阈值,或者,检测结果为人脸尺寸小于第五阈值,则确定识别完成,停止第二图像的连续采集,结束图像识别方法。Specifically, after the image recognition function corresponding to the first image in the associated coarse scene is called, after the first image is recognized, the second image of the recognition object is continuously collected, and each time a second image is collected, the sheet is The human body or the human face in the second image is detected. If the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, the identification is determined. Upon completion, the continuous acquisition of the second image is stopped, and the image recognition method is ended.
执行至此,完成本实施例提供的图像识别方法。该方法的基本思路参见图3,具体为:将盲人需要的图像识别功能根据使用距离划分为长焦(对应远距离)、中焦(对应中间距离)、短焦(对应近距离)使用场景,比如交通灯识别需要长焦拍摄,人脸识别需要中焦拍摄,OCR需要短焦拍摄。在图像识别系统工作时移动计算处理单元先控制图像获取单元(如焦距可控的摄像头)焦距在中焦的位置,采集一幅识别对象所在的场景图像,虽然这种情况下对于短焦场景和长焦场景拍摄的图像不够清晰或者目标区域的分辨率不够,但是本申请可以对图像进行粗略的分类识别,判断适用长焦的场景、还是适用短焦的场景或者是中焦的场景,然后根据粗分类的场景类型确定识别焦距,调节摄像机镜头到对应的识别焦距上,接着再次采用识别焦距采集识别对象的一张图像,然后调用对应场景的图像识别功能;如果该场景下有若干个细分的功能,可以先进行场景细分类,然后再调用具体的图像识别功能。最后还可以通过语音播报的方式将识别结果反馈到 盲人。So far, the image recognition method provided by the embodiment is completed. The basic idea of the method is shown in FIG. 3, specifically: the image recognition function required by the blind person is divided into a telephoto (corresponding to a long distance), a medium focus (corresponding to an intermediate distance), and a short focus (corresponding to a close distance) according to the use distance, For example, traffic light recognition requires telephoto shooting, face recognition requires mid-focus shooting, and OCR requires short-focus shooting. When the image recognition system is working, the mobile computing processing unit first controls the image acquisition unit (such as a focal length controllable camera) to focus on the focal position, and collects a scene image of the recognition object, although in this case for the short focal scene and The image captured in the telephoto scene is not clear enough or the resolution of the target area is not enough, but the application can roughly classify the image to determine whether the scene suitable for telephoto is suitable for the short-focus scene or the medium-focus scene, and then according to The rough classification scene type determines the recognition focal length, adjusts the camera lens to the corresponding recognition focal length, and then uses the recognition focal length to collect an image of the recognition object, and then calls the image recognition function of the corresponding scene; if there are several subdivisions in the scene The function can first classify the scene and then call the specific image recognition function. Finally, the results of the recognition can be fed back to the blind by means of voice announcements.
有益效果:Beneficial effects:
本申请实施例,采用中焦焦距采集识别对象所在的场景图像,根据场景图像确定识别焦距,采用识别焦距对识别对象进行识别,实现了在无用户的干预和输入的情况下,自动对环境进行识别,基于环境选择适当的焦距达到最好的拍摄效果,进而提升识别精准性,大大提高盲人生活的便利性。In the embodiment of the present application, the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing automatic environment for the environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving the accuracy of recognition, greatly improving the convenience of life for the blind.
另一方面,本申请实施例还提供了一种与包括显示器的电子设备结合使用的计算机程序产品,所述计算机程序产品包括计算机可读的存储介质和内嵌于其中的计算机程序机制,所述计算机程序机制包括用于执行下述各个步骤的指令:In another aspect, embodiments of the present application further provide a computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, The computer program mechanism includes instructions for performing the various steps described below:
采用中焦焦距采集识别对象所在的场景图像;Collecting a scene image of the recognition object by using a focal length of the focal length;
根据场景图像确定识别焦距;Determining the focal length according to the scene image;
采用识别焦距对识别对象进行识别。The recognition object is identified by the recognition focal length.
可选地,根据场景图像确定识别焦距,包括:Optionally, determining the recognized focal length according to the scene image includes:
通过场景粗分类模型对场景图像进行粗分类,确定场景图像所属粗场景;The scene image is coarsely classified by the scene rough classification model to determine the coarse scene to which the scene image belongs;
将所属粗场景对应的焦距确定为识别焦距;Determining a focal length corresponding to the coarse scene to identify the focal length;
其中,粗场景对应一或多个图像识别功能;Wherein, the coarse scene corresponds to one or more image recognition functions;
粗场景为长焦场景,中焦场景或者短焦场景;The coarse scene is a telephoto scene, a medium focus scene or a short focus scene;
场景粗分类模型是通过对长焦场景的样本、中焦场景的样本以及短焦场景的样本进行深度学习得到的。The scene rough classification model is obtained by deep learning the samples of the telephoto scene, the samples of the medium focal scene, and the samples of the short focus scene.
可选地,采用识别焦距对识别对象进行识别,包括:Optionally, identifying the object by identifying the focal length includes:
采用识别焦距采集识别对象的第一图像;Acquiring a first image of the recognition object by recognizing a focal length;
若所属粗场景对应一个图像识别功能,则调用所属粗场景对应的图像 识别功能,对第一图像进行识别;If the corresponding rough scene corresponds to an image recognition function, the image recognition function corresponding to the coarse scene is called to identify the first image;
若所属粗场景对应多个图像识别功能,则通过场景细分类模型对第一图像进行细分类,确定第一图像在所属粗场景中对应的图像识别功能;调用第一图像在所属粗场景中对应的图像识别功能,对第一图像进行识别。If the corresponding coarse image corresponds to multiple image recognition functions, the first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined; the first image is called to correspond to the corresponding coarse scene. The image recognition function identifies the first image.
可选地,所属粗场景为长焦场景,长焦场景对应的图像识别功能为红绿灯识别功能;Optionally, the associated coarse scene is a telephoto scene, and the image recognition function corresponding to the telephoto scene is a traffic light recognition function;
调用所属粗场景对应的图像识别功能,对第一图像进行识别,包括:The image recognition function corresponding to the coarse scene is called to identify the first image, including:
调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯或者黄灯;或者,Calling the traffic light recognition function to determine whether the recognition object is a red light, a green light or a yellow light according to the first image; or
调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯。The traffic light recognition function is called, and the identification object is determined to be a red light, a green light, a yellow light or a non-red, green and yellow light according to the first image.
可选地,调用红绿灯识别功能,根据第一图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯之后,还包括:Optionally, the traffic light recognition function is invoked, and after determining that the identification object is a red light, a green light, a yellow light, or a non-red, green, and yellow light according to the first image, the method further includes:
若根据第一图像确定识别对象为非红绿黄灯,则确定识别完成,结束图像识别方法;If it is determined that the identification object is a non-red, green, and yellow light according to the first image, determining that the recognition is completed, ending the image recognition method;
若根据第一图像确定识别对象为红灯、绿灯或者黄灯,则连续采集识别对象的第二图像,每当采集到一张第二图像,调用红绿灯识别功能,根据该张第二图像确定识别对象为红灯、绿灯、黄灯或者非红绿黄灯,若根据该张第二图像确定识别对象为非红绿黄灯,则确定识别完成,停止第二图像的连续采集,结束图像识别方法。If the identification object is determined to be a red light, a green light or a yellow light according to the first image, the second image of the recognition object is continuously collected, and whenever a second image is acquired, the traffic light recognition function is called, and the second image is determined according to the second image. The object is a red light, a green light, a yellow light or a non-red, green and yellow light. If the identification object is determined to be a non-red, green and yellow light according to the second image, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the image recognition method is ended. .
可选地,所属粗场景为短焦场景,短焦场景对应的图像识别功能为读书识别功能和物品识别功能;Optionally, the associated coarse scene is a short-focus scene, and the image recognition function corresponding to the short-focus scene is a reading recognition function and an item identification function;
通过场景细分类模型对第一图像进行细分类,确定第一图像在所属粗场景中对应的图像识别功能,包括:The first image is finely classified by the scene fine classification model, and the corresponding image recognition function of the first image in the associated coarse scene is determined, including:
通过场景细分类模型对第一图像进行识别,确定识别对象为书籍、报 刊、或者,非书籍报刊的物品;Identifying the first image by the scene sub-classification model, and determining that the object of recognition is a book, a newspaper, or an article of a non-book newspaper;
若识别对象为书籍,或者,若识别对象为报刊,则确定第一图像在所属粗场景中对应的图像识别功能为读书识别功能;If the recognition object is a book, or if the recognition object is a newspaper, determining that the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function;
若识别对象为非书籍报刊的物品,则确定第一图像在所属粗场景中对应的图像识别功能为物品识别功能。If the identification object is an item other than the book newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is the item identification function.
可选地,第一图像在所属粗场景中对应的图像识别功能为读书识别功能;Optionally, the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function;
调用第一图像在所属粗场景中对应的图像识别功能,对第一图像进行识别,包括:The image recognition function corresponding to the first image in the associated coarse scene is called to identify the first image, including:
调用读书识别功能识别第一图像,同时,连续采集识别对象的第二图像;Calling the book recognition function to recognize the first image, and simultaneously acquiring the second image of the recognition object;
每当采集到一张第二图像,确定该张第二图像与第一图像之间的内容相似性,若内容相似性低于第一阈值,则停止第二图像的连续采集,将该张第二图像作为新的第一图像,重新执行调用读书识别功能识别新的第一图像,同时,连续采集识别对象的新的第二图像,确定该张新的第二图像与第一图像之间的内容相似性及后续步骤。Whenever a second image is acquired, the content similarity between the second image and the first image is determined, and if the content similarity is lower than the first threshold, the continuous acquisition of the second image is stopped, and the first The second image is used as a new first image, and the call recognition function is re-executed to recognize the new first image, and at the same time, the new second image of the recognition object is continuously acquired, and the new second image and the first image are determined. Content similarity and subsequent steps.
可选地,确定该张第二图像与第一图像之间的内容相似性,包括:Optionally, determining content similarity between the second image and the first image, including:
提取该张第二图像中的第二特征点,提取第一图像中的第一特征点;Extracting a second feature point in the second image, and extracting a first feature point in the first image;
根据与第一特征点匹配的第二特征点的数量确定该张第二图像与第一图像之间的内容相似性。Content similarity between the second image and the first image is determined according to the number of second feature points matching the first feature point.
可选地,连续采集识别对象的第二图像之后,还包括:Optionally, after continuously acquiring the second image of the recognition object, the method further includes:
每当采集到一张第二图像,调用读书识别功能识别该张第二图像中的字符个数和字符置信度;Whenever a second image is acquired, the book recognition function is called to identify the number of characters and the character confidence in the second image;
若该张第二图像中的字符个数小于第二阈值和/或该张第二图像中的字符置信度小于第三阈值,则确定识别完成,停止第二图像的连续采集,结 束图像识别方法。If the number of characters in the second image is less than the second threshold and/or the character confidence in the second image is less than the third threshold, determining that the recognition is completed, stopping the continuous acquisition of the second image, and ending the image recognition method .
可选地,第一图像在所属粗场景中对应的图像识别功能为中焦场景中的人脸识别功能;Optionally, the corresponding image recognition function of the first image in the associated coarse scene is a face recognition function in the medium focus scene;
调用第一图像在所属粗场景中对应的图像识别功能,对第一图像进行识别之后,还包括:The image recognition function corresponding to the first image in the associated coarse scene is called, and after the first image is identified, the method further includes:
连续采集识别对象的第二图像;Continuously acquiring a second image of the recognition object;
每当采集到一张第二图像,对该张第二图像中的人体或者人脸进行检测;Whenever a second image is acquired, the human body or the face in the second image is detected;
若检测结果为无人体,且无人脸,或者,检测结果为人体尺寸小于第四阈值,或者,检测结果为人脸尺寸小于第五阈值,则确定识别完成,停止第二图像的连续采集,结束图像识别方法。If the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the end is ended. Image recognition method.
有益效果:Beneficial effects:
本申请实施例,采用中焦焦距采集识别对象所在的场景图像,根据场景图像确定识别焦距,采用识别焦距对识别对象进行识别,实现了在无用户的干预和输入的情况下,自动对环境进行识别,基于环境选择适当的焦距达到最好的拍摄效果,进而提升识别精准性,大大提高盲人生活的便利性。In the embodiment of the present application, the scene image of the recognition object is collected by using the focal length of the focal point, the recognition focal length is determined according to the scene image, and the recognition object is identified by using the recognition focal length, thereby realizing automatic environment for the environment without user intervention and input. Identification, based on the environment to choose the appropriate focal length to achieve the best shooting results, thereby improving the accuracy of recognition, greatly improving the convenience of life for the blind.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present application can be provided as a method, system, or computer program product. Thus, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware. Moreover, the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流 程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart and/or block diagrams, and combinations of flows and/or blocks in the flowcharts and/or block diagrams can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。While the preferred embodiment of the present application has been described, it will be apparent that those skilled in the art can make further changes and modifications to the embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and the modifications and

Claims (18)

  1. 一种图像识别方法,其特征在于,所述方法,包括:An image recognition method, characterized in that the method comprises:
    采用中焦焦距采集识别对象所在的场景图像;Collecting a scene image of the recognition object by using a focal length of the focal length;
    根据所述场景图像确定识别焦距;Determining a focal length according to the scene image;
    采用所述识别焦距对所述识别对象进行识别。The identification object is identified by the recognition focal length.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述场景图像确定识别焦距,包括:The method according to claim 1, wherein the determining the recognition focus according to the scene image comprises:
    通过场景粗分类模型对所述场景图像进行粗分类,确定所述场景图像所属粗场景;Performing rough classification on the scene image by using a scene rough classification model, and determining that the scene image belongs to a coarse scene;
    将所属粗场景对应的焦距确定为识别焦距;Determining a focal length corresponding to the coarse scene to identify the focal length;
    其中,粗场景对应一或多个图像识别功能;Wherein, the coarse scene corresponds to one or more image recognition functions;
    粗场景为长焦场景,中焦场景或者短焦场景;The coarse scene is a telephoto scene, a medium focus scene or a short focus scene;
    所述场景粗分类模型是通过对长焦场景的样本、中焦场景的样本以及短焦场景的样本进行深度学习得到的。The scene rough classification model is obtained by deep learning the samples of the telephoto scene, the samples of the medium focal scene, and the samples of the short focus scene.
  3. 根据权利要求2所述的方法,其特征在于,所述采用所述识别焦距对所述识别对象进行识别,包括:The method according to claim 2, wherein the identifying the identified object by using the recognized focal length comprises:
    采用所述识别焦距采集所述识别对象的第一图像;Acquiring the first image of the identification object by using the recognition focal length;
    若所属粗场景对应一个图像识别功能,则调用所属粗场景对应的图像识别功能,对所述第一图像进行识别;If the associated rough scene corresponds to an image recognition function, the image recognition function corresponding to the coarse scene is called to identify the first image;
    若所属粗场景对应多个图像识别功能,则通过场景细分类模型对所述第一图像进行细分类,确定所述第一图像在所属粗场景中对应的图像识别功能;调用所述第一图像在所属粗场景中对应的图像识别功能,对所述第一图像进行识别。If the corresponding coarse image corresponds to multiple image recognition functions, the first image is finely classified by the scene fine classification model, and the image recognition function corresponding to the first image in the associated coarse scene is determined; the first image is called. The first image is identified in a corresponding image recognition function in the associated coarse scene.
  4. 根据权利要求3所述的方法,其特征在于,所述所属粗场景为长焦场景,所述长焦场景对应的图像识别功能为红绿灯识别功能;The method according to claim 3, wherein the belonging rough scene is a telephoto scene, and the image recognition function corresponding to the telephoto scene is a traffic light recognition function;
    所述调用所属粗场景对应的图像识别功能,对所述第一图像进行识别,包括:The image recognition function corresponding to the coarse scene is called, and the first image is identified, including:
    调用红绿灯识别功能,根据所述第一图像确定所述识别对象为红灯、绿灯或者黄灯;或者,Calling a traffic light recognition function, determining, according to the first image, that the identification object is a red light, a green light, or a yellow light; or
    调用红绿灯识别功能,根据所述第一图像确定所述识别对象为红灯、绿灯、黄灯或者非红绿黄灯。The traffic light recognition function is called, and the identification object is determined to be a red light, a green light, a yellow light or a non-red green light according to the first image.
  5. 根据权利要求4所述的方法,其特征在于,所述调用红绿灯识别功能,根据所述第一图像确定所述识别对象为红灯、绿灯、黄灯或者非红绿黄灯之后,还包括:The method according to claim 4, wherein the calling the traffic light recognition function, after determining that the identification object is a red light, a green light, a yellow light or a non-red green light according to the first image, further comprises:
    若根据所述第一图像确定所述识别对象为非红绿黄灯,则确定识别完成,结束所述图像识别方法;If it is determined that the identification object is a non-red, green, and yellow light according to the first image, determining that the identification is completed, ending the image recognition method;
    若根据所述第一图像确定所述识别对象为红灯、绿灯或者黄灯,则连续采集所述识别对象的第二图像,每当采集到一张第二图像,调用红绿灯识别功能,根据该张第二图像确定所述识别对象为红灯、绿灯、黄灯或者非红绿黄灯,若根据该张第二图像确定所述识别对象为非红绿黄灯,则确定识别完成,停止第二图像的连续采集,结束所述图像识别方法。And if the identification object is determined to be a red light, a green light, or a yellow light according to the first image, the second image of the identification object is continuously collected, and whenever a second image is collected, the traffic light recognition function is invoked, according to the The second image determines that the identification object is a red light, a green light, a yellow light or a non-red green yellow light. If it is determined that the identification object is a non-red, green, and yellow light according to the second image, it is determined that the recognition is completed, and the stop is stopped. The continuous acquisition of the two images ends the image recognition method.
  6. 根据权利要求3所述的方法,其特征在于,所述所属粗场景为短焦场景,所述短焦场景对应的图像识别功能为读书识别功能和物品识别功能;The method according to claim 3, wherein the belonging rough scene is a short-focus scene, and the image recognition function corresponding to the short-focus scene is a reading recognition function and an item recognition function;
    所述通过场景细分类模型对所述第一图像进行细分类,确定所述第一图像在所属粗场景中对应的图像识别功能,包括:The finely classifying the first image by the scene fine classification model, and determining the corresponding image recognition function of the first image in the associated coarse scene, including:
    通过场景细分类模型对所述第一图像进行识别,确定所述识别对象为书籍、报刊、或者,非书籍报刊的物品;Identifying the first image by using a scene classification model, and determining that the identification object is a book, a newspaper, or an item of a non-book newspaper;
    若所述识别对象为书籍,或者,若所述识别对象为报刊,则确定所述第一图像在所属粗场景中对应的图像识别功能为读书识别功能;If the identification object is a book, or if the identification object is a newspaper, determining that the corresponding image recognition function of the first image in the associated coarse scene is a reading recognition function;
    若所述识别对象为非书籍报刊的物品,则确定所述第一图像在所属粗 场景中对应的图像识别功能为物品识别功能。If the identification object is an item other than the book newspaper, it is determined that the corresponding image recognition function of the first image in the associated coarse scene is an item identification function.
  7. 根据权利要求6所述的方法,其特征在于,所述所述第一图像在所属粗场景中对应的图像识别功能为读书识别功能;The method according to claim 6, wherein the image recognition function corresponding to the first image in the associated coarse scene is a reading recognition function;
    所述调用所述第一图像在所属粗场景中对应的图像识别功能,对所述第一图像进行识别,包括:The invoking the image recognition function corresponding to the first image in the associated coarse scene, and identifying the first image, includes:
    调用读书识别功能识别所述第一图像,同时,连续采集所述识别对象的第二图像;Calling the reading recognition function to identify the first image, and simultaneously acquiring the second image of the identification object;
    每当采集到一张第二图像,确定该张第二图像与所述第一图像之间的内容相似性,若内容相似性低于第一阈值,则停止第二图像的连续采集,将该张第二图像作为新的第一图像,重新执行调用读书识别功能识别新的第一图像,同时,连续采集所述识别对象的新的第二图像,确定该张新的第二图像与所述第一图像之间的内容相似性及后续步骤。Determining content similarity between the second image and the first image each time a second image is acquired, and if the content similarity is lower than the first threshold, stopping continuous acquisition of the second image, Transmitting the second image as a new first image, re-executing the call recognition function to recognize the new first image, and simultaneously acquiring the new second image of the recognition object, determining the new second image and the Content similarity between the first images and subsequent steps.
  8. 根据权利要求7所述的方法,其特征在于,所述确定该张第二图像与所述第一图像之间的内容相似性,包括:The method according to claim 7, wherein the determining the content similarity between the second image and the first image comprises:
    提取该张第二图像中的第二特征点,提取所述第一图像中的第一特征点;Extracting a second feature point in the second image, and extracting a first feature point in the first image;
    根据与第一特征点匹配的第二特征点的数量确定该张第二图像与所述第一图像之间的内容相似性。And determining content similarity between the second image and the first image according to the number of second feature points matching the first feature point.
  9. 根据权利要求7或8所述的方法,其特征在于,所述连续采集所述识别对象的第二图像之后,还包括:The method according to claim 7 or 8, wherein after the continuously acquiring the second image of the identification object, the method further comprises:
    每当采集到一张第二图像,调用读书识别功能识别该张第二图像中的字符个数和字符置信度;Whenever a second image is acquired, the book recognition function is called to identify the number of characters and the character confidence in the second image;
    若该张第二图像中的字符个数小于第二阈值和/或该张第二图像中的字符置信度小于第三阈值,则确定识别完成,停止第二图像的连续采集,结束所述图像识别方法。If the number of characters in the second image is less than the second threshold and/or the character confidence in the second image is less than the third threshold, determining that the recognition is completed, stopping the continuous acquisition of the second image, ending the image recognition methods.
  10. 根据权利要求3所述的方法,其特征在于,所述第一图像在所属粗场景中对应的图像识别功能为中焦场景中的人脸识别功能;The method according to claim 3, wherein the corresponding image recognition function of the first image in the associated coarse scene is a face recognition function in the medium focus scene;
    所述调用所述第一图像在所属粗场景中对应的图像识别功能,对所述第一图像进行识别之后,还包括:The invoking the image recognition function corresponding to the first image in the associated coarse scene, after identifying the first image, further includes:
    连续采集所述识别对象的第二图像;Collecting a second image of the identification object continuously;
    每当采集到一张第二图像,对该张第二图像中的人体或者人脸进行检测;Whenever a second image is acquired, the human body or the face in the second image is detected;
    若检测结果为无人体,且无人脸,或者,检测结果为人体尺寸小于第四阈值,或者,检测结果为人脸尺寸小于第五阈值,则确定识别完成,停止第二图像的连续采集,结束所述图像识别方法。If the detection result is no human body and no face, or the detection result is that the human body size is smaller than the fourth threshold, or the detection result is that the face size is smaller than the fifth threshold, it is determined that the recognition is completed, the continuous acquisition of the second image is stopped, and the end is ended. The image recognition method.
  11. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, comprising:
    存储器,一个或多个处理器;存储器与处理器通过通信总线相连;处理器被配置为执行存储器中的指令;所述存储介质中存储有用于执行权利要求1至10任一项所述方法中各个步骤的指令。a memory, one or more processors; a memory coupled to the processor via a communication bus; a processor configured to execute instructions in the memory; the storage medium storing the method of any one of claims 1 to 10 Instructions for each step.
  12. 根据权利要求11所述的电子设备,其特征在于,所述电子设备为智能手机。The electronic device of claim 11, wherein the electronic device is a smart phone.
  13. 一种与包括显示器的电子设备结合使用的计算机程序产品,所述计算机程序产品包括计算机可读的存储介质和内嵌于其中的计算机程序机制,所述计算机程序机制包括用于执行权利要求1至10任一所述方法中各个步骤的指令。A computer program product for use in conjunction with an electronic device including a display, the computer program product comprising a computer readable storage medium and a computer program mechanism embodied therein, the computer program mechanism comprising for performing claim 1 10 instructions for each of the steps in any of the methods.
  14. 一种图像识别系统,其特征在于,所述图像识别系统包括:图像获取单元和移动计算处理单元;An image recognition system, comprising: an image acquisition unit and a mobile calculation processing unit;
    所述图像获取单元为焦距可控的摄像头,其中焦距可控范围包括长焦焦距、中焦焦距和短焦焦距;或者,The image acquisition unit is a camera with a controllable focus, wherein the focal length controllable range includes a telephoto focal length, a medium focal length, and a short focal length; or
    所述图像获取单元为三个以上定焦摄像头,其中,长焦摄像头至少一 个,中焦摄像头至少一个,短焦摄像头至少一个;The image acquisition unit is three or more fixed focus cameras, wherein at least one telephoto camera, at least one medium focus camera, and at least one short focus camera;
    所述移动计算处理单元为权利要求13或14所述的电子设备;The mobile computing processing unit is the electronic device of claim 13 or 14;
    所述移动计算处理单元通过通用串行总线USB,或者,无线通信方式与所述图像获取单元连接。The mobile computing processing unit is coupled to the image acquisition unit via a universal serial bus USB or wireless communication.
  15. 根据权利要求14所述的系统,其特征在于,所述无线通信方式为蓝牙方式;The system according to claim 14, wherein the wireless communication mode is a Bluetooth mode;
    所述图像获取单元位于可穿戴式眼镜上。The image acquisition unit is located on the wearable glasses.
  16. 根据权利要求14或15所述的系统,其特征在于,A system according to claim 14 or 15, wherein
    所述移动计算处理单元,用于控制所述图像获取单元的焦距为中焦焦距,通过焦距为中焦焦距的图像获取单元采集识别对象所在的场景图像。The movement calculation processing unit is configured to control a focal length of the image acquisition unit to be a focal length of the medium focus, and acquire an image of the scene in which the recognition object is located by an image acquisition unit having a focal length of the focal length of the focus.
  17. 根据权利要求16所述的系统,其特征在于,The system of claim 16 wherein:
    所述移动技术处理单元,还用于控制所述图像获取单元的焦距为所述识别焦距,通过焦距为识别焦距的图像获取单元采集所述识别对象的第一图像。The mobile technology processing unit is further configured to control a focal length of the image acquiring unit to be the recognized focal length, and acquire an image of the first object by an image acquiring unit that recognizes a focal length by a focal length.
  18. 根据权利要求17所述的系统,其特征在于,The system of claim 17 wherein:
    所述移动技术处理单元,还用于在确定识别完成之后,控制所述图像获取单元的焦距为中焦焦距。The mobile technology processing unit is further configured to control a focal length of the image acquiring unit to be a mid-focus focal length after determining that the identification is completed.
PCT/CN2018/072111 2018-01-10 2018-01-10 Image recognition method and system, electronic device, and computer program product WO2019136636A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/072111 WO2019136636A1 (en) 2018-01-10 2018-01-10 Image recognition method and system, electronic device, and computer program product
CN201880000060.XA CN108235816B (en) 2018-01-10 2018-01-10 Image recognition method, system, electronic device and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/072111 WO2019136636A1 (en) 2018-01-10 2018-01-10 Image recognition method and system, electronic device, and computer program product

Publications (1)

Publication Number Publication Date
WO2019136636A1 true WO2019136636A1 (en) 2019-07-18

Family

ID=62657703

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/072111 WO2019136636A1 (en) 2018-01-10 2018-01-10 Image recognition method and system, electronic device, and computer program product

Country Status (2)

Country Link
CN (1) CN108235816B (en)
WO (1) WO2019136636A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061898A (en) * 2019-12-13 2020-04-24 Oppo(重庆)智能科技有限公司 Image processing method, image processing device, computer equipment and storage medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110798608B (en) * 2018-08-02 2022-09-30 北京京东尚科信息技术有限公司 Method and device for identifying image
CN109271982B (en) * 2018-09-20 2020-11-10 西安艾润物联网技术服务有限责任公司 Method for identifying multiple identification areas, identification terminal and readable storage medium
CN109949359A (en) * 2019-02-14 2019-06-28 深兰科技(上海)有限公司 A kind of method and apparatus carrying out target detection based on SSD model
CN110059678A (en) * 2019-04-17 2019-07-26 上海肇观电子科技有限公司 A kind of detection method, device and computer readable storage medium
CN110232313A (en) * 2019-04-28 2019-09-13 南京览视医疗科技有限公司 A kind of eye recommended method, system, electronic equipment and storage medium
CN110213491B (en) * 2019-06-26 2021-06-29 Oppo广东移动通信有限公司 Focusing method, device and storage medium
CN110782692A (en) * 2019-10-31 2020-02-11 青岛海信网络科技股份有限公司 Signal lamp fault detection method and system
CN112601025B (en) * 2020-12-24 2022-07-05 深圳集智数字科技有限公司 Image acquisition method and device, and computer readable storage medium of equipment
CN114666501B (en) * 2022-03-17 2023-04-07 深圳市百泰实业股份有限公司 Intelligent control method for camera of wearable device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090102942A1 (en) * 2007-10-17 2009-04-23 Sony Corporation Composition determining apparatus, composition determining method, and program
CN101785306A (en) * 2007-07-13 2010-07-21 坦德伯格电信公司 Method and system for automatic camera control
CN103197491A (en) * 2013-03-28 2013-07-10 华为技术有限公司 Method capable of achieving rapid automatic focusing and image acquisition device
JP5788197B2 (en) * 2011-03-22 2015-09-30 オリンパス株式会社 Image processing apparatus, image processing method, image processing program, and imaging apparatus
CN106462766A (en) * 2014-06-09 2017-02-22 高通股份有限公司 Image capturing parameter adjustment in preview mode

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101783882B (en) * 2009-01-15 2011-08-10 华晶科技股份有限公司 Method and image capturing device for automatically determining scenario mode
TWI413846B (en) * 2009-09-16 2013-11-01 Altek Corp Continuous focus method of digital camera
CN101714262B (en) * 2009-12-10 2011-12-21 北京大学 Method for reconstructing three-dimensional scene of single image
US9286678B2 (en) * 2011-12-28 2016-03-15 Pelco, Inc. Camera calibration using feature identification
EP3286619B1 (en) * 2015-06-29 2024-08-21 Essilor International A scene image analysis module
CN105007431B (en) * 2015-07-03 2017-11-24 广东欧珀移动通信有限公司 A kind of picture shooting method and terminal based on a variety of photographed scenes
CN105357526B (en) * 2015-11-13 2016-10-26 西安交通大学 The mobile phone football video quality assessment device considering scene classification based on compression domain and method
CN106375448A (en) * 2016-09-05 2017-02-01 腾讯科技(深圳)有限公司 Image processing method, device and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101785306A (en) * 2007-07-13 2010-07-21 坦德伯格电信公司 Method and system for automatic camera control
US20090102942A1 (en) * 2007-10-17 2009-04-23 Sony Corporation Composition determining apparatus, composition determining method, and program
JP5788197B2 (en) * 2011-03-22 2015-09-30 オリンパス株式会社 Image processing apparatus, image processing method, image processing program, and imaging apparatus
CN103197491A (en) * 2013-03-28 2013-07-10 华为技术有限公司 Method capable of achieving rapid automatic focusing and image acquisition device
CN106462766A (en) * 2014-06-09 2017-02-22 高通股份有限公司 Image capturing parameter adjustment in preview mode

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111061898A (en) * 2019-12-13 2020-04-24 Oppo(重庆)智能科技有限公司 Image processing method, image processing device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN108235816A (en) 2018-06-29
CN108235816B (en) 2020-10-16

Similar Documents

Publication Publication Date Title
WO2019136636A1 (en) Image recognition method and system, electronic device, and computer program product
CN106462766B (en) Image capture parameters adjustment is carried out in preview mode
WO2019233266A1 (en) Image processing method, computer readable storage medium and electronic device
US8314854B2 (en) Apparatus and method for image recognition of facial areas in photographic images from a digital camera
CN111626371B (en) Image classification method, device, equipment and readable storage medium
JP2020523665A (en) Biological detection method and device, electronic device, and storage medium
EP3236391A1 (en) Object detection and recognition under out of focus conditions
EP2336949B1 (en) Apparatus and method for registering plurality of facial images for face recognition
CN111967319B (en) Living body detection method, device, equipment and storage medium based on infrared and visible light
CN102055844A (en) Method for realizing camera shutter function by means of gesture recognition and
CN104281839A (en) Body posture identification method and device
JP5640621B2 (en) Method for classifying red-eye object candidates, computer-readable medium, and image processing apparatus
CN106874825A (en) The training method of Face datection, detection method and device
CN110443181A (en) Face identification method and device
Yanagisawa et al. Face detection for comic images with deformable part model
WO2015131571A1 (en) Method and terminal for implementing image sequencing
US10956788B2 (en) Artificial neural network
CN110121723B (en) Artificial neural network
CN102759953A (en) Automatic camera
Foysal et al. Advancing AI-based Assistive Systems for Visually Impaired People: Multi-Class Object Detection and Currency Classification
CN104794445A (en) ARM platform based dynamic facial iris acquisition method
CN102223472B (en) Method for indicating user to judge if the lens of image acquisition device parallel to the subject
CN111191519B (en) Living body detection method for user access of mobile power supply device
CN110287841B (en) Image transmission method and apparatus, image transmission system, and storage medium
CN112307244A (en) Photographic picture screening system based on image significance detection and human eye state detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18899967

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18899967

Country of ref document: EP

Kind code of ref document: A1