WO2023102935A1 - Procédé de traitement de données d'image, terminal intelligent et support de stockage - Google Patents

Procédé de traitement de données d'image, terminal intelligent et support de stockage Download PDF

Info

Publication number
WO2023102935A1
WO2023102935A1 PCT/CN2021/137246 CN2021137246W WO2023102935A1 WO 2023102935 A1 WO2023102935 A1 WO 2023102935A1 CN 2021137246 W CN2021137246 W CN 2021137246W WO 2023102935 A1 WO2023102935 A1 WO 2023102935A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
scene
optionally
semantic information
Prior art date
Application number
PCT/CN2021/137246
Other languages
English (en)
Chinese (zh)
Inventor
应贲
Original Assignee
深圳传音控股股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳传音控股股份有限公司 filed Critical 深圳传音控股股份有限公司
Priority to PCT/CN2021/137246 priority Critical patent/WO2023102935A1/fr
Publication of WO2023102935A1 publication Critical patent/WO2023102935A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/80Creating or modifying a manually drawn or painted image using a manual input device, e.g. mouse, light pen, direction keys on keyboard

Definitions

  • the present application relates to the technical field of image data processing, and in particular to an image data processing method, an intelligent terminal and a storage medium.
  • Computational photography refers to digital image capture and processing techniques that use digital computation rather than optical processing. Computational photography can increase the capabilities of camera equipment, or introduce more features than film-based photography, or reduce the cost or size of camera elements.
  • the present application provides an image data processing method, a smart terminal and a storage medium, so as to uniformly standardize the semantic information of computational photography in the data flow involved in computational photography.
  • the present application provides a method for processing image data, comprising the following steps:
  • the image information includes basic image information and image data.
  • the image data is the image itself.
  • basic image information can also be referred to as basic image description information, which can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width, and storage mode.
  • the image description information identifier is used to identify the "basic description information" field of the image.
  • the length of the basic description information indicates the total length of the basic description information field, including the image description information identifier.
  • the image type identifier is used to identify whether the image data type is a single-frame image, multi-frame image or video stream.
  • the image length is the length of the image data itself.
  • the image width that is, the width of the image data itself.
  • image color space image data color space description, such as RGGB, RGBW, RYYB, etc.
  • bitwidth the number of bits per component of the image.
  • the storage mode refers to the arrangement mode of each pixel of each component in the image color space in a storage space (such as memory, flash memory, or hard disk, etc.).
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following: depth information, scene classification information, instance segmentation information, and object detection information.
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to characterize the segmentation information of the instance in the image.
  • the depth information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains indication information of an infinite part, where the infinite part is a distance beyond the farthest distance that the device can detect.
  • the semantic information includes depth information
  • the step S1 includes: obtaining the depth information based on the image information through a laser ranging radar and/or a depth information analysis network.
  • the depth information parsing network is used to parse image information to generate depth information.
  • Step S1 includes: extracting image scene features of the image based on the image information; determining or generating the scene classification information of the image according to the image scene features.
  • determining or generating the scene classification information of the image according to the scene characteristics of the image includes: inputting the scene characteristics of the image into the scene classification model, and obtaining the probability that the image output by the scene classification model corresponds to at least one scene; Among the probabilities of at least one scene, it is determined that the scene corresponding to the maximum probability is the scene classification information of the image.
  • the scene classification model is used to determine the probability that the image corresponds to at least one scene.
  • step S2 includes: based on a preset format, filling semantic information in a reserved field of the image data stream.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • the method further includes: determining the identification information corresponding to the semantic information according to the preset corresponding relationship.
  • the semantic information includes at least one of scene classification information, instance segmentation information, and target detection information;
  • the identification information includes at least one of scene ID, instance ID, and target ID, and the preset correspondence includes scene name and At least one of the corresponding relationship between scene IDs, the corresponding relationship between instance names and instance IDs, and the corresponding relationship between target names and target IDs.
  • the present application also provides an image data processing method, comprising the following steps:
  • S20 Perform preset processing according to the semantic information.
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains an indication of an infinity part, which is a distance beyond the furthest distance that the device can detect;
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to represent the instance segmentation information in the image.
  • step S10 includes: reading semantic information in a reserved field of the image data stream based on a preset format.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • step S20 includes at least one of the following:
  • Semantic information includes scene classification information, according to the scene classification information, adjust the target parameters of the camera in the corresponding scene;
  • Semantic information includes target detection information, according to the target detection information, adjust the target of the camera's automatic focus;
  • Semantic information includes instance segmentation information, according to the instance segmentation information, the target instance in the image is obtained; according to the target instance, the image is preset.
  • the preset processing includes at least one of instance blurring, instance deformation, instance color retention, performing lut mapping on instances, and performing lut mapping on backgrounds.
  • the target parameters include at least one of automatic exposure parameters, display lookup tables, automatic focus parameters and white balance parameters.
  • the present application also provides an image data processing device, including:
  • a processing module configured to determine or generate semantic information of the image based on the image information
  • the saving module is used for saving semantic information in the image data stream based on a preset format.
  • the image information includes basic image information and image data.
  • the image data is the image itself.
  • the basic image information can also be called the basic description information of the image, which can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • image description information identification can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • Image description information identifier used to identify the "basic description information" field of the image
  • Basic description information length indicating the total length of the basic description information field, including the image description information identifier
  • Image type identification which is used to identify whether the image data type is a single-frame image, multi-frame image or video stream
  • Image length that is, the length of the image data itself
  • Image width that is, the width of the image data itself
  • Image color space image data color space description, such as RGGB, RGBW, RYYB, etc.;
  • Bit width the number of bits per component of the image
  • Storage method the arrangement method of each pixel of each component in the image color space in the storage space (such as memory, flash memory, or hard disk, etc.).
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following: depth information, scene classification information, instance segmentation information, and object detection information.
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to represent the instance segmentation information in the image.
  • the depth information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used when imaging the image;
  • the semantic information includes depth information
  • the processing module is specifically configured to: obtain the depth information through a laser ranging radar and/or a depth information analysis network based on image information.
  • the depth information parsing network is used to parse image information to generate depth information.
  • the semantic information includes scene classification information
  • the processing module is further configured to: extract the image scene features of the image based on the image information; determine or generate the scene classification information of the image according to the image scene features.
  • the processing module is also used to: input the image scene features into the scene classification model to obtain the probability that the image output by the scene classification model corresponds to at least one scene; among the probability that the image corresponds to at least one scene, determine the maximum probability The corresponding scene is scene classification information of the image.
  • the scene classification model is used to determine the probability that the image corresponds to at least one scene.
  • the saving module is specifically configured to: fill semantic information in a reserved field of the image data stream based on a preset format.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • the saving module is further configured to: based on a preset format, before saving the semantic information in the image data stream, determine the identification information corresponding to the semantic information according to a preset correspondence.
  • the semantic information includes at least one of scene classification information, instance segmentation information, and target detection information;
  • the identification information includes at least one of scene ID, instance ID, and target ID, and the preset correspondence includes scene name and At least one of the corresponding relationship between scene IDs, the corresponding relationship between instance names and instance IDs, and the corresponding relationship between target names and target IDs.
  • the present application also provides an image data processing device, including:
  • An acquisition module configured to acquire semantic information from image data streams based on a preset format
  • the processing module is used for performing preset processing according to the semantic information.
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains an indication of an infinity part, which is a distance beyond the furthest distance that the device can detect;
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to represent the instance segmentation information in the image.
  • the obtaining module is specifically configured to: read semantic information in a reserved field of the image data stream based on a preset format.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • processing module is specifically used for at least one of the following:
  • Semantic information includes scene classification information, according to the scene classification information, adjust the target parameters of the camera in the corresponding scene;
  • Semantic information includes target detection information, according to the target detection information, adjust the target of the camera's automatic focus;
  • Semantic information includes instance segmentation information, according to the instance segmentation information, the target instance in the image is obtained; according to the target instance, the image is preset.
  • the preset processing includes at least one of instance blurring, instance deformation, instance color retention, performing lut mapping on instances, and performing lut mapping on backgrounds.
  • the target parameters include at least one of automatic exposure parameters, display lookup tables, automatic focus parameters and white balance parameters.
  • the present application also provides an intelligent terminal, including: a memory and a processor, wherein an image data processing program is stored in the memory, and when the image data processing program is executed by the processor, the steps of any one of the above image data processing methods are implemented.
  • the present application also provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the steps of any one of the above-mentioned image data processing methods are realized.
  • the present application also provides a computer program product, the computer program product includes a computer program; when the computer program is executed, the steps of any one of the image data processing methods above are realized.
  • the image data processing method of the present application determines or generates the semantic information of the image based on the image information, and the semantic information is used to interpret the image; based on the preset format, the semantic information is stored in the data stream of the image.
  • the semantic information for interpreting the image is stored in the data stream of the image, so as to uniformly standardize the semantic information of computational photography in the data stream involved in computational photography.
  • FIG. 1 is a schematic diagram of a hardware structure of an intelligent terminal implementing various embodiments of the present application
  • FIG. 2 is a system architecture diagram of a communication network provided by an embodiment of the present application.
  • Fig. 3 is a schematic flowchart of an image data processing method according to a first embodiment
  • FIG. 4 is an example diagram of a depth image shown in an embodiment of the present application.
  • FIG. 5 is an example diagram of instance segmentation information shown in an embodiment of the present application.
  • Fig. 6 is a schematic flowchart of an image data processing method according to a second embodiment
  • Fig. 7 is a schematic structural diagram of an image data processing device according to a third embodiment
  • Fig. 8 is a schematic structural diagram of an image data processing device according to a fourth embodiment
  • Fig. 9 is a schematic structural diagram of a smart terminal according to a fifth embodiment.
  • first, second, third, etc. may be used herein to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this document, first information may also be called second information, and similarly, second information may also be called first information.
  • first information may also be called second information, and similarly, second information may also be called first information.
  • second information may also be called first information.
  • the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination”.
  • the singular forms "a”, “an” and “the” are intended to include the plural forms as well, unless the context indicates otherwise.
  • A, B, C means “any of the following: A; B; C; A and B; A and C; B and C; A and B and C
  • A, B or C or "A, B and/or C” means "any of the following: A; B; C; A and B; A and C; B and C; A and B and C”. Exceptions to this definition will only arise when combinations of elements, functions, steps or operations are inherently mutually exclusive in some way.
  • the words “if”, “if” as used herein may be interpreted as “at” or “when” or “in response to determining” or “in response to detecting”.
  • the phrases “if determined” or “if detected (the stated condition or event)” could be interpreted as “when determined” or “in response to the determination” or “when detected (the stated condition or event) )” or “in response to detection of (a stated condition or event)”.
  • step codes such as S1, S2, S10, and S20 are used for the purpose of expressing the corresponding content more clearly and concisely, and do not constitute a substantive limitation on the order.
  • S2 may be executed first and then S1, or S20 may be executed first and then S10, etc., but these should be within the protection scope of the present application.
  • Smart terminals can be implemented in various forms.
  • the smart terminals described in this application may include mobile phones, tablet computers, notebook computers, palmtop computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), navigation devices, Smart terminals such as wearable devices, smart bracelets, and pedometers, as well as fixed terminals such as digital TVs and desktop computers.
  • PDA Personal Digital Assistant
  • PMP portable media players
  • navigation devices Smart terminals such as wearable devices, smart bracelets, and pedometers
  • Smart terminals such as wearable devices, smart bracelets, and pedometers
  • fixed terminals such as digital TVs and desktop computers.
  • a smart terminal will be taken as an example, and those skilled in the art will understand that, in addition to elements specially used for mobile purposes, the configurations according to the embodiments of the present application can also be applied to fixed-type terminals.
  • FIG. 1 is a schematic diagram of the hardware structure of a smart terminal implementing various embodiments of the present application.
  • the smart terminal 100 may include: an RF (Radio Frequency, radio frequency) unit 101, a WiFi module 102, an audio output unit 103, an /V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111 and other components.
  • RF Radio Frequency, radio frequency
  • the radio frequency unit 101 can be used for sending and receiving information or receiving and sending signals during a call. Specifically, after receiving the downlink information of the base station, it is processed by the processor 110; in addition, the uplink data is sent to the base station.
  • the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
  • the radio frequency unit 101 can also communicate with the network and other devices through wireless communication.
  • the above wireless communication can use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication, Global System for Mobile Communications), GPRS (General Packet Radio Service, General Packet Radio Service), CDMA2000 (Code Division Multiple Access 2000 , Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access, Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access, Time Division Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division Duplexing-Long Term Evolution, frequency division duplex long-term evolution), TDD-LTE (Time Division Duplexing-Long Term Evolution, time-division duplex long-term evolution) and 5G, etc.
  • GSM Global System of Mobile communication, Global System for Mobile Communications
  • GPRS General Packet Radio Service
  • CDMA2000 Code Division Multiple Access 2000
  • WCDMA Wideband Code Division Multiple Access
  • TD-SCDMA Time Division-Synchronous Code Division Multiple Access, Time Division Synchro
  • WiFi is a short-distance wireless transmission technology.
  • the smart terminal can help users send and receive emails, browse web pages, and access streaming media, etc., and it provides users with wireless broadband Internet access.
  • Fig. 1 shows the WiFi module 102, it can be understood that it is not an essential component of the smart terminal, and can be completely omitted as required without changing the essence of the invention.
  • the audio output unit 103 can store the information received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 when the smart terminal 100 is in a call signal receiving mode, a call mode, a recording mode, a voice recognition mode, a broadcast receiving mode, or the like.
  • the audio data is converted into an audio signal and output as sound.
  • the audio output unit 103 can also provide audio output related to specific functions performed by the smart terminal 100 (optionally, call signal receiving sound, message receiving sound, etc.).
  • the audio output unit 103 may include a speaker, a buzzer, and the like.
  • the A/V input unit 104 is used to receive audio or video signals.
  • the A/V input unit 104 may include a graphics processor (Graphics Processing Unit, GPU) 1041 and a microphone 1042, and the graphics processor 1041 is used for still images obtained by an image capture device (such as a camera) or The image data of the video is processed.
  • the processed image can be displayed on the display unit 106 .
  • the image processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or sent via the radio frequency unit 101 or the WiFi module 102 .
  • the microphone 1042 can receive sound (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, and the like operating modes, and can process such sound as audio data.
  • the processed audio (voice) data can be converted into a format transmittable to a mobile communication base station via the radio frequency unit 101 for output in case of a phone call mode.
  • the microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the process of receiving and transmitting audio signals.
  • the smart terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor includes an ambient light sensor and a proximity sensor.
  • the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of the ambient light, and the proximity sensor can turn off the display when the smart terminal 100 moves to the ear. panel 1061 and/or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be used for applications that recognize the posture of mobile phones (such as horizontal and vertical screen switching, related Games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tap), etc.; as for mobile phones, fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, Other sensors such as thermometers and infrared sensors will not be described in detail here.
  • the display unit 106 is used to display information input by the user or information provided to the user.
  • the display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an organic light-emitting diode (Organic Light-Emitting Diode, OLED), or the like.
  • LCD Liquid Crystal Display
  • OLED Organic Light-Emitting Diode
  • the user input unit 107 can be used to receive input numbers or character information, and generate key signal input related to user settings and function control of the smart terminal.
  • the user input unit 107 may include a touch panel 1071 and other input devices 1072 .
  • the touch panel 1071 also referred to as a touch screen, can collect touch operations of the user on or near it (for example, the user uses any suitable object or accessory such as a finger or a stylus on the touch panel 1071 or near the touch panel 1071). operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 1071 may include two parts, a touch detection device and a touch controller.
  • the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into contact coordinates , and then sent to the processor 110, and can receive the command sent by the processor 110 and execute it.
  • the touch panel 1071 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the user input unit 107 may also include other input devices 1072 .
  • other input devices 1072 may include, but are not limited to, one or more of physical keyboards, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, etc., which are not specifically described here. limited.
  • the touch panel 1071 may cover the display panel 1061.
  • the touch panel 1071 detects a touch operation on or near it, it transmits to the processor 110 to determine the type of the touch event, and then the processor 110 determines the touch event according to the touch event.
  • the corresponding visual output is provided on the display panel 1061 .
  • the touch panel 1071 and the display panel 1061 are used as two independent components to realize the input and output functions of the smart terminal, in some embodiments, the touch panel 1071 and the display panel 1061 can be integrated.
  • the implementation of the input and output functions of the smart terminal is not specifically limited here.
  • the interface unit 108 is used as an interface through which at least one external device can be connected with the smart terminal 100 .
  • the external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, an audio input /Output (I/O) ports, video I/O ports, headphone ports, and more.
  • the interface unit 108 may be used to receive input from an external device (optionally, data information, power, etc.) and transmit the received input to one or more components within the smart terminal 100 or may be used to transfer data to and from external devices.
  • the memory 109 can be used to store software programs as well as various data.
  • the memory 109 can mainly include a storage program area and a storage data area.
  • the storage program area can store an operating system, at least one function required application program (such as a sound playback function, an image playback function, etc.) etc.
  • the storage data area can be Store data (such as audio data, phone book, etc.) created according to the use of the mobile phone.
  • the memory 109 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
  • the processor 110 is the control center of the smart terminal, and uses various interfaces and lines to connect various parts of the whole smart terminal, by running or executing software programs and/or modules stored in the memory 109, and calling data stored in the memory 109 , execute various functions of the smart terminal and process data, so as to monitor the smart terminal as a whole.
  • the processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor.
  • the application processor mainly processes operating systems, user interfaces, and application programs, etc.
  • the demodulation processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 110 .
  • the smart terminal 100 can also include a power supply 111 (such as a battery) for supplying power to various components.
  • a power supply 111 (such as a battery) for supplying power to various components.
  • the power supply 111 can be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power consumption through the power management system. and other functions.
  • the smart terminal 100 may also include a Bluetooth module, etc., which will not be repeated here.
  • the following describes the communication network system on which the smart terminal of the present application is based.
  • FIG. 2 is a structure diagram of a communication network system provided by an embodiment of the present application.
  • the communication network system is an LTE system of general mobile communication technology.
  • 201 E-UTRAN (Evolved UMTS Terrestrial Radio Access Network, Evolved UMTS Terrestrial Radio Access Network) 202, EPC (Evolved Packet Core, Evolved Packet Core Network) 203 and the operator's IP service 204.
  • E-UTRAN Evolved UMTS Terrestrial Radio Access Network
  • EPC Evolved Packet Core, Evolved Packet Core Network
  • the UE 201 may be the above-mentioned terminal 100, which will not be repeated here.
  • E-UTRAN 202 includes eNodeB 2021 and other eNodeB 2022 and so on.
  • the eNodeB 2021 can be connected to other eNodeB 2022 through a backhaul (for example, X2 interface), the eNodeB 2021 is connected to the EPC 203 , and the eNodeB 2021 can provide access from the UE 201 to the EPC 203 .
  • a backhaul for example, X2 interface
  • EPC203 may include MME (Mobility Management Entity, Mobility Management Entity) 2031, HSS (Home Subscriber Server, Home Subscriber Server) 2032, other MME2033, SGW (Serving Gate Way, Serving Gateway) 2034, PGW (PDN Gate Way, packet data Network Gateway) 2035 and PCRF (Policy and Charging Rules Function, Policy and Charging Functional Entity) 2036, etc.
  • MME2031 is a control node that handles signaling between UE201 and EPC203, and provides bearer and connection management.
  • HSS2032 is used to provide some registers to manage functions such as home location register (not shown in the figure), and save some user-specific information about service features and data rates.
  • PCRF2036 is the policy and charging control policy decision point of service data flow and IP bearer resources, it is the policy and charging execution function A unit (not shown) selects and provides available policy and charging control decisions.
  • the IP service 204 may include Internet, Intranet, IMS (IP Multimedia Subsystem, IP Multimedia Subsystem) or other IP services.
  • IMS IP Multimedia Subsystem, IP Multimedia Subsystem
  • LTE system is used as an example above, those skilled in the art should know that this application is not only applicable to the LTE system, but also applicable to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA and future new wireless communication systems.
  • the network system (such as 5G), etc., is not limited here.
  • the present application provides an image data processing method, an intelligent terminal and a storage medium.
  • the semantic information for interpreting images is stored based on a preset format, so that in the data flow involved in computational photography Unified specification of semantic information for computational photography.
  • Fig. 3 is a schematic flowchart of an image data processing method according to the first embodiment.
  • An embodiment of the present application provides an image data processing method, which is optionally applied to a smart terminal such as the aforementioned smart terminal. As shown in Figure 3, the image data processing method includes the following steps:
  • S1 Based on the image information, determine or generate the semantic information of the image.
  • the image information includes basic image information and image data.
  • the image data is the image itself.
  • the basic image information can also be called the basic description information of the image, which can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • image description information identification can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • Image description information identifier used to identify the "basic description information" field of the image
  • Basic description information length indicating the total length of the basic description information field, including the image description information identifier
  • Image type identification which is used to identify whether the image data type is a single-frame image, multi-frame image or video stream
  • Image length that is, the length of the image data itself
  • Image width that is, the width of the image data itself
  • Image color space image data color space description, such as RGGB (Bayer filter, Bayer filter, also known as RGBG or GRGB), RGBW (adding white sub-pixels (W) to the original RGB three primary colors), RYYB (with two A yellow sub-pixel (Y) replaces two green sub-pixels (G)), etc.;
  • Bit width the number of bits (bits) for each component of the image
  • Storage method the arrangement method of each pixel of each component in the image color space in the storage space (such as memory, flash memory, or hard disk, etc.).
  • semantic information is used to interpret the image.
  • common semantic information includes image depth information, scene classification information, instance segmentation information, and object detection information.
  • semantic information in this embodiment of the present application may include at least one of the following: depth information, scene classification information, instance segmentation information, and object detection information, but not limited thereto.
  • the depth information may include at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains indication information of an infinite part, where the infinite part is a distance beyond the farthest distance that the device can detect. It can be understood that the information within a certain distance range that the device can detect includes the shortest distance and the furthest distance. If the furthest distance exceeds this distance, it can be represented by the maximum value corresponding to infinity; while using the next largest value, To indicate the farthest distance that can be detected currently.
  • the distance between the maximum value and the minimum value will be equally divided into 256 parts, and all pixels will be quantized into 256 parts.
  • a depth image with the same resolution as the original image can be generated, as shown in Figure 4, which will be attached to the computational photography data stream as another channel of the image. It should be noted that with the development of equipment performance, 2 to the 8th power of 256 can also be expanded to a range of 512 or higher. At this time, the distance accuracy that can be provided will be greatly improved.
  • Table 1 is an expression method of depth information of an image with sky:
  • the scene classification information is used to characterize the scene represented by the image. It can be understood that an image can be divided into multiple scenes in most cases. For example, a cake image in a birthday party scene can be expressed as a party scene or a food scene. Therefore, when expressing the scene, the probability of the five scenes that the image most likely belongs to will be listed, such as:
  • ⁇ 1:0.5 ⁇ means that the probability that the image belongs to the scene whose scene ID is "1" is 0.5; ⁇ 22:0.2 ⁇ means that the probability that the image belongs to the scene whose scene ID is "22" is 0.2; ⁇ 25:0.15 ⁇ means that the probability that the image belongs to the scene whose scene ID is "25” is 0.15; ⁇ 45:0.1 ⁇ means that the probability that the image belongs to the scene whose scene ID is "45” is 0.1; ⁇ 55:0.05 ⁇ means that the image belongs to the scene whose ID is " The probability for the 55" scenario is 0.05.
  • the scene classification information needs a dictionary, which is used to resolve the number expressed in the form of numbers to which scene it belongs to.
  • a dictionary which is used to resolve the number expressed in the form of numbers to which scene it belongs to.
  • examples of different scenarios are highlighted by Table 2.
  • the first and third columns in Table 2 are scene IDs, expressed as numbers in digital form, which scene they belong to; the second and fourth columns are instances belonging to this scene.
  • the instance segmentation information is used to characterize the segmentation information of the instance in the image.
  • a matrix consistent with the resolution of the image such as 800*600, which can be expressed as a channel of the image
  • the instance segmentation information in the image can be used to express the instance segmentation information in the image.
  • Figure 5 where 0 is the background, and 2/15/35 are the instance IDs, that is, the IDs of the name information corresponding to the instances.
  • the target detection information of the image it can be understood that for the target that can be detected in the image, the form of ⁇ target ID: target coordinate ⁇ can be used to save the target list.
  • a dictionary is required to resolve.
  • the dictionary may use the dictionary used in the above example segmentation.
  • Combination plan depth information scene classification information Instance segmentation information target detection information
  • Combination Example 1 yes yes yes no Combination example 2 yes yes no no no Combination Example 3 yes no yes yes Combination Example 4 no yes yes yes
  • the semantic information includes depth information, scene classification information and instance segmentation information.
  • the semantic information includes depth information and scene classification information.
  • the semantic information includes depth information, instance segmentation information and object detection information.
  • the semantic information includes scene classification information, instance segmentation information and object detection information.
  • the semantic information includes depth information, scene classification information, instance segmentation information and object detection information.
  • the semantic information in the data stream may include instance segmentation information, so that the instance segmentation information is used to obtain specific instances in the image, so that a single shot can be used to achieve the effect of blurring instances; or, the semantic information in the data stream Target detection information can be included, so that the camera can adjust the auto-focus object in a targeted manner, so that the focus can be focused on the target to be photographed, and the imaging quality can be improved; or, the semantic information in the data stream can include scene classification information, to Adjust the 3A parameters for this scene.
  • the 3A parameters are auto focus (AF), auto exposure (AE) and auto white balance (AWB).
  • 3A digital imaging technology utilizes auto-focus algorithm, auto-exposure algorithm and auto-white balance algorithm to maximize the image contrast, improve the over-exposure or under-exposure of the main subject, and compensate the chromatic aberration of the picture under different light irradiation, so as to present a brighter image. High-quality image information.
  • the camera adopting 3A digital imaging technology can well guarantee the accurate color reproduction of the image, presenting a perfect day and night monitoring effect.
  • the semantic information can be not only the above four items, but also more other semantic information, so the information header needs to leave a sufficient length.
  • step S1 may include: based on the image information, obtaining the depth information through a laser ranging radar and/or a depth information analysis network.
  • the depth information parsing network is used to parse image information to generate depth information.
  • step S1 may include: extracting image scene features of the image based on the image information; determining or generating scene classification information of the image according to the image scene features.
  • determining or generating the scene classification information of the image according to the scene characteristics of the image includes: inputting the scene characteristics of the image into the scene classification model, and obtaining the probability that the image output by the scene classification model corresponds to at least one scene; Among the probabilities of at least one scene, it is determined that the scene corresponding to the maximum probability is the scene classification information of the image.
  • the scene classification model is used to determine the probability that the image corresponds to at least one scene.
  • S2 Preserve semantic information in the image data stream based on a preset format.
  • this step includes: based on a preset format, filling semantic information in a reserved field of the image data stream.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the table is shown in Table 1, and the matrix is shown in FIG. 5 .
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • the information header can be expressed as: Frame semantic info include:0 0 0 1, that is, only the fourth item is included; the information header can also be expressed as: Frame semantic info include:0 0 0 0.
  • the information header specifically corresponds to a lookup table containing semantic information, such as:
  • the field of the information header is a variable-length field.
  • the information header indicates that a certain semantic information exists, that is, when it is expressed in the form of 0 0 0 1, this field reserves enough length for the corresponding semantic information to be used for Express this semantic information.
  • the image data processing method may further include: determining identification information corresponding to the semantic information according to a preset correspondence relationship.
  • the semantic information includes at least one of scene classification information, instance segmentation information, and target detection information;
  • the identification information includes at least one of scene ID, instance ID, and target ID, and the preset correspondence includes scene name and At least one of the corresponding relationship between scene IDs, the corresponding relationship between instance names and instance IDs, and the corresponding relationship between target names and target IDs.
  • the preset correspondence relationship may be specifically presented in the form of a dictionary, but this embodiment of the present application is not limited thereto, and may be set accordingly according to actual needs.
  • the image data processing method of the embodiment of the present application determines or generates semantic information of the image based on image information, and the semantic information is used to interpret the image; based on a preset format, the semantic information is stored in the data stream of the image.
  • the semantic information for interpreting the image is stored in the data stream of the image, so as to uniformly standardize the semantic information of computational photography in the data stream involved in computational photography.
  • the image data processing method may further include: acquiring semantic information from the image data stream based on a preset format; and performing preset processing according to the semantic information.
  • acquiring semantic information from the image data stream based on a preset format may further include: acquiring semantic information from the image data stream based on a preset format; and performing preset processing according to the semantic information.
  • Fig. 6 is a schematic flowchart of an image data processing method according to a second embodiment.
  • An embodiment of the present application provides an image data processing method, which is applied to computational photography of a smart terminal such as the aforementioned smart terminal. As shown in Figure 6, the image data processing method includes the following steps:
  • S10 Obtain semantic information from image data streams based on a preset format.
  • semantic information is used to interpret the image.
  • common semantic information includes image depth information, scene classification information, instance segmentation information, and object detection information.
  • semantic information in this embodiment of the present application may include at least one of the following: depth information, scene classification information, instance segmentation information, and object detection information, but not limited thereto.
  • the depth information may include at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains indication information of an infinite part, where the infinite part is a distance beyond the farthest distance that the device can detect. It can be understood that the information within a certain distance range that the device can detect includes the shortest distance and the furthest distance. If the furthest distance exceeds this distance, it can be represented by the maximum value corresponding to infinity; while using the next largest value, To indicate the farthest distance that can be detected currently.
  • the distance between the maximum value and the minimum value will be equally divided into 256 parts, and all pixels will be quantized into 256 parts.
  • a depth image with the same resolution as the original image can be generated, as shown in Figure 4, which will be attached to the computational photography data stream as another channel of the image. It should be noted that with the development of equipment performance, 2 to the 8th power of 256 can also be expanded to a range of 512 or higher. At this time, the distance accuracy that can be provided will be greatly improved.
  • Table 1 shows a method for expressing depth information of an image with sky.
  • the scene classification information is used to characterize the scene represented by the image. It can be understood that an image can be divided into multiple scenes in most cases. For example, a cake image in a birthday party scene can be expressed as a party scene or a food scene. Therefore, when expressing the scene, the probability of the five scenes that the image most likely belongs to will be listed, such as:
  • ⁇ 1:0.5 ⁇ means that the probability that the image belongs to the scene whose scene ID is "1" is 0.5; ⁇ 22:0.2 ⁇ means that the probability that the image belongs to the scene whose scene ID is "22" is 0.2; ⁇ 25:0.15 ⁇ means that the probability that the image belongs to the scene whose scene ID is "25” is 0.15; ⁇ 45:0.1 ⁇ means that the probability that the image belongs to the scene whose scene ID is "45” is 0.1; ⁇ 55:0.05 ⁇ means that the image belongs to the scene whose ID is " The probability for the 55" scenario is 0.05.
  • the scene classification information requires a dictionary for parsing the number expressed in digital form to which scene it belongs to, as shown in Table 2 above.
  • the first column and the third column in Table 2 are scene IDs, expressed as numbers in digital form, which scene they belong to; the second column and fourth column are instances belonging to the scene.
  • the instance segmentation information is used to characterize the segmentation information of the instance in the image.
  • a matrix consistent with the resolution of the image such as 800*600, which can be expressed as a channel of the image
  • the instance segmentation information in the image can be used to express the instance segmentation information in the image.
  • Figure 5 where 0 is the background, and 2/15/35 are the instance IDs, that is, the IDs of the name information corresponding to the instances.
  • the target detection information of the image it can be understood that for the target that can be detected in the image, the form of ⁇ target ID: target coordinate ⁇ can be used to save the target list.
  • a dictionary is required to resolve.
  • the dictionary may use the dictionary used in the above example segmentation.
  • the semantic information includes depth information, scene classification information and instance segmentation information.
  • the semantic information includes depth information and scene classification information.
  • the semantic information includes depth information, instance segmentation information and object detection information.
  • the semantic information includes scene classification information, instance segmentation information and object detection information.
  • the semantic information includes depth information, scene classification information, instance segmentation information and object detection information.
  • the semantic information can be not only the above four items, but also more other semantic information, so the information header needs to leave a sufficient length.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the table is shown in Table 1, and the matrix is shown in FIG. 5 .
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • the information header can be expressed as: Frame semantic info include:0 0 0 1, that is, only the fourth item is included; the information header can also be expressed as: Frame semantic info include:0 0 0 0.
  • the information header specifically corresponds to a lookup table containing semantic information, such as:
  • the field of the information header is a variable-length field.
  • the information header indicates that a certain semantic information exists, that is, when it is expressed in the form of 0 0 0 1, this field reserves enough length for the corresponding semantic information to be used for Express this semantic information.
  • the step S10 includes: reading semantic information in a reserved field of the image data stream based on a preset format.
  • S20 Perform preset processing according to the semantic information.
  • step S20 may include: adjusting the target parameters of the camera in the corresponding scene according to the scene classification information.
  • the target parameters include at least one of 3A parameters, a display lookup table, or other parameters related to imaging quality.
  • the 3A parameters are auto focus (AF) parameters, auto exposure (AE) parameters and auto white balance (AWB) parameters.
  • 3A digital imaging technology utilizes auto-focus algorithm, auto-exposure algorithm and auto-white balance algorithm to maximize the image contrast, improve the over-exposure or under-exposure of the main subject, and compensate the chromatic aberration of the picture under different light irradiation, so as to present a brighter image. High-quality image information.
  • the camera adopting 3A digital imaging technology can well guarantee the accurate color reproduction of the image, presenting a perfect day and night monitoring effect.
  • the semantic information includes instance segmentation information.
  • step S20 may include: obtaining the target instance in the image according to the instance segmentation information; and performing preset processing on the image according to the target instance.
  • the preset processing may include at least one of processing such as instance blurring, instance deformation, instance color retention, mapping processing for the instance, and mapping processing for the background.
  • the semantic information includes instance segmentation information
  • the instance segmentation information can be used to obtain specific instances in the image, so as to achieve instance blurring, instance deformation, instance color retention, mapping processing for instances, and mapping processing for backgrounds, etc. after effect.
  • step S20 may include: adjusting the camera's auto-focus target according to the target detection information, so that the camera can specifically adjust the auto-focus target, so that the focus is on the target to be photographed, and the imaging quality is improved.
  • semantic information is obtained from an image data stream based on a preset format, and the semantic information is used to interpret the image; preset processing is performed according to the semantic information.
  • the semantic information used to interpret the image is stored in the image data stream based on the preset format.
  • the semantic information of computational photography can be uniformly standardized in the data stream involved in computational photography; When computational photography performs related processing, for different applications of images, it only needs to obtain corresponding semantic information from the data stream, without repeating the same processing on images, thereby avoiding the waste of computing resources.
  • the image data processing method may further include: determining or generating the semantic information of the image based on the image information; storing the semantic information in the data stream of the image based on a preset format .
  • the image data processing method may further include: determining or generating the semantic information of the image based on the image information; storing the semantic information in the data stream of the image based on a preset format .
  • Fig. 7 is a schematic structural diagram of an image data processing device according to a third embodiment.
  • An embodiment of the present application provides an image data processing device.
  • the image data processing device 70 includes:
  • a processing module 71 configured to determine or generate semantic information of the image based on the image information
  • the saving module 72 is configured to save semantic information in the image data stream based on a preset format.
  • the image information includes basic image information and image data.
  • the image data is the image itself.
  • the basic image information can also be called the basic description information of the image, which can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • image description information identification can include image description information identification, basic description information length, image type identification, image length, image width, image color space, bit width and storage method, etc.
  • Image description information identifier used to identify the "basic description information" field of the image
  • Basic description information length indicating the total length of the basic description information field, including the image description information identifier
  • Image type identification which is used to identify whether the image data type is a single-frame image, multi-frame image or video stream
  • Image length that is, the length of the image data itself
  • Image width that is, the width of the image data itself
  • Image color space image data color space description, such as RGBGB (also called RGBG or GRGB), RGBW, RYYB, etc.;
  • Bit width the number of bits per component of the image
  • Storage method the arrangement method of each pixel of each component in the image color space in the storage space (such as memory, flash memory, or hard disk, etc.).
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following: depth information, scene classification information, instance segmentation information, and object detection information.
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to represent the instance segmentation information in the image.
  • the depth information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • An indication of whether the image contains an infinity portion which is a distance beyond the furthest the device can detect. It can be understood that the information within a certain distance range that the device can detect includes the shortest distance and the furthest distance. If the furthest distance exceeds this distance, it can be represented by the maximum value corresponding to infinity; while using the next largest value, To indicate the farthest distance that can be detected currently.
  • the semantic information includes depth information
  • the processing module 71 is specifically configured to: obtain the depth information based on the image information through a laser ranging radar and/or a depth information analysis network.
  • the depth information parsing network is used to parse image information to generate depth information.
  • the semantic information includes scene classification information
  • the processing module 71 is further configured to: extract the image scene features of the image based on the image information; determine or generate the scene classification information of the image according to the image scene features.
  • the processing module 71 is further configured to: input image scene features into the scene classification model to obtain the probability that the image output by the scene classification model corresponds to at least one scene; among the probability that the image corresponds to at least one scene, determine the maximum The scene corresponding to the probability is the scene classification information of the image.
  • the scene classification model is used to determine the probability that the image corresponds to at least one scene.
  • the saving module 72 is specifically configured to: fill semantic information in a reserved field of the image data stream based on a preset format.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • the saving module is further configured to: based on a preset format, before saving the semantic information in the image data stream, determine the identification information corresponding to the semantic information according to a preset correspondence.
  • the semantic information includes at least one of scene classification information, instance segmentation information, and target detection information;
  • the identification information includes at least one of scene ID, instance ID, and target ID, and the preset correspondence includes scene name and At least one of the corresponding relationship between scene IDs, the corresponding relationship between instance names and instance IDs, and the corresponding relationship between target names and target IDs.
  • the processing module 71 may also be configured to: acquire semantic information from the image data stream based on a preset format; perform preset processing according to the semantic information.
  • the processing module 71 may also be configured to: acquire semantic information from the image data stream based on a preset format; perform preset processing according to the semantic information.
  • Fig. 8 is a schematic structural diagram of an image data processing device according to a fourth embodiment.
  • An embodiment of the present application provides an image data processing device.
  • the image data processing device 80 includes:
  • An acquisition module 81 configured to acquire semantic information from the image data stream based on a preset format
  • the processing module 82 is configured to perform preset processing according to the semantic information.
  • semantic information is used to interpret the image.
  • the semantic information includes at least one of the following:
  • Depth image the pixel value in the depth image is used to represent the distance between the pixel point in the image and the camera used to image the image;
  • the image contains an indication of an infinity part, which is a distance beyond the furthest distance that the device can detect;
  • the scene classification information is used to characterize the scene represented by the image.
  • the instance segmentation information is used to represent the instance segmentation information in the image.
  • the obtaining module 81 is specifically configured to: read semantic information in a reserved field of the image data stream based on a preset format.
  • the preset format is any combination of at least one semantic information.
  • the preset formats corresponding to different types of semantic information are different.
  • the preset format includes at least one of table, single-channel bitmap, matrix and key-value pair.
  • the preset format further includes an information header, which is used to indicate whether the image data stream contains semantic information, and/or, the information header is used to indicate the type of semantic information contained in the image data stream.
  • processing module 82 is specifically used for at least one of the following:
  • Semantic information includes scene classification information, according to the scene classification information, adjust the target parameters of the camera in the corresponding scene;
  • Semantic information includes target detection information, according to the target detection information, adjust the target of the camera's automatic focus;
  • Semantic information includes instance segmentation information, according to the instance segmentation information, the target instance in the image is obtained; according to the target instance, the image is preset.
  • the preset processing includes at least one of instance blurring, instance deformation, instance color retention, performing lut mapping on instances, and performing lut mapping on backgrounds.
  • the target parameters include at least one of automatic exposure parameters, display lookup tables, automatic focus parameters and white balance parameters.
  • the processing module 82 is further configured to: determine or generate semantic information of the image based on the image information; and save the semantic information in the data stream of the image based on a preset format.
  • the processing module 82 is further configured to: determine or generate semantic information of the image based on the image information; and save the semantic information in the data stream of the image based on a preset format.
  • FIG. 9 is a schematic structural diagram of a smart terminal according to a fifth embodiment.
  • An embodiment of the present application provides an intelligent terminal.
  • an intelligent terminal 90 includes a memory 91 and a processor 92.
  • An image data processing program is stored in the memory 91.
  • the image data processing program is executed by the processor 92, any of the above-mentioned
  • the steps of the image data processing method in an embodiment have similar implementation principles and beneficial effects, and will not be repeated here.
  • the above-mentioned smart terminal 90 further includes a communication interface 93 , and the communication interface 93 may be connected to the processor 92 through the bus 94 .
  • the processor 92 can control the communication interface 93 to implement the receiving and sending functions of the smart terminal 90 .
  • the above-mentioned integrated modules implemented in the form of software function modules can be stored in a computer-readable storage medium.
  • the above-mentioned software function modules are stored in a storage medium, and include several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) or a processor (English: processor) to execute the methods of the various embodiments of the present application. partial steps.
  • An embodiment of the present application further provides a computer-readable storage medium, on which an image data processing program is stored, and when the image data processing program is executed by a processor, the steps of the image data processing method in any of the foregoing embodiments are implemented.
  • An embodiment of the present application further provides a computer program product, the computer program product includes computer program code, and when the computer program code is run on the computer, the computer is made to execute the methods in the above various possible implementation manners.
  • the embodiment of the present application also provides a chip, including a memory and a processor.
  • the memory is used to store a computer program
  • the processor is used to call and run the computer program from the memory, so that the device installed with the chip executes the above various possible implementation modes. Methods.
  • Units in the device in the embodiment of the present application may be combined, divided and deleted according to actual needs.
  • the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in other words, the part that contributes to the prior art, and the computer software product is stored in one of the above storage media (such as ROM/RAM, magnetic CD, CD), including several instructions to make a terminal device (which may be a mobile phone, computer, server, controlled terminal, or network device, etc.) execute the method of each embodiment of the present application.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • a computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, special purpose computer, a computer network, or other programmable apparatus.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, alternatively, computer instructions may be transferred from a website site, computer, server or data center via a wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server, a data center, etc. integrated with one or more available media.
  • Usable media may be magnetic media, (optionally floppy disks, memory disks, tapes), optical media (optionally, DVD), or semiconductor media (eg Solid State Disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

La présente demande concerne un procédé de traitement de données d'image, un terminal intelligent et un support de stockage. Le procédé de traitement de données d'image comprend les étapes consistant à : déterminer ou générer des informations sémantiques d'une image sur la base d'informations d'image ; et stocker les informations sémantiques dans un flux de données de l'image sur la base d'un format prédéfini. Les informations sémantiques de photographie sont calculées de manière unifiée et standard dans un flux de données impliqué dans la photographie informatique.
PCT/CN2021/137246 2021-12-10 2021-12-10 Procédé de traitement de données d'image, terminal intelligent et support de stockage WO2023102935A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/137246 WO2023102935A1 (fr) 2021-12-10 2021-12-10 Procédé de traitement de données d'image, terminal intelligent et support de stockage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/137246 WO2023102935A1 (fr) 2021-12-10 2021-12-10 Procédé de traitement de données d'image, terminal intelligent et support de stockage

Publications (1)

Publication Number Publication Date
WO2023102935A1 true WO2023102935A1 (fr) 2023-06-15

Family

ID=86729485

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137246 WO2023102935A1 (fr) 2021-12-10 2021-12-10 Procédé de traitement de données d'image, terminal intelligent et support de stockage

Country Status (1)

Country Link
WO (1) WO2023102935A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103053166A (zh) * 2010-11-08 2013-04-17 索尼公司 立体图像数据发送设备、立体图像数据发送方法和立体图像数据接收设备
US20170131090A1 (en) * 2015-11-06 2017-05-11 Intel Corporation Systems, methods, and apparatuses for implementing maximum likelihood image binarization in a coded light range camera
CN111866032A (zh) * 2019-04-11 2020-10-30 阿里巴巴集团控股有限公司 一种数据处理方法、装置以及计算设备
CN113487705A (zh) * 2021-07-14 2021-10-08 上海传英信息技术有限公司 图像标注方法、终端及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103053166A (zh) * 2010-11-08 2013-04-17 索尼公司 立体图像数据发送设备、立体图像数据发送方法和立体图像数据接收设备
US20170131090A1 (en) * 2015-11-06 2017-05-11 Intel Corporation Systems, methods, and apparatuses for implementing maximum likelihood image binarization in a coded light range camera
CN111866032A (zh) * 2019-04-11 2020-10-30 阿里巴巴集团控股有限公司 一种数据处理方法、装置以及计算设备
CN113487705A (zh) * 2021-07-14 2021-10-08 上海传英信息技术有限公司 图像标注方法、终端及存储介质

Similar Documents

Publication Publication Date Title
US11941883B2 (en) Video classification method, model training method, device, and storage medium
WO2021036715A1 (fr) Procédé et appareil de fusion d'image-texte et dispositif électronique
WO2022166765A1 (fr) Procédé de traitement d'images, terminal mobile et support de stockage
CN113556492B (zh) 缩略图生成方法、移动终端及可读存储介质
CN111737520B (zh) 一种视频分类方法、视频分类装置、电子设备及存储介质
CN112181564A (zh) 生成壁纸的方法、移动终端及存储介质
CN114298883A (zh) 图像处理方法、智能终端及存储介质
WO2023010705A1 (fr) Procédé de traitement de données, terminal mobile et support de stockage
CN107743198B (zh) 一种拍照方法、终端及存储介质
WO2024098873A1 (fr) Procédé de traitement, dispositif de traitement et support de stockage
CN113347372A (zh) 拍摄补光方法、移动终端及可读存储介质
WO2023108444A1 (fr) Procédé de traitement d'image, terminal intelligent et support de stockage
WO2023102935A1 (fr) Procédé de traitement de données d'image, terminal intelligent et support de stockage
WO2023284218A1 (fr) Procédé de commande de photographie, terminal mobile et support de stockage
CN113286106B (zh) 录像方法、移动终端及存储介质
WO2022095752A1 (fr) Procédé de démultiplexage de trame, dispositif électronique et support de stockage
CN112532786B (zh) 图像显示方法、终端设备和存储介质
CN114723645A (zh) 图像处理方法、智能终端及存储介质
CN114092366A (zh) 图像处理方法、移动终端及存储介质
CN113901245A (zh) 图片搜索方法、智能终端及存储介质
WO2023108443A1 (fr) Procédé de traitement d'images, terminal intelligent et support de stockage
WO2023102934A1 (fr) Procédé de traitement de données, terminal intelligent et support de stockage
WO2023097446A1 (fr) Procédé de traitement vidéo,terminal intelligent et support de stockage
WO2023108442A1 (fr) Procédé de traitement d'image, terminal intelligent, et support de stockage
WO2023050413A1 (fr) Procédé de traitement d'image, terminal intelligent et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21966853

Country of ref document: EP

Kind code of ref document: A1