WO2021068628A1 - 一种图像处理方法和系统 - Google Patents

一种图像处理方法和系统 Download PDF

Info

Publication number
WO2021068628A1
WO2021068628A1 PCT/CN2020/107107 CN2020107107W WO2021068628A1 WO 2021068628 A1 WO2021068628 A1 WO 2021068628A1 CN 2020107107 W CN2020107107 W CN 2020107107W WO 2021068628 A1 WO2021068628 A1 WO 2021068628A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing
text
result
image data
image
Prior art date
Application number
PCT/CN2020/107107
Other languages
English (en)
French (fr)
Inventor
张凯隆
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021068628A1 publication Critical patent/WO2021068628A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This specification relates to the field of images, in particular to a method and system for determining the text content of images.
  • One of the embodiments of this specification provides an image processing method, which is executed by a first processing device.
  • the method includes: acquiring image data; performing a first process on the image data within a preset time period to obtain a first result; the first result includes a result obtained by performing the first process within a preset period of time and/or The progress information of executing the first processing within a preset time period; and sending the first result to a second processing device, so that the second processing device performs subsequent processing related to the image data based on the first result.
  • the system includes: an image acquisition module for acquiring image data; a first processing module for performing first processing on the image data within a preset time period to obtain a first result; the first result is included in the pre- It is assumed that the result obtained by executing the first processing within a time period and/or the progress information of executing the first processing within a preset time period; the transmission module is configured to send the first result to the second processing device, so that the second processing device is based on The first result is subjected to subsequent processing related to the image data.
  • the device includes at least one processor and at least one memory; the at least one memory is used to store computer instructions; the at least one processor is used to execute at least part of the computer instructions to implement the above-mentioned image processing method.
  • Another embodiment of this specification provides yet another image processing method, which is executed by a second processing device.
  • the method includes: obtaining a first result; the first result is obtained by a first processing device performing first processing on the image data within a preset time period; the first result includes the first processing device executing within the preset time period The result obtained by the first processing and/or the progress information of the first processing device executing the first processing within a preset period of time; based on the first result, subsequent processing related to the image data is performed.
  • the system includes: an acquisition module for acquiring a first result; the first result is obtained by the first processing device performing first processing on the image data within a preset time period; the first result is included in the preset time period The result obtained by the first processing device performing the first processing and/or the progress information of the first processing device performing the first processing within a preset period of time; the subsequent processing module is configured to communicate with the image data based on the first result Related follow-up processing.
  • the device includes at least one processor and at least one memory; the at least one memory is used to store computer instructions; the at least one processor is used to execute at least part of the computer instructions to implement the above-mentioned image processing method.
  • Fig. 1 is an application scenario diagram of an exemplary image processing system according to some embodiments of this specification
  • Fig. 2 is an exemplary flowchart of an image processing method according to some embodiments of the present specification
  • Fig. 3 is a block diagram of an exemplary image processing system according to some embodiments of the present specification.
  • Fig. 4 is a block diagram of another exemplary image processing system according to some embodiments of the present specification.
  • Fig. 5 is an exemplary flowchart of an image processing method executed by the first processing device according to some embodiments of this specification.
  • Fig. 6 is an exemplary flowchart of an image processing method executed by a second processing device according to some embodiments of this specification.
  • system is a method for distinguishing different components, elements, parts, parts, or assemblies of different levels.
  • the words can be replaced by other expressions.
  • Fig. 1 is an application scenario diagram of an exemplary image processing system according to some embodiments of this specification.
  • the image processing system 100 includes two processing devices capable of image processing: a first processing device and a second processing device.
  • the first processing device performs image processing, and then the second processing device performs subsequent processing to obtain the final image processing result. Since the two processing devices can perform image processing cooperatively, the processing pressure can be allocated to the two processing devices to avoid excessive pressure on the processing device caused by only a single processing device.
  • the image processing system 100 can be applied to various scenarios that may involve image processing, such as image processing scenarios involved in various applications.
  • the applications can include social applications, payment applications, photographing applications, and information applications. Programs, shopping applications and various small programs, etc.
  • the image processing system 100 may include a server 110, a network 120, a terminal 130, and a storage device 140.
  • the server 110 may be used as the second processing device to receive data and/or information from at least one other component of the image processing system 100, and/or send data and/or information to other components.
  • the server 110 may obtain image data from the terminal 130 and/or the storage device 140.
  • the server 110 may be used to process data and/or information from at least one component of the image processing system 100.
  • the server 110 may receive the image processing result from the terminal 130 and/or store the image data of the device 140, and perform subsequent processing according to the image processing result of the terminal 130.
  • the server 110 may continue to perform the image detection and recognition processing.
  • the server 110 may also directly perform risk identification according to the processing result of the terminal 130.
  • the server 110 may be a single processing device or a group of processing devices.
  • the processing device group may be a centralized processing device group connected to the network 120 via an access point, or a distributed processing device group respectively connected to the network 120 via at least one access point.
  • the server 110 may be locally connected to the network 120 or remotely connected to the network 120.
  • the server 110 may access information and/or data stored in the terminal 130 and/or the storage device 140 via the network 120.
  • the storage device 140 may be used as a back-end data storage of the server 110.
  • the server 110 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof.
  • the server 110 may include a processing device 112.
  • the processing device 112 may process information and/or data related to at least one function described in this specification.
  • the processing device 112 may perform the main functions of the image processing system 100.
  • the processing device 112 may perform text detection and recognition on the image to determine the text content in the image.
  • the processing device 112 may include at least one processing unit (for example, a single-core processing device or a multi-core processing device).
  • the processing device 112 includes a central processing unit (CPU), an application specific integrated circuit (ASIC), an application specific instruction set processor (ASIP), a graphics processing unit (GPU), a physical processing unit (PPU), and a digital signal processor. (DSP), Field Programmable Gate Array (FPGA), Programmable Logic Device (PLD), Controller, Microcontroller Unit, Reduced Instruction Set Computer (RISC), Microprocessor, etc., or any combination thereof.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • ASIP application specific instruction set processor
  • GPU graphics processing
  • the network 120 may facilitate the exchange of information and/or data.
  • at least one component in the image processing system 100 may send information and/or data to other components in the image processing system 100 via the network 120.
  • the server 110 may obtain image data from the storage device 140 via the network 120.
  • the recognition result may be sent to the terminal 130 via the network 120.
  • the network 120 may be any form of wired or wireless network, or any combination thereof.
  • the network 120 may include a cable network, a wired network, an optical fiber network, a telecommunication network, an internal network, the Internet, a local area network (LAN), a wide area network (WAN), a wireless local area network (WLAN), a metropolitan area network (MAN), public switched telephone network (PSTN), Bluetooth network, ZigBee network, near field communication (NFC) network, etc. or any combination thereof.
  • the network 120 may include at least one network access point.
  • the network 120 may include wired or wireless network access points, such as base stations and/or Internet exchange points 120-1, 120-2, ..., and at least one component of the image processing system 100 may be connected to the network 120 to exchange data And/or information.
  • the user can access the image processing system 100 through the terminal 130.
  • the terminal 130 may serve as the first processing device.
  • the terminal 130 can obtain image data in various ways, and perform text detection and recognition on the image data. These image data can be uploaded to the server 110 or the storage device 140, and the text detection and recognition results can be sent to the server 110 for subsequent processing.
  • the terminal 130 may acquire image data through an image acquisition component (such as a camera).
  • the user takes an image through the terminal 130 and uploads the image data to the server 110 or the storage device 140 through the network.
  • the terminal 130 may obtain various image data from the Internet. For example, when the user browses network information through the terminal 130, the image information published on the network can be obtained.
  • image information sent by other users through their terminals can be obtained.
  • the terminal 130 can perform text detection and recognition on the acquired image data.
  • the terminal 130 may use optical character recognition technology to perform text detection and recognition on the image data.
  • a deep learning engine is deployed on the terminal 130 to perform text detection and text recognition.
  • the terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, etc., or any combination thereof.
  • the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, etc., or any combination thereof.
  • smart home devices may include smart lighting devices, smart electrical appliance control devices, smart monitoring devices, smart TVs, smart cameras, walkie-talkies, etc., or any combination thereof.
  • the wearable device may include a smart bracelet, smart footwear, smart glasses, smart helmets, smart watches, smart clothes, smart backpacks, smart accessories, etc., or any combination thereof.
  • the smart mobile device may include a smart phone, a personal digital assistant (PDA), a gaming device, a navigation device, a point of sale (POS), etc., or any combination thereof.
  • the virtual reality device and/or augmented virtual reality device may include virtual reality helmets, virtual reality glasses, virtual reality patches, augmented reality helmets, augmented reality glasses, augmented reality patches, etc., or any combination thereof.
  • virtual reality devices and / or augmented reality device may comprise GoogleGlass TM, OculusRift TM, Hololens TM GearVR TM or the like.
  • the storage device 140 may store data and/or instructions. For example, image data acquired and sent by the terminal 130 may be stored. In some embodiments, the storage device 140 may store data and/or instructions that can be executed by the processing device 112, and the server 110 may execute or use the data and/or instructions to implement the exemplary methods described in this specification. In some embodiments, the storage device 140 may include mass memory, removable memory, volatile read-write memory, read-only memory (ROM), etc., or any combination thereof. Exemplary mass storage devices may include magnetic disks, optical disks, solid state disks, and the like. Exemplary removable storage may include flash drives, floppy disks, optical disks, memory cards, compact disks, magnetic tapes, and the like. An exemplary volatile read-write memory may include random access memory (RAM).
  • RAM random access memory
  • Exemplary RAM may include dynamic random access memory (DRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), static random access memory (SRAM), thyristor random access memory (T-RAM), and zero capacitance Random access memory (Z-RAM), etc.
  • Exemplary read-only memory may include mask-type read-only memory (MROM), programmable read-only memory (PROM), erasable programmable read-only memory (PEROM), electrically erasable programmable read-only memory (EEPROM), CD-ROM and digital versatile disk read-only memory, etc.
  • the storage device 140 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-layer cloud, etc., or any combination thereof.
  • the storage device 140 may be integrated in the server 110. In other embodiments, the storage device 140 may be integrated in another server different from the server 110.
  • the storage device 140 can be deployed on an image server that can be used to store image data, the terminal 130 can send the acquired image data to the image server for synchronous storage, and the server 110 can acquire the image data from the image server.
  • Fig. 2 is an exemplary flowchart of an image processing method according to some embodiments of the present specification.
  • the process 200 shown in FIG. 2 may be implemented in the image processing system 100 shown in FIG. 1.
  • at least a part of the process 200 may be stored in the storage device 140 as an instruction, and called and/or executed by the server 110 and the terminal 130.
  • Step 210 The first processing device receives image data.
  • the first processing device may be a terminal device, such as the terminal 130.
  • the terminal device may be a mobile terminal device, such as a mobile phone device, a tablet device, a smart home device, an Internet of Things (IOT) device, such as a face-scanning device, a code-scanning device, etc., or Edge Devices, such as routers, switches, network access devices, etc.
  • IOT Internet of Things
  • Edge Devices such as routers, switches, network access devices, etc.
  • the first processing device can acquire image data in various ways, including but not limited to capturing images through an image acquisition component (such as a camera), downloading images from the Internet, loading locally pre-stored images, and receiving images transmitted from other devices.
  • an application program APP
  • APP is installed on the first processing device, and the first processing device can run the APP to obtain image data.
  • images posted by the official account, life account, etc. in social or information apps.
  • images appearing in individual or group chats in social apps, or images in status information posted by users can be received.
  • images (such as user avatars, etc.) input by the user using the image shooting or uploading function in the APP can be received.
  • the image data may contain text content, and it is necessary to perform text detection and recognition on the image data in order to make a risk judgment on the text content in the image data.
  • the image data may be a plain text pattern containing only text content, or may also be a pattern containing both text content and non-text content.
  • the detection and recognition of the text content presented in the image format is more difficult, and the demand for calculation is greater. Therefore, the technical solution of the cooperative processing of two processing devices is adopted next.
  • Step 220 The first processing device performs first processing on the image data within a preset time period to obtain a first result.
  • the first processing of the first processing device may include performing text detection on the image data, and may also include performing text recognition on the image data.
  • the text detection processing may include locating the area where the text exists in the image, and determining the bounding box of the text line.
  • the text recognition processing includes recognizing the positioned text and determining the text content.
  • the first processing device may use optical character recognition technology to detect and/or recognize the text in the image data.
  • Optical Character Recognition (OCR) technology uses optical means to determine the text shape in an image based on the detected darkness and light, and then converts the text shape into a text format that can be processed by a computer through a character recognition method.
  • OCR Optical Character Recognition
  • the first processing may also include preprocessing the image data before performing text detection.
  • the preprocessing includes but is not limited to grayscale, geometric transformation, image enhancement, and the like.
  • the preset duration is used to limit the time for the first processing device to process the image data. It can be understood that, since the first processing device and the second processing device are used for collaborative processing in this specification, the second processing device will continue processing after the first processing device finishes processing. Therefore, it is necessary to set the preset duration for the first processing device. That is, the first processing device only processes the image data within the preset time period. After the preset time period is exceeded, the first processing device may not continue processing, and the second processing device will continue processing.
  • the preset duration may be preset according to related factors. For example, the preset duration may be set according to the performance of the first processing device.
  • the stronger the performance of the first processing device, the shorter the preset duration; and the preset duration may also be determined according to the complexity of the image processing scene involved. Duration, the more complex the image processing scene involved, the longer the preset duration.
  • the preset duration can be any numerical value, and as an example, it can be 10 seconds, 5 seconds, 4 seconds, 3 seconds, 2 seconds, 1 second, and so on.
  • the first result may include a result obtained by the first processing device executing the first processing within the preset time period, for example, it may include a text detection and recognition result, or only a text detection result.
  • the first result may also include progress information of the execution of the first process within the preset time period, for example, text detection has been completed and no text has been completed, text detection processing and text recognition processing have been completed, text detection processing has been completed.
  • progress information of the execution of the first process within the preset time period for example, text detection has been completed and no text has been completed, text detection processing and text recognition processing have been completed, text detection processing has been completed
  • text detection processing has been completed
  • there are four cases of unfinished text recognition processing, unfinished text detection processing and text recognition processing can be seen in FIG. 5, which will not be repeated here.
  • Step 230 The first processing device sends the first result to the second processing device.
  • the second processing device may be a server, such as the server 110 shown in FIG. 1.
  • the first processing device (such as the terminal 130 shown in FIG. 1) may send the first processing result to the second processing device (such as the server 110) through a network (such as the network 120 shown in FIG. 1).
  • the first processing device may also send the first processing result to the storage device (the storage device 140 shown in FIG. 1) via the network, and the second processing device obtains the first processing result from the storage device.
  • the second processing device needs to use the image data in addition to the first result when performing the subsequent processing in step 240.
  • the first processing device may also send the image data to the second processing device.
  • the first processing device may directly send the image data to the second processing device, or may send an image identification.
  • the image identification includes but is not limited to the encoding of the image (such as a string of randomly generated character strings).
  • the first processing device may send the acquired image data and the corresponding image identifier to a storage device (the storage device 140 shown in FIG. 1) for storage.
  • a storage device the storage device 140 shown in FIG. 1
  • the second processing When the device needs to acquire the image data for subsequent processing, it may acquire the image data from the storage device according to the corresponding image identifier sent by the first processing device.
  • Step 240 The second processing device performs subsequent processing related to the image data according to the first processing result.
  • the subsequent processing includes a part of the first processing that the first processing device fails to complete within a preset period of time.
  • the second device in response to the first result being no text, the second device may not perform subsequent processing.
  • the second processing device may perform different subsequent processing according to different first results. For example, in response to the first result including the progress information of the completed text detection processing and text recognition processing and the recognized text content, the subsequent processing of the second processing device may be to perform risk judgment processing on the text content;
  • the first result includes progress information of text detection processing completed but not completed text recognition processing and position information of the text in the image data.
  • the subsequent processing of the second processing device may be to obtain the image data and based on the The location information identifies text content from the image data, and performs risk judgment processing on the text content; in response to the first result including progress information of unfinished text detection processing and text recognition processing, the subsequent processing of the second processing device
  • the processing may include acquiring the image data, performing text detection processing and text recognition processing on the image data, and performing risk judgment processing on the text content obtained by recognition.
  • a preset duration may be set for the second processing device to perform the subsequent processing, such as 10 seconds, 5 seconds, 4 seconds, 3 seconds, 2 seconds, 1 second, etc., and no text is obtained after the set duration is exceeded.
  • the image data can be reported or other special processing can be performed. For more content about the subsequent processing performed by the second processing device, refer to FIG. 6 and its description, which will not be repeated here.
  • the subsequent processing may further include performing risk recognition on the text content in the image data to obtain a text risk recognition result.
  • bad text content such as pornography, gambling, drugs, violence, terror, and vulgarity. Therefore, it is necessary to use some technical means to identify such image data with bad text content, and then perform processing such as reminding or shielding.
  • Images that may be at risk can be processed in different ways according to the situation, for example, images can be deleted, images can be prohibited from being published, images only visible to the publisher, and so on.
  • the risk identification can be applied to all scenarios involving content risk prevention and control, such as social chat, content posted by online accounts, upload of user information (for example, avatar information, nickname information, etc.).
  • content risk prevention and control such as social chat, content posted by online accounts, upload of user information (for example, avatar information, nickname information, etc.).
  • user information for example, avatar information, nickname information, etc.
  • Fig. 3 is a block diagram of an exemplary image processing system according to some embodiments of the present specification.
  • the image processing system can be implemented on a first processing device (the terminal 130 shown in FIG. 1).
  • the image processing system 300 may include an image acquisition module 310, a first processing module 320, and a transmission module 330.
  • the image acquisition module 310 is used to acquire image data.
  • the image acquisition module 310 can acquire image data in various ways, including but not limited to capturing images through an image acquisition component (such as a camera), downloading images from the Internet, loading local pre-stored images, and receiving transmissions from other devices. Coming images etc.
  • the image acquisition module 310 can acquire image data by running an APP.
  • the image data may contain text content, and text detection and recognition need to be performed on the image data.
  • the image data may be a plain text pattern containing only text content, or may also be a pattern containing both text content and non-text content.
  • the first processing module 320 is configured to perform first processing on image data.
  • the first processing module 320 may perform text detection on image data, and may also perform text recognition on image data.
  • the first processing module 320 uses OCR technology to detect and/or recognize text in the image data.
  • the first processing module 320 before performing text detection and recognition on the image data, the first processing module 320 may also preprocess the image data.
  • a preset duration can be set for the first processing of the first processing module 320 to limit the time of the first processing.
  • the first processing module 320 may obtain the first result after performing the first processing on the image data.
  • the first result may include, for example, text detection and recognition results, or only text detection results; the first result may also include progress information of the first processing module 320 performed by the first processing module 320 within the preset time period.
  • the transmission module 330 can be used to transmit data.
  • the transmission module 330 may send the first result obtained by the first processing module 320 to perform the first processing within a preset period of time to the second processing device.
  • the transmission module 330 may also send image data to the second processing device.
  • the transmission module 330 may directly send the first result and/or image data to the second processing device, or may first send it to the storage device (the storage device 140 shown in FIG. 1), and the second processing device receives the data from the Obtained from the storage device.
  • Fig. 4 is a block diagram of an exemplary image processing system according to some embodiments of the present specification.
  • the image processing system 400 can be implemented on a second processing device (the server 110 shown in FIG. 1). As shown in FIG. 4, the image processing system 400 may include an acquisition module 410 and a subsequent processing module 420.
  • the obtaining module 410 is used to obtain information.
  • the obtaining module 410 may obtain the first result obtained by the first processing device (as shown in FIG. 3) performing the first processing within a preset time period.
  • the acquiring module 410 may also acquire the image data acquired by the first processing device (the image acquiring module 310 shown in FIG. 3).
  • the subsequent processing module 420 is configured to perform subsequent processing related to the image data according to the first result obtained by the first processing device (the first processing module 320 shown in FIG. 3) performing the first processing on the image data.
  • the post-processing module 420 may include a text processing unit 422 and a risk analysis unit 424.
  • the text processing unit 422 may execute the part of the first processing that the first processing module 320 failed to complete within the preset time period, for example, the text detection and/or text recognition processing that the first processing module 320 failed to complete, to determine the image data
  • the risk analysis unit 424 is used to perform risk analysis on the identified text content.
  • the risk analysis unit 424 may perform risk analysis on the text content through text mining technology to determine whether the text content is risky or the risk level of the text content, so as to take measures on the corresponding image data, such as deleting the image and prohibiting publication. Images, images visible only to the publisher, etc.
  • system and its modules shown in FIG. 3 and FIG. 4 can be implemented in various ways.
  • the system and its modules may be implemented by hardware, software, or a combination of software and hardware.
  • the hardware part can be implemented by dedicated logic;
  • the software part can be stored in a memory and executed by an appropriate instruction execution system, such as a microprocessor or dedicated design hardware.
  • processor control codes for example on a carrier medium such as a disk, CD or DVD-ROM, such as a read-only memory (firmware Such codes are provided on a programmable memory or a data carrier such as an optical or electronic signal carrier.
  • the system and its modules in this specification can not only be implemented by hardware circuits such as very large-scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc. It may also be implemented by software executed by various types of processors, or may be implemented by a combination of the above-mentioned hardware circuit and software (for example, firmware).
  • the above description of the image processing systems 300 and 400 and their modules is only for convenience of description, and does not limit this specification within the scope of the embodiments mentioned. It can be understood that for those skilled in the art, after understanding the principle of the system, it is possible to arbitrarily combine various modules, or form a subsystem to connect with other modules without departing from this principle.
  • the image acquisition module 310, the first processing module 320, and the transmission module 330 disclosed in FIG. 3 may be different modules in a system, or one module may implement the above two or more modules.
  • the modules disclosed in Figure 4 can also be different modules in a system, or one module can implement the functions of the above two or more modules.
  • the image processing system 300 and/or 400 may further include a communication module to communicate with other components.
  • Each module in the image processing system 300 and/or 400 may share a storage module, and each module may also have its own storage module. Such deformations are all within the protection scope of this specification.
  • Fig. 5 is an exemplary flowchart of an image processing method executed by the first processing device according to some embodiments of the present specification.
  • the process 500 shown in FIG. 5 may be implemented in the image processing system 100 shown in FIG. 1.
  • at least a part of the process 500 may be stored in the storage device 140 in the form of instructions, and called and/or executed by the terminal 130.
  • Step 510 Obtain image data. This step 510 has been described in detail in step 210, and will not be repeated here.
  • Step 520 Perform text detection and text recognition processing on the image data within a preset time period to obtain a first result.
  • the first processing device may use OCR technology to perform text detection and recognition on the acquired image data.
  • a set of deep learning engines may be deployed in the first processing device to perform text detection and recognition processing.
  • the first result includes text detection and recognition results. For example, it may be that there is no text in the image data, or it may be text content recognized from the image. In some embodiments, the first result only includes text detection results, but no text recognition results.
  • the text detection result may include the position information of the text in the image data, for example, it may be the bounding box of the text line in the image data, or the abscissa, ordinate, etc. of the text.
  • the first result may also include progress information of the text detection and text recognition processing performed by the first processing device on the image data within a preset time period. The progress information may reflect whether the first processing device has completed the text detection and recognition processing of the image data within the preset time period.
  • the progress information can include four types: text detection completed and no text, text detection processing completed and text recognition processing completed, text detection processing completed but not completed text recognition processing, text detection processing not completed, and text recognition processing completed.
  • the preset duration is a preset time threshold, which can be any time value such as 0.25s, 0.5s, 1s, 1.5s, 2s, 3s, 5s, etc. In some embodiments, the preset duration may be adjusted according to related factors such as image size, image specification, image category, and image source.
  • the first processing device may be a terminal device.
  • using a terminal device to perform text detection and recognition processing on image data can reduce the amount of calculation for the server to perform subsequent processing on the image data, and reduce the calculation pressure of the server to a certain extent.
  • Step 530 Send the first result to a second processing device, so that the second processing device performs subsequent processing related to the image data based on the first result.
  • step 520 different first results can be expressed in a specific expression form, such as hasEdgeResult and its value.
  • the first processing device may send the result of the text detection and the image data to the second processing device .
  • the first processing device may transmit the image identifier, the hasEdgeResult value, and the result obtained by performing the first processing (such as the recognized text content or the detected text location) to the second processing device, and receive it.
  • the image data of is uploaded to the storage device (storage device 140 in FIG. 1) for synchronization, and the second processing device may obtain the corresponding image data from the storage device based on the image identifier for subsequent processing.
  • the first processing device may also upload these data to the storage device.
  • the storage device 140 shown in FIG. 1 The storage device 140 shown in FIG. 1, and the second processing device obtains these data from the storage device.
  • the first processing device may generate an image data identifier, upload the image data and its identifier to the storage device, and send the identifier to the second processing device.
  • the second processing device may obtain the image data from the storage device according to the image data identifier.
  • Fig. 6 is an exemplary flowchart of an image processing method executed by a second processing device according to some embodiments of this specification.
  • the process 600 shown in FIG. 6 may be implemented in the image processing system 100 shown in FIG. 1.
  • at least a part of the process 600 may be stored in the storage device 140 in the form of instructions, and called and/or executed by the server 110.
  • Step 610 Obtain the first result sent by the first processing device.
  • first result sent by the first processing device For more content about the first result and the first result sent by the first processing device to the second processing device, please refer to steps 520 and 530 in FIG. 5, which will not be repeated here.
  • Steps 620 to 650 are related subsequent processing performed by the second processing device on the image data according to the first result sent by the first processing device.
  • the second processing device can perform text detection and recognition processing on the image data, and the second processing device can also perform risk analysis on the recognized text content.
  • the second processing device may use OCR technology for text detection and recognition.
  • a set of deep learning engines may be pre-deployed on the second processing device to perform text detection, text recognition processing, and/or risk judgment processing.
  • the capability of the deep learning engine in the second processing device is greater than the capability of the deep learning engine in the first processing device, so the image data that the first processing device has not successfully detected or recognized can be detected and recognized, and Perform other processing. The following describes the subsequent processing of the image data by the second processing device under different first results.
  • Step 620 in response to the first result being no text, no subsequent processing is performed.
  • the first processing device performs processing before the second processing device performs processing, which can effectively reduce the calculation pressure of the second processing.
  • the specific size of the calculation pressure reduction can be determined according to the specific conditions of the image data.
  • non-text images account for more than 50% of all images.
  • the first processing device may successfully determine that there are no text in these images after text detection and recognition, so that the second processing device does not need to perform any subsequent processing on these images without text. If the first processing device successfully determines all the non-text images, the second processing device does not need to perform subsequent processing on these non-text images.
  • Step 630 In response to the first result including the progress information of the completed text detection processing and text recognition processing and the recognized text content, perform risk judgment processing on the text content.
  • the second processing device may perform risk judgment processing on the recognized text content through text mining technology.
  • Text mining can use text word segmentation technology to recombine the recognized text content into a word sequence according to certain specifications, and then judge whether there is bad content according to the word segmentation result.
  • Word segmentation algorithms include, but are not limited to, word segmentation algorithms based on dictionary matching, word segmentation algorithms based on semantic analysis, and word segmentation algorithms based on probability statistical models.
  • a text mining model can be obtained through training, and the model is used to make risk judgments on the recognized text content.
  • Step 640 In response to the first result including the progress information of the text detection process being completed but the text recognition process being not completed, and the location information of the text in the image data, the image data is acquired and based on the location information from the Identify text content in the image data, and perform risk judgment processing on the text content.
  • the second processing device needs to perform text recognition again, but does not need to perform the text recognition again.
  • Text detection which can also reduce the computational pressure of the second processing device.
  • the first processing device may send the text location information determined by text detection to the second processing device.
  • the text location information may be described by information such as abscissa and ordinate.
  • the second processing device may also Acquire the image data, the image data may be directly sent by the first processing device to the second processing device, or the first processing device may first be sent to the storage device for storage, and then the second processing device will receive it from the storage device. Get in.
  • the second processing device can find text in the image data according to the image data and the text position information obtained by the first processing device, and perform text recognition to obtain the recognized text content.
  • step 630 The operation of performing risk judgment processing on the text content is similar to that described in step 630, and will not be repeated here.
  • Step 650 In response to the first result including the progress information of the unfinished text detection processing and text recognition processing, obtain the image data, perform text detection processing and text recognition processing on the image data, and perform text recognition processing on the recognized text
  • the content is subject to risk judgment processing.
  • the second processing device needs to perform text detection and text recognition again. It can be understood that the processing capability of the first processing device is limited and cannot process all image data. The processing capability of the second processing device is greater than the processing capability of the first processing device. Therefore, the second processing device may be capable of processing failures of the first processing device. Image data. For image data for which the first processing device fails to successfully perform text processing and text recognition, the second processing device needs to obtain the image data to perform text detection and text recognition again.
  • step 630 The operation of performing risk judgment processing on the text content is similar to that described in step 630, and will not be repeated here.
  • the possible beneficial effects brought by the embodiments of this specification include, but are not limited to: splitting the text detection and text recognition of image data, and the terminal device first processes them, and then the server performs subsequent processes, thereby making full use of terminal resources , To reduce the computing pressure carried by the server. It should be noted that different embodiments may have different beneficial effects. In different embodiments, the possible beneficial effects may be any one or a combination of the above, or any other beneficial effects that may be obtained.
  • the computer storage medium may contain a propagated data signal containing a computer program code, for example on a baseband or as part of a carrier wave.
  • the propagated signal may have multiple manifestations, including electromagnetic forms, optical forms, etc., or suitable combinations.
  • the computer storage medium may be any computer readable medium other than the computer readable storage medium, and the medium may be connected to an instruction execution system, device, or device to realize communication, propagation, or transmission of the program for use.
  • the program code located on the computer storage medium can be transmitted through any suitable medium, including radio, cable, fiber optic cable, RF, or similar medium, or any combination of the above medium.
  • the computer program codes required for the operation of each part of this manual can be written in any one or more programming languages, including object-oriented programming languages such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python Etc., conventional programming languages such as C language, VisualBasic, Fortran2003, Perl, COBOL2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code can run entirely on the user's computer, or as an independent software package on the user's computer, or partly on the user's computer and partly on a remote computer, or entirely on the remote computer or processing equipment.
  • the remote computer can be connected to the user's computer through any network form, such as a local area network (LAN) or a wide area network (WAN), or connected to an external computer (for example, via the Internet), or in a cloud computing environment, or as a service Use software as a service (SaaS).
  • LAN local area network
  • WAN wide area network
  • SaaS service Use software as a service
  • numbers describing the number of ingredients and attributes are used. It should be understood that such numbers used in the description of the embodiments use the modifier "about”, “approximately” or “substantially” in some examples. Retouch. Unless otherwise stated, “approximately”, “approximately” or “substantially” indicates that the number is allowed to vary by ⁇ 20%.
  • the numerical parameters used in the description and claims are approximate values, and the approximate values can be changed according to the required characteristics of individual embodiments. In some embodiments, the numerical parameter should consider the prescribed effective digits and adopt the method of general digit retention. Although the numerical ranges and parameters used to confirm the breadth of the ranges in some embodiments of this specification are approximate values, in specific embodiments, the setting of such numerical values is as accurate as possible within the feasible range.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Character Discrimination (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本说明书提供了一种图像处理方法和系统。所述方法包括:第一处理设备获取图像数据;第一处理设备在预设时长内对所述图像数据进行第一处理,得到第一结果;所述第一结果包括在预设时长内执行第一处理获得的结果和/或在预设时长内执行第一处理的进度信息;第一处理设备将所述第一处理结果发送给第二处理设备,以便第二处理设备基于所述第一结果进行与所述图像数据相关的后续处理;第二处理设备获取所述第一结果;第二处理设备基于所述第一结果,进行与所述图像数据相关的后续处理。

Description

一种图像处理方法和系统 技术领域
本说明书涉及图像领域,特别涉及一种确定图像文本内容的方法和系统。
背景技术
随着互联网技术的快速发展,人们生活的方方面面都已经离不开互联网。但由于互联网中的信息内容来源复杂,有必要通过风险识别技术手段来识别信息内容中的风险,从而保障在互联网使用过程中的安全。这其中包括了对图像携带的文本信息的风险识别。目前,对图像携带文本信息的风险识别工作需要较大的计算力支撑,因此这部分工作目前还主要依赖于处理能力更强的服务器端,给服务器端带来了较大的计算压力。
因此,有必要提供一种图像处理方法,减轻图像检测与识别过程中服务器端承载的计算压力。
发明内容
本说明书实施例之一提供一种图像处理方法,所述方法由第一处理设备执行。所述方法包括:获取图像数据;在预设时长内对所述图像数据进行第一处理,得到第一结果;所述第一结果包括在预设时长内执行第一处理获得的结果和/或在预设时长内执行第一处理的进度信息;以及将所述第一结果发送给第二处理设备,以便第二处理设备基于所述第一结果进行与所述图像数据相关的后续处理。
本说明书另一实施例提供一种图像处理系统。所述系统包括:图像获取模块,用于获取图像数据;第一处理模块,用于在预设时长内对所述图像数据进行第一处理,得到第一结果;所述第一结果包括在预设时长内执行第一处理获得的结果和/或在预设时长内执行第一处理的进度信息;传送模块,用于将所述第一结果发送给第二处理设备,以便第二处理设备基于所述第一结果进行与所述图像数据相关的后续处理。
本说明书另一实施例提供一种图像处理装置。所述装置包括至少一个处理器以及至少一个存储器;所述至少一个存储器用于存储计算机指令;所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现上述图像处理方法。
本说明书另一实施例提供又一种图像处理方法,所述方法由第二处理设备执行。所 述方法包括:获取第一结果;所述第一结果为第一处理设备在预设时长内对图像数据进行第一处理得到;所述第一结果包括在预设时长内第一处理设备执行第一处理获得的结果和/或在预设时长内第一处理设备执行第一处理的进度信息;基于所述第一结果,进行与所述图像数据相关的后续处理。
本说明书另一实施例提供又一种图像处理系统。所述系统包括:获取模块,用于获取第一结果;所述第一结果为第一处理设备在预设时长内对图像数据进行第一处理得到;所述第一结果包括在预设时长内第一处理设备执行第一处理获得的结果和/或在预设时长内第一处理设备执行第一处理的进度信息;后续处理模块,用于基于所述第一结果,进行与所述图像数据相关的后续处理。
本说明书另一实施例提供又一种图像处理装置。所述装置包括至少一个处理器以及至少一个存储器;所述至少一个存储器用于存储计算机指令;所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现上述图像处理方法。
附图说明
本说明书将以示例性实施例的方式进一步说明,这些示例性实施例将通过附图进行详细描述。这些实施例并非限制性的,在这些实施例中,相同的编号表示相同的结构,其中:
图1是根据本说明书一些实施例所示的一个示例性图像处理系统的应用场景图;
图2是根据本说明书一些实施例所示的图像处理方法的示例性流程图;
图3是根据本说明书一些实施例所示的一个示例性图像处理系统的框图;
图4是根据本说明书一些实施例所示的另一个示例性图像处理系统的框图;
图5是根据本说明书一些实施例所示的第一处理设备执行的图像处理方法的示例性流程图;以及
图6是根据本说明书一些实施例所示的第二处理设备执行的图像处理方法的示例性流程图。
具体实施方式
为了更清楚地说明本说明书实施例的技术方案,下面将对实施例描述中所需要使用 的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本说明书的一些示例或实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图将本说明书应用于其它类似情景。除非从语言环境中显而易见或另做说明,图中相同标号代表相同结构或操作。
应当理解,本文使用的“系统”、“装置”、“单元”和/或“模组”是用于区分不同级别的不同组件、元件、部件、部分或装配的一种方法。然而,如果其他词语可实现相同的目的,则可通过其他表达来替换所述词语。
如本说明书和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其它的步骤或元素。
本说明书中使用了流程图用来说明根据本说明书的实施例的系统所执行的操作。应当理解的是,前面或后面操作不一定按照顺序来精确地执行。相反,可以按照倒序或同时处理各个步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。
图1是根据本说明书一些实施例所示的一个示例性图像处理系统的应用场景图。
图像处理系统100包括两个能够进行图像处理的处理设备:第一处理设备和第二处理设备。先由第一处理设备进行图像处理,再由第二处理设备接着进行后续处理,得到最终的图像处理结果。由于两个处理设备之间可协同进行图像处理,因此可以将处理压力分配给两个处理设备,避免只用单个处理设备造成处理设备压力过大。图像处理系统100可以应用于各种可能涉及图像处理的场景,例如各类应用程序中涉及的图像处理场景,应用程序可以包括社交类应用程序、支付类应用程序、拍照类应用程序、资讯类应用程序、购物类应用程序及各种小程序等。如图1所示,该图像处理系统100可以包括服务器110、网络120、终端130以及存储设备140。
服务器110可以作为所述第二处理设备,用来接收来自图像处理系统100的至少一个其他组件的数据和/或信息,和/或向其他组件发送数据和/或信息。例如,服务器110可以获取来自终端130和/或存储设备140中的图像数据。服务器110可以用来处理来自图像处理系统100的至少一个组件的数据和/或信息。例如,服务器110可以接收来自终端130的图像处理结果和/或来存储设备140的图像数据,并根据终端130的图像处理结 果进行后续处理。仅作为示例,终端130的处理结果表明终端130未完成对图像中文字的检测和识别,则服务器110可以继续进行图像的检测与识别处理。又例如,终端130的处理结果表明终端130已完成对图像中文字的检测和识别,则服务器110还可以根据终端130的处理结果直接进行风险识别。
在一些实施例中,服务器110可以是单个处理设备,也可以是处理设备组。处理设备组可以是经由接入点连接到网络120的集中式处理设备组,或者经由至少一个接入点分别连接到网络120的分布式处理设备组。在一些实施例中,服务器110可以本地连接到网络120或者与网络120远程连接。例如,服务器110可以经由网络120访问存储在终端130和/或存储设备140中的信息和/或数据。又例如,存储设备140可以用作服务器110的后端数据存储器。在一些实施例中,服务器110可以在云平台上实施。仅作为示例,所述云平台可以包括私有云、公共云、混合云、社区云、分布云、内部云、多层云等或其任意组合。
在一些实施例中,服务器110可以包括处理设备112。处理设备112可以处理与本说明书中描述的至少一个功能相关的信息和/或数据。在一些实施例中,处理设备112可以执行图像处理系统100的主要功能。例如,处理设备112可以对图像进行文本检测与识别,以确定图像中的文本内容。在一些实施例中,处理设备112可包括至少一个处理单元(例如,单核处理设备或多核处理设备)。仅作为示例,处理设备112包括中央处理单元(CPU)、专用集成电路(ASIC)、专用应用指令集处理器(ASIP)、图形处理单元(GPU)、物理处理单元(PPU)、数字信号处理器(DSP)、现场可程序门阵列(FPGA)、可程序逻辑设备(PLD)、控制器、微控制器单元、精简指令集计算机(RISC)、微处理器等,或其任意组合。
网络120可以促进信息和/或数据的交换。在一些实施例中,图像处理系统100中的至少一个组件(例如,服务器110、终端130、存储设备140)可以经由网络120将信息和/或数据发送到图像处理系统100中的其他组件。例如,服务器110可以经由网络120从存储设备140获得图像数据。又例如,服务器110完成图像检测与识别后,可以经由网络120将识别结果发送至终端130。
在一些实施例中,网络120可以为任意形式的有线或无线网络,或其任意组合。仅作为示例,网络120可以包括缆线网络、有线网络、光纤网络、远程通信网络、内部网络、互联网、局域网络(LAN)、广域网络(WAN)、无线局域网络(WLAN)、城域网(MAN)、公共开关电话网络(PSTN)、蓝牙网络、ZigBee网络、近场通讯(NFC) 网络等或其任意组合。在一些实施例中,网络120可以包括至少一个网络接入点。例如,网络120可以包括有线或无线网络接入点,如基站和/或互联网交换点120-1、120-2、……,通过图像处理系统100的至少一个部件可以连接到网络120以交换数据和/或信息。
用户可通过终端130接入图像处理系统100。在一些实时中,终端130可以作为所述第一处理设备。终端130可以通过各种方式获取图像数据,并对图像数据进行文字检测与识别。这些图像数据可以上传至服务器110或存储设备140,文字检测与识别结果可以发送至服务器110进行后续处理。例如,终端130可以通过图像采集组件(如摄像头)获取图像数据。在一些实施例中,用户通过终端130拍摄图像后通过网络将图像数据上传至服务器110或存储设备140。又例如,终端130可以从网上获取各种图像数据。例如,当用户通过终端130浏览网络信息时,可以获取网络中发布的图像信息。在一些实施例中,当用户通过终端130与其他用户的终端进行交流时,可以获取其他用户通过其终端发出的图像信息。
终端130可以对获取到的图像数据进行文字检测与识别。在一些实施例中,终端130可以采用光学字符识别技术对图像数据进行文字检测与识别。在一些实施例中,终端130上部署有深度学习引擎,用于执行文本检测和文本识别。
终端130可以包括移动设备130-1、平板计算机130-2、膝上型计算机130-3等,或其任何组合。在一些实施例中,移动设备130-1可以包括智能家居设备、可穿戴设备、智能移动设备、虚拟现实设备、增强现实设备等,或其任意组合。在一些实施例中,智能家居设备可以包括智能照明设备、智能电器控制设备、智能监控设备、智能电视、智能摄像机、对讲机等,或其任意组合。在一些实施例中,该可穿戴设备可包括智能手环、智能鞋袜、智能眼镜、智能头盔、智能手表、智能衣服、智能背包、智能配件等或其任意组合。在一些实施例中,智能移动设备可以包括智能电话、个人数字助理(PDA)、游戏设备、导航设备、销售点(POS)等,或其任意组合。在一些实施例中,虚拟现实设备和/或增强型虚拟现实设备可以包括虚拟现实头盔、虚拟现实眼镜、虚拟现实补丁、增强现实头盔、增强现实眼镜、增强现实补丁等或其任意组合。例如,虚拟现实设备和/或增强现实设备可以包括GoogleGlass TM、OculusRift TM、Hololens TM或GearVR TM等。
存储设备140可以储存数据和/或指令。例如,可以存储终端130获取并发送来的图像数据等。在一些实施例中,存储设备140可以存储处理设备112可以执行的数据和/或指令,服务器110可以通过执行或使用所述数据和/或指令以实现本说明书描述的示例性方法。在一些实施例中,存储设备140可包括大容量存储器、可移动存储器、易失性 读写存储器、只读存储器(ROM)等或其任意组合。示例性的大容量存储器可以包括磁盘、光盘、固态磁盘等。示例性可移动存储器可以包括闪存驱动器、软盘、光盘、存储卡、压缩盘、磁带等。示例性易失性读写存储器可以包括随机存取存储器(RAM)。示例性RAM可包括动态随机存取存储器(DRAM)、双倍数据速率同步动态随机存取存储器(DDRSDRAM)、静态随机存取存储器(SRAM)、晶闸管随机存取存储器(T-RAM)和零电容随机存取存储器(Z-RAM)等。示例性只读存储器可以包括掩模型只读存储器(MROM)、可编程只读存储器(PROM)、可擦除可编程只读存储器(PEROM)、电可擦除可编程只读存储器(EEPROM)、光盘只读存储器(CD-ROM)和数字多功能磁盘只读存储器等。在一些实施例中,所述存储设备140可在云平台上实现。仅作为示例,所述云平台可以包括私有云、公共云、混合云、社区云、分布云、内部云、多层云等或其任意组合。在一些实施例中,存储设备140可以集成在服务器110中。在另一些实施例中,存储设备140可以集成在不同于服务器110的另一服务器中。例如,存储设备140可以部署在可用于存储图像数据的图像服务器上,终端130可以将获取到的图像数据发送到所述图像服务器进行同步存储,服务器110可以从所述图像服务器处获取图像数据。
应当注意的是,上述有关图像处理系统100的描述仅仅是为了示例和说明,而不限定本说明书的适用范围。对于本领域技术人员来说,在本说明书的指导下可以对图像处理系统100进行各种修正和改变。然而,这些修正和改变仍在本说明书的范围之内。
图2是根据本说明书一些实施例所示的图像处理方法的示例性流程图。在一些实施例中,图2中所示的流程200可以在图1中所示的图像处理系统100中实现。例如,流程200的至少一部分可以作为指令的形式存储在存储设备140中,并且由服务器110和终端130调用和/或执行。
步骤210,第一处理设备接收图像数据。
在一些实施例中,第一处理设备可以为终端设备,例如终端130。在一些实施例中,终端设备可以是移动终端设备,例如手机设备、平板设备、智能家居设备、物联网(The Internet of Things,IOT)机具,例如扫脸机具、扫码机具等,还可以是边缘设备(Edge Device),例如路由器、交换机、网络接入设备等。
第一处理设备可以通过各种方式获取图像数据,包括但不限于通过图像采集组件(如摄像头)拍摄图像、从网上下载图像、加载本地预先存储的图像、接收其他设备传送来的图像等。在一些实施例中,第一处理设备上安装有应用程序(APP),第一处理设备 可以运行所述APP获取图像数据。例如,可以接收社交类或是资讯类APP中公众号、生活号等发布的图像。又例如,可以接收社交类APP中个人或群体聊天中出现的图像,或用户发布的状态信息中的图像。再例如,可以接收用户使用APP中的图像拍摄或上传功能输入的图像(如用户头像等)。
所述图像数据可能含有文本内容,需要对所述图像数据进行文本检测和识别,以对图像数据中的文本内容进行风险判断。例如,所述图像数据可以是仅含有文本内容的纯文本图案,或者也可以是同时含有文本内容和非文本内容的图案。相比于呈现为文本格式的文本内容来说,呈现为图像格式的文本内容的检测识别难度更大,对计算的需求更大,因此接下来采用两个处理设备协同处理的技术方案。
步骤220,第一处理设备在预设时长内对所述图像数据进行第一处理,得到第一结果。
第一处理设备的第一处理可以包括对所述图像数据进行文本检测,还可以包括对所述图像数据进行文本识别。文本检测处理可以包括对图像中存在文字的区域进行定位,确定文本行的边界框。文本识别处理包括对定位后的文字进行识别,确定文本内容。在一些实施例中,第一处理设备可以采用光学字符识别技术对所述图像数据中的文本进行检测和/或识别。光学字符识别(Optical Character Recognition,OCR)技术通过光学手段、根据检测到的暗亮来确定图像中的文本形状,然后通过字符识别方法将文本形状转换为计算机可处理的文本格式的文本。在一些实施例中,第一处理还可以包括在进行文本检测前对图像数据进行预处理,预处理包括但不限于灰度化、几何变换、图像增强等。
所述预设时长用于限制第一处理设备对图像数据进行处理的时间。可以理解,由于本说明书中采用了第一处理设备和第二处理设备协同处理,第一处理设备处理完之后由第二处理设备接着处理,因此需要给第一处理设备设置所述预设时长,即第一处理设备仅在所述预设时长内对图像数据进行处理,超出所述预设时长后第一处理设备可以不再继续处理,而交由第二处理设备接着处理。在一些实施例中,可以根据相关因素预先设定所述预设时长。例如,可以根据第一处理设备的性能设置所述预设时长,第一处理设备的性能越强,所述预设时长越短;还可以根据所涉图像处理场景的复杂度确定所述预设时长,所涉图像处理场景越复杂,所述预设时长越长。所述预设时长可以为任何数值,仅作为示例,可以是10秒、5秒、4秒、3秒、2秒、1秒等。
所述第一结果可以包括在所述预设时长内所述第一处理设备执行所述第一处理获得的结果,例如,可以包括文本检测及识别结果,或仅包括文本检测结果。所述第一结果 还可以包括在所述预设时长内执行所述第一处理的进度信息,例如,已完成文本检测且无文本、已完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理、未完成文本检测处理与文本识别处理这四种情况。所述第一设备进行第一处理的详细过程以及有关第一结果的详细描述可以参见图5,在此不再赘述。
步骤230,第一处理设备将所述第一结果发送给第二处理设备。
在一些实施例中,第二处理设备可以为服务器,如图1所示的服务器110。第一处理设备(如图1所示的终端130)可以通过网络(如图1所示的网络120)将第一处理结果发送给第二处理设备(如服务器110)。第一处理设备也可以通过网络将第一处理结果发送给存储设备(如图1所示的存储设备140),第二处理设备从所述存储设备处获取第一处理结果。
在一些实施例中,第二处理设备进行步骤240中的后续处理时,除了要用到所述第一结果外,还需要用到所述图像数据。相应地,第一处理设备除了将所述第一结果发送给第二处理设备外,还可以将所述图像数据也发送给第二处理设备。例如,第一处理设备可以向第二处理设备直接发送所述图像数据,或者可以发送图像标识,图像标识包括但不限于图像的编码(如一串随机生成的字符串)。图像标识与图像数据之间具有对应关系,第二处理设备可以根据图像标识获取相应的图像数据。例如,第一处理设备在步骤210中获取到图像数据后,可以将获取到的图像数据及相应的图像标识发送给存储设备(如图1所示的存储设备140)进行存储,当第二处理设备在进行后续处理需要获取所述图像数据时,则可以根据第一处理设备发送来的相应图像标识从所述存储设备获取所述图像数据。
步骤240,第二处理设备根据所述第一处理结果进行与所述图像数据相关的后续处理。
在一些实施例中,所述后续处理包括对于第一处理设备在预设时长内未能完成的第一处理的部分。在一些实施例中,响应于所述第一结果为无文本,第二设备可以不执行后续处理。在一些实施例中,根据不同的第一结果,第二处理设备可以进行不同的后续处理。例如,响应于所述第一结果包括已完成文本检测处理与文本识别处理的进度信息以及识别出的文本内容,第二处理设备的后续处理可以是对所述文本内容进行风险判断处理;响应于所述第一结果包括已完成文本检测处理但未完成文本识别处理的进度信息以及文本在所述图像数据中的位置信息,第二处理设备的后续处理可以是获取所述图像数据并基于所述位置信息从所述图像数据中识别文本内容,并对所述文本内容进行风险 判断处理;响应于所述第一结果包括未完成文本检测处理与文本识别处理的进度信息,第二处理设备的后续处理可以是获取所述图像数据,对所述图像数据进行文本检测处理以及文本识别处理,并对识别获得的文本内容进行风险判断处理。在一些实施例中,可以为第二处理设备进行所述后续处理设定预设时长,例如10秒、5秒、4秒、3秒、2秒、1秒等,超过设定时长未得到文本检测和/或识别结果,则可以将图像数据进行上报或是进行其他特殊处理。有关第二处理设备进行后续处理的更多内容可以参见图6及其描述,此处不再赘述。
在一些实施例中,所述后续处理还可以包括对所述图像数据中的文本内容进行风险识别,得到文本风险识别结果。由于互联网中的图像数据繁杂多样,图像数据中可能存在一种载体而存在包括黄色、赌博、毒品、暴力、恐怖、低俗等不良文本内容。因此,需要通过一些技术手段,将这类具有不良文本内容的图像数据识别出来,再进行提醒或是屏蔽等处理。对于可能存在风险的图像,可以根据情况通过不同的形式进行处理,例如,可以删除图像、禁止发布图像、仅发布者可见图像等。在一些实施例中,所述风险识别可以应用于社交聊天、网络账号发布内容、用户的信息(例如,头像信息、昵称信息等)上传等所有涉及内容风险防控的场景。有关风险识别的更多内容可以参见图6及其描述。
应当注意的是,上述有关流程200的描述仅仅是为了示例和说明,而不限定本说明书的适用范围。对于本领域技术人员来说,在本说明书的指导下可以对流程200进行各种修正和改变。然而,这些修正和改变仍在本说明书的范围之内。
图3是根据本说明书一些实施例所示的一个示例性图像处理系统的框图。该图像处理系统可以在第一处理设备(如图1所示终端130)上实现。如图3所示,该图像处理系统300可以包括图像获取模块310、第一处理模块320、传送模块330。
图像获取模块310用于获取图像数据。在一些实施例中,图像获取模块310可以通过各种方式获取图像数据,包括但不限于通过图像采集组件(如摄像头)拍摄图像、从网上下载图像、加载本地预先存储的图像、接收其他设备传送来的图像等。在一些实施例中,图像获取模块310可以通过运行APP获取图像数据。所述图像数据可能含有文本内容,需要对所述图像数据进行文本检测和识别。例如,所述图像数据可以是仅含有文本内容的纯文本图案,或者也可以是同时含有文本内容和非文本内容的图案。
第一处理模块320用于对图像数据进行第一处理。在一些实施例中,第一处理模块320可以对图像数据进行文本检测,还可以对图像数据进行文本识别。在一些实施例中, 第一处理模块320采用OCR技术对图像数据中的文本进行检测和/或识别。在一些实施例中,在对图像数据进行文本检测和识别前,第一处理模块320还可以对图像数据进行预处理。在一些实施例中,针对第一处理模块320的第一处理可以设置预设时长,用来限制所述第一处理的时间。第一处理模块320对图像数据进行第一处理后可以得到第一结果。第一结果可以包括例如文本检测及识别结果,或仅包括文本检测结果;第一结果还可以包括在所述预设时长内第一处理模块320进行所述第一处理的进度信息。
传送模块330可以用来传送数据。在一些实施例中,传送模块330可以将第一处理模块320在预设时长内执行第一处理获得的第一结果发送给第二处理设备。在一些实施例中,传送模块330还可以向第二处理设备发送图像数据。传送模块330可以将所述第一结果和/或图像数据直接发送给第二处理设备,也可以先发送给存储设备(如图1所示的存储设备140),由第二处理设备从所述存储设备中获取。
图4是根据本说明书一些实施例所示的一个示例性图像处理系统的框图。该图像处理系统400可以在第二处理设备(如图1所示服务器110)上实现。如图4所示,该图像处理系统400可以包括获取模块410、后续处理模块420。
获取模块410用于获取信息。在一些实施例中,获取模块410可以获取第一处理设备(如图3所示的)在预设时长内执行第一处理获得到的第一结果。在一些实施例中,获取模块410还可以获取第一处理设备(如图3所示图像获取模块310)获取到的图像数据。
后续处理模块420用于根据第一处理设备(如图3所示的第一处理模块320)对图像数据执行第一处理得到的第一结果执行与所述图像数据相关的后续处理。后处理模块420可以包括文本处理单元422和风险分析单元424。文本处理单元422可以执行第一处理模块320在预设时长内未能完成的第一处理的部分,例如,第一处理模块320未能完成的文本检测和/或文本识别处理,以确定图像数据中的文本内容。风险分析单元424用于对识别出的文本内容进行风险分析。在一些实施例中,风险分析单元424可以通过文本挖掘技术对文本内容进行风险分析,确定文本内容是否具有风险,或者文本内容的风险程度,从而对相应图像数据采取措施,例如删除图像、禁止发布图像、仅发布者可见图像等。
应当理解,图3和图4所示的系统及其模块可以利用各种方式来实现。例如,在一些实施例中,系统及其模块可以通过硬件、软件或者软件和硬件的结合来实现。其中,硬件部分可以利用专用逻辑来实现;软件部分则可以存储在存储器中,由适当的指令执 行系统,例如微处理器或者专用设计硬件来执行。本领域技术人员可以理解上述的方法和系统可以使用计算机可执行指令和/或包含在处理器控制代码中来实现,例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本说明书的系统及其模块不仅可以有诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现,也可以用例如由各种类型的处理器所执行的软件实现,还可以由上述硬件电路和软件的结合(例如,固件)来实现。
需要注意的是,以上对于图像处理系统300和400及其模块的描述,仅为描述方便,并不能把本说明书限制在所举实施例范围之内。可以理解,对于本领域的技术人员来说,在了解该系统的原理后,可能在不背离这一原理的情况下,对各个模块进行任意组合,或者构成子系统与其他模块连接。例如,在一些实施例中,图3中披露的图像获取模块310、第一处理模块320、传送模块330可以是一个系统中的不同模块,也可以是一个模块实现上述两个或两个以上模块的功能;类似地,图4中披露的各模块也可以一个系统中的不同模块,也可以是一个模块实现上述两个或两个以上模块的功能。又例如,图像处理系统300和/或400还可以包括通信模块,用来与其他部件通信。图像处理系统300和/或400中各个模块可以共用一个存储模块,各个模块也可以分别具有各自的存储模块。诸如此类的变形,均在本说明书的保护范围之内。
图5是根据本说明书一些实施例所示的第一处理设备执行的图像处理方法的示例性流程图。在一些实施例中,图5中所示的流程500可以在图1中所示的图像处理系统100中实现。例如,流程500的至少一部分可以作为指令的形式存储在存储设备140中,并且由终端130调用和/或执行。
步骤510,获取图像数据。该步骤510已在步骤210中详细描述,这里不再赘述。
步骤520,在预设时长内对所述图像数据进行文本检测和文本识别处理,得到第一结果。在一些实施例中,第一处理设备可以采用OCR技术对获取的图像数据执行文字检测和识别。在一些实施例中,可以在第一处理设备中部署一套深度学习引擎,用于执行文本检测和识别处理。
在一些实施例中,第一结果包括文本检测及识别结果。例如,可以是所述图像数据中无文本,或者可以是从所述图像中识别出来的文本内容。在一些实施例中,第一结果仅包括文本检测结果,而无文本识别结果。文本检测结果可以包括所述图像数据中文本 的位置信息,例如可以是所述图像数据中文本行的边界框,或者是文本所在横坐标、纵坐标等。在一些实施例中,第一结果还可以包括第一处理设备在预设时长内对图像数据进行文本检测和文本识别处理的进度信息。进度信息可以反映第一处理设备在预设时长内是否完成对所述图像数据的文字检测和识别处理。仅作为示例,进度信息可以包括已完成文本检测且无文本、已完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理、未完成文本检测处理与文本识别处理这四种情况。在一些实施例中,可以使用特定表达形式表示不同的第一结果,例如,用hasEdgeResult及其数值来表示:若为hasEdgeResult=0,则表示已完成文本检测且无文本;若为hasEdgeResult=1,则表示已完成文本检测处理与文本识别处理,且识别出文本内容;若为hasEdgeResult=2,则表示未完成文本检测和文本识别处理;若为hasEdgeResult=3,则表示已完成文本检测处理但未完成文本识别处理。
在一些实施例中,所述预设时长为预先设定的时间阈值,可以是0.25s、0.5s、1s、1.5s、2s、3s、5s等任意时间值。在一些实施例,所述预设时长可以根据图像大小、图像规格、图像类别、图像来源等相关因素进行相应调整。
在一些实施例中,第一处理设备可以为终端设备。在一些实施例中,使用终端设备对图像数据进行文字检测和识别处理可以减少服务器对所述图像数据进行后续处理的计算量,在一定程度上减轻服务器的计算压力。
步骤530,将所述第一结果发送给第二处理设备,以便第二处理设备基于第一结果进行与所述图像数据相关的后续处理。
如步骤520中所述,可以用特定表达形式表示不同的第一结果,如用hasEdgeResult及其数值来表示。为方便说明,下面以hasEdgeResult及其数值为例描述第一处理设备如何将所述第一结果发送给第二处理设备。若hasEdgeResult=0,则表示已完成文本检测且无文本,这种情况下第一处理设备可以只向第二处理设备发送的处理结果可以为空;若hasEdgeResult=1,则表示已完成文本检测处理与文本识别处理,且识别出文本内容,这种情况下第一处理设备可以将识别出的文本内容发送给第二处理设备;若hasEdgeResult=2,则表示未完成文本检测和文本识别处理,意味着第一处理设备对所述图像数据的文本检测和文本识别均失败,这种情况下第一处理设备可以将所述图像数据发送给第二处理设备,由第二处理设备对所述图像数据进行文本检测和识别;若hasEdgeResult=3,则表示已完成文本检测处理但未完成文本识别处理,这种情况下第一处理设备可以将文本检测的结果以及所述图像数据发送给第二处理设备。在一些实施例 中,第一处理设备可以将图像标识、hasEdgeResult值以及执行第一处理获得的结果(如识别到的文本内容或检测到的文本位置)传输给第二处理设备,将其接收得到的图像数据上传给存储设备(如图1中的存储设备140)进行同步,第二处理设备可以基于图像标识从存储设备中获取相应的图像数据进行后续处理。需要注意的是,除了上述由第一处理设备直接向第二处理设备发送第一结果和/或所述图像数据外,在一些实施例中,第一处理设备也可以将这些数据上传至存储设备(如图1所示的存储设备140),再由第二处理设备从所述存储设备处获取这些数据。以图像数据的传送为例,在一些实施例中,第一处理设备可以生成图像数据标识,将所述图像数据及其标识上传至存储设备,并将所述标识发送给第二处理设备,第二处理设备可以根据图像数据标识从所述存储设备处获取所述图像数据。
应当注意的是,上述有关流程500的描述仅仅是为了示例和说明,而不限定本说明书的适用范围。对于本领域技术人员来说,在本说明书的指导下可以对流程500进行各种修正和改变。然而,这些修正和改变仍在本说明书的范围之内。
图6是根据本说明书一些实施例所示的第二处理设备执行的图像处理方法的示例性流程图。在一些实施例中,图6中所示的流程600可以在图1中所示的图像处理系统100中实现。例如,流程600的至少一部分可以作为指令的形式存储在存储设备140中,并且由服务器110调用和/或执行。
步骤610,获取第一处理设备发送来的第一结果。关于第一结果以及第一处理设备向第二处理设备发送第一结果的更多内容,可以参见图5中的步骤520和530,此处不再赘述。
步骤620~650为第二处理设备根据第一处理设备发送来的所述第一结果对所述图像数据进行的相关后续处理。第二处理设备可以对图像数据进行文本检测和识别处理,第二处理设备还可以对识别出的文本内容进行风险分析。在一些实施例中,第二处理设备可以采用OCR技术进行文本检测和识别。在一些实施例中,第二处理设备上可以预先部署一套深度学习引擎,用于执行文本检测、文本识别处理和/或风险判断处理。在一些实施例中,第二处理设备中的深度学习引擎的能力大于第一处理设备的中的深度学习引擎的能力,因此可以检测识别出第一处理设备未成功检测或识别的图像数据,并进行其他处理。下面分别说明不同第一结果下,第二处理设备对所述图像数据进行的后续处理。
步骤620,响应于所述第一结果为无文本,不执行后续处理。在一些实施例中,第一处理设备在所述预设时长内完成对所述图像数据的文本检测处理且文本检测结果为 无文本(如前述hasEdgeResult=0表示的情况),则第二处理设备不进行任何后续处理。
通过本说明书实施例的方法,在第二处理设备进行处理前先由第一处理设备进行处理,可以有效减轻第二处理的计算压力,计算压力减轻的具体大小可以根据图像数据的具体情况而定。例如,无文字图像占全部图像的占比在50%以上。对于无文字的图像,第一处理设备进行文本检测与识别后可能会成功确定这些图像中无文字,这样第二处理设备就无需对这些无文字图像进行任何后续处理。如果第一处理设备成功确定出所有的无文字图像,则第二处理设备就无需对这些无文字图像进行后续处理,相比于仅靠第二处理设备对所有图像进行文本检测和识别的方案,可以降低第二处理设备至少50%的计算压力。步骤630,响应于所述第一结果包括已完成文本检测处理与文本识别处理的进度信息以及识别出的文本内容,对所述文本内容进行风险判断处理。
在一些实施例中,第一处理设备在所述预设时长内成功完成对所述图像数据的文本检测与文字识别,且识别出所述图像数据中的文本内容(如前述hasEdgeResult=1对应的情况),则第一处理设备可以将识别出的文本内容发送至第二处理设备,第二处理设备无需再次进行检测与识别,直接进行风险判断处理,这也可以减少第二处理设备的计算压力。
在一些实施例中,第二处理设备可以通过文本挖掘技术对识别出来的文本内容进行风险判断处理。文本挖掘可以采用文本分词技术将识别出的文本内容按照一定的规范重新组合成词序列,再根据分词结果判断是否存在不良内容。分词算法包括但不限于基于字典匹配的分词算法、基于语义分析的分词算法、基于概率统计模型的分词算法等。在一些实施例中,可以通过训练得到文本挖掘模型,用模型对识别出的文本内容进行风险判断。
步骤640,响应于所述第一结果包括已完成文本检测处理但未完成文本识别处理的进度信息以及文本在所述图像数据中的位置信息,获取所述图像数据并基于所述位置信息从所述图像数据中识别文本内容,并对所述文本内容进行风险判断处理。
在一些实施例中,第一处理设备在所述预设时长内成功完成对所述图像数据的文本检测处理但未完成文本识别处理,则第二处理设备需要再次进行文字识别,但无需再次进行文本检测,这也可以减轻第二处理设备的计算压力。在一些实施例中,第一处理设备可以将通过文本检测确定的文本位置信息发送给第二处理设备,文本位置信息可以通过横坐标以及纵坐标等信息进行描述,另外,第二处理设备还可以获取所述图像数据,所述图像数据可以由第一处理设备直接发送给第二处理设备,也可以由第一处理设备先 发送给存储设备进行存储,再由第二处理设备从所述存储设备中获取。第二处理设备根据所述图像数据和第一处理设备得到的文本位置信息能够在所述图像数据中找到文本,并进行文本识别,得到识别出的文本内容。
对文本内容进行风险判断处理的操作与步骤630中描述的类似,此处不再赘述。
步骤650,响应于所述第一结果包括未完成文本检测处理与文本识别处理的进度信息,获取所述图像数据,对所述图像数据进行文本检测处理以及文本识别处理,并对识别获得的文本内容进行风险判断处理。
在一些实施例中,第一处理设备在所述预设时长内未完成文本检测处理与文本识别处理,则第二处理设备需要重新进行文本检测与文字识别。可以理解,第一处理设备的处理能力有限,不能处理所有图像数据,第二处理设备的处理能力大于第一处理设备的处理能力,因此第二处理设备可能有能力处理第一处理设备处理失败的图像数据。对于第一处理设备未能成功进行文本处理与文本识别的图像数据,第二处理设备需要获取图像数据重新进行文本检测和文本识别。
对文本内容进行风险判断处理的操作与步骤630中描述的类似,此处不再赘述。
应当注意的是,上述有关流程600的描述仅仅是为了示例和说明,而不限定本说明书的适用范围。对于本领域技术人员来说,在本说明书的指导下可以对流程600进行各种修正和改变。然而,这些修正和改变仍在本说明书的范围之内。
本说明书实施例可能带来的有益效果包括但不限于:将对图像数据的文本检测和文本识别拆分进行,并由终端设备先行处理,再由服务器进行后续处理,由此充分利用了终端资源,降低服务器端所承载的计算压力。需要说明的是,不同实施例可能产生的有益效果不同,在不同的实施例里,可能产生的有益效果可以是以上任意一种或几种的组合,也可以是其他任何可能获得的有益效果。
上文已对基本概念做了描述,显然,对于本领域技术人员来说,上述详细披露仅仅作为示例,而并不构成对本说明书的限定。虽然此处并没有明确说明,本领域技术人员可能会对本说明书进行各种修改、改进和修正。该类修改、改进和修正在本说明书中被建议,所以该类修改、改进、修正仍属于本说明书示范实施例的精神和范围。
同时,本说明书使用了特定词语来描述本说明书的实施例。如“一个实施例”、“一实施例”、和/或“一些实施例”意指与本说明书至少一个实施例相关的某一特征、结构或特点。因此,应强调并注意的是,本说明书中在不同位置两次或多次提及的“一实施例”或 “一个实施例”或“一个替代性实施例”并不一定是指同一实施例。此外,本说明书的一个或多个实施例中的某些特征、结构或特点可以进行适当的组合。
此外,本领域技术人员可以理解,本说明书的各方面可以通过若干具有可专利性的种类或情况进行说明和描述,包括任何新的和有用的工序、机器、产品或物质的组合,或对他们的任何新的和有用的改进。相应地,本说明书的各个方面可以完全由硬件执行、可以完全由软件(包括固件、常驻软件、微码等)执行、也可以由硬件和软件组合执行。以上硬件或软件均可被称为“数据块”、“模块”、“引擎”、“单元”、“组件”或“系统”。此外,本说明书的各方面可能表现为位于一个或多个计算机可读介质中的计算机产品,该产品包括计算机可读程序编码。
计算机存储介质可能包含一个内含有计算机程序编码的传播数据信号,例如在基带上或作为载波的一部分。该传播信号可能有多种表现形式,包括电磁形式、光形式等,或合适的组合形式。计算机存储介质可以是除计算机可读存储介质之外的任何计算机可读介质,该介质可以通过连接至一个指令执行系统、装置或设备以实现通讯、传播或传输供使用的程序。位于计算机存储介质上的程序编码可以通过任何合适的介质进行传播,包括无线电、电缆、光纤电缆、RF、或类似介质,或任何上述介质的组合。
本说明书各部分操作所需的计算机程序编码可以用任意一种或多种程序语言编写,包括面向对象编程语言如Java、Scala、Smalltalk、Eiffel、JADE、Emerald、C++、C#、VB.NET、Python等,常规程序化编程语言如C语言、VisualBasic、Fortran2003、Perl、COBOL2002、PHP、ABAP,动态编程语言如Python、Ruby和Groovy,或其他编程语言等。该程序编码可以完全在用户计算机上运行、或作为独立的软件包在用户计算机上运行、或部分在用户计算机上运行部分在远程计算机运行、或完全在远程计算机或处理设备上运行。在后种情况下,远程计算机可以通过任何网络形式与用户计算机连接,比如局域网(LAN)或广域网(WAN),或连接至外部计算机(例如通过因特网),或在云计算环境中,或作为服务使用如软件即服务(SaaS)。
此外,除非权利要求中明确说明,本说明书所述处理元素和序列的顺序、数字字母的使用、或其他名称的使用,并非用于限定本说明书流程和方法的顺序。尽管上述披露中通过各种示例讨论了一些目前认为有用的发明实施例,但应当理解的是,该类细节仅起到说明的目的,附加的权利要求并不仅限于披露的实施例,相反,权利要求旨在覆盖所有符合本说明书实施例实质和范围的修正和等价组合。例如,虽然以上所描述的系统组件可以通过硬件设备实现,但是也可以只通过软件的解决方案得以实现,如在现有的 处理设备或移动设备上安装所描述的系统。
同理,应当注意的是,为了简化本说明书披露的表述,从而帮助对一个或多个发明实施例的理解,前文对本说明书实施例的描述中,有时会将多种特征归并至一个实施例、附图或对其的描述中。但是,这种披露方法并不意味着本说明书对象所需要的特征比权利要求中提及的特征多。实际上,实施例的特征要少于上述披露的单个实施例的全部特征。
一些实施例中使用了描述成分、属性数量的数字,应当理解的是,此类用于实施例描述的数字,在一些示例中使用了修饰词“大约”、“近似”或“大体上”来修饰。除非另外说明,“大约”、“近似”或“大体上”表明所述数字允许有±20%的变化。相应地,在一些实施例中,说明书和权利要求中使用的数值参数均为近似值,该近似值根据个别实施例所需特点可以发生改变。在一些实施例中,数值参数应考虑规定的有效数位并采用一般位数保留的方法。尽管本说明书一些实施例中用于确认其范围广度的数值域和参数为近似值,在具体实施例中,此类数值的设定在可行范围内尽可能精确。
针对本说明书引用的每个专利、专利申请、专利申请公开物和其他材料,如文章、书籍、说明书、出版物、文档等,特此将其全部内容并入本说明书作为参考。与本说明书内容不一致或产生冲突的说明书历史文件除外,对本说明书权利要求最广范围有限制的文件(当前或之后附加于本说明书中的)也除外。需要说明的是,如果本说明书附属材料中的描述、定义、和/或术语的使用与本说明书所述内容有不一致或冲突的地方,以本说明书的描述、定义和/或术语的使用为准。
最后,应当理解的是,本说明书中所述实施例仅用以说明本说明书实施例的原则。其他的变形也可能属于本说明书的范围。因此,作为示例而非限制,本说明书实施例的替代配置可视为与本说明书的教导一致。相应地,本说明书的实施例不仅限于本说明书明确介绍和描述的实施例。

Claims (24)

  1. 一种图像处理方法,所述方法由第一处理设备执行,其包括:
    获取图像数据;
    在预设时长内对所述图像数据进行第一处理,得到第一结果;所述第一结果包括在预设时长内执行第一处理获得的结果和/或在预设时长内执行第一处理的进度信息;以及
    将所述第一结果发送给第二处理设备,以便第二处理设备基于所述第一结果进行与所述图像数据相关的后续处理。
  2. 根据权利要求1所述的图像处理方法,所述第一处理包括对图像数据的文本检测处理与文本识别处理;
    所述后续处理包括第一处理设备在预设时长内未能完成的第一处理的部分以及文本风险判断处理,或者仅包括文本风险判断处理。
  3. 根据权利要求1所述的图像处理方法,所述在预设时长内执行第一处理获得的结果包括文本检测及识别结果,或仅包括文本检测结果;
    所述进度信息包括未完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理或者已完成文本检测处理与文本识别处理。
  4. 根据权利要求3所述的图像处理方法,所述文本检测及识别结果包括无文本或识别出来的文本内容;所述文本检测结果包括文本在所述图像数据中的位置信息。
  5. 根据权利要求1所述的图像处理方法,所述第一处理设备为终端设备,所述第二处理设备为服务器。
  6. 一种图像处理系统,包括:
    图像获取模块,用于获取图像数据;
    第一处理模块,用于在预设时长内对所述图像数据进行第一处理,得到第一结果;所述第一结果包括在预设时长内执行第一处理获得的结果和/或在预设时长内执行第一处理的进度信息;以及
    传送模块,用于将所述第一结果发送给第二处理设备,以便第二处理设备基于所述第一结果进行与所述图像数据相关的后续处理。
  7. 根据权利要求6所述的图像处理系统,所述第一处理包括对图像数据的文本检测处理与文本识别处理;
    所述后续处理包括第一处理设备在预设时长内未能完成的第一处理的部分以及文本风险判断处理,或者仅包括文本风险判断处理。
  8. 根据权利要求6所述的图像处理系统,所述在预设时长内执行第一处理获得的 结果包括文本检测及识别结果,或仅包括文本检测结果;
    所述进度信息包括未完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理或者已完成文本检测处理与文本识别处理。
  9. 根据权利要求8所述的图像处理系统,所述文本检测及识别结果包括无文本或识别出来的文本内容;所述文本检测结果包括文本在所述图像数据中的位置信息。
  10. 根据权利要求6所述的图像处理系统,所述第一处理设备为终端设备,所述第二处理设备为服务器。
  11. 一种图像处理装置,所述装置包括至少一个处理器以及至少一个存储器;
    所述至少一个存储器用于存储计算机指令;
    所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现如权利要求1~5中任一项所述的图像处理方法。
  12. 一种图像处理方法,所述方法由第二处理设备执行,其包括:
    获取第一结果;所述第一结果为第一处理设备在预设时长内对图像数据进行第一处理得到;所述第一结果包括在预设时长内第一处理设备执行第一处理获得的结果和/或在预设时长内第一处理设备执行第一处理的进度信息;
    基于所述第一结果,进行与所述图像数据相关的后续处理。
  13. 根据权利要求12所述的图像处理方法,所述第一处理包括对图像数据的文本检测处理与文本识别处理;
    所述后续处理包括第一处理设备在预设时长内未能完成的第一处理的部分以及文本风险判断处理,或者仅包括文本风险判断处理。
  14. 根据权利要求13所述的图像处理方法,所述在预设时长内执行第一处理获得的结果包括文本检测及识别结果,或仅包括文本检测结果;
    所述进度信息包括未完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理或者已完成文本检测处理与文本识别处理。
  15. 根据权利要求14所述的图像处理方法,所述文本检测及识别结果包括无文本或识别出的文本内容;所述文本检测结果包括文本在所述图像数据中的位置信息。
  16. 根据权利要求15所述的图像处理方法,所述基于所述第一结果,进行与所述图像数据相关的后续处理,包括:
    响应于所述第一结果为无文本,不执行后续处理;
    响应于所述第一结果包括已完成文本检测处理与文本识别处理的进度信息以及识别出的文本内容,对所述文本内容进行风险判断处理;
    响应于所述第一结果包括已完成文本检测处理但未完成文本识别处理的进度信息以及文本在所述图像数据中的位置信息,获取所述图像数据并基于所述位置信息从所述图像数据中识别文本内容,并对所述文本内容进行风险判断处理;
    响应于所述第一结果包括未完成文本检测处理与文本识别处理的进度信息,获取所述图像数据,对所述图像数据进行文本检测处理以及文本识别处理,并对识别获得的文本内容进行风险判断处理。
  17. 根据权利要求12所述的图像处理方法,所述第一处理设备为终端设备,所述第二处理设备为服务器。
  18. 一种图像处理系统,包括:
    获取模块,用于获取第一结果;所述第一结果为第一处理设备在预设时长内对图像数据进行第一处理得到;所述第一结果包括在预设时长内第一处理设备执行第一处理获得的结果和/或在预设时长内第一处理设备执行第一处理的进度信息;
    后续处理模块,用于基于所述第一结果,进行与所述图像数据相关的后续处理。
  19. 根据权利要求18所述的图像处理系统,所述第一处理包括对图像数据的文本检测处理与文本识别处理;
    所述后续处理包括第一处理设备在预设时长内未能完成的第一处理的部分以及文本风险判断处理,或者仅包括文本风险判断处理。
  20. 根据权利要求19所述的图像处理系统,所述在预设时长内执行第一处理获得的结果包括文本检测及识别结果,或仅包括文本检测结果;
    所述进度信息包括未完成文本检测处理与文本识别处理、已完成文本检测处理但未完成文本识别处理或者已完成文本检测处理与文本识别处理。
  21. 根据权利要求20所述的图像处理系统,所述文本检测及识别结果包括无文本或识别出的文本内容;所述文本检测结果包括文本在所述图像数据中的位置信息。
  22. 根据权利要求21所述的图像处理系统,所述后续处理模块还用于:
    响应于所述第一结果为无文本,不执行后续处理;
    响应于所述第一结果包括已完成文本检测处理与文本识别处理的进度信息以及识别出的文本内容,对所述文本内容进行风险判断处理;
    响应于所述第一结果包括已完成文本检测处理但未完成文本识别处理的进度信息以及文本在所述图像数据中的位置信息,获取所述图像数据并基于所述位置信息从所述图像数据中识别文本内容,并对所述文本内容进行风险判断处理;
    响应于所述第一结果包括未完成文本检测处理与文本识别处理的进度信息,获取所 述图像数据,对所述图像数据进行文本检测处理以及文本识别处理,并对识别获得的文本内容进行风险判断处理。
  23. 根据权利要求18所述的图像处理系统,所述第一处理设备为终端设备,所述第二处理设备为服务器。
  24. 一种图像处理装置,所述装置包括至少一个处理器以及至少一个存储器;
    所述至少一个存储器用于存储计算机指令;
    所述至少一个处理器用于执行所述计算机指令中的至少部分指令以实现如权利要求12~17中任一项所述的图像处理方法。
PCT/CN2020/107107 2019-10-11 2020-08-05 一种图像处理方法和系统 WO2021068628A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910962922.3A CN110717484B (zh) 2019-10-11 2019-10-11 一种图像处理方法和系统
CN201910962922.3 2019-10-11

Publications (1)

Publication Number Publication Date
WO2021068628A1 true WO2021068628A1 (zh) 2021-04-15

Family

ID=69211426

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/107107 WO2021068628A1 (zh) 2019-10-11 2020-08-05 一种图像处理方法和系统

Country Status (3)

Country Link
CN (1) CN110717484B (zh)
TW (1) TWI793418B (zh)
WO (1) WO2021068628A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717484B (zh) * 2019-10-11 2021-07-27 支付宝(杭州)信息技术有限公司 一种图像处理方法和系统
CN115019291B (zh) * 2021-11-22 2023-04-14 荣耀终端有限公司 图像的文字识别方法、电子设备及存储介质
CN113936338A (zh) * 2021-12-15 2022-01-14 北京亮亮视野科技有限公司 手势识别方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004179851A (ja) * 2002-11-26 2004-06-24 Fuji Photo Film Co Ltd 文字認識システム
CN102855480A (zh) * 2012-08-07 2013-01-02 北京百度网讯科技有限公司 一种图像文字识别方法和装置
CN105260241A (zh) * 2015-10-23 2016-01-20 南京理工大学 集群系统中进程相互协作的方法
CN109918187A (zh) * 2019-03-12 2019-06-21 北京同城必应科技有限公司 任务调度方法、装置、设备和存储介质
CN110717484A (zh) * 2019-10-11 2020-01-21 支付宝(杭州)信息技术有限公司 一种图像处理方法和系统

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4885814B2 (ja) * 2007-09-13 2012-02-29 株式会社リコー 画像処理装置、画像形成装置、画像処理方法、画像データ処理プログラム、及び記録媒体
JP5171296B2 (ja) * 2008-02-07 2013-03-27 キヤノン株式会社 画像保存システム、画像処理装置、画像保存方法およびプログラム
CN101957911B (zh) * 2010-09-29 2012-11-28 汉王科技股份有限公司 一种人脸识别方法及系统
CN102546920B (zh) * 2011-01-04 2013-10-23 中国移动通信有限公司 一种运行进程的方法、系统及设备
CN104463790B (zh) * 2013-09-17 2018-08-10 联想(北京)有限公司 一种图像处理的方法、装置和系统
CN108334517A (zh) * 2017-01-20 2018-07-27 华为技术有限公司 一种网页渲染方法及相关设备
CN107609461A (zh) * 2017-07-19 2018-01-19 阿里巴巴集团控股有限公司 模型的训练方法、数据相似度的确定方法、装置及设备
KR102104397B1 (ko) * 2017-12-05 2020-04-24 이원 유해물 관리 방법, 이를 실행하는 컴퓨팅 장치 및 컴퓨터 프로그램
CN108074236B (zh) * 2017-12-27 2020-05-19 Oppo广东移动通信有限公司 植物浇灌提醒方法、装置、设备及存储介质
CN109086789A (zh) * 2018-06-08 2018-12-25 四川斐讯信息技术有限公司 一种图像识别方法及系统
CN109858420A (zh) * 2019-01-24 2019-06-07 国信电子票据平台信息服务有限公司 一种票据处理系统和处理方法
CN110263792B (zh) * 2019-06-12 2021-10-22 广东小天才科技有限公司 图像识读及数据处理方法、智能笔、系统及存储介质
CN110929525B (zh) * 2019-10-23 2022-08-05 三明学院 一种网贷风险行为分析检测方法、装置、设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004179851A (ja) * 2002-11-26 2004-06-24 Fuji Photo Film Co Ltd 文字認識システム
CN102855480A (zh) * 2012-08-07 2013-01-02 北京百度网讯科技有限公司 一种图像文字识别方法和装置
CN105260241A (zh) * 2015-10-23 2016-01-20 南京理工大学 集群系统中进程相互协作的方法
CN109918187A (zh) * 2019-03-12 2019-06-21 北京同城必应科技有限公司 任务调度方法、装置、设备和存储介质
CN110717484A (zh) * 2019-10-11 2020-01-21 支付宝(杭州)信息技术有限公司 一种图像处理方法和系统

Also Published As

Publication number Publication date
TW202115604A (zh) 2021-04-16
TWI793418B (zh) 2023-02-21
CN110717484A (zh) 2020-01-21
CN110717484B (zh) 2021-07-27

Similar Documents

Publication Publication Date Title
WO2021068628A1 (zh) 一种图像处理方法和系统
EP3370188B1 (en) Facial verification method, device, and computer storage medium
US11290447B2 (en) Face verification method and device
US10810292B2 (en) Electronic device and method for storing fingerprint information
US10789592B2 (en) Transaction confirmation based on user attributes
US10034124B2 (en) Electronic apparatus and method for identifying at least one pairing subject in electronic apparatus
CN112052789A (zh) 人脸识别方法、装置、电子设备及存储介质
US20160171280A1 (en) Method of updating biometric feature pattern and electronic device for same
US10417620B2 (en) User attribute value transfer method and terminal
WO2018196553A1 (zh) 标识的获取方法及装置、存储介质以及电子装置
CN111767554B (zh) 屏幕分享方法、装置、存储介质及电子设备
US10504560B2 (en) Electronic device and operation method thereof
CN113826135B (zh) 使用话音识别进行非接触式认证的系统、方法和计算机系统
US20200218772A1 (en) Method and apparatus for dynamically identifying a user of an account for posting images
US11985135B2 (en) Stated age filter
CN112836661A (zh) 人脸识别方法、装置、电子设备及存储介质
US20200265132A1 (en) Electronic device for authenticating biometric information and operating method thereof
CN115984977A (zh) 活体检测方法和系统
WO2019100234A1 (zh) 实现信息互动的方法和装置
US11087121B2 (en) High accuracy and volume facial recognition on mobile platforms
WO2020191547A1 (zh) 生物识别方法及装置
US9940948B2 (en) Systems and methods for enabling information exchanges between devices
US20240242525A1 (en) Character string pattern matching using machine learning
WO2024051364A1 (zh) 活体检测模型训练方法、装置、存储介质以及终端
CN117333879A (zh) 模型的训练方法、水印文本识别方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20873800

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20873800

Country of ref document: EP

Kind code of ref document: A1