WO2017076149A1 - 一种图像处理系统及图像处理的方法 - Google Patents

一种图像处理系统及图像处理的方法 Download PDF

Info

Publication number
WO2017076149A1
WO2017076149A1 PCT/CN2016/101482 CN2016101482W WO2017076149A1 WO 2017076149 A1 WO2017076149 A1 WO 2017076149A1 CN 2016101482 W CN2016101482 W CN 2016101482W WO 2017076149 A1 WO2017076149 A1 WO 2017076149A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
message
image processing
module
processing
Prior art date
Application number
PCT/CN2016/101482
Other languages
English (en)
French (fr)
Inventor
张燕
夏正勋
杨庆平
汪峰来
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017076149A1 publication Critical patent/WO2017076149A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Definitions

  • Embodiments of the present invention relate to the field of intelligent image processing, and in particular, to an image processing system and a method for image processing.
  • Intelligent image processing algorithms especially those in artificial intelligence, generally require powerful computing resources due to computational complexity, such as deep learning.
  • the most important issue in this area is to solve the high performance and acceleration problems of the algorithm.
  • cloud computing's massive processing resources such as Google's famous cat face recognition
  • the server cluster after 3 days, the results.
  • the disadvantage of this calculation is that the calculation is slow.
  • the other is to use the GPU (Graphical Processing Unit) for acceleration.
  • a Stanford University researcher named Adam Coates came up with a better solution. He used a different microprocessor (GPU) to connect the three computers together. They run like a system, and the results are the same as those of thousands of Google computers. This is definitely an extraordinary achievement.
  • GPU Graphical Processing Unit
  • Hadoop Hadoop Distributed File System
  • Hadoop Distributed File System Hadoop Distributed File System
  • Massive Data Distributed Processing Software Framework Massive Data Distributed Processing Software Framework
  • High-volume image processing a computationally intensive application
  • Hadoop has its own shortcomings in this application. For example, full-scale scenes, serials within tasks; heavy throughput, no response time is not guaranteed, the most fatal is that Hadoop is not suitable for real-time analysis system, which limits Hadoop application scenarios to a certain extent.
  • Traditional distributed image processing system Most of them are based on remote procedure calls and NFS (Network File System) implementation, and there are inherent defects in system communication and storage.
  • the technical problem to be solved by the embodiments of the present invention is to provide an image processing system and an image processing method to improve the efficiency of processing large-scale image files.
  • an image processing system including:
  • a display module configured to receive one or more images uploaded by the user, upload the image to the storage module, and send a first message to the processing module; after receiving the second message of the processing module, according to the first The second message downloads the corresponding processed image from the storage module and displays the image;
  • the processing module is configured to: after receiving the first message, download an image from the storage module according to the first message, perform image processing, and upload the processed image to the storage module, to the The display module sends the second message;
  • the storage module is configured to store an image uploaded by the display module and the processing module.
  • the above image processing system further has the following features:
  • the processing module is further configured to parse the first message after receiving the first message, where the parsed information includes storage path information of the image, and downloads the corresponding information from the storage module according to the storage path information of the image. Image.
  • the above image processing system further has the following features:
  • the processing module performs image processing, including: converting the image into a byte stream, and calling a corresponding image algorithm to process the byte stream.
  • the above image processing system further has the following features:
  • the processing module is further configured to: upload image processing log information to the image processing The storage module.
  • the image processing system further has the following feature: the display module, after receiving the second message of the processing module, is further configured to download and display corresponding image processing log information from the storage module.
  • the image processing system further has the following feature: the first message is a Kafka kafka distributed publish subscription message system message.
  • the image processing system further has the following feature: the display module sends a first message to the processing module every time an image uploaded by the user is received.
  • the image processing system further has the following feature: the first message carries storage path information of the image.
  • the image processing system further has the following feature: the first message further carries information of an image algorithm parameter and an algorithm type set by a user.
  • the image processing system further has the following features:
  • the algorithm type includes any one of the following:
  • Image compression algorithms text recognition, poor image detection, and map search.
  • the image processing system further has the following feature: the second message carries storage path information of the processed image and log information.
  • the image processing system further has the following feature: the second message is a Kafka kafka distributed publish subscription message system message.
  • the image processing system further has the following feature: the storage module is a Hadoop distributed file system HDFS.
  • an embodiment of the present invention further provides a method for image processing, which is applied to the image processing system described above, and includes:
  • the processed image is downloaded and displayed.
  • the performing image processing includes:
  • the downloaded image data is converted into a byte stream, and the corresponding image algorithm is invoked to process the byte stream.
  • the foregoing method further has the following features: after performing the image processing, the method further includes:
  • the foregoing method further has the following features:
  • the above method also has the following features:
  • the storing the image includes: storing the image in a Hadoop distributed file system HDFS;
  • the storing the processed image includes: storing the processed image and image processing log information in the Hadoop distributed file system HDFS.
  • a computer storage medium is further provided, and the computer storage medium may store an execution instruction for performing the image processing in the above embodiment.
  • the embodiments of the present invention provide an image processing system and an image processing method, which can improve the efficiency of processing large-scale image files.
  • FIG. 1 is a schematic diagram of an image processing system according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of an image processing system according to an application example of the present invention.
  • FIG. 3 is a flowchart of a method for image processing according to an embodiment of the present invention.
  • the object of the present invention is to overcome the deficiencies of the prior art.
  • the embodiment provides an image processing system. As shown in FIG. 1, the image processing system of this embodiment includes:
  • a display module configured to receive one or more images uploaded by the user, upload the image to the storage module, and send a first message to the processing module; after receiving the second message of the processing module, according to the first The second message downloads the corresponding image from the storage module and displays it;
  • the processing module is configured to: after receiving the first message, download an image from the storage module according to the first message, perform image processing, and upload the processed image to the storage module, to the The display module sends the second message;
  • the storage module is configured to store an image uploaded by the display module and the processing module.
  • the processing module may be further configured to parse the first message after receiving the first message, where the parsed information includes storage path information of the image, according to the storage path of the image. Information downloads a corresponding image from the storage module.
  • the processing module performing image processing may include converting the image into a byte stream, and invoking a corresponding image algorithm to process the byte stream.
  • the storage module in this embodiment may use a HDFS (Hadoop Distributed File System) storage module and is configured to store images.
  • HDFS Hadoop Distributed File System
  • the CPU and the GPU are combined to fully utilize the scalability of the distributed cluster and the high speed of the GPU operation, and the research is performed; the processing module can select a Storm (storm distributed real-time computing system) with good real-time processing performance. (An open source distributed real-time computing system) designed to make data analysis more real-time and efficient.
  • Storm Storm distributed real-time computing system
  • An open source distributed real-time computing system designed to make data analysis more real-time and efficient.
  • the image to be processed and the processed image or result are all stored in the HDFS; the image path information is converted into a byte stream before the image processing algorithm is called, and the processed image is processed after the image analysis is completed.
  • the image or result is sent out in a byte stream, and the other links are the path information of the image storage in the HDFS, which greatly reduces the network burden and processes faster.
  • Kafka Kafka, a high-throughput distributed publish-subscribe messaging system
  • each image processing task sends a kafka message to Storm. system.
  • Multiple messages are concurrently processed to avoid single message processing in the Storm system.
  • the image processing system of the present embodiment includes the following modules: a visual UI (user interface) (corresponding to the above display module), and a Kafka message queue module. , HDFS, Storm module, intelligent image processing algorithm module. among them,
  • Visual UI On this UI, the user can select the image algorithm type, set the algorithm parameters, and upload a local image (single image or folder).
  • the submit button When the user clicks the submit button, the selected image processing algorithm task is submitted to the system and the task starts. After the task is started, the uploaded image is stored in the HDFS and the Kafka message is sent to the Storm module.
  • Kafka is a high-throughput distributed publish-subscribe messaging system that provides message persistence through O(1) (constant complexity) disk data structures with high throughput and long-term stability.
  • the module is responsible for sending and receiving messages of the entire image analysis system.
  • the image path, algorithm type, algorithm parameters and other information are sent to the Storm module;
  • the result path calculated by the Storm module is even processed by the algorithm.
  • the trace (tracking) log and other information is sent to the visualization interface, the image processed result is displayed by the interface.
  • the HDFS in addition to being responsible for storing user-uploaded images, the HDFS is also stored here. Avoid bandwidth consumption caused by bulk transfer of image streams in the system.
  • Storm is a scalable, fast and fault-tolerant open source real-time distributed computing system that is highly focused on stream processing. Storm excels in event processing and incremental computing, enabling real-time processing of data streams based on changing parameters.
  • the module is responsible for receiving kafka messages, parsing split message fields, downloading and uploading images from HDFS, and scheduling algorithm processing in the system, and belongs to the core processing module.
  • Intelligent image processing module This module is the core of the algorithm, and all intelligent image algorithms are encapsulated in this module.
  • the image algorithms of this embodiment are basically implemented by C or C++ coding, and each algorithm needs to be packaged into a .so file, and a callable Java interface is provided to implement an image algorithm. Call.
  • FIG. 3 is a flowchart of a method for image processing according to an embodiment of the present invention. As shown in FIG. 3, the method of the embodiment is applied to the image processing system described above, and includes the following steps:
  • Step 11 Receive one or more images uploaded by a user, and store the image.
  • Step 12 download the image, perform image processing, and store the processed image
  • Step 13 Download the processed image and display it.
  • the image processing method of the present embodiment processes images faster, while reducing network bandwidth consumption.
  • Step 101 The user uploads a to-be-processed image on the visual UI, sets an algorithm-related parameter, and clicks submit, that is, submits an image processing task to the system;
  • Step 102 The visual UI first stores all the images uploaded by the user to the HDFS, and records the storage path information of all the image files at the same time;
  • Step 103 The visual UI sends a Kafka message to the Storm module, where the message carries information such as an HDFS single image storage path information, an image algorithm parameter set on the interface, an algorithm type, and the like; each time an image is processed, a kafka message is sent, which may be continuous. Send multiple messages;
  • the types of algorithms include, but are not limited to, image compression, character recognition (eg, OCR (Optical Character Recognition)), poor image detection, and map search.
  • Step 104 KafkaSpout (kafka message source) in the Storm module is used to receive the Kafka message, and send the message to the first bolt (message handler) for parsing and splitting the message field.
  • the split field is: message number (sessionid) ), the storage path of the image on the HDFS, image parameters, algorithm type and other information.
  • Step 105 The split field is sent to the second bolt: ReadHdfsBolt;
  • ReadHdfsBolt downloads images from HDFS and converts them into byte streams based on the storage path information for each image.
  • Step 106 The converted byte stream is sent to a third bolt: AlgorithmBlot;
  • AlgorithmBlot calls the corresponding intelligent image algorithm to process the byte stream according to the algorithm type message field, and obtains the corresponding processing result (image or text, etc.).
  • the AlgorithmBlot calls the encapsulated corresponding image algorithm java interface.
  • the byte stream is used as the input parameter of the interface, and the result obtained after the image processing is converted into a byte stream.
  • the image processing algorithms are basically implemented by C or C++ coding.
  • the algorithm needs to be encapsulated into a .so file and loaded into the project in advance; and the encapsulated java interface is adjusted by the jni method.
  • Step 107 The image algorithm obtains the result (image or text, etc.) and sends it to the fourth bolt: WriteHdfsBlot in the form of a byte stream;
  • WriteHdfsBlot converts the byte stream into an image format or other type of processing result stored in HDFS.
  • Step 108 Finally, the information such as the path where the processing result is processed in the HDFS is delivered to the last bolt: KafkaBolt.
  • KafkaBolt sends this data information to the visual UI interface as a kafka message.
  • Step 109 After receiving the message, the visual interface downloads the processed image or other result from the HDFS according to the processing result path information stored on the HDFS, and displays it on the interface.
  • the image processing method according to the embodiment of the present invention can be applied to a scenario with high real-time requirements, a wider application range, faster processing speed, and reduced network bandwidth consumption and reduced. Algorithm processing time.
  • the user uploads the image to be processed on the visual interface, and after designing the algorithm type and parameters, submits the image processing task to the Storm module.
  • the image is first stored in the HDFS, and the image path information is sent to the Storm module.
  • the Storm module is responsible for downloading images from HSFS storage modules and converting them into byte streams.
  • the relevant algorithm interface is called according to the type of algorithm.
  • the result of the image processing by the algorithm is first stored in the HDFS.
  • the Storm module sends the result path information and the processing log to the kafka message to the visual interface for the interface to display the results.
  • Embodiment 2 This embodiment takes an image compression algorithm as an example, and includes the following steps:
  • Step 201 The user uploads the image to be compressed (one or more sheets) on the visual UI, and selects an algorithm type: image compression, and sets an image compression algorithm compression factor parameter (the value ranges from 0 to 1, and the default value is 0.85). ), click the submit button, the image compression processing task is submitted to the system;
  • Step 202 The visual UI first stores all the images uploaded by the user into the HDFS, and records the storage paths of all the image files at the same time;
  • Step 203 The visual UI sends a Kafka message to the Storm module.
  • the message also carries information such as an HDFS single image storage path, an image algorithm compression factor parameter set on the interface, and an algorithm type (image compression algorithm). Processing an image will send a kafaka message, which can send multiple messages continuously;
  • Step 204 KafkaSpout is used to receive the Kafka message in the Storm module, and the message field is parsed and split.
  • the split field is: message number sessionid, storage path information of the image on HDFS, compression factor parameter, algorithm type (image compression) algorithm);
  • Step 205 The split field is sent to the second bolt: ReadHdfsBolt; ReadHdfsBolt downloads the image from the HDFS according to the storage path information of each image and converts it into a byte stream;
  • Step 206 The converted byte stream is sent to the third bolt: AlgorithmBlot; in the AlgorithmBlot, the algorithm type message field is image compression, and the encapsulated image compression image algorithm java interface is called.
  • the byte stream is used as an input parameter of the interface, and the image stream is compressed to obtain a compressed image stream;
  • the image processing algorithms are basically implemented by C or C++ coding.
  • the algorithm needs to be encapsulated into a .so file and loaded into the project in advance; and the encapsulated java interface is adjusted by the jni method.
  • Step 207 The compressed image is sent as a byte stream to a fourth bolt: WriteHdfsBlot;
  • WriteHdfsBlot converts the byte stream into an image format for storage in HDFS.
  • Step 208 Finally, the information such as the path of the compressed image in the HDFS is finally transmitted to the last bolt: KafkaBolt.
  • KafkaBolt sends this data information as a kafka message.
  • Step 209 After receiving the message, the visual UI downloads the compressed image from the HDFS according to the returned image path information, and displays it on the UI.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be configured to store program code for performing the following steps:
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • the scenario can be applied to a scenario with high real-time requirements, the application scope is wider, the processing speed is faster, the network bandwidth consumption is reduced, and the algorithm processing time is reduced. Can improve the efficiency of processing large batches of image files.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Facsimiles In General (AREA)

Abstract

一种图像处理系统及图像处理的方法,该图像处理系统包括:显示模块,设置为接收用户上传的一张或多张图像,将图像上传到存储模块,向处理模块发送第一消息;在接收到处理模块的第二消息后,根据第二消息从存储模块下载对应处理后图像并进行显示;处理模块,设置为在接收到第一消息后,根据第一消息从存储模块下载图像并进行图像处理,将处理后的图像上传到存储模块,向显示模块发送第二消息;存储模块,设置为存储显示模块和处理模块上传的图像。所述系统可以提高处理大批量图像文件的效率。

Description

一种图像处理系统及图像处理的方法 技术领域
本发明实施例涉及智能图像处理领域,尤其涉及一种图像处理系统及图像处理的方法。
背景技术
智能图像处理算法特别是人工智能方面的算法,一般由于计算复杂,例如深度学习都需要强大的计算资源。该领域最重要的问题,是解决算法高性能及加速问题。早期,人们很容易想到利用云计算的海量处理资源来扩展计算能力,如Google(谷歌)著名的猫脸识别,建立了一个9层的深度神经网络,运行在16000个CPU(中央处理器)组成的服务器集群,经过3天得出结果。这种计算的缺点是,计算速度慢。另外一种就是利用GPU(Graphical Processing Unit,图形处理器)进行加速。如猫脸识别实验中,一位名叫Adam Coates的斯坦福大学研究人员想出了一个更好的解决方案,他用一种不同的微处理器(GPU),将三台计算机连贯在一起,让它们像是一个系统一样运行,结果与Google数千台计算机的运行效果是一样的。这绝对是一个非凡的成就。
两种方法各有优缺点,分布式计算速度慢,但很容易扩展,能充分利用现已部署的海量CPU资源,投资低。GPU运算速度快,但GPU是比较新的硬件设备,大量使用需要大量的投资。
当前,Hadoop(Hadoop Distributed File System,Hadoop分布式文件系统,大量数据分布式处理软件框架)系统已经有了广泛的使用,也出现了基于Hadoop的分布式图像处理方案。大批量图像处理这种计算密集型的应用也给分布式系统的设计带来了一定的挑战,而Hadoop在这种应用中有其自身的不足之处。例如,全量场景,任务内串行;重吞吐量,响应时间完全没有保证等缺点,最致命的是Hadoop不适合做实时分析系统,这在一定程度上限制了Hadoop的应用场景。而传统的分布式图像处理系 统,多数基于远程过程调用和NFS(Network File System,网络文件系统)实现,在系统通信和存储上也存在先天的不足。
另外,若涉及海量图像分析处理的时候,如果以流的形式在系统中传递会严重消耗网络带宽,增加响应时间。
发明内容
本发明实施例要解决的技术问题是提供一种图像处理系统及图像处理的方法,以提高处理大批量图像文件的效率。
为了解决上述技术问题,本发明实施例提供了一种图像处理系统,其中,包括:
显示模块,设置为接收用户上传的一张或多张图像,将所述图像上传到存储模块,向处理模块发送第一消息;在接收到所述处理模块的第二消息后,根据所述第二消息从所述存储模块下载对应处理后图像并进行显示;
所述处理模块,设置为在接收到所述第一消息后,根据所述第一消息从所述存储模块下载图像并进行图像处理,将处理后的图像上传到所述存储模块,向所述显示模块发送所述第二消息;
所述存储模块,设置为存储所述显示模块和所述处理模块上传的图像。
可选地,上述图像处理系统还具有下面特点:
所述处理模块,还设置为在接收到所述第一消息后解析所述第一消息,解析出的信息包括图像的存储路径信息,根据所述图像的存储路径信息从所述存储模块下载对应的图像。
可选地,上述图像处理系统还具有下面特点:
所述处理模块,进行图像处理包括:将所述图像转化成字节流,调用对应的图像算法对所述字节流进行处理。
可选地,上述图像处理系统还具有下面特点:
所述处理模块,进行图像处理后还用于:将图像处理日志信息上传到 所述存储模块。
可选地,上述图像处理系统还具有下面特点:所述显示模块,接收到所述处理模块的第二消息后还用于,从所述存储模块下载对应的图像处理日志信息并显示。
可选地,上述图像处理系统还具有下面特点:所述第一消息为卡夫卡kafka分布式发布订阅消息系统消息。
可选地,上述图像处理系统还具有下面特点:所述显示模块,每接收用户上传的一张图像向所述处理模块发送一条所述第一消息。
可选地,上述图像处理系统还具有下面特点:所述第一消息携带图像的存储路径信息。
可选地,上述图像处理系统还具有下面特点:所述第一消息还携带用户设置的图像算法参数和算法类型的信息。
可选地,上述图像处理系统还具有下面特点:所述算法类型包括以下的任一种:
图像压缩算法、文字识别、不良图像检测和以图搜图。
可选地,上述图像处理系统还具有下面特点:所述第二消息携带所述处理后的图像和日志信息的存储路径信息。
可选地,上述图像处理系统还具有下面特点:所述第二消息为卡夫卡kafka分布式发布订阅消息系统消息。
可选地,上述图像处理系统还具有下面特点:所述存储模块为Hadoop分布式文件系统HDFS。
为了解决上述问题,本发明实施例还提供了一种图像处理的方法,应用于上述的图像处理系统,包括:
接收用户上传的一张或多张图像,存储所述图像;
下载所述图像,并进行图像处理,存储处理后的图像;
下载所述处理后的图像并进行显示。
可选地,上述方法还具有下面特点:所述进行图像处理包括:
将所下载的图像数据转化成字节流,调用对应的图像算法对所述字节流进行处理。
可选地,上述方法还具有下面特点:所述进行图像处理后,还包括:
存储图像处理日志信息。
可选地,上述方法还具有下面特点:还包括:
下载所述图像处理日志信息并进行显示。
可选地,上述方法还具有下面特点:
所述存储所述图像包括:将所述图像存储于Hadoop分布式文件系统HDFS;
所述存储处理后的图像,包括:将所述处理后的图像和图像处理日志信息存储于所述Hadoop分布式文件系统HDFS。
在本发明实施例中,还提供了一种计算机存储介质,该计算机存储介质可以存储有执行指令,该执行指令用于执行上述实施例中的图像处理的方法。
综上,本发明实施例提供一种图像处理系统及图像处理的方法,可以提高处理大批量图像文件的效率。
附图说明
图1为本发明实施例的图像处理系统的示意图;
图2为本发明应用示例的图像处理系统的示意图;
图3为本发明实施例的图像处理的方法的流程图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明白,下文中将结合附图对本发明的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
本发明的目的是为了克服现有技术的缺陷,本实施例提供一种图像处理系统,如图1所示,本实施例的图像处理系统包括:
显示模块,设置为接收用户上传的一张或多张图像,将所述图像上传到存储模块,向处理模块发送第一消息;在接收到所述处理模块的第二消息后,根据所述第二消息从所述存储模块下载对应的图像并进行显示;
所述处理模块,设置为在接收到所述第一消息后,根据所述第一消息从所述存储模块下载图像并进行图像处理,将处理后的图像上传到所述存储模块,向所述显示模块发送所述第二消息;
所述存储模块,设置为存储所述显示模块和所述处理模块上传的图像。
在一优选实施例中,所述处理模块,还可以设置为在接收到所述第一消息后解析所述第一消息,解析出的信息包括图像的存储路径信息,根据所述图像的存储路径信息从所述存储模块下载对应的图像。
在一优选实施例中,所述处理模块进行图像处理可以包括:将所述图像转化成字节流,调用对应的图像算法对所述字节流进行处理。
本实施例中的存储模块可以采用HDFS(Hadoop Distributed File System,Hadoop分布式文件系统)存储模块,设置为存储图像。
本实施例,将CPU与GPU相结合,充分利用分布式集群的扩展性和GPU运行的高速度,并进行研究;所述处理模块可以选择实时处理性能佳的Storm(风暴分布式实时计算系统)(一个开源的分布式实时计算系统),目的是让数据分析更加实时高效。
另外,本实施例中,将待处理的图像以及处理后的图像或结果全部存储到HDFS;在调用图像处理算法之前将图像路径信息转换成字节流,以及在完成图像分析后将处理后的图像或结果以字节流发送出去,其他环节传递的都是HDFS中图像存储的路径信息,这样大大减轻了网络负担,处理速度更快。
对于海量图像,采用Kafka(卡夫卡,一种高吞吐量的分布式发布订阅消息系统)消息队列,每张图像处理任务就发送一条kafka消息到Storm 系统。实现多条消息同时并发处理,避免Storm系统中循环单条消息处理。
图2为本发明应用示例的图像处理系统的示意图,如图2所示,本实施例的图像处理系统包括以下模块:可视化UI(用户界面)(相当于上述的显示模块),Kafka消息队列模块,HDFS、Storm模块,智能图像处理算法模块。其中,
可视化UI:该UI上,用户可以选择图像算法类型,设置算法参数,可以上传本地图像(单张图像或文件夹)。用户点击提交按钮后,就会将选择的图像处理算法任务提交到系统中,任务便开始启动。任务启动后,先将用户上传的图像存储到HDFS中,并向Storm模块发送Kafka消息。
Kafka消息队列模块:kafka是一种高吞吐量的分布式发布订阅消息系统,通过O(1)(常量复杂度)的磁盘数据结构提供消息的持久化,具有高吞吐量和长时间的稳定性能等特点。该模块负责整个图像分析系统的消息发送和接收,在输入端,将图像路径、算法类型、算法参数等信息发送给Storm模块;在输出端,将Storm模块计算得到的结果路径,甚至是算法处理的trace(跟踪)日志等信息发送到可视化界面后,由界面将图像处理后的结果展示出来。
HDFS:在实施例中,HDFS除了负责存储用户上传的图像外,图像算法处理后的结果也会存到这里。避免系统中大批量传递图像流而导致的带宽消耗。
Storm模块:Storm是一套极具可扩展能力、快速惊人且具备容错能力的开源实时分布计算系统,其高度专注于流处理领域。Storm在事件处理与增量计算方面表现突出,能够以实时方式根据不断变化的参数对数据流进行处理。该模块在系统中承担接收kafka消息、解析拆分消息字段、从HDFS下载和上传图像、调度算法处理等功能,属于核心处理模块。
智能图像处理模块:该模块是算法核心,所有智能图像算法都封装在这个模块中。本实施例的图像算法基本上都是用C或C++编码实现的,需要将每个算法封装成.so文件,并提供可调用的Java接口,实现图像算法 的调用。
图3为本发明实施例的图像处理的方法的流程图,如图3所示,本实施例的方法应用于上述的图像处理系统,包括以下步骤:
步骤11、接收用户上传的一张或多张图像,存储所述图像;
步骤12、下载所述图像,并进行图像处理,存储处理后的图像;
步骤13、下载所述处理后的图像并进行显示。
本实施例的图像处理的方法处理图像的速度更快,同时可以减少网络带宽消耗。
以下以两个具体实施例对本发明实施例的方法进行详细的说明。
实施例一
本实施例的图像处理的方法包括以下步骤:
步骤101:用户在可视化UI上上传待处理图像,设置算法相关参数,点击提交,即向系统提交了图像处理任务;
步骤102:可视化UI首先将用户上传的图像全部存储到HDFS,同时记录所有图像文件的存放路径信息;
步骤103:可视化UI发送Kafka消息到Storm模块,消息中携带HDFS单张图像存放路径信息、界面上设置的图像算法参数、算法类型等信息;每处理一张图像都会发送一个kafka消息,可一直连续发多条消息;
所述算法类型包括但不限于:图像压缩、文字识别(例如,OCR(Optical Character Recognition,光学字符识别))、不良图像检测、以图搜图。
步骤104:Storm模块中KafkaSpout(kafka消息源)用来接收Kafka消息,并将消息发给第一个bolt(消息处理者)进行消息字段解析拆分,拆分后的字段为:消息号(sessionid)、图像在HDFS上的存储路径、图像参数、算法类型等信息。
步骤105:拆分后的字段发送到第二个bolt:ReadHdfsBolt;
ReadHdfsBolt会根据每张图像的存储路径信息从HDFS上下载图像并转化成字节流。
步骤106:转换后的字节流发送到第三个bolt:AlgorithmBlot;
AlgorithmBlot根据算法类型消息字段,调用对应的智能图像算法对字节流进行处理,得到相应的处理结果(图像或文字等)。
本实施例中,AlgorithmBlot调用已封装好的相应图像算法java接口。其中,字节流作为接口的入参,图像处理后得到的结果转换成字节流。
其中,图像处理算法基本都是C或C++编码实现的,事先需要将算法封装成.so文件,并加载到工程中;而封装的java接口通过jni方式调算法.so文件。
步骤107:图像算法处理后得到结果(图像或文字等)以字节流的形式发送到第四个bolt:WriteHdfsBlot;
WriteHdfsBlot将字节流转换成图像格式或其他类型的处理结果存储到HDFS中。
步骤108:最后将HDFS中处理结果所在的路径等信息传递到最后一个bolt:KafkaBolt。
KafkaBolt将这些数据信息以kafka消息发送给可视化UI界面。
步骤109:可视化界面接收消息后,根据HDFS上存储的处理结果路径信息,从HDFS上下载处理后的图像或其他结果,并展示到界面上。
采用本发明实施例所述的图像处理的方法,与现有技术相比,可适用于实时性要求较高的场景,应用范围更广,处理速度更快,同时减少了网络带宽消耗,降低了算法处理时间。
本实施例中,用户在可视化界面上上传待处理的图像,同时设计算法类型以及参数后,向Storm模块中提交图像处理任务。任务启动后,首先将图像存储到HDFS中,并将图像路径信息发送kafka消息到Storm模块中。Storm模块负责从HSFS存储模块中下载图像并转换成字节流,并根 据算法类型调用相关的算法接口。图像经算法处理后得到的结果先存储到HDFS中。最后Storm模块将结果路径信息以及处理日志等信息发送kafka消息到可视化界面,用于界面进行结果展示。
实施例二,本实施例以图像压缩算法为例,包括以下步骤:
步骤201:用户在可视化UI上上传待压缩的图像(一张或多张),选择算法类型为:图像压缩,同时设置图像压缩算法压缩因子参数(取值范围在0~1,默认值为0.85),点击提交按钮,即向系统提交了图像压缩处理任务;
步骤202:可视化UI首先将用户上传的所有图像全部存储到HDFS中,同时记录所有图像文件的存放路径;
步骤203:可视化UI发送Kafka消息到Storm模块,除了消息号sessionid外,消息中还携带HDFS单张图像存放路径、界面上设置的图像算法压缩因子参数、算法类型(图像压缩算法)等信息;每处理一张图像都会发送一个kafaka消息,可一直连续发多条消息;
步骤204:Storm模块中KafkaSpout用来接收Kafka消息,并将消息字段解析拆分,拆分后的字段为:消息号sessionid、图像在HDFS上的存储路径信息、压缩因子参数、算法类型(图像压缩算法);
步骤205:拆分后的字段发送到第二个bolt:ReadHdfsBolt;ReadHdfsBolt会根据每张图像的存储路径信息从HDFS上下载图像并转化成字节流;
步骤206:转换后的字节流发送到第三个bolt:AlgorithmBlot;AlgorithmBlot里面判断算法类型消息字段为图像压缩,则调用已封装好的图像压缩图像算法java接口。其中,字节流作为接口的入参,进行图像压缩处理后得到压缩后的图像流;
其中,图像处理算法基本都是C或C++编码实现的,事先需要将算法封装成.so文件,并加载到工程中;而封装的java接口通过jni方式调算法.so文件。
步骤207:压缩后的图像以字节流的形式发送到第四个bolt:WriteHdfsBlot;
WriteHdfsBlot将字节流转换成图像格式存储到HDFS中。
步骤208:最后将压缩后图像在HDFS中的路径等信息传递到最后一个bolt:KafkaBolt。
KafkaBolt将这些数据信息以kafka消息发送出去。
步骤209:可视化UI收到消息后,根据返回的图像路径信息从HDFS上下载压缩后的图像,并展示在UI上。
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的程序代码:
S1,接收用户上传的一张或多张图像,存储图像;
S2,下载图像,并进行图像处理,存储处理后的图像;
S3,下载处理后的图像并进行显示。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本发明不限制于任何特定形式的硬件和软件的结合。
以上仅为本发明的优选实施例,当然,本发明还可有其他多种实施例, 在不背离本发明精神及其实质的情况下,熟悉本领域的技术人员当可根据本发明作出各种相应的改变和变形,但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。
工业实用性
在本实施例中,可适用于实时性要求较高的场景,应用范围更广,处理速度更快,同时减少了网络带宽消耗,降低了算法处理时间。可以提高处理大批量图像文件的效率。

Claims (18)

  1. 一种图像处理系统,包括:
    显示模块,设置为接收用户上传的一张或多张图像,将所述图像上传到存储模块,向处理模块发送第一消息;在接收到所述处理模块的第二消息后,根据所述第二消息从所述存储模块下载对应处理后图像并进行显示;
    所述处理模块,设置为在接收到所述第一消息后,根据所述第一消息从所述存储模块下载图像并进行图像处理,将处理后的图像上传到所述存储模块,向所述显示模块发送所述第二消息;
    所述存储模块,设置为存储所述显示模块和所述处理模块上传的图像。
  2. 如权利要求1所述的图像处理系统,其中:
    所述处理模块,还设置为在接收到所述第一消息后解析所述第一消息,解析出的信息包括图像的存储路径信息,根据所述图像的存储路径信息从所述存储模块下载对应的图像。
  3. 如权利要求1所述的图像处理系统,其中:
    所述处理模块,进行图像处理包括:将所述图像转化成字节流,调用对应的图像算法对所述字节流进行处理。
  4. 如权利要求1所述的图像处理系统,其中:
    所述处理模块,进行图像处理后还设置为:将图像处理日志信息上传到所述存储模块。
  5. 如权利要求4所述的图像处理系统,其中:
    所述显示模块,接收到所述处理模块的第二消息后还用于,从所 述存储模块下载对应的图像处理日志信息并显示。
  6. 如权利要求1所述的图像处理系统,其中:
    所述第一消息为卡夫卡kafka分布式发布订阅消息系统消息。
  7. 如权利要求6所述的图像处理系统,其中:
    所述显示模块,每接收用户上传的一张图像向所述处理模块发送一条所述第一消息。
  8. 如权利要求1所述的图像处理系统,其中:
    所述第一消息携带图像的存储路径信息。
  9. 如权利要求8所述的图像处理系统,其中:
    所述第一消息还携带用户设置的图像算法参数和算法类型的信息。
  10. 如权利要求9所述的方法,其中:所述算法类型包括以下的任一种:
    图像压缩算法、文字识别、不良图像检测和以图搜图。
  11. 如权利要求1所述的图像处理系统,其中:
    所述第二消息携带所述处理后的图像和日志信息的存储路径信息。
  12. 如权利要求11所述的图像处理系统,其中:
    所述第二消息为卡夫卡kafka分布式发布订阅消息系统消息。
  13. 如权利要求1-12任一项所述的图像处理系统,其中:
    所述存储模块为Hadoop分布式文件系统HDFS。
  14. 一种图像处理的方法,应用于如权利要求1-13任一项所述的图像处理系统,包括:
    接收用户上传的一张或多张图像,存储所述图像;
    下载所述图像,并进行图像处理,存储处理后的图像;
    下载所述处理后的图像并进行显示。
  15. 如权利要求14所述的方法,其中:所述进行图像处理包括:
    将所下载的图像数据转化成字节流,调用对应的图像算法对所述字节流进行处理。
  16. 如权利要求14所述的方法,其中:所述进行图像处理后,还包括:
    存储图像处理日志信息。
  17. 如权利要求16所述的方法,其中:还包括:
    下载所述图像处理日志信息并进行显示。
  18. 如权利要求14-17任一项所述的方法,其中:
    所述存储所述图像包括:将所述图像存储于Hadoop分布式文件系统HDFS;
    所述存储处理后的图像,包括:将所述处理后的图像和图像处理日志信息存储于所述Hadoop分布式文件系统HDFS。
PCT/CN2016/101482 2015-11-02 2016-10-08 一种图像处理系统及图像处理的方法 WO2017076149A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510731253.0 2015-11-02
CN201510731253.0A CN106649377A (zh) 2015-11-02 2015-11-02 一种图像处理系统及图像处理的方法

Publications (1)

Publication Number Publication Date
WO2017076149A1 true WO2017076149A1 (zh) 2017-05-11

Family

ID=58663182

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/101482 WO2017076149A1 (zh) 2015-11-02 2016-10-08 一种图像处理系统及图像处理的方法

Country Status (2)

Country Link
CN (1) CN106649377A (zh)
WO (1) WO2017076149A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110865974A (zh) * 2019-09-27 2020-03-06 苏州浪潮智能科技有限公司 一种基于kafka智能加载离线SQL表数据的方法
CN114040223A (zh) * 2021-11-05 2022-02-11 湖北亿咖通科技有限公司 一种图像处理方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984614B (zh) * 2018-06-12 2022-01-25 成都三零凯天通信实业有限公司 一种基于大数据环境下的视图像快速识别方法
CN110308944B (zh) * 2019-05-15 2022-03-22 广东电网有限责任公司广州供电局 配置文件处理方法、系统、计算机设备和存储介质
CN111813574A (zh) * 2020-07-02 2020-10-23 Oppo(重庆)智能科技有限公司 图片压缩方法、装置、存储介质和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1787447A (zh) * 2005-11-16 2006-06-14 中国科学院上海技术物理研究所 多点寻址获取医学图像交互式网络会诊实现方法及系统
CN101950302A (zh) * 2010-09-29 2011-01-19 李晓耕 基于移动设备的海量音乐库管理方法
CN103533073A (zh) * 2013-10-23 2014-01-22 北京网秦天下科技有限公司 用于移动设备文件的文件管理系统和方法
CN103560945A (zh) * 2013-11-14 2014-02-05 麦克奥迪(厦门)医疗诊断系统有限公司 一种通过网络即时浏览超大图像的方法
US8683338B2 (en) * 2007-03-29 2014-03-25 Brother Kogyo Kabushiki Kaisha Information processing device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036025A (zh) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 一种基于分布式的海量日志采集系统
CN104615777A (zh) * 2015-02-27 2015-05-13 浪潮集团有限公司 一种基于流式计算引擎的实时数据处理方法及装置
CN104992147A (zh) * 2015-06-09 2015-10-21 中国石油大学(华东) 一种基于快慢结合云计算环境的深度学习的车牌识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1787447A (zh) * 2005-11-16 2006-06-14 中国科学院上海技术物理研究所 多点寻址获取医学图像交互式网络会诊实现方法及系统
US8683338B2 (en) * 2007-03-29 2014-03-25 Brother Kogyo Kabushiki Kaisha Information processing device
CN101950302A (zh) * 2010-09-29 2011-01-19 李晓耕 基于移动设备的海量音乐库管理方法
CN103533073A (zh) * 2013-10-23 2014-01-22 北京网秦天下科技有限公司 用于移动设备文件的文件管理系统和方法
CN103560945A (zh) * 2013-11-14 2014-02-05 麦克奥迪(厦门)医疗诊断系统有限公司 一种通过网络即时浏览超大图像的方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110865974A (zh) * 2019-09-27 2020-03-06 苏州浪潮智能科技有限公司 一种基于kafka智能加载离线SQL表数据的方法
CN114040223A (zh) * 2021-11-05 2022-02-11 湖北亿咖通科技有限公司 一种图像处理方法及系统
CN114040223B (zh) * 2021-11-05 2023-11-24 亿咖通(湖北)技术有限公司 一种图像处理方法及系统

Also Published As

Publication number Publication date
CN106649377A (zh) 2017-05-10

Similar Documents

Publication Publication Date Title
WO2017076149A1 (zh) 一种图像处理系统及图像处理的方法
US10698625B2 (en) Data pipeline architecture for analytics processing stack
CN109327509B (zh) 一种主/从架构的低耦合的分布式流式计算系统
US10996878B2 (en) Data pipeline architecture for cloud processing of structured and unstructured data
WO2016206600A1 (zh) 一种信息流数据的处理方法和装置
US11188380B2 (en) Method and apparatus for processing task in smart device
Tan et al. An approach for fast and parallel video processing on Apache Hadoop clusters
US20120131088A1 (en) Multimedia information retrieval system with progressive feature selection and submission
WO2019047441A1 (zh) 一种通信优化方法及系统
US20140149452A1 (en) Systems and methods for providing messages for a java message service
US20200019854A1 (en) Method of accelerating execution of machine learning based application tasks in a computing device
JP2017506373A (ja) 分散ファイルシステム内のデータへの並列アクセス
CN110781180B (zh) 一种数据筛选方法和数据筛选装置
US11494437B1 (en) System and method for performing object-modifying commands in an unstructured storage service
Poojara et al. Serverless data pipeline approaches for IoT data in fog and cloud computing
WO2022104612A1 (zh) 数据分发流程配置方法及装置、电子设备、存储介质
US20230275976A1 (en) Data processing method and apparatus, and computer-readable storage medium
WO2020228452A1 (zh) 非结构化数据处理方法和非结构化数据处理系统
WO2017031894A1 (zh) 数据的搜索方法、装置及终端
US9111009B2 (en) Self-parsing XML documents to improve XML processing
CN111177237A (zh) 一种数据处理系统、方法及装置
WO2022104611A1 (zh) 数据分发系统及数据分发方法
WO2016008317A1 (zh) 数据处理方法和中心节点
US11741133B1 (en) System and method for information management
CN115378937A (zh) 任务的分布式并发方法、装置、设备和可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16861417

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16861417

Country of ref document: EP

Kind code of ref document: A1