CN115080531A

CN115080531A - Distributed storage based image processing method, system, device and medium

Info

Publication number: CN115080531A
Application number: CN202210529580.8A
Authority: CN
Inventors: 林杰
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2022-05-16
Filing date: 2022-05-16
Publication date: 2022-09-20

Abstract

The application provides a distributed storage-based image processing method, a system, equipment and a medium, which are applicable to the field of image processing, wherein the method comprises the steps of acquiring a scene image acquired by image equipment and a target image in the scene image; analyzing the scene image, and determining the structural data and the image data of the scene image; aggregating and storing image data in a plurality of scene images into a large file for distributed storage, and generating a unique file identifier of the large file; determining the file identifier of the scene image according to the corresponding relation between the image name of the scene image and the file identifier, combining the area information of the image data of the target image in the scene image and the source identifier of the image data, and storing the target image by adding the area information in the file identifier of the scene image; by adding the regional information storage target image in the file identification, the storage of repeated images is avoided, and the utilization rate of an image storage space is greatly improved.

Description

Distributed storage based image processing method, system, device and medium

Technical Field

The present application relates to the field of image processing, and also relates to the field of data storage, and in particular, to a method, system, device, and medium for processing an image based on distributed storage.

Background

Distributed storage is a data storage technology, which uses the disk space on each machine through a network and forms a virtual storage device by using the distributed storage resources, so that data is distributed and stored in the disk space of each corner.

In the related art, in order to optimize the storage space utilization to the maximum, it is usually compared whether MD5(Message-Digest Algorithm) values of each file (image) are the same, and if so, duplicate files are deleted to improve the storage space utilization. However, by adopting the above method, the repeatability of the content area between the files is ignored, the files are stored redundantly, and the storage space is greatly wasted.

Content of application

In view of the above-mentioned shortcomings in the prior art, the present application provides a method, system, device and medium for processing images based on distributed storage to solve the above-mentioned technical problems.

The application provides an image processing method based on distributed storage, which comprises the following steps:

acquiring a scene image acquired by image equipment and a target image in the scene image;

analyzing the scene image, and determining structured data and image data of the scene image, wherein the structured data comprises an image name of the image data, area information of the target image in the image data, and a source identifier of the image data corresponding to the target image;

aggregating and storing image data of a plurality of scene images into a large file for distributed storage, and generating a unique file identifier of the large file; storing the file identification in association with the image name of the scene image;

and determining the file identifier of the scene image associated with the target image by combining the area information of the target image and the source identifier according to the corresponding relation between the image name of the scene image and the file identifier, and storing the target image by adding the area information in the file identifier.

In one possible implementation, acquiring a scene image captured by an imaging device and a target image in the scene image includes:

and acquiring a scene image and a target image of the same scene by using the same image equipment, or/and acquiring the scene image and the target image of the same scene by using different image equipment.

In a possible implementation, parsing the scene image to determine structured data and image data of the scene image includes:

analyzing the scene image based on an analysis interface protocol to obtain structural data and image data of the scene image; the structured data comprises an image name of the image data, regional information of the target image in the image data and a source identifier of the image data corresponding to the target image;

wherein the region information of the target image in the image data is determined by comparing the similarity of the target image and the image data; and determining the source identification of the image data corresponding to the target image according to the similarity result of the target image and the image data.

In a possible implementation manner, the aggregating and storing image data of a plurality of scene images as a large file for distributed storage, and generating a file identifier stored in the large file includes:

aggregating and storing image data of a plurality of scene images into a large file, wherein each large file corresponds to a unique file identifier, and storing the structured data of the scene images in a correlation manner;

storing each large file in a distributed manner according to a load balancing principle, wherein the file identification comprises a file name, offset of the image data in the large file and an image size; and the region information of the target image in the scene image.

In a possible implementation manner, after storing the target image by adding the area information in the file identifier, the method further includes:

positioning the storage position of the large file by using the file identification;

after the large file is determined, reading a scene image of the large file according to the file identification, and storing the scene image in a scanning line array mode;

and reading corresponding row and column data from the scanning line array by using the regional information of the target image in the scene data to restore the target image.

In a possible implementation manner, before the corresponding relationship between the image name of the scene image and the file identifier is obtained, the method further includes:

judging the relationship between the analyzed region information of the scene image and the target image;

if the target image is the analyzed regional information of the scene image, storing the target image in the corresponding scene image according to the file identification;

and if the target image is not the analyzed regional information of the scene image, storing the scene image and the target image in a distributed manner by adopting an aggregation storage mode.

In one possible embodiment, the distributed storage further includes:

when a distributed storage request is received, acquiring a writable area of a disk in a distributed storage system; the distributed storage request comprises the size of the large file to be written, and the disk is distributed and divided into a plurality of writable areas;

determining writing information according to the distributed storage request and the residual storage space of the writable area; the writing information comprises the length of each object obtained by segmenting the large file to be written;

and returning the writing information to the computing node so that the computing node writes the file to be written into the corresponding storage node according to the writing information.

The present application also provides a distributed storage based image processing system, the system comprising:

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a scene image acquired by image equipment and a target image in the scene image;

the analysis processing module is used for analyzing the scene image and determining structured data and image data of the scene image, wherein the structured data comprises an image name of the image data, area information of the target image in the image data and a source identifier of the image data corresponding to the target image;

the aggregation storage module aggregates and stores the image data in the scene images into a large file for distributed storage, and generates a unique file identifier of the large file; storing the file identification in association with the image name of the scene image;

and the image storage module is used for determining the file identifier of the target image related to the scene image by combining the area information of the target image and the source identifier according to the corresponding relation between the image name of the scene image and the file identifier, and storing the target image by adding the area information in the file identifier.

The application also provides an electronic device comprising a processor, a memory and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute the computer program stored in the memory to implement the method according to any one of the embodiments described above.

The present application also provides a computer-readable storage medium having stored thereon a computer program,

the computer program is for causing a computer to perform a method as in any one of the embodiments described above.

The beneficial effect of this application: according to the method, the scene image and the target image belonging to the scene image are obtained, the scene image is analyzed and processed, and the structured data and the image data of the scene image are determined; after image data of a plurality of scene images are aggregated and stored into a large file, determining a file identifier of the target image associated with the scene image according to the corresponding relation between the image name of the scene image and the file identifier and combining the area information of the target image and the source identifier, and storing the target image by adding the area information in the file identifier, namely realizing the storage of the target image; on one hand, the aggregation storage mode is utilized, so that the storage space can be used to the maximum extent, and the waste of the storage space is reduced; on the other hand, the region information is added in the file identification to store the target image, so that the storage of repeated images is avoided, and the utilization rate of an image storage space is greatly improved.

Drawings

FIG. 1 is a schematic diagram of a network architecture provided in an embodiment of the present application;

FIG. 2 is a flow chart of a distributed storage based image processing method provided in an embodiment of the present application;

FIG. 3 is a flowchart of image restoration based on a distributed storage image provided in an embodiment of the present application;

FIG. 4 is a flow chart of distributed storage based on a distributed storage image provided in an embodiment of the present application;

FIG. 5 is a block diagram of a distributed storage based image processing system framework provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

FIG. 7 is a schematic view of a region repeat of an acquired image according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a distributed storage based image processing storage according to an embodiment of the present application;

fig. 9 is a schematic diagram of distributed storage based on a distributed storage image according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application is provided by way of specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure herein. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present application, and the drawings only show the components related to the present application and are not drawn according to the number, shape and size of the components in actual implementation, the type, quantity and proportion of each component in actual implementation may be changed freely, and the layout of the components may be more complicated.

In the following description, numerous details are set forth to provide a more thorough explanation of the embodiments of the present application, however, it will be apparent to one skilled in the art that the embodiments of the present application may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring the embodiments of the present application.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present disclosure. As shown in fig. 1, the network architecture may include a server 01 (server cluster) and a user terminal cluster. The user terminal cluster may comprise one or more user terminals, where the number of user terminals will not be limited. As shown in fig. 1, the system may specifically include a user terminal 100a, a user terminal 100b, user terminals 100c and …, and a user terminal 100 n. As shown in fig. 1, the user terminal 100a, the user terminal 100b, the user terminals 100c, …, and the user terminal 100n may be respectively connected to the server 10 via a network, so that each user terminal may interact with the server 10 via the network. Here, the specific connection mode of the network connection is not limited, and for example, the connection mode may be directly or indirectly connected through wired communication, or may be directly or indirectly connected through wireless communication.

Wherein, each ue in the ue cluster may include: the intelligent terminal comprises an intelligent terminal with an image data processing function, such as a smart phone, a tablet personal computer, a notebook computer, a desktop computer, an intelligent sound box, an intelligent watch, a vehicle-mounted terminal and an intelligent television. It should be understood that each user terminal in the user terminal cluster shown in fig. 1 may be installed with a target application (i.e., an application client), and when the application client runs in each user terminal, data interaction may be performed with the server 01 shown in fig. 1. The application client may include a social client, a multimedia client (e.g., a video client), an entertainment client (e.g., a game client), an education client, a live client, and the like. The application client may be an independent client, or may be an applet integrated in a client (for example, a social client, an education client, a multimedia client, and the like), which is not limited herein.

As shown in fig. 1, the server 01 in the embodiment of the present application may be a server corresponding to the application client. The server 01 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services.

For convenience of understanding, in the embodiment of the present application, one user terminal may be selected as a target user terminal from the plurality of user terminals shown in fig. 1. For example, the user terminal 100a shown in fig. 1 may be used as a target user terminal in the embodiment of the present application, and a target application (i.e., an application client) may be integrated in the target user terminal. At this time, the target user terminal may implement data interaction with the server 01 through the service data platform corresponding to the application client. The target application can run a trained target feature learning model, the hash feature of the currently acquired image to be queried can be accurately learned through the target feature learning model, and then whether a target image with higher similarity to the image to be queried exists in an image data processing system can be quickly judged through the binary coding feature corresponding to the hash feature of the image to be queried. For example, in the field of AI security application, in order to support a big data analysis function, an original scene image shot by a front-end camera needs to be subjected to object detection such as human faces, human figures, motor vehicles, non-motor vehicles, and the like, and a detected object is calibrated and stored, as shown in fig. 7, a schematic diagram is repeated for an acquired image region provided in an embodiment of the present application, where the upper diagram is the original scene image (a scene image, i.e., a first image) shot by the front-end camera, and the size of the upper diagram is about 300 KB; the lower left image is a human-shaped image (second image) obtained by detecting and calibrating the original scene image by using an AI algorithm, the size of the human-shaped image is about 100KB, and the human-shaped image corresponds to a rectangular area encircled by the original scene image; the lower right image is a face image (third image) detected and calibrated from the original image by the AI algorithm, and the size of the face image is about 15KB and corresponds to a rectangular area enclosed by the original scene image. It should be noted that the collected face image belongs to a legal source under the authorized prospect of the user and by the agreement and contract of the user.

In the related art, the repeatability of the content area existing between the images is not considered in the unit of the whole image, and the first image, the second image and the third image are not overlapped and are stored separately by default, so that the actual storage space utilization efficiency is only about 75%.

It should be understood that the embodiment of the present application provides a distributed storage-based image processing method capable of avoiding repeated storage of images, which may relate to machine learning direction in the field of artificial intelligence. It is understood that so-called Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human intelligence, senses the environment, acquires knowledge and uses knowledge to obtain the best results using a digital computer or digital computer controlled computation. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like.

Distributed File System (DFS) means that physical storage resources managed by a File System are not necessarily directly connected to a local node, but are connected to a node through a computer network; or a complete hierarchical file system formed by combining several different logical disk partitions or volume labels. The distributed file system can effectively solve the storage and management problems of data, a certain file system fixed at a certain place is expanded to any multiple places/multiple file systems, and a file system network is formed by a plurality of nodes. Each node may be distributed at different locations, with communication and data transfer between nodes over the network.

In the distributed storage image processing method, the utilization rate of a storage space is improved. The method comprises the steps of analyzing a scene image by acquiring the scene image and a target image belonging to the scene image, and determining structured data and image data of the scene image, wherein the structured data comprises an image name of the image data, area information of the target image in the image data and a source identifier of the image data corresponding to the target image; storing image data in a plurality of scene images into a large file by aggregation, associating and storing a file identifier and an image name of a scene image, determining the file identifier of the target image associated with the scene image by combining area information of the image data of the target image in the scene image and a source identifier of the image data according to the corresponding relation between the image name of the scene image and the file identifier, and storing the target image by adding the area information in the file identifier of the scene image; on one hand, the method and the device utilize a polymerization storage mode, so that the storage space can be used to the maximum extent, and the waste of the storage space is reduced; on the other hand, the storage of repeated images is avoided by adding the region information storage target image in the file identifier, the utilization rate of the image storage space is greatly improved, and compared with the current image data deduplication compression method, the utilization rate of the storage space is improved by about 25% in an AI security application scene.

Referring to fig. 2, a schematic flow chart of a distributed storage based image processing method according to an embodiment of the present application is detailed as follows:

step S101, acquiring a scene image acquired by image equipment and a target image in the scene image;

specifically, a scene image and a target image of the same scene are acquired by the same image device, or/and a scene image and a target image of the same scene are acquired by different image devices, so that the degree of association between the scene image and the target image is ensured.

For example, the image devices include cameras, video cameras, and other electronic devices equipped with a camera. The scene image comprises a target image, i.e. the scene image part content is the same as the target image, in other words the target image is a part of the scene image. The scene image mainly refers to a specific monitoring scene, and can be used for acquiring images in specific scenes such as cells, factories and enterprises to form the scene image. The target includes, but is not limited to, a human body, a vehicle, an animal, a plant, and other objects. In addition, the acquired scene image or the target image may be a two-dimensional image or a three-dimensional image, which is not limited herein.

Step S102, analyzing the scene image, and determining structured data and image data of the scene image, wherein the structured data comprises an image name of the image data, area information of the target image in the image data, and a source identifier of the image data corresponding to the target image;

specifically, the GA/T1400 protocol is utilized to analyze the scene image, and the structural data and the image data of the scene image are determined.

For example, the face recognition camera is connected with the personnel structured analysis system through SDK or GA/T1400 protocol communication of a manufacturer, and the vehicle recognition camera is connected with the vehicle structured analysis system through SDK or GA/T1400 protocol communication; the face recognition camera and the vehicle recognition camera are respectively in communication connection with a video image information database through a GA/T1400.4-2017 interface protocol; the video image information database is connected with the server platform through GA/T1400.4-2017 interface protocol communication.

Step S103, aggregating and storing the image data of the scene images into a large file for distributed storage, and generating a unique file identifier of the large file; storing the file identification in association with the image name of the scene image;

specifically, image data of a plurality of scene images are aggregated and stored into one large file, each large file corresponds to a unique file identifier, and structured data of the scene images are stored in an associated manner;

For example, image data of a plurality of scene images are aggregated and stored into one large file, each large file corresponds to a unique file identifier, adaptive configuration is performed according to a storage space and the size of a file of a scene image to be stored, and the large files are stored in a distributed manner according to a load balancing principle.

In the embodiment, for image data in a scene image, a suitable compression algorithm is selected for compression according to the text type of the image data, and meanwhile, whether the size of the image data meets a preset size or not is judged, and if the size meets the preset size, the data is written into a persistence device. That is, in the embodiment, an aggregation storage manner is adopted, when the image data is larger than 4MB (in some embodiments, a smaller value may be set, for example, 512K, and a value of a predetermined size may be set according to an actual business requirement, and it is convenient to explain that 4MB is selected), the image data of the scene image is compressed to form a large file, and the large file is saved into a persistent storage device, the large file corresponds to a unique file identifier, and the file identifier is stored in association with the image name of the scene image.

Step S104, determining the file identifier of the target image related to the scene image by combining the area information of the target image and the source identifier according to the corresponding relation between the image name of the scene image and the file identifier, and storing the target image by adding the area information in the file identifier.

Specifically, the method comprises the steps of determining the area information of image data in a scene image corresponding to a target image through the corresponding relation between the image name of the scene image and a file identifier stored in a comparison database (an image comparison database), determining the file identifier of the scene image related to the target image by combining the area information of the image data in the scene image of the target image and the source identifier, and realizing the storage of the target image by utilizing the scene image through adding the area information in the file identifier.

For example, the file identifier of the scene image related to the target image is determined according to the area information of the image data of the target image in the scene image and the source identifier of the target image in the image information, and the target image is stored by changing the area information added in the file identifier.

That is, the region information is modified in the file identifier to prevent the repeated target image from being stored in the distributed storage system, and the target image can be stored only by recording the region information (position region information) corresponding to the target image.

In the embodiment, scene image information collected by an image device and target image information in a scene image are received; analyzing the scene image information, and determining the structural data and the image data of the scene image; aggregating and storing image data of a plurality of scene images into a large file for distributed storage, and generating a file identifier of the scene image data; storing target image content data by adding region information in the file identifier of the source scene image according to the corresponding relation between the name of the scene image and the file identifier and combining the name of the source scene image and the region information in the structured data of the target image; the file identification of the target image is generated by adding the regional information mode to the file identification of the source scene image to store the content data of the target image, so that the storage of repeated image data is avoided, and the utilization rate of the storage space of the image is greatly improved. On one hand, the method and the device utilize a polymerization storage mode, so that the storage space can be used to the maximum extent, and the waste of the storage space is reduced; on the other hand, the storage of repeated images is avoided by adding the region information storage target image in the file identifier, the utilization rate of the image storage space is greatly improved, and compared with the current image data deduplication compression method, the utilization rate of the storage space is improved by about 25% in an AI security application scene.

Optionally, analyzing the scene image to determine the structured data and the image data of the scene image, including:

That is, the similarity between the target image and the image data is compared, for example, if the similarity between the image data corresponding to the target image and the scene image reaches a preset reliability threshold, it is determined that the information content of the region in the target image is the same as that in the scene image, otherwise, the information content of the region in the target image is different from that in the scene image.

Specifically, the similarity between the target image and the image data is determined in any one of the following manners; euclidean distance, cosine distance, average hash algorithm, perceptual hash algorithm, difference hash algorithm, and the like.

Through the method, the similarity between the target image and the scene image area information can be determined, so that the target image can be stored in the original scene image according to the image name, the source identification and the like, and the problem of image repeated storage is avoided.

Referring to fig. 3, in an embodiment of the present application, after storing a target image by adding region information in a file identifier of the scene image, the flowchart for recovering an image based on a distributed stored image further includes:

step S201, positioning the storage position of the large file by using the file identifier;

specifically, the storage position of the large file can be located in the distributed storage system by using the file identifier, so that the large file is convenient to locate and search.

Step S202, after the large file is determined, reading a scene image of the large file according to the file identification, and storing the scene image in a scanning line array mode;

specifically, the large file of the designated storage position is read and cached by using the memory, so that subsequent data recovery is facilitated.

Step S203, reading corresponding row and column data from the scanning line array by using the regional information of the target image in the scene data to restore the target image.

Specifically, the area information of the image data of the target image in the scene image is determined through the corresponding relation between the image name of the scene image and the file identifier stored in advance, and the corresponding row and column data is read by using the scanning line array to restore the target image.

By the mode, the corresponding area information in the scene image is read, and the data reading and recovery of the target image are realized.

Optionally, before the corresponding relationship between the image name of the scene image and the file identifier, the method further includes:

In this embodiment, since the area information of the target image and the scene image stored in the distributed storage system is not necessarily completely in one-to-one correspondence, the target image can be stored quickly by judging the relationship between the two in advance.

Optionally, referring to fig. 4, in an embodiment of the present application, a distributed storage flow chart based on a distributed storage image further includes:

step S301, when a distributed storage request is received, a writable area of a disk in a distributed storage system is obtained; the distributed storage request comprises the size of the large file to be written, and the disk is distributed and divided into a plurality of writable areas;

for example, an object is a basic unit of data storage, consisting of meta information, user data, and a file name. In a storage node of a distributed storage system, a disk may be divided into a plurality of writable areas zones, where space management of the disk is granular with the size of one Zone (e.g., 256MB), and the Zone is used for storing objects.

Step S302, determining writing information according to the distributed storage request and the residual storage space of the writable area; the writing information comprises the length of each object obtained by segmenting the large file to be written;

step S303, returning the write information to a computing node, so that the computing node writes the file to be written into the corresponding storage node according to the write information.

In this embodiment, when a computing node requests to store a file, a large file to be written needs to be segmented into a plurality of objects, and then the objects are stored in a writable area Zone of a disk in a distributed object storage system. Specifically, the computing node first generates a distributed storage request according to the large file to be written, and then initiates the distributed storage request to the server. When a server receives a distributed storage request initiated by a computing node, a writable area Zone of a disk in a distributed storage system is obtained, so as to perform subsequent storage space allocation according to the writable area Zone. Alternatively, the computing node may be a desktop computer, a cell phone, a PDA, or the like.

In this way, in order to improve the space utilization of the writable area Zone, the write information may be determined according to the distributed storage request and the remaining storage space of the writable area Zone. Specifically, the size of the large file to be written may be determined according to the distributed storage request, the lengths of the objects obtained by segmenting the large file to be written are adaptively configured according to the size of the large file to be written and the remaining storage space of the writable area, and the objects with variable lengths are subjected to space management, so as to maximize the use of Zone space.

In other embodiments, referring to fig. 8, an image processing storage diagram based on distributed storage according to an embodiment of the present application includes:

the computing nodes interact with the distributed storage system through a standard GA/T1400 protocol, and the protocol specification content comprises two parts: the image data comprises image structured data and image data, wherein key fields in the image structured data comprise: ImageID (image name), SourceID (source identification), region information (LeftTopXY, RightBomXY, for expressing location information). The structured data of the original scene image shot by the front-end camera does not carry the SourceID and the region information, and only the structured data of the specific target image, such as the face image and the human-shaped image, contains the SourceID and the region information, which indicates that the target image corresponds to the corresponding region of the SourceID original image, as shown in fig. 7.

The storage node can be divided into two layers according to the logic function: the first layer serves a built-in GA/T1400 protocol and is mainly responsible for the functions of protocol analysis and an image comparison library; the second layer is an image storage and restoration service and is mainly responsible for the functions of image aggregation storage, direct storage and direct access and calculation of image data in the restoration area.

The storage format of image data in a disk, taking 8 × 8pixels as an example, adopts 24-bits RGB color, and stores from top to bottom in a manner of taking 8pixels as a scan line, each pixel includes three units of r (red), g (green), and b (bulb), each unit occupies 8bits, that is, the binary storage sequence corresponding to the image data is: r, G, B, R, G, B, R, G, B, …. In order to exert the storage performance to the maximum extent, a plurality of pieces of image data are aggregated and stored into a single large file, and a fid unique identifier is adopted, wherein the fid format is as follows: fileid-offset-len-leftpoint-rightpoint, wherein the fileid occupies 64bits, the high 4bits represent the image type, the value of 0 represents the original scene image, the value of 1 represents the area image, the middle 12bits represents the storage node id, the encoding is started from 0, the low 48 bits represent the file id, and the encoding is started from 0; offset and len respectively represent the offset of the image in the large file and the image size, and each account for 32 bits; the leftpoint and the rightpoint represent the area information of the area image in the original scene image, each of the areas occupies 32bits, the higher 16bits represents the X coordinate value, the lower 16bits represents the Y coordinate value, and the coordinate values all take pixel as a unit.

The image comparison library stores the corresponding relation between the ImageID and the fid (file identification) of the original scene image in the last 30s, the computing node searches the fid corresponding to the original image in the image comparison library through the GA/T1400 according to the SourceID, the added region information is modified and used as the fid of the region image, and the data of the region image does not need to be submitted and stored in a disk, so that the purposes of deleting the region and improving the utilization efficiency of the storage space are achieved.

When the computing node reads the area image data, the fid of the area image needs to be carried, the distributed storage system firstly positions and reads the original scene image data according to the file identification fid, the original scene image data is stored in a memory in a scanline array mode (the specific picture compression and decompression processes, such as a jpeg format, can be realized through standard libjpeg.

Referring to fig. 9, a schematic diagram of distributed storage based on a distributed storage image according to an embodiment of the present application includes:

for example, taking a storage cluster formed by 3 storage nodes as an example, an image content area deduplication process is described:

(1) writing original scene images

The computing Node carries the structural data and the image data of the scene image through a GA/T1400 protocol, and initiates an image storage request to a storage Node 1;

after receiving the request, the storage Node1 analyzes the image structured data and the image data according to the specification of the GA/T1400 protocol content, and selects and initiates an image storage request to the storage Node 2 according to the load balancing principle;

the storage Node 2 aggregates the stored image data and returns the fid of the scene image;

the storage Node1 records the correspondence between the image ImageID and the fid in the image comparison library, and returns the fid to the computing Node.

(2) Writing human shape, face image (image content area repeat deleting)

The computing Node carries the image structured data and the image data through a GA/T1400 protocol, and initiates an image storage request to the storage Node 1;

after receiving the request, the storage Node1 parses the image structured data information and the image data according to the GA/T1400 protocol content specification, searches the fid (identifier) of the original scene image in the image contrast library according to the SourceID (source identifier), modifies the added region information as the region image fid, and returns the region image fid to the computing Node.

(3) Reading image data

The computing Node carries the image fid through a GA/T1400 protocol and initiates an image reading request to the storage Node 1;

the storage node extracts the storage node id from the fid, such as: node 2, initiating a request for reading image data to the storage Node;

after receiving the request, the storage Node 2 analyzes and determines whether to read the original scene image or the area image from the fid, and then analyzes the file id where the image data is located, the offset position, the image size, and the area information of the area image. If the image is the regional image, restoring the regional image data according to the image data;

after receiving the image data, the storage Node1 directly returns the image data to the computing Node.

By the distributed storage-based image processing method, a scene image and a target image belonging to the scene image are obtained, the scene image is analyzed, and structured data and image data of the scene image are determined, wherein the structured data comprises an image name of the image data, area information of the target image in the image data, and a source identifier of the image data corresponding to the target image; after the image data of the scene file is stored into a large file through aggregation, the file identification and the image name of the scene image are stored in an associated manner, and the target image storage can be realized only by adding the area information in the file identification corresponding to the scene image; on one hand, the method and the device have the advantages that the aggregation storage mode is utilized, so that the storage space can be used to the maximum extent, and the waste of the storage space is reduced; on the other hand, the region information is added in the file identification to store the target image, so that the storage of repeated images is avoided, and the utilization rate of an image storage space is greatly improved.

Referring to fig. 6, the present embodiment provides a distributed storage based image processing system 500, which includes:

an obtaining module 501, configured to obtain a scene image acquired by an image device and a target image in the scene image;

an analysis processing module 502, configured to perform analysis processing on the scene image, and determine structured data and image data of the scene image, where the structured data includes an image name of the image data, area information of the target image in the image data, and a source identifier of the image data corresponding to the target image;

the aggregation storage module 503 is configured to aggregate and store the scene images into a large file for distributed storage by using the structured data and the image data of the scene images, and generate a unique file identifier for the large file; storing the file identification in association with the image name of the scene image;

a region information determining module 504, configured to determine, according to a corresponding relationship between an image name of the scene image and a file identifier, a file identifier of the scene image associated with the target image by combining the region information of the target image and the source identifier, and store the target image by adding region information in the file identifier.

In this embodiment, the system is substantially provided with a plurality of modules for executing the method in the above embodiments, and specific functions and technical effects may refer to the above method embodiments, which are not described herein again.

Referring to fig. 7, an embodiment of the present application further provides an electronic device 600, which includes a processor 601, a memory 602, and a communication bus 603;

a communication bus 603 is used to connect the processor 601 and the memory 602;

the processor 601 is adapted to execute the computer program stored in the memory 602 to implement the method according to one or more of the above-mentioned embodiments.

Embodiments of the present application also provide a computer-readable storage medium, having a computer program stored thereon,

the computer program is for causing a computer to perform the method of any one of the above embodiments one.

Embodiments of the present application also provide a non-volatile readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the one or more modules may cause the device to execute instructions (instructions) included in an embodiment of the present application.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The above embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the application. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical concepts disclosed in the present application shall be covered by the claims of the present application.

Claims

1. A method for processing an image based on distributed storage, the method comprising:

analyzing the scene image, and determining structural data and image data of the scene image, wherein the structural data comprises an image name of the image data, area information of the target image in the image data, and a source identifier of the image data corresponding to the target image;

2. The method of claim 1, wherein acquiring an image of a scene captured by an imaging device and an image of an object within the image of the scene comprises:

the method comprises the steps of utilizing the same image equipment to collect a scene image and a target image of the same scene, or/and utilizing different image equipment to collect the scene image and the target image of the same scene.

3. The method of claim 2, wherein parsing the scene image to determine structured data and image data for the scene image comprises:

4. The method of claim 1, wherein the aggregating image data of the plurality of scene images into a large file for distributed storage, and generating a file identifier of the large file storage, comprises:

aggregating and storing the image data of a plurality of scene images into a large file, wherein each large file corresponds to a unique file identifier, and storing the structural data of the scene images in a correlation manner

And storing each large file in a distributed manner according to a load balancing principle, wherein the file identification comprises a file name, the offset of the image data in the large file, the image size and the region information of the target image in the scene image.

5. The method of any of claims 1-4, wherein after storing the target image by adding region information within the file identification, further comprising:

and reading corresponding row and column data from the scanning line array by utilizing the regional information of the target image in the scene data to restore the target image.

6. The method according to claim 1, wherein before the corresponding relationship between the image name of the scene image and the file identifier, further comprising:

7. The method of any of claims 1-4, wherein the distributed storage further comprises:

8. A distributed storage based image processing system, the system comprising:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a scene image acquired by image equipment and a target image in the scene image;

the aggregation storage module is used for aggregating and storing the image data of the plurality of scene images into a large file for distributed storage, and generating a unique file identifier of the large file; storing the file identification in association with the image name of the scene image;

9. An electronic device comprising a processor, a memory, and a communication bus;

the communication bus is used for connecting the processor and the memory;

the processor is configured to execute a computer program stored in the memory to implement the method of any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program,

the computer program is for causing a computer to perform the method of any one of claims 1-7.