CN113095286A

CN113095286A - Big data image processing algorithm and system

Info

Publication number: CN113095286A
Application number: CN202110480856.3A
Authority: CN
Inventors: 汪知礼
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-07-09

Abstract

The invention relates to the technical field of image processing, and discloses a big data image processing algorithm, which comprises the following steps: acquiring mass image data, and carrying out image graying and grayscale stretching pretreatment on the image data; performing feature extraction on the preprocessed image by using an image description feature extraction algorithm; optimizing a data partitioning strategy of the big data platform by using a partitioning optimization algorithm, storing mass image data into the optimized big data platform, and taking image description characteristics as key values of image storage; carrying out self-adaptive segmentation processing on the stored image by utilizing a self-adaptive image segmentation algorithm; and extracting the semantic information of the segmented image blocks by using an image semantic feature extraction model, and taking the semantic information of all the segmented image blocks as the semantic information of the original image. The invention also provides a big data image processing system. The invention realizes image processing based on big data.

Description

Big data image processing algorithm and system

Technical Field

The invention relates to the technical field of image processing, in particular to a big data image processing algorithm and a big data image processing system.

Background

With the rapid development of image big data technology, multi-dimensional, multi-scale and high-resolution image data shows explosive growth. The traditional image processing software has the problem of long time consumption when processing the massive image data.

Meanwhile, the traditional image processing software has strong dependence on hardware resources of a server due to the problems of high data throughput, information redundancy and the like, cannot be managed in a centralized manner, is difficult to maintain, and has poor expansibility and low storage utilization rate.

In view of this, how to implement more efficient mass data processing by means of big data technology becomes a problem to be solved urgently by those skilled in the art.

Disclosure of Invention

The invention provides a big data image processing algorithm, which is characterized in that the description characteristics of an image are extracted to be used as a storage key value of the image, a data partitioning strategy of a big data platform is optimized by using a partitioning optimization algorithm, and mass data are stored in the optimized big data platform; the self-adaptive image segmentation algorithm is used for carrying out self-adaptive segmentation processing on the stored image, the image semantic feature extraction model is used for extracting the semantic information of the segmented image blocks, and the semantic information of all the segmented image blocks is used as the information of the original image, so that the segmentation and semantic information extraction processing of the image are realized.

In order to achieve the above object, the present invention provides a big data image processing algorithm, including:

acquiring mass image data, and performing image graying and grayscale stretching preprocessing on the image data to obtain preprocessed mass image data;

performing feature extraction on the preprocessed image by using an image description feature extraction algorithm to obtain description features of massive images;

optimizing a data partitioning strategy of the big data platform by using a partitioning optimization algorithm, storing mass image data into the optimized big data platform, and taking image description characteristics as key values of image storage;

carrying out self-adaptive segmentation processing on the stored image by using a self-adaptive image segmentation algorithm to obtain a plurality of image blocks;

and extracting the semantic information of the segmented image blocks by using an image semantic feature extraction model, and taking the semantic information of all the segmented image blocks as the semantic information of the original image.

Optionally, the preprocessing of performing image graying and grayscale stretching on the image data includes:

1) solving the maximum value of three components of each pixel in the acquired massive images, and setting the maximum value as the gray value of the pixel point to obtain a gray map of the image, wherein the formula of the gray processing is as follows:

G(i,j)＝max{R(i,j),G(i,j),B(i,j)}

wherein:

(i, j) is a pixel point in the image;

r (i, j), G (i, j) and B (i, j) are respectively the values of the pixel point (i, j) in R, G, B three color channels;

g (i, j) is the gray value of the pixel point (i, j);

2) for the gray-scale image, stretching the gray-scale of the image by using a piecewise linear transformation, wherein the formula of the gray-scale stretching is as follows:

wherein:

f (x, y) is a gray scale map;

MAX_f(x,y),MIN_f(x,y)respectively the maximum and minimum grey values of the grey map.

Optionally, the performing, by using an image description feature extraction algorithm, feature extraction on the preprocessed image includes:

1) constructing a Hessian matrix of the image:

wherein:

h_xx(x, sigma) is a second derivative of the image at the position x of the pixel point, and sigma is a neighborhood standard deviation of the image pixel;

2) calculating each imageHessian matrix extremum of element: d (h (x) ═ h_xy*h_yy-(0.9*h_xy)²(ii) a Selecting K pixel points with the maximum D (H (x)) in the image as local feature points of the image;

3) constructing a Gaussian scale domain space:

wherein:

i (x, y) is an original image;

sigma is the standard deviation of pixels of the original image;

4) comparing the local characteristic points processed by the Hessian matrix with all adjacent points of an image domain and a scale domain of the local characteristic points, and positioning stable characteristic points by adopting a non-maximum value inhibition method;

5) counting Harr wavelet characteristics in the circular neighborhood of the characteristic points and determining the main direction of the characteristic points; and 4-4 rectangular area blocks around the feature points are extracted, and the sum of the horizontal direction values, the sum of the vertical direction values, the sum of the horizontal direction absolute values and the sum of the vertical direction absolute values of the Harr wavelet features are counted to obtain 64-dimensional feature vectors which serve as image description features.

Optionally, the optimizing the data partitioning policy of the big data platform by using the partition optimization algorithm includes:

in a specific embodiment of the present invention, the big data platform is a Hadoop platform;

the partition optimization algorithm flow comprises the following steps:

1) the method comprises the following steps that a big data platform receives massive image data, image description characteristics are used as key values key for image storage, and the stored images are sampled based on a sampling rate s, in a specific embodiment of the invention, a sampling rate evaluation model is constructed, the image sampling rate is determined according to the constructed sampling rate evaluation model, and the sampling rate evaluation model is as follows:

s＝argmin(αD_s+βT_s)

wherein:

cov_s,ithe value of the error rate of the i-th sampling at the sampling rate s, which is the difference between the data distribution after sampling and the expected distribution, cov_m,iThe error rate value of the ith sampling when the sampling rate is 100 percent;

t_s,ithe sampling time of j sampling when the sampling rate is s is represented;

α, β represent sampling rate evaluation model parameters, which are both set to 1;

2) obtaining the total load and the total Key value Key kind number according to the sampling result, and calculating the average load of the big data platform according to the Reduce number started by the client;

traversing the Key value queue, if the value in the Key value is larger than the average value load, then the image corresponding to the Key value Key is called as a large load, and then the large load is split, wherein the splitting of the large load is divided into two cases, 1. the large load is equal to the load average value, then the Key is distributed to the node with the Reduce load of 0, and the corresponding relation between the partition number and the Reduce node is recorded; 2. and when the large load is several times of the average load, splitting the large load by using the average load, simultaneously recording the corresponding relation between the partition number and the Key, and after the large load is processed, directly distributing the small load to the Reduce node and recording the corresponding relation between the Key and the partition number.

Optionally, the performing, by using an adaptive image segmentation algorithm, an adaptive segmentation process on the stored image includes:

1) determining the image segmentation quantity K, and performing random and uniform centroid distribution operation on all pixel points in the whole image, namely distributing corresponding K clustering centers, wherein the area of each image block is S multiplied by S, S is sqrt (N/K), and N is the total number of image pixels;

2) computing clustered centers and surrogates in search regionsSpatial distance function between prime points

Wherein (x)_i,y_i) Is the cluster center of the ith image block, (x)_j,y_j) The pixel points in the ith image block are selected;

3) converting the image into an LAB image, calculating a scaling function m (i, j) of the image:

wherein:

(x_ij,y_ij) Is the difference between the distances between pixel i and pixel j;

L_ij,A_ij,B_ijthe correlation between the pixel point i and the pixel point j about the brightness, the red and green color and the yellow and green color is shown;

4) respectively calculating a proportional function of each image block, and dividing the divided image into two parts by using the proportional function, namely an area larger than the proportional function and an area smaller than or equal to the proportional function; for the segmented small-size image, calculating the pixel difference between the small-size image and the adjacent large-size image:

t_q＝|L-L_q|+|A-A_q|+|B-B_q|

wherein:

t_qis the pixel difference between the small-size image and the adjacent large-size image q;

l, A and B are average values of the brightness, red and green color values and yellow and green color values of the small-size image respectively;

L_q,A_q,B_qrespectively averaging the luminance, red-green color value and yellow-green color value of the adjacent large-size image q;

if t_qIf T is less than T, the small-size image and the large-size image are merged, wherein T isA preset image threshold;

5) and performing iterative computation to optimize the algorithm until the area of the clustering center and the area of the clustering image blocks are not changed, and segmenting the original image to obtain a plurality of image blocks.

Optionally, the extracting semantic information of the segmented image block by using the image semantic feature extraction model includes:

in a specific embodiment of the present invention, the network structure of the image semantic feature extraction model is a VGG16 model;

the image semantic feature extraction process comprises the following steps:

performing feature training by adopting the initial weight of ImageNet migration, and respectively extracting low-level features and high-level features of the image; basic characteristic information of the image, such as characteristics of lines, edges, shapes and the like, is extracted by the lower layer network, and the information has universality, so that the network parameters of the lower layer can directly adopt pre-trained weight information; the high-level network forms feature expression information aiming at a specific problem by combining and mapping the low-level basic features;

inputting the extracted features into the full-connection layer, and training and adjusting the weight information of the full-connection layer by using a fine-tuning strategy so as to maximize the accuracy of semantic classification of the image block; and selecting the output information of the first full-connection layer as the deep learning semantic information of the image block.

Further, to achieve the above object, the present invention also provides a big data image processing system, comprising:

the image acquisition device is used for acquiring mass image data;

the data processor is used for preprocessing image graying and gray stretching to the image data to obtain preprocessed massive image data, and extracting the characteristics of the preprocessed image by using an image description characteristic extraction algorithm to obtain the description characteristics of the massive image;

the big data image processing device is used for optimizing a data partitioning strategy of the big data platform by utilizing a partitioning optimization algorithm, storing mass image data into the optimized big data platform, taking image description characteristics as key values of image storage, and performing self-adaptive segmentation processing on the stored image by utilizing a self-adaptive image segmentation algorithm to obtain a plurality of image blocks; and extracting the semantic information of the segmented image blocks by using an image semantic feature extraction model, and taking the semantic information of all the segmented image blocks as the semantic information of the original image.

Further, to achieve the above object, the present invention also provides a computer readable storage medium having stored thereon big data image processing program instructions executable by one or more processors to implement the steps of the implementation method of big data image processing as described above.

Compared with the prior art, the invention provides a big data image processing algorithm, which has the following advantages:

firstly, the invention provides a partition optimization algorithm of a Hadoop platform, wherein a big data platform receives mass image data, takes image description characteristics as key values key for image storage, constructs a sampling rate evaluation model, and comprehensively considers the influence of different sampling rates on data distribution and sampling cost, so as to determine the optimal image sampling rate according to the constructed sampling rate evaluation model, and sample the stored image based on the sampling rate s, wherein the sampling rate evaluation model is as follows:

s＝argmin(αD_s+βT_s)

wherein: cov_s,iThe value of the error rate of the i-th sampling at the sampling rate s, which is the difference between the data distribution after sampling and the expected distribution, cov_m,iThe error rate value of the ith sampling when the sampling rate is 100 percent; t is t_s,iShow the drawerSampling time of j sampling when the sampling rate is s; alpha and beta represent sampling rate evaluation model parameters; obtaining the total load and the total Key value Key kind number according to the sampling result, and calculating the average load of the big data platform according to the Reduce number started by the client; traversing the Key value queue, if the value in the Key value is larger than the average load, taking the image corresponding to the Key value Key as a large load, splitting the large load, wherein the splitting of the large load is divided into two conditions, 1. the large load is equal to the load average value, distributing the Key to a node with a Reduce load of 0, and recording the corresponding relation between the partition number and the Reduce node; 2. the large load is several times of the average load, the large load is split by the average load, the corresponding relation between the partition number and the Key is recorded, and after the large load is processed, the small load is directly distributed to Reduce nodes and the corresponding relation between the Key and the partition number is recorded, so that a large-load image is placed on a small number of Reduce as much as possible, the problem of overlarge load caused by inclined data distribution in a large data storage platform is effectively solved, and the storage pressure of image data is relieved.

Meanwhile, compared with the traditional image segmentation algorithm which needs to manually determine image segmentation parameters, the invention provides a self-adaptive image segmentation algorithm for carrying out self-adaptive segmentation processing on a stored image, firstly, the image segmentation quantity K is determined, and random and uniform centroid distribution operation is carried out on all pixel points in the whole image, namely, corresponding K clustering centers are distributed, the area of each image block is S multiplied by S, S is sqrt (N/K), wherein N is the total number of image pixels; calculating a spatial distance function between clustered centers and all pixel points in a search area

Wherein (x)_i,y_i) Is the cluster center of the ith image block, (x)_j,y_j) The pixel points in the ith image block are selected; converting the image into an LAB image, calculating a scaling function m (i, j) of the image:

wherein: (x)_ij,y_ij) Is the difference between the distances between pixel i and pixel j; l is_ij,A_ij,B_ijThe correlation between the pixel point i and the pixel point j about the brightness, the red and green color and the yellow and green color is shown; respectively calculating a proportional function of each image block, and dividing the divided image into two parts by using the proportional function, namely an area larger than the proportional function and an area smaller than or equal to the proportional function; for the segmented small-size image, calculating the pixel difference between the small-size image and the adjacent large-size image:

t_q＝|L-L_q|+|A-A_q|+|B-B_q|

wherein: t is t_qIs the pixel difference between the small-size image and the adjacent large-size image q; l, A and B are average values of the brightness, red and green color values and yellow and green color values of the small-size image respectively; l is_q,A_q,B_qRespectively averaging the luminance, red-green color value and yellow-green color value of the adjacent large-size image q; if t_qIf the image size is less than T, combining the small-size image with the large-size image, wherein T is a preset image threshold value; and performing iterative computation to optimize the algorithm until the area of the clustering center and the area of the clustering image blocks are not changed, and segmenting the original image to obtain a plurality of image blocks. Compared with the traditional algorithm, the method obtains the proportion function of each pixel point in the image area and the clustering center pixel point of the area through a plurality of times of iterative computation, the proportional function contains information about luminance information, red-green color information, yellow-blue color information, and coordinate space distance in the color space, the proportion function of each image block can be obtained through iterative calculation, the divided image is divided into two parts by utilizing the proportion function, i.e., regions greater than the proportional function and regions less than or equal to the proportional function, to achieve finer image segmentation, meanwhile, aiming at the small-size image which possibly appears, the invention sets the threshold value of the pixel difference between the small-size image and the adjacent large-size image, if the threshold value is met, the small-sized image is merged with the large-sized image to thereby implement an adaptive segmentation process of the image based on the color space and the coordinate space of the image.

Drawings

FIG. 1 is a schematic flow chart of a big data image processing algorithm according to an embodiment of the present invention;

FIG. 2 is a block diagram of a big data image processing system according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Extracting description features of the image to serve as storage key values of the image, optimizing a data partitioning strategy of the big data platform by using a partitioning optimization algorithm, and storing mass data into the optimized big data platform; the self-adaptive image segmentation algorithm is used for carrying out self-adaptive segmentation processing on the stored image, the image semantic feature extraction model is used for extracting the semantic information of the segmented image blocks, and the semantic information of all the segmented image blocks is used as the information of the original image, so that the segmentation and semantic information extraction processing of the image are realized. Referring to fig. 1, a schematic diagram of a big data image processing algorithm according to an embodiment of the present invention is provided.

In this embodiment, the big data image processing algorithm includes:

and S1, acquiring mass image data, and performing preprocessing of image graying and gray level stretching on the image data to obtain the mass image data after preprocessing.

Firstly, the invention acquires mass image data and carries out preprocessing of image graying and gray stretching on the image data, wherein the preprocessing flow of the image graying and gray stretching is as follows:

G(i,j)＝max{R(i,j),G(i,j),B(i,j)}

wherein:

(i, j) is a pixel point in the image;

g (i, j) is the gray value of the pixel point (i, j);

wherein:

f (x, y) is a gray scale map;

And S2, performing feature extraction on the preprocessed image by using an image description feature extraction algorithm to obtain description features of the massive images.

Further, the method utilizes an image description feature extraction algorithm to extract features of the preprocessed image to obtain description features of massive images; the image description feature extraction algorithm flow is as follows:

1) constructing a Hessian matrix of the image:

wherein:

2) calculating the Hessian matrix extreme value of each pixel: d (h (x) ═ h_xy*h_yy-(0.9*h_xy)²(ii) a Selecting K pixel points with the maximum D (H (x)) in the image as local feature points of the image;

3) constructing a Gaussian scale domain space:

wherein:

i (x, y) is an original image;

sigma is the standard deviation of pixels of the original image;

And S3, optimizing the data partitioning strategy of the big data platform by using a partitioning optimization algorithm, storing mass image data into the optimized big data platform, and taking the image description characteristics as key values of image storage.

Further, the data partitioning strategy of the big data platform is optimized by using a partitioning optimization algorithm, and in a specific embodiment of the invention, the big data platform is a Hadoop platform;

the partition optimization algorithm flow comprises the following steps:

s＝argmin(αD_s+βT_s)

wherein:

cov_s,tthe value of the error rate of the i-th sampling at the sampling rate s, which is the difference between the data distribution after sampling and the expected distribution, cov_m,iThe error rate value of the ith sampling when the sampling rate is 100 percent;

And S4, carrying out self-adaptive segmentation processing on the stored image by utilizing a self-adaptive image segmentation algorithm to obtain a plurality of image blocks.

Further, the invention uses self-adaptive image segmentation algorithm to carry out self-adaptive segmentation processing on the stored image, and the flow of the self-adaptive image segmentation algorithm is as follows:

2) calculating a spatial distance function between clustered centers and all pixel points in a search area

wherein:

t_q＝|L-L_q|+|A-A_q|+|B-B_q|

wherein:

if t_qIf the image size is less than T, combining the small-size image with the large-size image, wherein T is a preset image threshold value;

And S5, extracting the semantic information of the divided image blocks by using the image semantic feature extraction model, and taking the semantic information of all the divided image blocks as the semantic information of the original image.

Furthermore, the invention utilizes an image semantic feature extraction model to extract semantic information of the segmented image blocks, and in a specific embodiment of the invention, the network structure of the image semantic feature extraction model is a VGG16 model;

the image semantic feature extraction process comprises the following steps:

The following describes embodiments of the present invention through an algorithmic experiment and tests of the inventive treatment method. The hardware test environment of the algorithm of the invention is as follows: inter (R) core (TM) i7-6700KCPU with software Matlab2018 a; the comparison method comprises a big data image processing algorithm based on random forests and a big data image processing algorithm based on Bayes.

In the algorithm experiment of the invention, the data set is 10G of image data. In the experiment, the image data is input into the algorithm model, and the accuracy of image processing is used as an evaluation index of algorithm feasibility, wherein the higher the accuracy of image processing is, the higher the effectiveness and the feasibility of the algorithm are.

According to the experimental result, the image processing accuracy of the random forest-based big data image processing algorithm is 81.31%, the image processing accuracy of the Bayesian-based big data image processing algorithm is 86.38%, the image processing accuracy of the method is 87.92%, and compared with a comparison algorithm, the big data image processing algorithm provided by the invention can realize higher image processing accuracy.

The invention also provides a big data image processing system. Fig. 2 is a schematic diagram illustrating an internal structure of a big data image processing system according to an embodiment of the present invention.

In the present embodiment, the big data image processing system 1 includes at least an image acquisition device 11, a data processor 12, a big data image processing device 13, a communication bus 14, and a network interface 15.

The image capturing device 11 may be a PC (Personal Computer), a terminal device such as a smart phone, a tablet Computer, or a mobile Computer, or may be a server.

The data processor 12 includes at least one type of readable storage medium including flash memory, hard disks, multi-media cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, and the like. The data processor 12 may in some embodiments be an internal storage unit of the big data image processing system 1, such as a hard disk of the big data image processing system 1. The data processor 12 may also be an external storage device of the big data image processing system 1 in other embodiments, such as a plug-in hard disk provided on the big data image processing system 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and so on. Further, the data processor 12 may also include both an internal storage unit and an external storage device of the large data image processing system 1. The data processor 12 can be used not only to store application software installed in the large data image processing system 1 and various types of data, but also to temporarily store data that has been output or is to be output.

The big data image Processing apparatus 13 may be, in some embodiments, a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data Processing chip for running program codes stored in the data processor 12 or Processing data, such as big data image Processing program instructions 16.

The communication bus 14 is used to enable connection communication between these components.

The network interface 15 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), and is typically used to establish a communication link between the system 1 and other electronic devices.

Optionally, the big data image processing system 1 may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may further include a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the big data image processing system 1 and for displaying a visualized user interface.

While FIG. 2 only shows the components 11-15 and the big data image processing system 1, those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of the big data image processing system 1, and may include fewer or more components than shown, or some components in combination, or a different arrangement of components.

In the embodiment of the big-data image processing system 1 shown in fig. 2, big-data image processing program instructions 16 are stored in the data processor 12; the steps of the big-data image processing apparatus 13 executing the big-data image processing program instructions 16 stored in the data processor 12 are the same as the implementation method of the big-data image processing algorithm, and are not described here.

Furthermore, an embodiment of the present invention also provides a computer-readable storage medium having stored thereon big-data image processing program instructions executable by one or more processors to implement the following operations:

It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A big data image processing algorithm, the method comprising:

2. The big data image processing algorithm according to claim 1, wherein the pre-processing of image graying and gray stretching for the image data comprises:

G(i,j)＝max{R(i,j),G(i,j),B(i,j)}

wherein:

(i, j) is a pixel point in the image;

g (i, j) is the gray value of the pixel point (i, j);

wherein:

f (x, y) is a gray scale map;

MAX_f(x,y),MIN_f(x，y)respectively the maximum and minimum grey values of the grey map.

3. The big data image processing algorithm according to claim 2, wherein the performing the feature extraction on the preprocessed image by using the image description feature extraction algorithm comprises:

1) constructing a Hessian matrix of the image:

wherein:

h_xx(x, sigma) is the position of image at pixel point xA second derivative, wherein sigma is a neighborhood standard deviation of an image pixel;

3) constructing a Gaussian scale domain space:

wherein:

i (x, y) is an original image;

sigma is the standard deviation of pixels of the original image;

4. The big data image processing algorithm according to claim 3, wherein the optimizing the data partitioning policy of the big data platform by using the partitioning optimization algorithm comprises:

1) the method comprises the following steps that a big data platform receives massive image data, image description characteristics are used as key values key for image storage, a sampling rate evaluation model is built, the image sampling rate is determined according to the built sampling rate evaluation model, the stored image is sampled based on the sampling rate s, and the sampling rate evaluation model is as follows:

s＝argmin(αD_s+βT_s)

wherein:

cov_s，ithe value of the error rate of the i-th sampling at the sampling rate s, which is the difference between the data distribution after sampling and the expected distribution, cov_m,iThe error rate value of the ith sampling when the sampling rate is 100 percent;

5. The big data image processing algorithm of claim 4, wherein the adaptively segmenting the stored image using an adaptive image segmentation algorithm comprises:

wherein:

t_q＝|L-L_q|+|A-A_q|+|B-B_q|

wherein:

L_q,A_q,B_qrespectively adjacent to the large-size image q brightness, red and green color valuesAnd an average of yellow-green color values;

6. The big data image processing algorithm according to claim 5, wherein the extracting semantic information of the segmented image block by using the image semantic feature extraction model comprises:

the image semantic feature extraction process comprises the following steps:

performing feature training by adopting the initial weight of ImageNet migration, and respectively extracting low-level features and high-level features of the image;

7. A big data image processing system, the system comprising:

the image acquisition device is used for acquiring mass image data;

8. A computer readable storage medium having stored thereon big data image processing program instructions executable by one or more processors to implement the steps of an implementation method of big data image processing as described above.