CN116385811A

CN116385811A - Multi-party collaborative image data analysis method and system

Info

Publication number: CN116385811A
Application number: CN202310658814.3A
Authority: CN
Inventors: 彭文芳
Original assignee: Shenzhen Chengwang Chuangshuo Technology Co ltd
Current assignee: Shenzhen Chengwang Chuangshuo Technology Co ltd
Priority date: 2023-06-06
Filing date: 2023-06-06
Publication date: 2023-07-04

Abstract

The invention provides a method and a system for analyzing image data based on multi-party collaboration, wherein the method comprises the following steps: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set; calculating the limit density of the image sample i, calculating the interval distance from the image sample i to the adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than the limit density of the image sample i, and the adjacent image sample j is nearest to the image sample i; forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set; calculating a separation image sample; the non-separated image samples except the group center are distributed to the corresponding groups according to a first distribution strategy, and the non-separated image samples which are not distributed by the first distribution strategy and the separated image samples are distributed to the corresponding groups according to a second distribution strategy; the invention can group image data sets with larger numbers, and the distribution is accurate and quick.

Description

Multi-party collaborative image data analysis method and system

Technical Field

The invention belongs to the field of image processing, and particularly relates to a multiparty collaborative image data analysis method and system.

Background

If related research is to be performed by using the images, the images need to be classified and grouped first, and the images are classified according to the similarity without prior knowledge according to the content of the images. The classified images are high in similarity and low in similarity, but if manual operation is performed, the method is huge in workload, and cannot be realized, if the existing images are used for feature extraction and then classification, the huge number of images brings great challenges to the storage space and the running memory of a computer.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a multiparty collaborative image data analysis method.

The invention adopts the following technical scheme:

a multi-party collaborative image data analysis method comprises the following steps:

acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;

calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;

forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;

calculating a separation image sample;

non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.

Specifically, the limit density of the image sample i is calculated, and the distance between the image sample i and the adjacent image sample j is calculated, specifically:

；

；

for maximum limit densityImage sample i, which

；

Wherein,,

for the limit density of the image sample i, +.>

For the euclidean distance of image sample i and image sample j,

set of K neighbor image samples for image sample i, +.>

Is the separation distance of image sample i to adjacent image sample j.

Specifically, the calculating the separation image sample specifically includes:

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

Specifically, non-isolated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, wherein the first allocation policy is as follows:

step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;

step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;

step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)

Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;

step S14, if the queue Vq is not empty, turning to S13;

step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.

Specifically, the non-separated image samples and the separated image samples which are not allocated by the first allocation policy are allocated to corresponding groups according to a second allocation policy, wherein the second allocation policy is as follows:

s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

s22, obtaining an image from the matrix SThe samples p are classified into corresponding groups, and the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

-a }; n of each sample _k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N _k (p)=max{N _k (i) I=1, 2, 3..nr }; sample p is distributed as follows:

a) If N _k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;

b) If 0 is less than N _k (p) < K, then a maximum value equal to N is selected from the recognition matrix s _k The samples of (p), performing group allocation, and marking the allocated samples as p;

c) Otherwise, ending the second allocation strategy;

s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set _k (q)= N _k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;

s24, if no unallocated sample exists, ending the second allocation policy 2, otherwise turning to S22.

In another aspect of the invention, a multi-party collaborative image data analysis system comprises:

an image preprocessing unit: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;

limit density and separation distance calculation unit: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;

group center set acquisition unit: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;

a separation image sample calculation unit: calculating a separation image sample;

grouping unit: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.

Specifically, in the limit density and interval distance calculating unit, a limit density of an image sample i is calculated, and an interval distance from the image sample i to an adjacent image sample j is calculated, specifically:

；

；

for the image sample i with the greatest limit density, it

；

Wherein,,

for the limit density of the image sample i, +.>

For the euclidean distance of image sample i and image sample j,

set of K neighbor image samples for image sample i, +.>

Is the separation distance of image sample i to adjacent image sample j.

Specifically, in the isolated image sample calculation unit, an isolated image sample is calculated, specifically:

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

Specifically, in the grouping unit, non-separated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, where the first allocation policy is:

step S14, if the queue Vq is not empty, turning to S13;

Specifically, in the grouping unit, the non-separated image samples and the separated image samples not assigned by the first assignment policy are assigned to corresponding groups according to a second assignment policy, where the second assignment policy is:

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

c) Otherwise, ending the second allocation strategy;

As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:

the invention provides a multi-party collaborative image data analysis method, which comprises the steps of obtaining an image sample set, and carrying out missing value processing on image samples in the image sample set; calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i; forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set; calculating a separation image sample; the non-separated image samples except the group center are distributed to the corresponding groups according to a first distribution strategy, and the non-separated image samples which are not distributed by the first distribution strategy and the separated image samples are distributed to the corresponding groups according to a second distribution strategy; the method provided by the invention can realize grouping of the image data set with larger number, and is accurate in distribution, simple and quick.

Drawings

FIG. 1 is a flowchart of a method for analyzing data based on multi-party collaborative images according to an embodiment of the present invention;

FIG. 2 is a diagram of a multi-party collaborative image data analysis method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the present invention.

Detailed Description

The invention is further described below by means of specific embodiments.

The invention provides a multi-party collaborative image data analysis method, which can realize grouping of image data sets with larger numbers, and is accurate in distribution, simple and quick.

FIG. 1 is a schematic illustration of a multi-party collaborative image data analysis method in accordance with aspects of the present invention; the method specifically comprises the following steps:

s101: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;

the image data preprocessing comprises the steps of processing missing data and normalizing the data, wherein the missing data is replaced by a mean value, the data normalization adopts a maximum minimization method, and through the data normalization, the influence of different dimensions on experimental results is eliminated, and the operation time cost of an algorithm is reduced.

S102: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;

；

；

for the image sample i with the greatest limit density, it

；

Wherein,,

for the limit density of the image sample i, +.>

For the euclidean distance of image sample i and image sample j,

set of K neighbor image samples for image sample i, +.>

Is the separation distance of image sample i to adjacent image sample j.

In the embodiment of the invention, the limit density of the sample point i is estimated, the density calculation range is reduced from the whole data set sample to K neighbors of the sample i, so that the obtained sample density is only related to the K neighbor samples thereof, the local information of the sample point i can be reflected, and the sample limit density calculation of the method is more time-saving under the condition of not considering searching the K neighbors of the sample.

S103: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;

s104: calculating a separation image sample;

when grouping is carried out, the separated image sample points have great influence on grouping, and the phenomenon that two images are combined into one group can occur, so that before the image samples are distributed to the corresponding groups, the separated sample points are removed;

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

S105: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.

step S14, if the queue Vq is not empty, turning to S13;

s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=) 1,2,3…

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

c) Otherwise, ending the second allocation strategy;

In another aspect, the embodiment of the invention provides a multi-party collaborative image data analysis system, which comprises the following steps:

an image preprocessing unit 201: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;

Limit density and separation distance calculation unit 202: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;

；

；

for the image sample i with the greatest limit density, it

；

Wherein,,

for the limit density of the image sample i, +.>

For the euclidean distance of image sample i and image sample j,

set of K neighbor image samples for image sample i, +.>

Is the separation distance of image sample i to adjacent image sample j.

Group center set acquisition unit 203: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;

a separation image sample calculation unit 204: calculating a separation image sample;

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

Grouping unit 205: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.

step S14, if the queue Vq is not empty, turning to S13;

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

c) Otherwise, ending the second allocation strategy;

As shown in fig. 3, an electronic device 300 is provided in an embodiment of the present invention, which includes a memory 310, a processor 320, and a computer program 311 stored in the memory 310 and capable of running on the processor 320, where the processor 320 implements a multi-party collaborative image data analysis method provided in the embodiment of the present invention when executing the computer program 311.

In a specific implementation, when the processor 320 executes the computer program 311, any implementation manner of the embodiment corresponding to fig. 1 may be implemented.

Since the electronic device described in this embodiment is a device for implementing a data processing apparatus in this embodiment of the present invention, based on the method described in this embodiment of the present invention, those skilled in the art can understand the specific implementation of the electronic device in this embodiment and various modifications thereof, so how the electronic device implements the method in this embodiment of the present invention will not be described in detail herein, and only those devices for implementing the method in this embodiment of the present invention will belong to the scope of protection intended by the present invention.

Referring to fig. 4, fig. 4 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the invention.

As shown in fig. 4, the present embodiment provides a computer readable storage medium 400, on which a computer program 411 is stored, which when executed by a processor, implements a multi-party collaborative image data analysis method provided by an embodiment of the present invention;

in a specific implementation, the computer program 411 may implement any implementation of the embodiment corresponding to fig. 1 when executed by a processor.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The foregoing is merely illustrative of specific embodiments of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modification of the present invention by using the design concept shall fall within the scope of the present invention.

Claims

1. The multi-party collaborative image data analysis method is characterized by comprising the following steps of:

calculating a separation image sample;

2. The multi-party collaborative image data analysis method according to claim 1, wherein a limit density of an image sample i is calculated and a separation distance of the image sample i from an adjacent image sample j is calculated, specifically:

；

；

for the image sample i with the greatest limit density, it

；

Wherein,,

for the limit density of the image sample i, +.>

Euclidean distance for image sample i and image sample j,>

set of K neighbor image samples for image sample i, +.>

For image samplesThe separation distance from this i to the adjacent image sample j.

3. The multi-party collaborative image data analysis method according to claim 1, wherein the computing a separate image sample is specifically:

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

4. The multi-party collaborative image data analysis method according to claim 1, wherein non-split image samples other than a group center are assigned to respective groups according to a first assignment policy that is:

step S14, if the queue Vq is not empty, turning to S13;

5. The method of claim 4, wherein non-split image samples and split image samples not assigned to the first assignment policy are assigned to respective groups according to a second assignment policy, the second assignment policy being:

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

c) Otherwise, ending the second allocation strategy;

s24, if no unallocated sample exists, ending the second allocation policy, otherwise turning to S22.

6. A multiparty collaborative image data-based analysis system, comprising:

7. The multi-party collaborative image data analysis system according to claim 6, wherein the limit density and separation distance calculating unit calculates a limit density for an image sample i and calculates a separation distance from the image sample i to an adjacent image sample j by:

；

；

for the image sample i with the greatest limit density, it

；

Wherein,,

for the limit density of the image sample i, +.>

Euclidean distance for image sample i and image sample j,>

set of K neighbor image samples for image sample i, +.>

Is the separation distance of image sample i to adjacent image sample j.

8. The multi-party collaborative image data analysis system according to claim 6, wherein the separate image sample computing unit computes separate image samples as follows:

；

；

；

where o is the sample of the separation image,

for the KNN distance of image sample i, +.>

For a defined threshold value, N is the total number of image samples, +.>

The euclidean distance for image sample i and image sample j.

9. The multi-party collaborative image data analysis system according to claim 6, wherein non-separated image samples other than a group center are assigned to respective groups in the grouping unit according to a first assignment policy, the first assignment policy being:

step S14, if the queue Vq is not empty, turning to S13;

10. The multi-party collaborative image data analysis system according to claim 9, wherein the grouping unit assigns non-split image samples and split image samples not assigned to a first allocation policy to respective groups according to a second allocation policy, the second allocation policy being:

) Number of samples M of (1) _b (i) Obtaining 1 x->

Form an nr x +.f for unassigned samples>

Matrix s, where s (i, j) =m _j (i)，j = 1,2,3…/>

I=1, 2, 3..nr, nr is the number of unassigned samples;

Let N _k (i) = max{N _b (i)，b=1,2,3…/>

c) Otherwise, ending the second allocation strategy;