CN116385811A - Multi-party collaborative image data analysis method and system - Google Patents

Multi-party collaborative image data analysis method and system Download PDF

Info

Publication number
CN116385811A
CN116385811A CN202310658814.3A CN202310658814A CN116385811A CN 116385811 A CN116385811 A CN 116385811A CN 202310658814 A CN202310658814 A CN 202310658814A CN 116385811 A CN116385811 A CN 116385811A
Authority
CN
China
Prior art keywords
sample
image
samples
image sample
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310658814.3A
Other languages
Chinese (zh)
Inventor
彭文芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Chengwang Chuangshuo Technology Co ltd
Original Assignee
Shenzhen Chengwang Chuangshuo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Chengwang Chuangshuo Technology Co ltd filed Critical Shenzhen Chengwang Chuangshuo Technology Co ltd
Priority to CN202310658814.3A priority Critical patent/CN116385811A/en
Publication of CN116385811A publication Critical patent/CN116385811A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a method and a system for analyzing image data based on multi-party collaboration, wherein the method comprises the following steps: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set; calculating the limit density of the image sample i, calculating the interval distance from the image sample i to the adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than the limit density of the image sample i, and the adjacent image sample j is nearest to the image sample i; forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set; calculating a separation image sample; the non-separated image samples except the group center are distributed to the corresponding groups according to a first distribution strategy, and the non-separated image samples which are not distributed by the first distribution strategy and the separated image samples are distributed to the corresponding groups according to a second distribution strategy; the invention can group image data sets with larger numbers, and the distribution is accurate and quick.

Description

Multi-party collaborative image data analysis method and system
Technical Field
The invention belongs to the field of image processing, and particularly relates to a multiparty collaborative image data analysis method and system.
Background
If related research is to be performed by using the images, the images need to be classified and grouped first, and the images are classified according to the similarity without prior knowledge according to the content of the images. The classified images are high in similarity and low in similarity, but if manual operation is performed, the method is huge in workload, and cannot be realized, if the existing images are used for feature extraction and then classification, the huge number of images brings great challenges to the storage space and the running memory of a computer.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a multiparty collaborative image data analysis method.
The invention adopts the following technical scheme:
a multi-party collaborative image data analysis method comprises the following steps:
acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
calculating a separation image sample;
non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
Specifically, the limit density of the image sample i is calculated, and the distance between the image sample i and the adjacent image sample j is calculated, specifically:
Figure SMS_1
Figure SMS_2
for maximum limit densityImage sample i, which
Figure SMS_3
Wherein,,
Figure SMS_4
for the limit density of the image sample i, +.>
Figure SMS_5
For the euclidean distance of image sample i and image sample j,
Figure SMS_6
set of K neighbor image samples for image sample i, +.>
Figure SMS_7
Is the separation distance of image sample i to adjacent image sample j.
Specifically, the calculating the separation image sample specifically includes:
Figure SMS_8
Figure SMS_9
Figure SMS_10
where o is the sample of the separation image,
Figure SMS_11
for the KNN distance of image sample i, +.>
Figure SMS_12
For a defined threshold value, N is the total number of image samples, +.>
Figure SMS_13
The euclidean distance for image sample i and image sample j.
Specifically, non-isolated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, wherein the first allocation policy is as follows:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure SMS_14
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
Specifically, the non-separated image samples and the separated image samples which are not allocated by the first allocation policy are allocated to corresponding groups according to a second allocation policy, wherein the second allocation policy is as follows:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)
Figure SMS_15
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure SMS_16
Form an nr x +.f for unassigned samples>
Figure SMS_17
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure SMS_18
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, obtaining an image from the matrix SThe samples p are classified into corresponding groups, and the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure SMS_19
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure SMS_20
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy 2, otherwise turning to S22.
In another aspect of the invention, a multi-party collaborative image data analysis system comprises:
an image preprocessing unit: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
limit density and separation distance calculation unit: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
group center set acquisition unit: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
a separation image sample calculation unit: calculating a separation image sample;
grouping unit: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
Specifically, in the limit density and interval distance calculating unit, a limit density of an image sample i is calculated, and an interval distance from the image sample i to an adjacent image sample j is calculated, specifically:
Figure SMS_21
Figure SMS_22
for the image sample i with the greatest limit density, it
Figure SMS_23
Wherein,,
Figure SMS_24
for the limit density of the image sample i, +.>
Figure SMS_25
For the euclidean distance of image sample i and image sample j,
Figure SMS_26
set of K neighbor image samples for image sample i, +.>
Figure SMS_27
Is the separation distance of image sample i to adjacent image sample j.
Specifically, in the isolated image sample calculation unit, an isolated image sample is calculated, specifically:
Figure SMS_28
Figure SMS_29
Figure SMS_30
where o is the sample of the separation image,
Figure SMS_31
for the KNN distance of image sample i, +.>
Figure SMS_32
For a defined threshold value, N is the total number of image samples, +.>
Figure SMS_33
The euclidean distance for image sample i and image sample j.
Specifically, in the grouping unit, non-separated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, where the first allocation policy is:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure SMS_34
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
Specifically, in the grouping unit, the non-separated image samples and the separated image samples not assigned by the first assignment policy are assigned to corresponding groups according to a second assignment policy, where the second assignment policy is:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)
Figure SMS_35
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure SMS_36
Form an nr x +.f for unassigned samples>
Figure SMS_37
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure SMS_38
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure SMS_39
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure SMS_40
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy 2, otherwise turning to S22.
As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:
the invention provides a multi-party collaborative image data analysis method, which comprises the steps of obtaining an image sample set, and carrying out missing value processing on image samples in the image sample set; calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i; forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set; calculating a separation image sample; the non-separated image samples except the group center are distributed to the corresponding groups according to a first distribution strategy, and the non-separated image samples which are not distributed by the first distribution strategy and the separated image samples are distributed to the corresponding groups according to a second distribution strategy; the method provided by the invention can realize grouping of the image data set with larger number, and is accurate in distribution, simple and quick.
Drawings
FIG. 1 is a flowchart of a method for analyzing data based on multi-party collaborative images according to an embodiment of the present invention;
FIG. 2 is a diagram of a multi-party collaborative image data analysis method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the present invention.
Detailed Description
The invention is further described below by means of specific embodiments.
The invention provides a multi-party collaborative image data analysis method, which can realize grouping of image data sets with larger numbers, and is accurate in distribution, simple and quick.
FIG. 1 is a schematic illustration of a multi-party collaborative image data analysis method in accordance with aspects of the present invention; the method specifically comprises the following steps:
s101: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
the image data preprocessing comprises the steps of processing missing data and normalizing the data, wherein the missing data is replaced by a mean value, the data normalization adopts a maximum minimization method, and through the data normalization, the influence of different dimensions on experimental results is eliminated, and the operation time cost of an algorithm is reduced.
S102: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
specifically, the limit density of the image sample i is calculated, and the distance between the image sample i and the adjacent image sample j is calculated, specifically:
Figure SMS_41
Figure SMS_42
for the image sample i with the greatest limit density, it
Figure SMS_43
Wherein,,
Figure SMS_44
for the limit density of the image sample i, +.>
Figure SMS_45
For the euclidean distance of image sample i and image sample j,
Figure SMS_46
set of K neighbor image samples for image sample i, +.>
Figure SMS_47
Is the separation distance of image sample i to adjacent image sample j.
In the embodiment of the invention, the limit density of the sample point i is estimated, the density calculation range is reduced from the whole data set sample to K neighbors of the sample i, so that the obtained sample density is only related to the K neighbor samples thereof, the local information of the sample point i can be reflected, and the sample limit density calculation of the method is more time-saving under the condition of not considering searching the K neighbors of the sample.
S103: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
s104: calculating a separation image sample;
when grouping is carried out, the separated image sample points have great influence on grouping, and the phenomenon that two images are combined into one group can occur, so that before the image samples are distributed to the corresponding groups, the separated sample points are removed;
specifically, the calculating the separation image sample specifically includes:
Figure SMS_48
Figure SMS_49
Figure SMS_50
where o is the sample of the separation image,
Figure SMS_51
for the KNN distance of image sample i, +.>
Figure SMS_52
For a defined threshold value, N is the total number of image samples, +.>
Figure SMS_53
The euclidean distance for image sample i and image sample j.
S105: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
Specifically, non-isolated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, wherein the first allocation policy is as follows:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure SMS_54
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
Specifically, the non-separated image samples and the separated image samples which are not allocated by the first allocation policy are allocated to corresponding groups according to a second allocation policy, wherein the second allocation policy is as follows:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=) 1,2,3…
Figure SMS_55
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure SMS_56
Form an nr x +.f for unassigned samples>
Figure SMS_57
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure SMS_58
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure SMS_59
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure SMS_60
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy 2, otherwise turning to S22.
In another aspect, the embodiment of the invention provides a multi-party collaborative image data analysis system, which comprises the following steps:
an image preprocessing unit 201: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
the image data preprocessing comprises the steps of processing missing data and normalizing the data, wherein the missing data is replaced by a mean value, the data normalization adopts a maximum minimization method, and through the data normalization, the influence of different dimensions on experimental results is eliminated, and the operation time cost of an algorithm is reduced.
Limit density and separation distance calculation unit 202: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
specifically, the limit density of the image sample i is calculated, and the distance between the image sample i and the adjacent image sample j is calculated, specifically:
Figure SMS_61
Figure SMS_62
for the image sample i with the greatest limit density, it
Figure SMS_63
Wherein,,
Figure SMS_64
for the limit density of the image sample i, +.>
Figure SMS_65
For the euclidean distance of image sample i and image sample j,
Figure SMS_66
set of K neighbor image samples for image sample i, +.>
Figure SMS_67
Is the separation distance of image sample i to adjacent image sample j.
In the embodiment of the invention, the limit density of the sample point i is estimated, the density calculation range is reduced from the whole data set sample to K neighbors of the sample i, so that the obtained sample density is only related to the K neighbor samples thereof, the local information of the sample point i can be reflected, and the sample limit density calculation of the method is more time-saving under the condition of not considering searching the K neighbors of the sample.
Group center set acquisition unit 203: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
a separation image sample calculation unit 204: calculating a separation image sample;
when grouping is carried out, the separated image sample points have great influence on grouping, and the phenomenon that two images are combined into one group can occur, so that before the image samples are distributed to the corresponding groups, the separated sample points are removed;
specifically, the calculating the separation image sample specifically includes:
Figure SMS_68
Figure SMS_69
Figure SMS_70
where o is the sample of the separation image,
Figure SMS_71
for the KNN distance of image sample i, +.>
Figure SMS_72
For a defined threshold value, N is the total number of image samples, +.>
Figure SMS_73
The euclidean distance for image sample i and image sample j.
Grouping unit 205: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
Specifically, in the grouping unit, non-separated image samples except for a group center are allocated to corresponding groups according to a first allocation policy, where the first allocation policy is:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure SMS_74
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
Specifically, in the grouping unit, the non-separated image samples and the separated image samples not assigned by the first assignment policy are assigned to corresponding groups according to a second assignment policy, where the second assignment policy is:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)
Figure SMS_75
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure SMS_76
Form an nr x +.f for unassigned samples>
Figure SMS_77
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure SMS_78
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure SMS_79
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure SMS_80
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy 2, otherwise turning to S22.
As shown in fig. 3, an electronic device 300 is provided in an embodiment of the present invention, which includes a memory 310, a processor 320, and a computer program 311 stored in the memory 310 and capable of running on the processor 320, where the processor 320 implements a multi-party collaborative image data analysis method provided in the embodiment of the present invention when executing the computer program 311.
In a specific implementation, when the processor 320 executes the computer program 311, any implementation manner of the embodiment corresponding to fig. 1 may be implemented.
Since the electronic device described in this embodiment is a device for implementing a data processing apparatus in this embodiment of the present invention, based on the method described in this embodiment of the present invention, those skilled in the art can understand the specific implementation of the electronic device in this embodiment and various modifications thereof, so how the electronic device implements the method in this embodiment of the present invention will not be described in detail herein, and only those devices for implementing the method in this embodiment of the present invention will belong to the scope of protection intended by the present invention.
Referring to fig. 4, fig. 4 is a schematic diagram of an embodiment of a computer readable storage medium according to an embodiment of the invention.
As shown in fig. 4, the present embodiment provides a computer readable storage medium 400, on which a computer program 411 is stored, which when executed by a processor, implements a multi-party collaborative image data analysis method provided by an embodiment of the present invention;
in a specific implementation, the computer program 411 may implement any implementation of the embodiment corresponding to fig. 1 when executed by a processor.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The invention provides a multi-party collaborative image data analysis method, which comprises the steps of obtaining an image sample set, and carrying out missing value processing on image samples in the image sample set; calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i; forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set; calculating a separation image sample; the non-separated image samples except the group center are distributed to the corresponding groups according to a first distribution strategy, and the non-separated image samples which are not distributed by the first distribution strategy and the separated image samples are distributed to the corresponding groups according to a second distribution strategy; the method provided by the invention can realize grouping of the image data set with larger number, and is accurate in distribution, simple and quick.
The foregoing is merely illustrative of specific embodiments of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modification of the present invention by using the design concept shall fall within the scope of the present invention.

Claims (10)

1. The multi-party collaborative image data analysis method is characterized by comprising the following steps of:
acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
calculating a separation image sample;
non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
2. The multi-party collaborative image data analysis method according to claim 1, wherein a limit density of an image sample i is calculated and a separation distance of the image sample i from an adjacent image sample j is calculated, specifically:
Figure QLYQS_1
Figure QLYQS_2
for the image sample i with the greatest limit density, it
Figure QLYQS_3
Wherein,,
Figure QLYQS_4
for the limit density of the image sample i, +.>
Figure QLYQS_5
Euclidean distance for image sample i and image sample j,>
Figure QLYQS_6
set of K neighbor image samples for image sample i, +.>
Figure QLYQS_7
For image samplesThe separation distance from this i to the adjacent image sample j.
3. The multi-party collaborative image data analysis method according to claim 1, wherein the computing a separate image sample is specifically:
Figure QLYQS_8
Figure QLYQS_9
Figure QLYQS_10
where o is the sample of the separation image,
Figure QLYQS_11
for the KNN distance of image sample i, +.>
Figure QLYQS_12
For a defined threshold value, N is the total number of image samples, +.>
Figure QLYQS_13
The euclidean distance for image sample i and image sample j.
4. The multi-party collaborative image data analysis method according to claim 1, wherein non-split image samples other than a group center are assigned to respective groups according to a first assignment policy that is:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure QLYQS_14
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
5. The method of claim 4, wherein non-split image samples and split image samples not assigned to the first assignment policy are assigned to respective groups according to a second assignment policy, the second assignment policy being:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)
Figure QLYQS_15
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure QLYQS_16
Form an nr x +.f for unassigned samples>
Figure QLYQS_17
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure QLYQS_18
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure QLYQS_19
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure QLYQS_20
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy, otherwise turning to S22.
6. A multiparty collaborative image data-based analysis system, comprising:
an image preprocessing unit: acquiring an image sample set, and performing missing value processing on the image samples in the image sample set;
limit density and separation distance calculation unit: calculating the limit density of an image sample i, and calculating the interval distance from the image sample i to an adjacent image sample j, wherein the limit density of the adjacent image sample j is larger than that of the image sample i, and the adjacent image sample j is nearest to the image sample i;
group center set acquisition unit: forming a decision graph according to the limit density and the interval distance of each image sample, and taking the image sample with larger limit density and interval distance as a group center to obtain a group center set;
a separation image sample calculation unit: calculating a separation image sample;
grouping unit: non-separated image samples except the group center are distributed to corresponding groups according to a first distribution strategy, and non-separated image samples which are not distributed by the first distribution strategy and separated image samples are distributed to corresponding groups according to a second distribution strategy.
7. The multi-party collaborative image data analysis system according to claim 6, wherein the limit density and separation distance calculating unit calculates a limit density for an image sample i and calculates a separation distance from the image sample i to an adjacent image sample j by:
Figure QLYQS_21
Figure QLYQS_22
for the image sample i with the greatest limit density, it
Figure QLYQS_23
Wherein,,
Figure QLYQS_24
for the limit density of the image sample i, +.>
Figure QLYQS_25
Euclidean distance for image sample i and image sample j,>
Figure QLYQS_26
set of K neighbor image samples for image sample i, +.>
Figure QLYQS_27
Is the separation distance of image sample i to adjacent image sample j.
8. The multi-party collaborative image data analysis system according to claim 6, wherein the separate image sample computing unit computes separate image samples as follows:
Figure QLYQS_28
Figure QLYQS_29
Figure QLYQS_30
where o is the sample of the separation image,
Figure QLYQS_31
for the KNN distance of image sample i, +.>
Figure QLYQS_32
For a defined threshold value, N is the total number of image samples, +.>
Figure QLYQS_33
The euclidean distance for image sample i and image sample j.
9. The multi-party collaborative image data analysis system according to claim 6, wherein non-separated image samples other than a group center are assigned to respective groups in the grouping unit according to a first assignment policy, the first assignment policy being:
step S11, selecting an unaccessed sample point CI from the group center set CI as a group center of a new group, and marking the CI as accessed;
step S12, merging samples in a close neighbor set KNN (ci) of the ci point into a group where the ci is located, initializing a queue Vq, and sequentially putting samples in the KNN (ci) into the queue Vq;
step S13, after deleting the top sample q of the queue Vq, for each sample r in the set KNN (q), if the condition is satisfied: i) Not assigned, ii) non-point of separation, iii)
Figure QLYQS_34
Grouping r into the group to which q belongs, and adding a sample r into the tail of the queue Vq;
step S14, if the queue Vq is not empty, turning to S13;
step S15, if the CI has not accessed sample points, turning to step S11, otherwise ending the first allocation strategy.
10. The multi-party collaborative image data analysis system according to claim 9, wherein the grouping unit assigns non-split image samples and split image samples not assigned to a first allocation policy to respective groups according to a second allocation policy, the second allocation policy being:
s21, for each unassigned image sample i, counting the neighbors KNN (i) belonging to group b (b=1, 2,3 …)
Figure QLYQS_35
) Number of samples M of (1) b (i) Obtaining 1 x->
Figure QLYQS_36
Form an nr x +.f for unassigned samples>
Figure QLYQS_37
Matrix s, where s (i, j) =m j (i),j = 1,2,3…/>
Figure QLYQS_38
I=1, 2, 3..nr, nr is the number of unassigned samples;
s22, acquiring image samples p from the matrix S, and classifying the image samples p into corresponding groups, wherein the specific sample p acquisition method comprises the following steps: deriving from the matrix s the group k to which the sample i belongs, k=1, 2,3 …
Figure QLYQS_39
Let N k (i) = max{N b (i),b=1,2,3…/>
Figure QLYQS_40
-a }; n of each sample k (i) A composition vector Vmax; let p be the sample corresponding to the maximum value of the Vmax component, i.e. N k (p)=max{N k (i) I=1, 2, 3..nr }; sample p is distributed as follows:
a) If N k (p) =k, then all samples with maximum value K in the matrix s are assigned to the group corresponding to the maximum value;
b) If 0 is less than N k (p) < K, then a maximum value equal to N is selected from the recognition matrix s k The samples of (p), performing group allocation, and marking the allocated samples as p;
c) Otherwise, ending the second allocation strategy;
s23, updating the identification matrix S: for samples q not allocated in KNN (p), N is set k (q)= N k (q) +1, and placing sample P at a corresponding vector N (P) of the recognition matrix S with 0;
s24, if no unallocated sample exists, ending the second allocation policy, otherwise turning to S22.
CN202310658814.3A 2023-06-06 2023-06-06 Multi-party collaborative image data analysis method and system Pending CN116385811A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310658814.3A CN116385811A (en) 2023-06-06 2023-06-06 Multi-party collaborative image data analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310658814.3A CN116385811A (en) 2023-06-06 2023-06-06 Multi-party collaborative image data analysis method and system

Publications (1)

Publication Number Publication Date
CN116385811A true CN116385811A (en) 2023-07-04

Family

ID=86981019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310658814.3A Pending CN116385811A (en) 2023-06-06 2023-06-06 Multi-party collaborative image data analysis method and system

Country Status (1)

Country Link
CN (1) CN116385811A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193078A1 (en) * 2016-01-06 2017-07-06 International Business Machines Corporation Hybrid method for anomaly Classification
CN110232414A (en) * 2019-06-11 2019-09-13 西北工业大学 Density peaks clustering algorithm based on k nearest neighbor and shared nearest neighbor
CN111079650A (en) * 2019-12-17 2020-04-28 国网江苏省电力工程咨询有限公司 Laser point cloud split conductor extraction method based on improved KNN-DPC algorithm
CN115964662A (en) * 2021-10-08 2023-04-14 哈尔滨工业大学(威海) Complex equipment parameter anomaly detection method based on improved density peak clustering

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193078A1 (en) * 2016-01-06 2017-07-06 International Business Machines Corporation Hybrid method for anomaly Classification
CN110232414A (en) * 2019-06-11 2019-09-13 西北工业大学 Density peaks clustering algorithm based on k nearest neighbor and shared nearest neighbor
CN111079650A (en) * 2019-12-17 2020-04-28 国网江苏省电力工程咨询有限公司 Laser point cloud split conductor extraction method based on improved KNN-DPC algorithm
CN115964662A (en) * 2021-10-08 2023-04-14 哈尔滨工业大学(威海) Complex equipment parameter anomaly detection method based on improved density peak clustering

Similar Documents

Publication Publication Date Title
Zhao et al. Domain generalization via entropy regularization
US9031305B2 (en) Image classification apparatus with first and second feature extraction units and recording medium having program recorded therein
Zhang et al. Adaptive affinity loss and erroneous pseudo-label refinement for weakly supervised semantic segmentation
CN110232414A (en) Density peaks clustering algorithm based on k nearest neighbor and shared nearest neighbor
US20120141017A1 (en) Reducing false detection rate using local pattern based post-filter
US20110295778A1 (en) Information processing apparatus, information processing method, and program
CN112036476A (en) Data feature selection method and device based on two-classification service and computer equipment
CN114821237A (en) Unsupervised ship re-identification method and system based on multi-stage comparison learning
JPWO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and identification dictionary learning program
CN116701979A (en) Social network data analysis method and system based on limited k-means
CN113536020B (en) Method, storage medium and computer program product for data query
CN114359632A (en) Point cloud target classification method based on improved PointNet + + neural network
CN111191033A (en) Open set classification method based on classification utility
CN113705215A (en) Meta-learning-based large-scale multi-label text classification method
CN112836753B (en) Method, apparatus, device, medium, and article for domain adaptive learning
Liu et al. Robust muscle cell quantification using structured edge detection and hierarchical segmentation
CN113987243A (en) Image file gathering method, image file gathering device and computer readable storage medium
CN116385811A (en) Multi-party collaborative image data analysis method and system
Xiao et al. An improved siamese network model for handwritten signature verification
CN111444362A (en) Malicious picture intercepting method, device, equipment and storage medium
Singh et al. Discriminator-free unsupervised domain adaptation for multi-label image classification
Pereira et al. Assessing active learning strategies to improve the quality control of the soybean seed vigor
CN111783869B (en) Training data screening method and device, electronic equipment and storage medium
Ariff et al. Character segmentation for automatic vehicle license plate recognition based on fast k-means clustering
CN110569831B (en) Feature matching method and system for power equipment nameplate

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination