CN108596338A - A kind of acquisition methods and its system of neural metwork training collection - Google Patents
A kind of acquisition methods and its system of neural metwork training collection Download PDFInfo
- Publication number
- CN108596338A CN108596338A CN201810438759.6A CN201810438759A CN108596338A CN 108596338 A CN108596338 A CN 108596338A CN 201810438759 A CN201810438759 A CN 201810438759A CN 108596338 A CN108596338 A CN 108596338A
- Authority
- CN
- China
- Prior art keywords
- image
- data
- screened
- collection
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of acquisition methods and its system of neural metwork training collection, method includes:S100 obtains the classification information of image set to be screened;S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample image is the corresponding template image of classification information of image set to be screened;S300 carries out dual training to the destination image data and obtains sample data set.The present invention, which realizes, reduces artificial screening sample data set, promotes screening efficiency and screening reliability, improves the accuracy of neural network.
Description
Technical field
The present invention relates to data processing field, the espespecially a kind of acquisition methods and its system of neural metwork training collection.
Background technology
In recent years, with the rapid development of the continuous development of computer vision technique, especially neural network model, people
The demand for training the demand of required image data to be especially to label information accurate image data computer vision increasingly increases
Greatly.
Neural network model (ConvolutionalNeuralNetworks, CNN) is one kind of deep learning algorithm, is
The important processing analysis tool in the fields such as image recognition, has become one of the research hotspot of numerous scientific domains in recent years.God
It is that any feature manually marked need not be used when training pattern through the advantages of network model algorithm, can explores automatically defeated
Enter the feature that variable implies, while the weights of network share characteristic, greatly reduce the complexity of model, reduce weights
Quantity.These advantages show particularly evident when the input of network is image, and original image can be directly as the defeated of network
Enter, avoids feature extraction complicated in tional identification algorithm and data reconstruction processes.
For the great amount of images sample data set for obtaining needed for training neural network model, most easily mode is to pass through network
It obtains, using the method for web crawlers, web crawlers can will meet the information of the condition from internet according to the condition of setting
Magnanimity information in crawl out.
Current way is, using crawling for web crawlers magnanimity, then to carry out artificial screening and cleaning.The problem of bringing
It is that workload is extremely huge, the selection result subjectivity is big, and the selection result is easy error, meanwhile, using the image pattern number of mistake
It is trained according to set pair neural network, the classification results of mistake can be brought.Picture number is obtained when using traditional network crawler technology
According to when, the picture quality that crawls increases with the quantity crawled, and downward trend is substantially presented, traditional network reptile is caused to crawl
Data influence the training result of the subsequently image recognition based on neural network model there are larger noise.
Invention content
The object of the present invention is to provide a kind of acquisition methods and its system of neural metwork training collection, realizes and reduce artificial sieve
Sample data set is selected, screening efficiency and screening reliability is promoted, improves the accuracy of neural network.
Technical solution provided by the invention is as follows:
The present invention provides a kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure
Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened
The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Further, include step before the step S100:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030;
Otherwise, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge.
Further, the step S200 includes step:
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement
Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened
As data set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Further, the step S300 includes step:
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample
Notebook data collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample
Notebook data collection.
Further, include step after the step S300:
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value;
If so, terminating;Otherwise, return to step S100.
The present invention also provides a kind of acquisition systems of neural metwork training collection, including:
Data obtaining module obtains the classification information of image set to be screened;
Optical sieving module screens target image, obtains the corresponding destination image data collection of the image set to be screened;Institute
It is the image data that similarity reaches default similarity threshold between sample image to state target image;The sample image is to wait for
Screen the corresponding template image of classification information of image set;
Data set acquisition module carries out dual training to the destination image data and obtains sample data set.
Further, further include:
Accuracy rate detection module detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module;
The test accuracy rate of the accuracy rate judgment module, switching next image collection is judged, until all image sets
It completes to judge.
Further, described image screening module includes:
Image data acquiring unit, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit judges whether each image data meets default reptile plan according to the sample image
Slightly, described image data are marked according to judging result, obtain it is all mark the image data for being, obtain described waiting sieving
Select the corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Further, the data set acquisition module includes:
Pixel transform data augmentation unit carries out pixel transform, after obtaining pixel transform to the destination image data
Image data is as the sample data set;
Geometric transformation data augmentation unit carries out geometric transformation, after obtaining geometric transformation to the destination image data
Image data is as the sample data set.
Further, further include:
Data processing module expands the data set of neural network model according to the sample data set;
Performance judging module, the performance of the neural network model after detection expansion, judges whether the performance reaches default
Performance threshold;
Described information acquisition module reacquires image to be screened also when the performance is not up to default capabilities threshold value
The classification information of collection reacquires new destination image data by described image screening module and the data set acquisition module
Collection carries out dual training and obtains new sample data set.
The acquisition methods and its system of a kind of neural metwork training collection provided through the invention, can bring it is following at least
A kind of advantageous effect:
1) present invention carries out targetedly data set augmentation by the classification information to image set to be screened, significantly drops
Low labor workload reduces data screening caused by the subjectivity of artificial screening and malfunctions, fought to destination image data
Training, to introduce stochastic variable, improves the robustness of neural network model.
2) present invention obtains test accuracy rate less than default standard after obtaining neural network model from neural network model
The image data set of true rate is image set to be screened, and targetedly data set is carried out by the classification information to image set to be screened
Augmentation, avoids indifference, no purpose, without targetedly into line data set augmentation,
3) present invention obtains test accuracy rate less than default standard after obtaining neural network model from neural network model
The image data set of true rate is image set to be screened, and targetedly data set is carried out by the classification information to image set to be screened
Augmentation increases the proportion that the corresponding image set to be screened of classification information is concentrated in the overall data of neural network model, quickly
Carry out efficiently and purposefully network training and parameter adjustment.
4) according to pixel transform, either any one or multiple combinations of geometric transformation carry out data augmentation and draw the present invention
Enter stochastic variable, can increase data volume in the case where not changing image category, the extensive energy of neural network model can be improved
Power has made sample and has introduced stochastic variable to data set to process resistant, can improve the robustness of neural network, accurately
Rate, fault-tolerance.
Description of the drawings
Below by a manner of clearly understandable, preferred embodiment is described with reference to the drawings, to a kind of neural metwork training collection
Acquisition methods and its above-mentioned characteristic, technical characteristic, advantage and its realization method of system be further described.
Fig. 1 is the flow chart of first embodiment of the invention;
Fig. 2 is the flow chart of second embodiment of the invention;
Fig. 3 is the flow chart of second embodiment of the invention;
Fig. 4 is the flow chart of third embodiment of the invention;
Fig. 5 is the flow chart of fourth embodiment of the invention;
Fig. 6 is the flow chart of fifth embodiment of the invention;
Fig. 7 is the structural schematic diagram of sixth embodiment of the invention;
Fig. 8 is the structural schematic diagram of seventh embodiment of the invention.
Specific implementation mode
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, control is illustrated below
The specific implementation mode of the present invention.It should be evident that drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings
Attached drawing, and obtain other embodiments.
To make simplified form, part related to the present invention is only schematically shown in each figure, they are not represented
Its practical structures as product.In addition, so that simplified form is easy to understand, there is identical structure or function in some figures
Component only symbolically depicts one of those, or has only marked one of those.Herein, "one" is not only indicated
" only this ", can also indicate the situation of " more than one ".
The first embodiment of the present invention, as shown in Figure 1:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure
Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened
The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Specifically, in the present embodiment, when handling image recognition either image classification or when other machines learning tasks
It waits, the performance (discrimination, classification accuracy) of neural network model how is promoted, due to the data in neural network model
Amount is bigger, and when the similitude of data is higher, the performance of neural network model is better, and the prior art is in order to improve neural network mould
The performance of type, general to carry out garbled data using artificial screening by the way of, artificial screening not only heavy workload, and acquisition
Data cause neural network model may since human subjective's property height causes the similarity of data voluntarily to be judged and classified by user
The specified number of mistake according to and influence the performance of neural network model.The present invention obtains image set to be screened from neural network model,
Then the classification information for obtaining image set to be screened obtains the corresponding sample image of image set to be screened according to classification information, from
And the image data of acquisition is compared to similarity with sample image, when the similarity between image data and sample image reaches
When to default similarity threshold, which is exactly target image, and all image datas are carried out above-mentioned similarity-rough set,
All target images are filtered out according to judging result, these form category information according to the target image that classification information obtains
Then the destination image data collection of corresponding image set to be screened carries out dual training to destination image data, random to introduce
Variable improves the robustness of neural network model.The present invention is carried out targetedly by the classification information to image set to be screened
Data set augmentation, significantly reduces labor workload, reduces data screening caused by the subjectivity of artificial screening and malfunctions, right
Destination image data carries out dual training and improves the robustness of neural network model to introduce stochastic variable.
The second embodiment of the present invention, as shown in Figure 2:
A kind of acquisition methods of neural metwork training collection, including step:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030;
Otherwise, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge;
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure
Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened
The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Specifically, the present embodiment is the preferred embodiment of above-mentioned first embodiment, in the present embodiment, each classification is detected
The corresponding test accuracy rate of image set, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value, if currently
When the test accuracy rate of image set is less than default capabilities threshold value, present image collection is labeled as image set to be screened, if current figure
When the test accuracy rate of image set is not less than (being greater than) default capabilities threshold value, the test for switching next image collection is accurate
Rate, continues to judge whether the test accuracy rate of next image collection is less than default capabilities threshold value, is marked down according to judging result
One image set, until the image set of all categories in neural network template is completed to judge and after classification marker, gets nerve
Classification information of the test accuracy rate less than the image set to be screened of default capabilities threshold value in network template.The present invention is obtaining nerve
After network model, it is figure to be screened that test accuracy rate is obtained from neural network model to be less than the image data set of default accuracy rate
Image set carries out targetedly data set augmentation by the classification information to image set to be screened, avoids indifference, no purpose, nothing
Targetedly into line data set augmentation, whether input source is judged into the expansion of line data set by the result of prediction, this
Kind processing has great help for the reduction of workload.In addition targetedly classification information is increased into line data set augmentation
The proportion that corresponding image set to be screened is concentrated in the overall data of neural network model, due to the training in neural network model
In, the variation of proportion influences whether the network parameter finally obtained, can be quick to also improve the robustness of neural network
Carry out efficiently and purposefully network training and parameter adjustment.
The third embodiment of the present invention, as shown in Figure 3:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement
Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened
As data set;
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample
Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, in the present embodiment, acquisition belongs to
The image data of the classification information of image set to be screened, and the corresponding sample graph of classification information for obtaining image set to be screened
Picture judges whether each image data meets default reptile strategy then according to sample image, by default reptile strategy to figure
As data are screened to obtain target image, for example corresponding search is carried out according to preset keyword and is compared, according to default abstract
Value carries out verification comparison, and (by MD5 algorithms or SHA1 algorithms, CRC32 algorithms etc. digest algorithm calculates digest value, will count
Obtained digest value is compared with default digest value), matching comparison etc. mode is carried out according to default similarity, according to than
Mark each image data acquisition is all the image data for being is marked to obtain image set pair to be screened compared with judging result
The destination image data collection answered, filters out unwanted picture data, to improve Screening Treatment efficiency, reduces artificial
Workload.In addition, carrying out pixel transform to destination image data, the image data after pixel transform is obtained as sample data
Collection, pixel transform include:1, increasing noise and filtering, the mode of noise includes but is not limited to salt-pepper noise, Gaussian noise, in
Value filtering;2, channel, the sequence in tri- channels adjustment RBG are converted;3, contrast, brightness and saturation degree, color jitter are adjusted.This
Invention can carry out data augmentation according to any one or multiple combinations of above-mentioned pixel transform and introduce stochastic variable, Neng Gou
In the case of not changing image category, increases data volume, the generalization ability of neural network model can be improved, by target image
Data carry out pixel transform, have been done to sample and have introduced stochastic variable to data set to process resistant, can improve neural network
Robustness, accuracy rate, fault-tolerance.
The fourth embodiment of the present invention, as shown in Figure 4:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement
Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened
As data set;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample
Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, in the present embodiment, acquisition belongs to
The image data of the classification information of image set to be screened, and the corresponding sample graph of classification information for obtaining image set to be screened
Picture judges whether each image data meets default reptile strategy then according to sample image, by default reptile strategy to figure
As data are screened to obtain target image, for example corresponding search is carried out according to preset keyword and is compared, according to default abstract
Value carries out verification comparison, and (by MD5 algorithms or SHA1 algorithms, CRC32 algorithms etc. digest algorithm calculates digest value, will count
Obtained digest value is compared with default digest value), matching comparison etc. mode is carried out according to default similarity, according to than
Mark each image data acquisition is all the image data for being is marked to obtain image set pair to be screened compared with judging result
The destination image data collection answered, filters out unwanted picture data, to improve Screening Treatment efficiency, reduces artificial
Workload.In addition, carrying out geometric transformation to destination image data, the image data after geometric transformation is obtained as sample data
Collection, geometric transformation include:1, it overturns, such as:Flip horizontal, it is vertical to overturn, it is overturn according to actual conditions, for example, about people
Face has spun upside down the face reformed into down, overturns without practical significance;2, it translates, simulates real-life picture
The transformation of position occurs for situation not placed in the middle;3, it rotates;4, it sets black, simulates the data sample being at least partially obscured;5, it cuts;6、
Scaling.The present invention can carry out data augmentation according to any one or multiple combinations of above-mentioned geometric transformation and introduce random become
Amount can increase data volume, can improve the generalization ability of neural network model, pass through in the case where not changing image category
Geometric transformation is carried out to destination image data, sample has been done, stochastic variable, Neng Gouti is introduced to data set to process resistant
The robustness of high neural network, accuracy rate, fault-tolerance.
The fifth embodiment of the present invention, as shown in Figure 5:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement
Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened
As data set;
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample
Notebook data collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample
Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, specific effect is referring to above-mentioned
Three and fourth embodiment, this is no longer going to repeat them.
The sixth embodiment of the present invention, as shown in Figure 6:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure
Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened
The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set;
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value;
If so, terminating;Otherwise, return to step S100.
Specifically, in the present embodiment, (it is unanimously herein since the similitude of the data in neural network model is consistent
In certain similarity ranges) when, if the performance of the bigger neural network model of data volume is relatively better, by to be screened
The classification information of image set carries out targetedly data set augmentation and obtains sample data set, significantly reduces manual working
Amount, the sample data set of acquisition is added in neural network model, expands the data set of neural network model so that nerve net
The data of network model obtain augmentation, the performance of the neural network model after then detection expands, if the neural network after expanding
The performance of model reaches default capabilities threshold value, then the neural network model after the training of above-mentioned data augmentation is exactly a qualification
Neural network model, the neural network model of the qualification can be passed through and carry out subsequent picture recognition classification.If after expanding
The performance of neural network model be not up to the classification information that default capabilities threshold value then reacquires image set to be screened, according to class
Other information sifting target image is obtained and is carried out pair to destination image data after the corresponding destination image data collection of image set to be screened
Anti- training obtains sample data set.The present invention by detection expanded according to sample data set after neural network model performance,
It is carried out judging whether to continue the training of data augmentation according to judging result, to promote the robustness of neural network model.
The seventh embodiment of the present invention, as shown in Figure 7:
A kind of acquisition system of neural metwork training collection, including:
Data obtaining module 110 obtains the classification information of image set to be screened;
Optical sieving module 120 screens target image, obtains the corresponding destination image data of the image set to be screened
Collection;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample graph
The corresponding template image of classification information as being image set to be screened;
Data set acquisition module 130 carries out dual training to the destination image data and obtains sample data set.
Specifically, the present embodiment is the corresponding system embodiment of above method embodiment, specific effect is referring to the above method
Embodiment, this is no longer going to repeat them.
The eighth embodiment of the present invention, as shown in Figure 8:
A kind of acquisition system of neural metwork training collection, including:
Data obtaining module 110 obtains the classification information of image set to be screened;
Optical sieving module 120 screens target image, obtains the corresponding destination image data of the image set to be screened
Collection;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample graph
The corresponding template image of classification information as being image set to be screened;
Data set acquisition module 130 carries out dual training to the destination image data and obtains sample data set.
Preferably, further include:
Accuracy rate detection module 010 detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module 020, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module 030;
The test accuracy rate of the accuracy rate judgment module 020, switching next image collection is judged, until all images
Collection is completed to judge.
Preferably, described image screening module 120 includes:
Image data acquiring unit 121, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit 122 obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit 123 judges whether each image data meets default reptile according to the sample image
Strategy marks described image data according to judging result, obtains the image data that all labels are, obtains described wait for
Screen the corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Preferably, the data set acquisition module 130 includes:
Pixel transform data augmentation unit 131 carries out pixel transform, after obtaining pixel transform to the destination image data
Image data as the sample data set;
Geometric transformation data augmentation unit 132 carries out geometric transformation, after obtaining geometric transformation to the destination image data
Image data as the sample data set.Specifically, in the present embodiment,
Preferably, further include:
Data processing module 140 expands the data set of neural network model according to the sample data set;
Performance judging module 150, the performance of the neural network model after detection expansion, judges whether the performance reaches pre-
If performance threshold;
Described information acquisition module 110 reacquires figure to be screened also when the performance is not up to default capabilities threshold value
The classification information of image set reacquires new target by described image screening module 120 and the data set acquisition module 130
Image data set carries out dual training and obtains new sample data set.
Specifically, the present embodiment is the corresponding system embodiment of above method embodiment, specific effect is referring to the above method
Embodiment, this is no longer going to repeat them.
It should be noted that above-described embodiment can be freely combined as needed.The above is only the preferred of the present invention
Embodiment, it is noted that for those skilled in the art, in the premise for not departing from the principle of the invention
Under, several improvements and modifications can also be made, these improvements and modifications also should be regarded as protection scope of the present invention.
Claims (10)
1. a kind of acquisition methods of neural metwork training collection, which is characterized in that including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target image is
Similarity reaches the image data of default similarity threshold between sample image;The sample image is image set to be screened
The corresponding template image of classification information;
S300 carries out dual training to the destination image data and obtains sample data set.
2. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that before the step S100
Including step:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030;It is no
Then, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge.
3. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that the step S200 includes
Step:
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, according to judging result mark
Remember described image data;
S240 obtains the image data that all labels are and obtains the corresponding target image number of the image set to be screened
According to collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
4. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that the step S300 includes
Step:
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample number
According to collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample number
According to collection.
5. according to the acquisition methods of claim 1-4 any one of them neural metwork training collection, which is characterized in that the step
It include step after S300:
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value;If so,
Terminate;Otherwise, return to step S100.
6. a kind of acquisition system of neural metwork training collection, which is characterized in that including:
Data obtaining module obtains the classification information of image set to be screened;
Optical sieving module screens target image, obtains the corresponding destination image data collection of the image set to be screened;The mesh
Logo image is the image data that similarity reaches default similarity threshold between sample image;The sample image is to be screened
The corresponding template image of classification information of image set;
Data set acquisition module carries out dual training to the destination image data and obtains sample data set.
7. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that further include:
Accuracy rate detection module detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module;
The test accuracy rate of the accuracy rate judgment module, switching next image collection is judged, until all image sets are completed
Judge.
8. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that described image screening module
Including:
Image data acquiring unit, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit judges whether each image data meets default reptile strategy, root according to the sample image
It is judged that result queue described image data, obtain the image data that all labels are, obtain the figure to be screened
The corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
9. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that the data set obtains mould
Block includes:
Pixel transform data augmentation unit carries out pixel transform to the destination image data, obtains the image after pixel transform
Data are as the sample data set;
Geometric transformation data augmentation unit carries out geometric transformation to the destination image data, obtains the image after geometric transformation
Data are as the sample data set.
10. according to the acquisition system of claim 6-9 any one of them neural metwork training collection, which is characterized in that further include:
Data processing module expands the data set of neural network model according to the sample data set;
Performance judging module, the performance of the neural network model after detection expansion, judges whether the performance reaches default capabilities
Threshold value;
Described information acquisition module reacquires image set to be screened also when the performance is not up to default capabilities threshold value
Classification information, by described image screening module and the data set acquisition module, reacquire new destination image data collection into
Row dual training obtains new sample data set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810438759.6A CN108596338A (en) | 2018-05-09 | 2018-05-09 | A kind of acquisition methods and its system of neural metwork training collection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810438759.6A CN108596338A (en) | 2018-05-09 | 2018-05-09 | A kind of acquisition methods and its system of neural metwork training collection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108596338A true CN108596338A (en) | 2018-09-28 |
Family
ID=63636043
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810438759.6A Pending CN108596338A (en) | 2018-05-09 | 2018-05-09 | A kind of acquisition methods and its system of neural metwork training collection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108596338A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109284729A (en) * | 2018-10-08 | 2019-01-29 | 北京影谱科技股份有限公司 | Method, apparatus and medium based on video acquisition human face recognition model training data |
CN109793491A (en) * | 2018-12-29 | 2019-05-24 | 维沃移动通信有限公司 | A kind of colour blindness detection method and terminal device |
CN109934275A (en) * | 2019-03-05 | 2019-06-25 | 深圳市商汤科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110059647A (en) * | 2019-04-23 | 2019-07-26 | 杭州智趣智能信息技术有限公司 | A kind of file classification method, system and associated component |
CN111241969A (en) * | 2020-01-06 | 2020-06-05 | 北京三快在线科技有限公司 | Target detection method and device and corresponding model training method and device |
CN111612133A (en) * | 2020-05-20 | 2020-09-01 | 广州华见智能科技有限公司 | Internal organ feature coding method based on face image multi-stage relation learning |
CN111680683A (en) * | 2019-03-30 | 2020-09-18 | 上海铼锶信息技术有限公司 | ROI parameter acquisition method and system |
CN113255711A (en) * | 2020-02-13 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Confrontation detection method, device and equipment |
CN114548192A (en) * | 2020-11-23 | 2022-05-27 | 千寻位置网络有限公司 | Sample data processing method and device, electronic equipment and medium |
CN114638322A (en) * | 2022-05-20 | 2022-06-17 | 南京大学 | Full-automatic target detection system and method based on given description in open scene |
CN116433939A (en) * | 2023-04-18 | 2023-07-14 | 北京百度网讯科技有限公司 | Sample image generation method, training method, recognition method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106408564A (en) * | 2016-10-10 | 2017-02-15 | 北京新皓然软件技术有限责任公司 | Depth-learning-based eye-fundus image processing method, device and system |
WO2017091833A1 (en) * | 2015-11-29 | 2017-06-01 | Arterys Inc. | Automated cardiac volume segmentation |
CN106919920A (en) * | 2017-03-06 | 2017-07-04 | 重庆邮电大学 | Scene recognition method based on convolution feature and spatial vision bag of words |
CN107423815A (en) * | 2017-08-07 | 2017-12-01 | 北京工业大学 | A kind of computer based low quality classification chart is as data cleaning method |
CN107590156A (en) * | 2016-07-09 | 2018-01-16 | 北京至信普林科技有限公司 | A kind of polytypic method of text based on training set cyclic extension |
-
2018
- 2018-05-09 CN CN201810438759.6A patent/CN108596338A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017091833A1 (en) * | 2015-11-29 | 2017-06-01 | Arterys Inc. | Automated cardiac volume segmentation |
CN107590156A (en) * | 2016-07-09 | 2018-01-16 | 北京至信普林科技有限公司 | A kind of polytypic method of text based on training set cyclic extension |
CN106408564A (en) * | 2016-10-10 | 2017-02-15 | 北京新皓然软件技术有限责任公司 | Depth-learning-based eye-fundus image processing method, device and system |
CN106919920A (en) * | 2017-03-06 | 2017-07-04 | 重庆邮电大学 | Scene recognition method based on convolution feature and spatial vision bag of words |
CN107423815A (en) * | 2017-08-07 | 2017-12-01 | 北京工业大学 | A kind of computer based low quality classification chart is as data cleaning method |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109284729B (en) * | 2018-10-08 | 2020-03-03 | 北京影谱科技股份有限公司 | Method, device and medium for acquiring face recognition model training data based on video |
CN109284729A (en) * | 2018-10-08 | 2019-01-29 | 北京影谱科技股份有限公司 | Method, apparatus and medium based on video acquisition human face recognition model training data |
CN109793491B (en) * | 2018-12-29 | 2021-11-23 | 维沃移动通信有限公司 | Terminal equipment for color blindness detection |
CN109793491A (en) * | 2018-12-29 | 2019-05-24 | 维沃移动通信有限公司 | A kind of colour blindness detection method and terminal device |
CN109934275A (en) * | 2019-03-05 | 2019-06-25 | 深圳市商汤科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111680683B (en) * | 2019-03-30 | 2023-06-02 | 上海铼锶信息技术有限公司 | ROI parameter acquisition method and system |
CN111680683A (en) * | 2019-03-30 | 2020-09-18 | 上海铼锶信息技术有限公司 | ROI parameter acquisition method and system |
CN110059647A (en) * | 2019-04-23 | 2019-07-26 | 杭州智趣智能信息技术有限公司 | A kind of file classification method, system and associated component |
CN111241969A (en) * | 2020-01-06 | 2020-06-05 | 北京三快在线科技有限公司 | Target detection method and device and corresponding model training method and device |
CN113255711A (en) * | 2020-02-13 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Confrontation detection method, device and equipment |
CN113255711B (en) * | 2020-02-13 | 2024-05-28 | 阿里巴巴集团控股有限公司 | Countermeasure detection method, device and equipment |
CN111612133B (en) * | 2020-05-20 | 2021-10-19 | 广州华见智能科技有限公司 | Internal organ feature coding method based on face image multi-stage relation learning |
CN111612133A (en) * | 2020-05-20 | 2020-09-01 | 广州华见智能科技有限公司 | Internal organ feature coding method based on face image multi-stage relation learning |
CN114548192A (en) * | 2020-11-23 | 2022-05-27 | 千寻位置网络有限公司 | Sample data processing method and device, electronic equipment and medium |
CN114638322A (en) * | 2022-05-20 | 2022-06-17 | 南京大学 | Full-automatic target detection system and method based on given description in open scene |
CN114638322B (en) * | 2022-05-20 | 2022-09-13 | 南京大学 | Full-automatic target detection system and method based on given description in open scene |
CN116433939A (en) * | 2023-04-18 | 2023-07-14 | 北京百度网讯科技有限公司 | Sample image generation method, training method, recognition method and device |
CN116433939B (en) * | 2023-04-18 | 2024-02-20 | 北京百度网讯科技有限公司 | Sample image generation method, training method, recognition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596338A (en) | A kind of acquisition methods and its system of neural metwork training collection | |
CN110728209B (en) | Gesture recognition method and device, electronic equipment and storage medium | |
US11663489B2 (en) | Machine learning systems and methods for improved localization of image forgery | |
CN107016406A (en) | The pest and disease damage image generating method of network is resisted based on production | |
CN114359727B (en) | Tea disease identification method and system based on lightweight optimization Yolo v4 | |
CN106203454B (en) | The method and device of certificate format analysis | |
CN106156767A (en) | Driving license effect duration extraction method, server and terminal | |
CN109472193A (en) | Method for detecting human face and device | |
CN112862849B (en) | Image segmentation and full convolution neural network-based field rice ear counting method | |
CN108710893B (en) | Digital image camera source model classification method based on feature fusion | |
CN111242955B (en) | Road surface crack image segmentation method based on full convolution neural network | |
CN106339984A (en) | Distributed image super-resolution method based on K-means driven convolutional neural network | |
CN110415212A (en) | Abnormal cell detection method, device and computer readable storage medium | |
CN112418360B (en) | Convolutional neural network training method, pedestrian attribute identification method and related equipment | |
CN109977994A (en) | A kind of presentation graphics choosing method based on more example Active Learnings | |
CN111046793B (en) | Tomato disease identification method based on deep convolutional neural network | |
CN108009481A (en) | A kind of training method and device of CNN models, face identification method and device | |
CN111784665B (en) | OCT image quality evaluation method, system and device based on Fourier transform | |
CN110008961A (en) | Text real-time identification method, device, computer equipment and storage medium | |
CN106503047B (en) | A kind of image crawler optimization method based on convolutional neural networks | |
CN110059541A (en) | A kind of mobile phone usage behavior detection method and device in driving | |
CN106874913A (en) | A kind of vegetable detection method | |
CN102779157A (en) | Method and device for searching images | |
CN108734708A (en) | Gastric cancer recognition methods, device and storage medium | |
CN110334719A (en) | The method and system of object image are built in a kind of extraction remote sensing image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180928 |