CN108596338A - A kind of acquisition methods and its system of neural metwork training collection - Google Patents

A kind of acquisition methods and its system of neural metwork training collection Download PDF

Info

Publication number
CN108596338A
CN108596338A CN201810438759.6A CN201810438759A CN108596338A CN 108596338 A CN108596338 A CN 108596338A CN 201810438759 A CN201810438759 A CN 201810438759A CN 108596338 A CN108596338 A CN 108596338A
Authority
CN
China
Prior art keywords
image
data
screened
collection
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810438759.6A
Other languages
Chinese (zh)
Inventor
罗培元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Feixun Information Technology Co Ltd
Original Assignee
Sichuan Feixun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Feixun Information Technology Co Ltd filed Critical Sichuan Feixun Information Technology Co Ltd
Priority to CN201810438759.6A priority Critical patent/CN108596338A/en
Publication of CN108596338A publication Critical patent/CN108596338A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of acquisition methods and its system of neural metwork training collection, method includes:S100 obtains the classification information of image set to be screened;S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample image is the corresponding template image of classification information of image set to be screened;S300 carries out dual training to the destination image data and obtains sample data set.The present invention, which realizes, reduces artificial screening sample data set, promotes screening efficiency and screening reliability, improves the accuracy of neural network.

Description

A kind of acquisition methods and its system of neural metwork training collection
Technical field
The present invention relates to data processing field, the espespecially a kind of acquisition methods and its system of neural metwork training collection.
Background technology
In recent years, with the rapid development of the continuous development of computer vision technique, especially neural network model, people The demand for training the demand of required image data to be especially to label information accurate image data computer vision increasingly increases Greatly.
Neural network model (ConvolutionalNeuralNetworks, CNN) is one kind of deep learning algorithm, is The important processing analysis tool in the fields such as image recognition, has become one of the research hotspot of numerous scientific domains in recent years.God It is that any feature manually marked need not be used when training pattern through the advantages of network model algorithm, can explores automatically defeated Enter the feature that variable implies, while the weights of network share characteristic, greatly reduce the complexity of model, reduce weights Quantity.These advantages show particularly evident when the input of network is image, and original image can be directly as the defeated of network Enter, avoids feature extraction complicated in tional identification algorithm and data reconstruction processes.
For the great amount of images sample data set for obtaining needed for training neural network model, most easily mode is to pass through network It obtains, using the method for web crawlers, web crawlers can will meet the information of the condition from internet according to the condition of setting Magnanimity information in crawl out.
Current way is, using crawling for web crawlers magnanimity, then to carry out artificial screening and cleaning.The problem of bringing It is that workload is extremely huge, the selection result subjectivity is big, and the selection result is easy error, meanwhile, using the image pattern number of mistake It is trained according to set pair neural network, the classification results of mistake can be brought.Picture number is obtained when using traditional network crawler technology According to when, the picture quality that crawls increases with the quantity crawled, and downward trend is substantially presented, traditional network reptile is caused to crawl Data influence the training result of the subsequently image recognition based on neural network model there are larger noise.
Invention content
The object of the present invention is to provide a kind of acquisition methods and its system of neural metwork training collection, realizes and reduce artificial sieve Sample data set is selected, screening efficiency and screening reliability is promoted, improves the accuracy of neural network.
Technical solution provided by the invention is as follows:
The present invention provides a kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Further, include step before the step S100:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030; Otherwise, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge.
Further, the step S200 includes step:
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened As data set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Further, the step S300 includes step:
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample Notebook data collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample Notebook data collection.
Further, include step after the step S300:
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value; If so, terminating;Otherwise, return to step S100.
The present invention also provides a kind of acquisition systems of neural metwork training collection, including:
Data obtaining module obtains the classification information of image set to be screened;
Optical sieving module screens target image, obtains the corresponding destination image data collection of the image set to be screened;Institute It is the image data that similarity reaches default similarity threshold between sample image to state target image;The sample image is to wait for Screen the corresponding template image of classification information of image set;
Data set acquisition module carries out dual training to the destination image data and obtains sample data set.
Further, further include:
Accuracy rate detection module detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module;
The test accuracy rate of the accuracy rate judgment module, switching next image collection is judged, until all image sets It completes to judge.
Further, described image screening module includes:
Image data acquiring unit, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit judges whether each image data meets default reptile plan according to the sample image Slightly, described image data are marked according to judging result, obtain it is all mark the image data for being, obtain described waiting sieving Select the corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Further, the data set acquisition module includes:
Pixel transform data augmentation unit carries out pixel transform, after obtaining pixel transform to the destination image data Image data is as the sample data set;
Geometric transformation data augmentation unit carries out geometric transformation, after obtaining geometric transformation to the destination image data Image data is as the sample data set.
Further, further include:
Data processing module expands the data set of neural network model according to the sample data set;
Performance judging module, the performance of the neural network model after detection expansion, judges whether the performance reaches default Performance threshold;
Described information acquisition module reacquires image to be screened also when the performance is not up to default capabilities threshold value The classification information of collection reacquires new destination image data by described image screening module and the data set acquisition module Collection carries out dual training and obtains new sample data set.
The acquisition methods and its system of a kind of neural metwork training collection provided through the invention, can bring it is following at least A kind of advantageous effect:
1) present invention carries out targetedly data set augmentation by the classification information to image set to be screened, significantly drops Low labor workload reduces data screening caused by the subjectivity of artificial screening and malfunctions, fought to destination image data Training, to introduce stochastic variable, improves the robustness of neural network model.
2) present invention obtains test accuracy rate less than default standard after obtaining neural network model from neural network model The image data set of true rate is image set to be screened, and targetedly data set is carried out by the classification information to image set to be screened Augmentation, avoids indifference, no purpose, without targetedly into line data set augmentation,
3) present invention obtains test accuracy rate less than default standard after obtaining neural network model from neural network model The image data set of true rate is image set to be screened, and targetedly data set is carried out by the classification information to image set to be screened Augmentation increases the proportion that the corresponding image set to be screened of classification information is concentrated in the overall data of neural network model, quickly Carry out efficiently and purposefully network training and parameter adjustment.
4) according to pixel transform, either any one or multiple combinations of geometric transformation carry out data augmentation and draw the present invention Enter stochastic variable, can increase data volume in the case where not changing image category, the extensive energy of neural network model can be improved Power has made sample and has introduced stochastic variable to data set to process resistant, can improve the robustness of neural network, accurately Rate, fault-tolerance.
Description of the drawings
Below by a manner of clearly understandable, preferred embodiment is described with reference to the drawings, to a kind of neural metwork training collection Acquisition methods and its above-mentioned characteristic, technical characteristic, advantage and its realization method of system be further described.
Fig. 1 is the flow chart of first embodiment of the invention;
Fig. 2 is the flow chart of second embodiment of the invention;
Fig. 3 is the flow chart of second embodiment of the invention;
Fig. 4 is the flow chart of third embodiment of the invention;
Fig. 5 is the flow chart of fourth embodiment of the invention;
Fig. 6 is the flow chart of fifth embodiment of the invention;
Fig. 7 is the structural schematic diagram of sixth embodiment of the invention;
Fig. 8 is the structural schematic diagram of seventh embodiment of the invention.
Specific implementation mode
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, control is illustrated below The specific implementation mode of the present invention.It should be evident that drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, other are can also be obtained according to these attached drawings Attached drawing, and obtain other embodiments.
To make simplified form, part related to the present invention is only schematically shown in each figure, they are not represented Its practical structures as product.In addition, so that simplified form is easy to understand, there is identical structure or function in some figures Component only symbolically depicts one of those, or has only marked one of those.Herein, "one" is not only indicated " only this ", can also indicate the situation of " more than one ".
The first embodiment of the present invention, as shown in Figure 1:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Specifically, in the present embodiment, when handling image recognition either image classification or when other machines learning tasks It waits, the performance (discrimination, classification accuracy) of neural network model how is promoted, due to the data in neural network model Amount is bigger, and when the similitude of data is higher, the performance of neural network model is better, and the prior art is in order to improve neural network mould The performance of type, general to carry out garbled data using artificial screening by the way of, artificial screening not only heavy workload, and acquisition Data cause neural network model may since human subjective's property height causes the similarity of data voluntarily to be judged and classified by user The specified number of mistake according to and influence the performance of neural network model.The present invention obtains image set to be screened from neural network model, Then the classification information for obtaining image set to be screened obtains the corresponding sample image of image set to be screened according to classification information, from And the image data of acquisition is compared to similarity with sample image, when the similarity between image data and sample image reaches When to default similarity threshold, which is exactly target image, and all image datas are carried out above-mentioned similarity-rough set, All target images are filtered out according to judging result, these form category information according to the target image that classification information obtains Then the destination image data collection of corresponding image set to be screened carries out dual training to destination image data, random to introduce Variable improves the robustness of neural network model.The present invention is carried out targetedly by the classification information to image set to be screened Data set augmentation, significantly reduces labor workload, reduces data screening caused by the subjectivity of artificial screening and malfunctions, right Destination image data carries out dual training and improves the robustness of neural network model to introduce stochastic variable.
The second embodiment of the present invention, as shown in Figure 2:
A kind of acquisition methods of neural metwork training collection, including step:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030; Otherwise, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge;
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set.
Specifically, the present embodiment is the preferred embodiment of above-mentioned first embodiment, in the present embodiment, each classification is detected The corresponding test accuracy rate of image set, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value, if currently When the test accuracy rate of image set is less than default capabilities threshold value, present image collection is labeled as image set to be screened, if current figure When the test accuracy rate of image set is not less than (being greater than) default capabilities threshold value, the test for switching next image collection is accurate Rate, continues to judge whether the test accuracy rate of next image collection is less than default capabilities threshold value, is marked down according to judging result One image set, until the image set of all categories in neural network template is completed to judge and after classification marker, gets nerve Classification information of the test accuracy rate less than the image set to be screened of default capabilities threshold value in network template.The present invention is obtaining nerve After network model, it is figure to be screened that test accuracy rate is obtained from neural network model to be less than the image data set of default accuracy rate Image set carries out targetedly data set augmentation by the classification information to image set to be screened, avoids indifference, no purpose, nothing Targetedly into line data set augmentation, whether input source is judged into the expansion of line data set by the result of prediction, this Kind processing has great help for the reduction of workload.In addition targetedly classification information is increased into line data set augmentation The proportion that corresponding image set to be screened is concentrated in the overall data of neural network model, due to the training in neural network model In, the variation of proportion influences whether the network parameter finally obtained, can be quick to also improve the robustness of neural network Carry out efficiently and purposefully network training and parameter adjustment.
The third embodiment of the present invention, as shown in Figure 3:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened As data set;
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, in the present embodiment, acquisition belongs to The image data of the classification information of image set to be screened, and the corresponding sample graph of classification information for obtaining image set to be screened Picture judges whether each image data meets default reptile strategy then according to sample image, by default reptile strategy to figure As data are screened to obtain target image, for example corresponding search is carried out according to preset keyword and is compared, according to default abstract Value carries out verification comparison, and (by MD5 algorithms or SHA1 algorithms, CRC32 algorithms etc. digest algorithm calculates digest value, will count Obtained digest value is compared with default digest value), matching comparison etc. mode is carried out according to default similarity, according to than Mark each image data acquisition is all the image data for being is marked to obtain image set pair to be screened compared with judging result The destination image data collection answered, filters out unwanted picture data, to improve Screening Treatment efficiency, reduces artificial Workload.In addition, carrying out pixel transform to destination image data, the image data after pixel transform is obtained as sample data Collection, pixel transform include:1, increasing noise and filtering, the mode of noise includes but is not limited to salt-pepper noise, Gaussian noise, in Value filtering;2, channel, the sequence in tri- channels adjustment RBG are converted;3, contrast, brightness and saturation degree, color jitter are adjusted.This Invention can carry out data augmentation according to any one or multiple combinations of above-mentioned pixel transform and introduce stochastic variable, Neng Gou In the case of not changing image category, increases data volume, the generalization ability of neural network model can be improved, by target image Data carry out pixel transform, have been done to sample and have introduced stochastic variable to data set to process resistant, can improve neural network Robustness, accuracy rate, fault-tolerance.
The fourth embodiment of the present invention, as shown in Figure 4:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened As data set;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, in the present embodiment, acquisition belongs to The image data of the classification information of image set to be screened, and the corresponding sample graph of classification information for obtaining image set to be screened Picture judges whether each image data meets default reptile strategy then according to sample image, by default reptile strategy to figure As data are screened to obtain target image, for example corresponding search is carried out according to preset keyword and is compared, according to default abstract Value carries out verification comparison, and (by MD5 algorithms or SHA1 algorithms, CRC32 algorithms etc. digest algorithm calculates digest value, will count Obtained digest value is compared with default digest value), matching comparison etc. mode is carried out according to default similarity, according to than Mark each image data acquisition is all the image data for being is marked to obtain image set pair to be screened compared with judging result The destination image data collection answered, filters out unwanted picture data, to improve Screening Treatment efficiency, reduces artificial Workload.In addition, carrying out geometric transformation to destination image data, the image data after geometric transformation is obtained as sample data Collection, geometric transformation include:1, it overturns, such as:Flip horizontal, it is vertical to overturn, it is overturn according to actual conditions, for example, about people Face has spun upside down the face reformed into down, overturns without practical significance;2, it translates, simulates real-life picture The transformation of position occurs for situation not placed in the middle;3, it rotates;4, it sets black, simulates the data sample being at least partially obscured;5, it cuts;6、 Scaling.The present invention can carry out data augmentation according to any one or multiple combinations of above-mentioned geometric transformation and introduce random become Amount can increase data volume, can improve the generalization ability of neural network model, pass through in the case where not changing image category Geometric transformation is carried out to destination image data, sample has been done, stochastic variable, Neng Gouti is introduced to data set to process resistant The robustness of high neural network, accuracy rate, fault-tolerance.
The fifth embodiment of the present invention, as shown in Figure 5:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, is tied according to judgement Fruit marks described image data;
S240 obtains the image data that all labels are and obtains the corresponding target figure of the image set to be screened As data set;
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample Notebook data collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample Notebook data collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Specifically, the present embodiment is the preferred embodiment of above-mentioned the first and second embodiments, specific effect is referring to above-mentioned Three and fourth embodiment, this is no longer going to repeat them.
The sixth embodiment of the present invention, as shown in Figure 6:
A kind of acquisition methods of neural metwork training collection, including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target figure Picture is the image data that similarity reaches default similarity threshold between sample image;The sample image is image to be screened The corresponding template image of classification information of collection;
S300 carries out dual training to the destination image data and obtains sample data set;
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value; If so, terminating;Otherwise, return to step S100.
Specifically, in the present embodiment, (it is unanimously herein since the similitude of the data in neural network model is consistent In certain similarity ranges) when, if the performance of the bigger neural network model of data volume is relatively better, by to be screened The classification information of image set carries out targetedly data set augmentation and obtains sample data set, significantly reduces manual working Amount, the sample data set of acquisition is added in neural network model, expands the data set of neural network model so that nerve net The data of network model obtain augmentation, the performance of the neural network model after then detection expands, if the neural network after expanding The performance of model reaches default capabilities threshold value, then the neural network model after the training of above-mentioned data augmentation is exactly a qualification Neural network model, the neural network model of the qualification can be passed through and carry out subsequent picture recognition classification.If after expanding The performance of neural network model be not up to the classification information that default capabilities threshold value then reacquires image set to be screened, according to class Other information sifting target image is obtained and is carried out pair to destination image data after the corresponding destination image data collection of image set to be screened Anti- training obtains sample data set.The present invention by detection expanded according to sample data set after neural network model performance, It is carried out judging whether to continue the training of data augmentation according to judging result, to promote the robustness of neural network model.
The seventh embodiment of the present invention, as shown in Figure 7:
A kind of acquisition system of neural metwork training collection, including:
Data obtaining module 110 obtains the classification information of image set to be screened;
Optical sieving module 120 screens target image, obtains the corresponding destination image data of the image set to be screened Collection;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample graph The corresponding template image of classification information as being image set to be screened;
Data set acquisition module 130 carries out dual training to the destination image data and obtains sample data set.
Specifically, the present embodiment is the corresponding system embodiment of above method embodiment, specific effect is referring to the above method Embodiment, this is no longer going to repeat them.
The eighth embodiment of the present invention, as shown in Figure 8:
A kind of acquisition system of neural metwork training collection, including:
Data obtaining module 110 obtains the classification information of image set to be screened;
Optical sieving module 120 screens target image, obtains the corresponding destination image data of the image set to be screened Collection;The target image is the image data that similarity reaches default similarity threshold between sample image;The sample graph The corresponding template image of classification information as being image set to be screened;
Data set acquisition module 130 carries out dual training to the destination image data and obtains sample data set.
Preferably, further include:
Accuracy rate detection module 010 detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module 020, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module 030;
The test accuracy rate of the accuracy rate judgment module 020, switching next image collection is judged, until all images Collection is completed to judge.
Preferably, described image screening module 120 includes:
Image data acquiring unit 121, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit 122 obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit 123 judges whether each image data meets default reptile according to the sample image Strategy marks described image data according to judging result, obtains the image data that all labels are, obtains described wait for Screen the corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
Preferably, the data set acquisition module 130 includes:
Pixel transform data augmentation unit 131 carries out pixel transform, after obtaining pixel transform to the destination image data Image data as the sample data set;
Geometric transformation data augmentation unit 132 carries out geometric transformation, after obtaining geometric transformation to the destination image data Image data as the sample data set.Specifically, in the present embodiment,
Preferably, further include:
Data processing module 140 expands the data set of neural network model according to the sample data set;
Performance judging module 150, the performance of the neural network model after detection expansion, judges whether the performance reaches pre- If performance threshold;
Described information acquisition module 110 reacquires figure to be screened also when the performance is not up to default capabilities threshold value The classification information of image set reacquires new target by described image screening module 120 and the data set acquisition module 130 Image data set carries out dual training and obtains new sample data set.
Specifically, the present embodiment is the corresponding system embodiment of above method embodiment, specific effect is referring to the above method Embodiment, this is no longer going to repeat them.
It should be noted that above-described embodiment can be freely combined as needed.The above is only the preferred of the present invention Embodiment, it is noted that for those skilled in the art, in the premise for not departing from the principle of the invention Under, several improvements and modifications can also be made, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of acquisition methods of neural metwork training collection, which is characterized in that including step:
S100 obtains the classification information of image set to be screened;
S200 screens target image, obtains the corresponding destination image data collection of the image set to be screened;The target image is Similarity reaches the image data of default similarity threshold between sample image;The sample image is image set to be screened The corresponding template image of classification information;
S300 carries out dual training to the destination image data and obtains sample data set.
2. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that before the step S100 Including step:
S010 detects the corresponding test accuracy rate of image set of each classification;
S020 judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;If so, executing step S030;It is no Then, step S040 is executed;
The present image collection is labeled as the image set to be screened by S030;
The test accuracy rate of S040 switching next image collection is judged, until all image sets are completed to judge.
3. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that the step S200 includes Step:
S210 acquisitions belong to the image data of the classification information of the image set to be screened;
S220 obtains the corresponding sample image of classification information of the image set to be screened;
S230 judges whether each image data meets default reptile strategy according to the sample image, according to judging result mark Remember described image data;
S240 obtains the image data that all labels are and obtains the corresponding target image number of the image set to be screened According to collection;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
4. the acquisition methods of neural metwork training collection according to claim 1, which is characterized in that the step S300 includes Step:
S310 carries out pixel transform to the destination image data, obtains the image data after pixel transform as the sample number According to collection;
S320 carries out geometric transformation to the destination image data, obtains the image data after geometric transformation as the sample number According to collection.
5. according to the acquisition methods of claim 1-4 any one of them neural metwork training collection, which is characterized in that the step It include step after S300:
S400 expands the data set of neural network model according to the sample data set;
The performance of neural network model after S500 detection expansions, judges whether the performance reaches default capabilities threshold value;If so, Terminate;Otherwise, return to step S100.
6. a kind of acquisition system of neural metwork training collection, which is characterized in that including:
Data obtaining module obtains the classification information of image set to be screened;
Optical sieving module screens target image, obtains the corresponding destination image data collection of the image set to be screened;The mesh Logo image is the image data that similarity reaches default similarity threshold between sample image;The sample image is to be screened The corresponding template image of classification information of image set;
Data set acquisition module carries out dual training to the destination image data and obtains sample data set.
7. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that further include:
Accuracy rate detection module detects the corresponding test accuracy rate of image set of each classification;
Accuracy rate judgment module, judges whether the test accuracy rate of present image collection is less than default capabilities threshold value;
The present image collection is labeled as the image set to be screened by image tagged module;
The test accuracy rate of the accuracy rate judgment module, switching next image collection is judged, until all image sets are completed Judge.
8. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that described image screening module Including:
Image data acquiring unit, acquisition belong to the image data of the classification information of the image set to be screened;
Sample image acquiring unit obtains the corresponding sample image of classification information of the image set to be screened;
Image data screening unit judges whether each image data meets default reptile strategy, root according to the sample image It is judged that result queue described image data, obtain the image data that all labels are, obtain the figure to be screened The corresponding destination image data collection of image set;
Wherein, the default reptile strategy includes default digest value, and preset keyword presets similarity.
9. the acquisition system of neural metwork training collection according to claim 6, which is characterized in that the data set obtains mould Block includes:
Pixel transform data augmentation unit carries out pixel transform to the destination image data, obtains the image after pixel transform Data are as the sample data set;
Geometric transformation data augmentation unit carries out geometric transformation to the destination image data, obtains the image after geometric transformation Data are as the sample data set.
10. according to the acquisition system of claim 6-9 any one of them neural metwork training collection, which is characterized in that further include:
Data processing module expands the data set of neural network model according to the sample data set;
Performance judging module, the performance of the neural network model after detection expansion, judges whether the performance reaches default capabilities Threshold value;
Described information acquisition module reacquires image set to be screened also when the performance is not up to default capabilities threshold value Classification information, by described image screening module and the data set acquisition module, reacquire new destination image data collection into Row dual training obtains new sample data set.
CN201810438759.6A 2018-05-09 2018-05-09 A kind of acquisition methods and its system of neural metwork training collection Pending CN108596338A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810438759.6A CN108596338A (en) 2018-05-09 2018-05-09 A kind of acquisition methods and its system of neural metwork training collection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810438759.6A CN108596338A (en) 2018-05-09 2018-05-09 A kind of acquisition methods and its system of neural metwork training collection

Publications (1)

Publication Number Publication Date
CN108596338A true CN108596338A (en) 2018-09-28

Family

ID=63636043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810438759.6A Pending CN108596338A (en) 2018-05-09 2018-05-09 A kind of acquisition methods and its system of neural metwork training collection

Country Status (1)

Country Link
CN (1) CN108596338A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284729A (en) * 2018-10-08 2019-01-29 北京影谱科技股份有限公司 Method, apparatus and medium based on video acquisition human face recognition model training data
CN109793491A (en) * 2018-12-29 2019-05-24 维沃移动通信有限公司 A kind of colour blindness detection method and terminal device
CN109934275A (en) * 2019-03-05 2019-06-25 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN110059647A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of file classification method, system and associated component
CN111241969A (en) * 2020-01-06 2020-06-05 北京三快在线科技有限公司 Target detection method and device and corresponding model training method and device
CN111612133A (en) * 2020-05-20 2020-09-01 广州华见智能科技有限公司 Internal organ feature coding method based on face image multi-stage relation learning
CN111680683A (en) * 2019-03-30 2020-09-18 上海铼锶信息技术有限公司 ROI parameter acquisition method and system
CN113255711A (en) * 2020-02-13 2021-08-13 阿里巴巴集团控股有限公司 Confrontation detection method, device and equipment
CN114548192A (en) * 2020-11-23 2022-05-27 千寻位置网络有限公司 Sample data processing method and device, electronic equipment and medium
CN114638322A (en) * 2022-05-20 2022-06-17 南京大学 Full-automatic target detection system and method based on given description in open scene
CN116433939A (en) * 2023-04-18 2023-07-14 北京百度网讯科技有限公司 Sample image generation method, training method, recognition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408564A (en) * 2016-10-10 2017-02-15 北京新皓然软件技术有限责任公司 Depth-learning-based eye-fundus image processing method, device and system
WO2017091833A1 (en) * 2015-11-29 2017-06-01 Arterys Inc. Automated cardiac volume segmentation
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN107423815A (en) * 2017-08-07 2017-12-01 北京工业大学 A kind of computer based low quality classification chart is as data cleaning method
CN107590156A (en) * 2016-07-09 2018-01-16 北京至信普林科技有限公司 A kind of polytypic method of text based on training set cyclic extension

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017091833A1 (en) * 2015-11-29 2017-06-01 Arterys Inc. Automated cardiac volume segmentation
CN107590156A (en) * 2016-07-09 2018-01-16 北京至信普林科技有限公司 A kind of polytypic method of text based on training set cyclic extension
CN106408564A (en) * 2016-10-10 2017-02-15 北京新皓然软件技术有限责任公司 Depth-learning-based eye-fundus image processing method, device and system
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN107423815A (en) * 2017-08-07 2017-12-01 北京工业大学 A kind of computer based low quality classification chart is as data cleaning method

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284729B (en) * 2018-10-08 2020-03-03 北京影谱科技股份有限公司 Method, device and medium for acquiring face recognition model training data based on video
CN109284729A (en) * 2018-10-08 2019-01-29 北京影谱科技股份有限公司 Method, apparatus and medium based on video acquisition human face recognition model training data
CN109793491B (en) * 2018-12-29 2021-11-23 维沃移动通信有限公司 Terminal equipment for color blindness detection
CN109793491A (en) * 2018-12-29 2019-05-24 维沃移动通信有限公司 A kind of colour blindness detection method and terminal device
CN109934275A (en) * 2019-03-05 2019-06-25 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN111680683B (en) * 2019-03-30 2023-06-02 上海铼锶信息技术有限公司 ROI parameter acquisition method and system
CN111680683A (en) * 2019-03-30 2020-09-18 上海铼锶信息技术有限公司 ROI parameter acquisition method and system
CN110059647A (en) * 2019-04-23 2019-07-26 杭州智趣智能信息技术有限公司 A kind of file classification method, system and associated component
CN111241969A (en) * 2020-01-06 2020-06-05 北京三快在线科技有限公司 Target detection method and device and corresponding model training method and device
CN113255711A (en) * 2020-02-13 2021-08-13 阿里巴巴集团控股有限公司 Confrontation detection method, device and equipment
CN113255711B (en) * 2020-02-13 2024-05-28 阿里巴巴集团控股有限公司 Countermeasure detection method, device and equipment
CN111612133B (en) * 2020-05-20 2021-10-19 广州华见智能科技有限公司 Internal organ feature coding method based on face image multi-stage relation learning
CN111612133A (en) * 2020-05-20 2020-09-01 广州华见智能科技有限公司 Internal organ feature coding method based on face image multi-stage relation learning
CN114548192A (en) * 2020-11-23 2022-05-27 千寻位置网络有限公司 Sample data processing method and device, electronic equipment and medium
CN114638322A (en) * 2022-05-20 2022-06-17 南京大学 Full-automatic target detection system and method based on given description in open scene
CN114638322B (en) * 2022-05-20 2022-09-13 南京大学 Full-automatic target detection system and method based on given description in open scene
CN116433939A (en) * 2023-04-18 2023-07-14 北京百度网讯科技有限公司 Sample image generation method, training method, recognition method and device
CN116433939B (en) * 2023-04-18 2024-02-20 北京百度网讯科技有限公司 Sample image generation method, training method, recognition method and device

Similar Documents

Publication Publication Date Title
CN108596338A (en) A kind of acquisition methods and its system of neural metwork training collection
CN110728209B (en) Gesture recognition method and device, electronic equipment and storage medium
US11663489B2 (en) Machine learning systems and methods for improved localization of image forgery
CN107016406A (en) The pest and disease damage image generating method of network is resisted based on production
CN114359727B (en) Tea disease identification method and system based on lightweight optimization Yolo v4
CN106203454B (en) The method and device of certificate format analysis
CN106156767A (en) Driving license effect duration extraction method, server and terminal
CN109472193A (en) Method for detecting human face and device
CN112862849B (en) Image segmentation and full convolution neural network-based field rice ear counting method
CN108710893B (en) Digital image camera source model classification method based on feature fusion
CN111242955B (en) Road surface crack image segmentation method based on full convolution neural network
CN106339984A (en) Distributed image super-resolution method based on K-means driven convolutional neural network
CN110415212A (en) Abnormal cell detection method, device and computer readable storage medium
CN112418360B (en) Convolutional neural network training method, pedestrian attribute identification method and related equipment
CN109977994A (en) A kind of presentation graphics choosing method based on more example Active Learnings
CN111046793B (en) Tomato disease identification method based on deep convolutional neural network
CN108009481A (en) A kind of training method and device of CNN models, face identification method and device
CN111784665B (en) OCT image quality evaluation method, system and device based on Fourier transform
CN110008961A (en) Text real-time identification method, device, computer equipment and storage medium
CN106503047B (en) A kind of image crawler optimization method based on convolutional neural networks
CN110059541A (en) A kind of mobile phone usage behavior detection method and device in driving
CN106874913A (en) A kind of vegetable detection method
CN102779157A (en) Method and device for searching images
CN108734708A (en) Gastric cancer recognition methods, device and storage medium
CN110334719A (en) The method and system of object image are built in a kind of extraction remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180928