CN110162649A - Sample data acquisition methods obtain system, server and computer-readable medium - Google Patents

Sample data acquisition methods obtain system, server and computer-readable medium Download PDF

Info

Publication number
CN110162649A
CN110162649A CN201910441621.6A CN201910441621A CN110162649A CN 110162649 A CN110162649 A CN 110162649A CN 201910441621 A CN201910441621 A CN 201910441621A CN 110162649 A CN110162649 A CN 110162649A
Authority
CN
China
Prior art keywords
picture
subsample
samples pictures
female samples
female
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910441621.6A
Other languages
Chinese (zh)
Other versions
CN110162649B (en
Inventor
杨大陆
孙旭
杨叶辉
王磊
许言午
黄艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910441621.6A priority Critical patent/CN110162649B/en
Publication of CN110162649A publication Critical patent/CN110162649A/en
Application granted granted Critical
Publication of CN110162649B publication Critical patent/CN110162649B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

Present disclose provides a kind of sample data acquisition methods, comprising: constructs female samples pictures database;Multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set;For each female samples pictures set, multiple subsample pictures are extracted from each female samples pictures in mother's samples pictures set using the marquee with predetermined size, and preliminary category is assigned for every subsample picture, to obtain subsample picture set corresponding to each female samples pictures set;For each subsample picture set, the whole subsample pictures for being included using in the subsample picture set train sample classification model corresponding to each subsample picture set as training sample data;For each subsample picture, by the subsample, picture is separately input into this disaggregated model of various kinds, and chooses calibration category of the maximum classification results of frequency as the subsample picture.

Description

Sample data acquisition methods obtain system, server and computer-readable medium
Technical field
This disclosure relates to deep learning field, in particular to sample data acquisition methods, acquisition system, server and calculating Machine readable medium.
Background technique
When being used for the detection model for particular task based on deep learning (Deep Learning) technique drill, need A large amount of training sample data with calibration classification are acquired in advance.
However, finding in practical applications, for some special duties, it is difficult to which acquisition, which is largely marked, (has calibration Classification) small size samples;For example, in the lesion Detection task for eyeground picture, to realize trained detection mould Type is capable of detecting when in the picture of eyeground with the presence or absence of lesion, and is being detected there are positioning when lesion to lesions position, then The lesion mark sample data for needing to get large batch of small size (patch) grade or Pixel-level, currently can only be by artificial The mode chosen in the picture of eyeground, marked carries out sample collection.There are the following problems for above-mentioned artificial sample mode: 1) by In pathological change form complexity, doctor is marked there are very strong subjectivity, and boundary demarcation is more casual, and professional oculist is also difficult The attribute question of lesion boundary pixel is defined, i.e. mark difficulty is big;2) doctor marks that time-consuming manually, obtains the calibration of sample Classification it is at high cost, that is, be difficult to obtain great amount of samples.
Summary of the invention
The disclosure aims to solve at least one of the technical problems existing in the prior art, proposes a kind of sample data acquisition Method obtains system, server and computer-readable medium.
In a first aspect, the embodiment of the present disclosure provides a kind of sample data acquisition methods, comprising:
Female samples pictures database is constructed, mother's samples pictures database includes: multiple female samples with calibration category This picture;
Multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set, often A female samples pictures set includes multiple female samples pictures;
For mother's samples pictures set described in each, using the marquee with predetermined size from mother's samples pictures collection Multiple subsample pictures are extracted in each mother's Zhang Suoshu samples pictures in conjunction, and are assigned for every subsample picture Preliminary category, to obtain subsample picture set corresponding to each described female samples pictures set, the subsample picture Preliminary class be designated as the calibration categories of female samples pictures belonging to it, wherein the size of the marquee is less than female samples pictures Size;
For picture set in subsample described in each, with the whole increment for being included in the subsample picture set This picture and the corresponding preliminary category of each subsample picture train each subsample figure as training sample data Sample classification model corresponding to piece set;
For each subsample picture in each subsample picture set, by the subsample, picture is separately input into In each sample classification model, so that each sample classification model exports corresponding classification results respectively, and frequency is chosen Calibration category of the maximum classification results as the subsample picture.
In some embodiments, the shape of female samples pictures is square;
The shape of the marquee is square;
The ratio of the side length of the side length of the marquee and female samples pictures is equal to the first pre-determined factor q, wherein 0 < Q < 1.
It in some embodiments, will in each subsample picture in each subsample picture set The subsample picture is separately input into each sample classification model, is distinguished output phase for each sample classification model and is answered Classification results, and after the step of choosing calibration category of the maximum classification results of frequency as the subsample picture, Further include:
Judge whether the side length of the subsample picture is less than or equal to predetermined length threshold value;
When the side length for judging the subsample picture is less than or equal to the predetermined length threshold value, then process terminates;
When the side length for judging the subsample picture is greater than the predetermined length threshold value, then to have calibration category The subsample picture constructs new female samples pictures database, and based on new female sample as new female samples pictures Picture database continue to execute it is above-mentioned multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples The step of this picture set.
It in some embodiments, will in each subsample picture in each subsample picture set The subsample picture is separately input into each sample classification model, is distinguished output phase for each sample classification model and is answered Classification results, and after the step of choosing calibration category of the maximum classification results of frequency as the subsample picture, Further include:
The monitoring each subsample picture in each subsample picture set, by the subsample picture point It is not input in each sample classification model, so that each sample classification model exports corresponding classification results respectively, and The circulation for the step of choosing calibration category of the maximum classification results of frequency as the subsample picture executes cumulative number Whether pre-determined number threshold value is reached;
When monitoring the circulation and executing cumulative number and be not up to the pre-determined number threshold value, then to have calibration category The subsample picture as new female samples pictures, construct new female samples pictures database, and based on new female sample This picture database continue to execute it is above-mentioned multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple mothers The step of samples pictures set;
When monitoring the circulation and executing cumulative number and reach the pre-determined number threshold value, then process terminates.
In some embodiments, the first pre-determined factor q meets: 0.5≤q≤0.7.
In some embodiments, multiple repairing weld is being carried out to female samples pictures database, it is corresponding multiple to obtain In the step of female samples pictures set, the quantity for female samples pictures that each mother's samples pictures set is included is equal;
The quantity for female samples pictures that one female samples pictures set is included is wrapped with female samples pictures database The ratio of the quantity of the female samples pictures contained is equal to the second pre-determined factor p, wherein 0 < p < 1.
In some embodiments, the second pre-determined factor p meets: 0.4≤p≤0.6.
In some embodiments, have the marquee of predetermined size from each in mother's samples pictures set in use In the step of extracting multiple subsample pictures in mother's samples pictures, extracted from female samples pictures The quantity of subsample picture is predetermined quantity N;
Wherein predetermined quantity N is positive integer, and 3≤N≤10.
In some embodiments, the step of building mother samples pictures database includes:
Acquire multiple original sample pictures with calibration category;
Size adjusting processing is carried out to the original sample picture, is unitized with the size to original sample picture;
Using the original sample picture of completion size adjusting processing as female samples pictures, to construct female samples pictures Database.
Second aspect, the embodiment of the present disclosure additionally provide a kind of sample data acquisition system, comprising:
First building module, for constructing female samples pictures database, mother's samples pictures database includes: to have mark Determine multiple female samples pictures of category;
Sampling module, for carrying out multiple repairing weld to female samples pictures database, to obtain corresponding multiple female samples This picture set, each female samples pictures set include multiple female samples pictures;
Extraction module, for for each described female samples pictures set, using the marquee with predetermined size from Multiple subsample pictures are extracted in each mother's Zhang Suoshu samples pictures in mother's samples pictures set, and for described in every Subsample picture assigns preliminary category, to obtain subsample picture set corresponding to each described female samples pictures set, The preliminary class of the subsample picture is designated as the calibration category of its affiliated female samples pictures, wherein the size of the marquee is less than The size of mother's samples pictures;
Training module, for being directed to each described subsample picture set, to be included in the subsample picture set The whole subsample picture and the corresponding preliminary category of each subsample picture as training sample data, train Sample classification model corresponding to each subsample picture set;
Processing module, each subsample picture for being directed in each subsample picture set, by the subsample Picture is separately input into each sample classification model, so that each sample classification model exports corresponding classification knot respectively Fruit, and choose calibration category of the maximum classification results of frequency as the subsample picture.
In some embodiments, the shape of female samples pictures is square;
The shape of the marquee is square;
The ratio of the side length of the side length of the marquee and female samples pictures is equal to the first pre-determined factor q, wherein 0 < Q < 1.
In some embodiments, further includes:
Judgment module, for determining each subsample in each subsample picture set in the processing module After the calibration category of picture, judge whether the side length of the subsample picture is less than or equal to predetermined length threshold value;
Second building module, for judging that it is described predetermined that the side length of the subsample picture is greater than when the judgment module When length threshold, then to have the subsample picture of calibration category as new female samples pictures, new female sample is constructed This picture database, and control the sampling module and respective handling is continued to execute based on new female samples pictures database;
First control module, for judging that the side length of the subsample picture is less than or equal to institute when the judgment module When stating predetermined length threshold value, controls the sample data and obtain system stalls.
In some embodiments, further includes:
Monitoring module, for determining each subsample in each subsample picture set in the processing module After the calibration category of picture, the circulation for monitoring the processing module executes whether cumulative number reaches pre-determined number threshold value;
Third constructs module, for the monitoring module monitor circulation execution cumulative number do not reach it is described pre- When determining frequency threshold value, to have the subsample picture of calibration category as new female samples pictures, new female sample is constructed This picture database, and control the sampling module and respective handling is continued to execute based on new female samples pictures database;
Second control module, for reaching described make a reservation for when the monitoring module monitors the circulation execution cumulative number When frequency threshold value, controls the sample data and obtain system stalls.
In some embodiments, the first pre-determined factor q meets: 0.5≤q≤0.7.
In some embodiments, multiple repairing weld is carried out to female samples pictures database in the sampling module, with During corresponding multiple female samples pictures set, the quantity for female samples pictures that each mother's samples pictures set is included It is equal;
The quantity for female samples pictures that one female samples pictures set is included is wrapped with female samples pictures database The ratio of the quantity of the female samples pictures contained is equal to the second pre-determined factor p, wherein 0 < p < 1.
In some embodiments, the second pre-determined factor p meets: 0.4≤p≤0.6.
In some embodiments, use the marquee with predetermined size from mother's samples pictures collection in the extraction module During extracting multiple subsample pictures in each mother's Zhang Suoshu samples pictures in conjunction, from female sample graph The quantity for the subsample picture that piece is extracted is predetermined quantity N;
Wherein predetermined quantity N is positive integer, and 3≤N≤10.
In some embodiments, the first building module includes:
Acquisition unit, for acquiring multiple original sample pictures with calibration category;
Size adjusting unit, for carrying out size adjusting processing to the original sample picture, to original sample picture Size unitize;
Construction unit, the original sample picture for that will complete size adjusting processing is used as female samples pictures, with structure Build out female samples pictures database.
The third aspect, the embodiment of the present disclosure additionally provide a kind of server, comprising:
One or more processors;
Storage device is stored thereon with one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of places It manages device and realizes the method as provided by aforementioned any embodiment.
Fourth aspect, the embodiment of the present disclosure additionally provide a kind of computer-readable medium, are stored thereon with computer program, Wherein, the method as provided by aforementioned any embodiment is realized when described program is executed by processor.
The disclosure has the advantages that
The embodiment of the present disclosure provides a kind of sample data acquisition methods, it can be achieved that can mention from large-sized samples pictures The samples pictures of a large amount of small size are taken out, and carry out automatic marking for the samples pictures of these small sizes.
Detailed description of the invention
Fig. 1 is a kind of flow chart for sample data acquisition methods that the embodiment of the present disclosure provides;
Fig. 2 is a kind of specific implementation flow chart of step S1 in the disclosure;
Fig. 3 is the flow chart for another sample data acquisition methods that the embodiment of the present disclosure provides;
Fig. 4 is the flow chart for another sample data acquisition methods that the embodiment of the present disclosure provides;
Fig. 5 is the structural block diagram that a kind of sample data that the embodiment of the present disclosure provides obtains system;
Fig. 6 is a kind of structural block diagram of the first building module in the disclosure;
Fig. 7 is the structural block diagram that another sample data that the embodiment of the present disclosure provides obtains system;
Fig. 8 is the structural block diagram that another sample data that the embodiment of the present disclosure provides obtains system.
Specific embodiment
To make those skilled in the art more fully understand the technical solution of the disclosure, the disclosure is mentioned with reference to the accompanying drawing A kind of sample data acquisition methods, acquisition system, server and the computer-readable medium supplied is described in detail.
Example embodiment will hereinafter be described more fully hereinafter with reference to the accompanying drawings, but the example embodiment can be with difference Form embodies and should not be construed as being limited to embodiment set forth herein.Conversely, the purpose for providing these embodiments is It is thoroughly and complete to make the disclosure, and those skilled in the art will be made to fully understand the scope of the present disclosure.
Term as used herein is only used for description specific embodiment, and is not intended to limit the disclosure.As used herein , "one" is also intended to "the" including plural form singular, unless in addition context is expressly noted that.It will also be appreciated that Be, when in this specification use term " includes " and/or " by ... be made " when, specify there are the feature, entirety, step, Operation, element and/or component, but do not preclude the presence or addition of other one or more features, entirety, step, operation, element, Component and/or its group.
Although these elements are not it will be appreciated that term first, second etc. can be used herein to describe various elements It should be limited to these terms.These terms are only used to distinguish an element and another element.Therefore, without departing substantially from the disclosure In the case where giving advice, first element, first assembly or the first component being discussed below can be described as second element, the second component or Two components.
Embodiment described herein can be by the idealized schematic diagram of the disclosure and reference planes figure and/or sectional view are retouched It states.It therefore, can be according to manufacturing technology and/or tolerance come modified example diagram.Therefore, embodiment is not limited to reality shown in the drawings Apply example, but the modification of the configuration including being formed based on manufacturing process.Therefore, the area illustrated in attached drawing, which has, schematically to be belonged to Property, and the shape in area as shown in the figure instantiates the concrete shape in the area of element, but is not intended to restrictive.
Unless otherwise defined, the otherwise meaning of all terms (including technical and scientific term) used herein and this field The normally understood meaning of those of ordinary skill is identical.It will also be understood that such as those those of limit term in common dictionary and answer When being interpreted as having and its consistent meaning of meaning under the background of the relevant technologies and the disclosure, and will be not interpreted as having There are idealization or excessively formal meaning, unless clear herein so limit.
" sample " in the disclosure each means picture sample, and the sample data acquisition methods in the disclosure are directed to for obtaining The mark sample data of preplanned mission, these mark sample datas both may be the positive sample data for preplanned mission, can also It can be the negative sample data for preplanned mission;Wherein, the preplanned mission can for segmentation task, classification task, location tasks, Identification mission etc. is arbitrarily applicable to the task of depth learning technology, and the technical solution of the disclosure is to the specific of above-mentioned preplanned mission Type is not construed as limiting.
In addition, " the mark sample data " in the disclosure refers to the picture sample of calibration category, the kind of category is demarcated Class and quantity are as manually previously according to set by specific preplanned mission;For example, preplanned mission is the disease for eyeground picture Stove Detection task, then demarcating category can be set to " lesion " sample and two class of " non-lesion " sample, naturally it is also possible to as needed Calibration class is denoted as further refining so that the subsequent detection model trained can identify the specific type of lesion, Such as calibration category can be set as " blood spots type lesion " sample, " exudative type lesion " sample, " velveteen spot type lesion " sample ... Multiple classes such as " non-lesion " sample.It should be noted that the technical solution of the disclosure is equal to the type and quantity of " calibration category " It is not construed as limiting.
Fig. 1 is a kind of flow chart for sample data acquisition methods that the embodiment of the present disclosure provides, as shown in Figure 1,
Step S1, female samples pictures database is constructed, female samples pictures database includes: multiple mothers with calibration category Samples pictures.
Fig. 2 is a kind of specific implementation flow chart of step S1 in the disclosure, as shown in Fig. 2, step S1 includes:
Step S101, acquisition has multiple original sample pictures of calibration category.
In the disclosure, original sample picture refers to the big ruler that mark (having calibration classification) is completed for preplanned mission Very little samples pictures, these original sample pictures are without any processing.Furthermore, it is contemplated that in practical applications, appointing for predetermined The acquisition difficulty and significance level of the positive sample data of business will be far longer than the acquisition difficulty and significance level of negative sample data, It therefore should the positive sample data got for preplanned mission as far as possible more.For this purpose, selected original sample picture should use up Calibration classification may be selected to correspond to the picture of positive sample.
It is below for eye with preplanned mission for the technical solution for better understanding the disclosure convenient for those skilled in the art The lesion Detection task of negative film, in case where pre-set calibration category includes " lesion " and " non-lesion " two classes, Carry out exemplary description.Wherein, the picture that calibration class is designated as " lesion " can be used as positive sample, and calibration class is designated as the figure of " non-lesion " Piece can be used as negative sample.Those skilled in the art it should also be understood that above-mentioned setting only play the role of it is exemplary, will not be to the disclosure Technical solution generate limitation.
In step s101, can using the large scale eyeground picture of the calibration category having (can be marked in advance by manually) as Original sample picture;Certainly, after to make the sample data acquisition methods provided by the disclosure, can obtain as far as possible To more positive samples, original sample picture, which is selected, demarcates the large scale eyeground picture that class is designated as " lesion ".
Step S102, size adjusting processing is carried out to original sample picture, is united with the size to original sample picture One changes.
In step s 102, it is contemplated that the sizes of different original sample pictures may be different, for convenience of it is subsequent can be to not It is uniformly processed with original sample picture, then needs to carry out these original sample pictures size adjusting (Resize) processing, with It unitizes to the size of original sample picture.
For handling eye fundus image, usual eye fundus image is wider than height, then can first to a left side for eye fundus image, Right two side portions are cut, so that the shape of eye fundus image becomes square;Then the eye fundus image after cutting is unified To be sized, which can be designed and adjust according to the actual situation for Resize processing.It can as one kind Embodiment is selected, treated that eye fundus image shape is square by Resize, having a size of H × H, H=1600 pixel.
Resize processing is carried out to reach the unitized realization process of size as the ordinary skill in the art, herein to picture Without detailed description.
Step S103, using the original sample picture of completion size adjusting processing as female samples pictures, to construct female sample This picture database.
In step s 103, using the original sample picture of completion size adjusting processing as female samples pictures, to construct Female samples pictures database, mother's samples pictures database include: multiple female samples pictures with calibration category.
It should be noted that above-mentioned carry out Resize processing to original sample picture, with the size to original sample picture Unitized situation is carried out, the preferred embodiment in the disclosure is belonged to, can be united convenient for subsequent to each original sample picture One processing, promotes treatment effeciency, will not generate restriction effect to the technology of the disclosure.
Step S2, multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set, Each mother's samples pictures set includes multiple female samples pictures.
It in step s 2, can be by stochastical sampling mode or based on the sample mode of certain rule come to female sample graph the piece number Multiple repairing weld is carried out according to library, which, which can be used, puts back to sampling or without putting back to sampling.Wherein, sampling can adopt every time Collect multiple female samples pictures out, multiple female samples pictures that sampling acquires out every time constitute a female samples pictures set.
As a kind of specific optinal plan, the progress of female samples pictures database is repeatedly put using stochastical sampling mode Back production sample, and the quantity for sampling female samples pictures collected every time is equal;It should be noted that using above-mentioned sample mode In obtained mother's samples pictures set, there may be intersection between different mother's samples pictures set.
Further, it is assumed that the quantity for female samples pictures that female samples pictures database is included is denoted as C, then each female sample The quantity for female samples pictures that this picture set is included can be p*C, i.e., female sample that one female samples pictures set is included The ratio of the quantity for female samples pictures that the quantity of picture and female samples pictures database are included is p, wherein 0 < p < 1, p's Specific value can be designed and adjust according to the actual situation.
It should be noted that the value of p is bigger, then two different female samples pictures set include identical female sample graph The quantity of piece is more, and the otherness between two different female samples pictures set is smaller, is unfavorable for subsequent step S4 and step S5 Training and mark;However, the value of p is smaller, then female samples pictures quantity that each female samples pictures set is included is fewer, Cause the sample size that can finally obtain after sample data acquisition methods less.Comprehensively consider above-mentioned factor, the disclosure In preferably, 0.4≤p≤0.6;It is further preferred that p=0.5.
In addition, the quantity of mother's samples pictures set obtained in step S2 is denoted as M, M is greater than or equal to 2 to be preset Positive integer, the specific value of M can be designed and adjust according to the actual situation.
It in the disclosure, is to allow each female samples pictures in female samples pictures database that can be adopted as far as possible Sample is at least one female samples pictures set, then the value of M*p should be greater than 1, wherein in the bigger picture database of the value of M*p The probability that female samples pictures can be sampled to female samples pictures set is bigger, and the value of certain M*p the big, will lead to subsequent system The treating capacity of system is bigger.Comprehensively consider above-mentioned factor, in the disclosure preferably, the value of M*p meets 1 < M*p < 10.
Step S3, for each female samples pictures set, using the marquee with predetermined size from mother's sample graph Multiple subsample pictures are extracted in each female samples pictures in piece set, and are assigned tentatively for every subsample picture Category, to obtain subsample picture set corresponding to each female samples pictures set, the preliminary class of subsample picture is designated as The calibration category of female samples pictures belonging to it, wherein the size of marquee is less than the size of female samples pictures.
In step s3, it when extracting multiple subsample pictures from a female samples pictures using marquee, can use Random extracting mode carries out subsample picture according to certain Rule Extraction mode, belongs to the protection scope of the disclosure.Separately Outside, from multiple subsample pictures that a female samples pictures are extracted, there may be the presence of part subsample picture Part is overlapping, this kind of situation will not have an impact the technical solution of the disclosure.
Alternatively, the shape of female samples pictures is square;The shape of marquee is square;Wherein, It is assumed that the side length of female samples pictures is H, then the side length of preset marquee can be q*H, the i.e. side length of marquee and female sample The ratio of the side length of this picture is equal to the first pre-determined factor q, wherein the specific value of 0 < q < 1, q can carry out according to the actual situation Design and adjustment.
It should be noted that q value is larger in the case where the side length of female samples pictures is certain, then marquee size is got over Greatly, the size of the subsample picture obtained is bigger, it is difficult to meet " small size " demand of user;Q value is smaller, then marquee ruler Very little smaller, the probability that marquee can get positive sample is smaller.Comprehensively consider above-mentioned factor, in the disclosure preferably, first is pre- Determine coefficient q satisfaction: 0.5≤q≤0.7.It is further preferred that q=0.6.
For convenience of description, it is assumed that extracting N subsample pictures in each female samples pictures, (N is preset Positive integer greater than 1), then for female samples pictures in a female samples pictures set, amounts to and can extract N*p*C subsamples Picture, the N*p*C subsample pictures constitute a sub- samples pictures set.Therefore, pass through step S3, available M son Samples pictures set, and each subsample picture set includes N*p*C subsample pictures.
Alternatively, predetermined quantity N meets: 3≤N≤10.
For each subsample picture extracted, configure corresponding preliminary category for it, subsample picture just Step class is designated as the calibration category of its affiliated female samples pictures.
Step S4, it is directed to each subsample picture set, with the whole increments for being included in the subsample picture set This picture and the corresponding preliminary category of each subsample picture train each subsample picture set institute as training sample data Corresponding sample classification model.
In step s 4, be based on depth learning technology, can according to whole subsample pictures in the picture set of subsample with And the corresponding preliminary category of each subsample picture should to train sample classification model corresponding with the subsample picture set Sample classification model can be used for carrying out classification processing to the sample of input.It should be noted that based on depth learning technology according to Sample belongs to the ordinary skill in the art to train the process of corresponding model, is not be described in detail herein
By step S4, can train and the M sub- one-to-one M sample classification models of samples pictures set.
Step S5, for each subsample picture in each subsample picture set, by the subsample picture point It is not input in this disaggregated model of various kinds, for various kinds, this disaggregated model exports corresponding classification results respectively, and chooses frequency Calibration category of the maximum classification results as the subsample picture.
Based on above mentioned step S3 it is found that M sub- samples pictures set amount to comprising M*N*p*C having a size of q*H × q*H's Subsample picture.In step s 5, for each subsample picture in M*N*p*C subsample pictures, by the subsample Picture is exported respectively into M sample classification model, so that M classification results are obtained, by M classification results only statistic of classification Select calibration category of the maximum classification results of frequency as the subsample picture.By step S5, can be realized for Each subsample picture in M*N*p*C subsample pictures configures corresponding calibration category and (carries out certainly for subsample picture Dynamic mark).
Based on above content as it can be seen that by executing an above-mentioned steps S1~step S5, it can be achieved that from C having a size of H × H Large scale samples pictures in get M*N*p*C as the subsample picture having a size of q*H × q*H, and to M*N*p*C sons Samples pictures realize automatic marking.It should be noted that the part in above-mentioned M*N*p*C subsample pictures can be used as positive sample This, partially can be used as negative sample.
In the disclosure, by circulation execute above-mentioned steps S2~step S5, can get it is more, smaller and by from The subsample picture of dynamic mark.It is described below in conjunction with specific embodiment.
Fig. 3 is the flow chart for another sample data acquisition methods that the embodiment of the present disclosure provides, as shown in figure 3, the sample Notebook data acquisition methods include:
Step S1, female samples pictures database is constructed, female samples pictures database includes: multiple mothers with calibration category Samples pictures.
In step sl, the quantity for female samples pictures that female samples pictures database is included is C;The shape of female samples pictures Shape is square, side length H.
Step S2, multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set, Each mother's samples pictures set includes multiple female samples pictures.
In step s 2, the quantity of female samples pictures set is M, female sample graph that each mother's samples pictures set is included The ratio of the quantity of mother's samples pictures included in the quantity of piece and female samples pictures database is equal to the second pre-determined factor p.
Step S3, for each female samples pictures set, using the marquee with predetermined size from mother's sample graph Multiple subsample pictures are extracted in each female samples pictures in piece set, and are assigned tentatively for every subsample picture Category, to obtain subsample picture set corresponding to each female samples pictures set, the preliminary class of subsample picture is designated as The calibration category of female samples pictures belonging to it, wherein the size of marquee is less than the size of female samples pictures.
In step s3, the shape of marquee is square, the ratio of the side length of the side length of marquee and female samples pictures Equal to the first pre-determined factor q;N subsample pictures are extracted in each female samples pictures.
Step S4, it is directed to each subsample picture set, with the whole increments for being included in the subsample picture set This picture and the corresponding preliminary category of each subsample picture train each subsample picture set institute as training sample data Corresponding sample classification model.
Step S5, for each subsample picture in each subsample picture set, by the subsample picture point It is not input in this disaggregated model of various kinds, for various kinds, this disaggregated model exports corresponding classification results respectively, and chooses frequency Calibration category of the maximum classification results as the subsample picture.
Step S6a, judge whether the side length of subsample picture is less than or equal to predetermined length threshold value.
In step S6a, the specific value of predetermined length threshold value is the training sample picture according to required for preplanned mission Size by artificial preset.For example, when preplanned mission is the lesion Detection task for eyeground picture, it is contemplated that institute The ideal dimensions of the training sample picture needed should be less than or be equal to 16 × 16 (units: pixel), at this time can be by predetermined length threshold Value is designed as 16 pixels.
When step S6a judges the side length of subsample picture less than or equal to predetermined length threshold value, then show nearest one The secondary size for executing the step a subsample picture accessed by S5 meets scheduled demand, and the last time executes the step S5 Each accessed subsample picture can be used as required training sample picture, and process terminates;When step S6a judges It is when the side length of subsample picture is greater than predetermined length threshold value, then the last to execute the step a subsample figure accessed by S5 Piece it is oversized, need to continue extract small size subsample picture treatment process, hereafter execute step S7a.
Step S7a, to have the subsample picture of calibration category as new female samples pictures, new female sample is constructed Picture database.
After step S7a, above-mentioned steps S2 is executed based on new female samples pictures database again, is executed with recycling Step S2~step S7a, until the step S6a in certain cyclic process judges that the side length of subsample picture is less than or waits In predetermined length threshold value.It should be noted that the S1 of step in this present embodiment~step S5 specific descriptions, reference can be made to aforementioned Corresponding contents in embodiment, details are not described herein again.
During above-mentioned circulation executes step S2~step S7a, when executing the step S5 i times, obtained completion The quantity of the subsample picture of standard is (M*N*p)i* C, the side length of every subsample picture are qi* H, i are positive integer.
By sample data acquisition methods shown in Fig. 3, side length can be extracted from large-sized samples pictures and is less than or waits Automatic marking is carried out in the samples pictures of the small size of predetermined length threshold value, and for the samples pictures of these small sizes.It is same with this When, the size of finally obtained subsample picture can be controlled based on " predetermined length threshold value ".
Fig. 4 is the flow chart for another sample data acquisition methods that the embodiment of the present disclosure provides, as shown in figure 4, and Fig. 3 In finally obtained subsample picture is controlled based on " predetermined length threshold value " size scheme unlike, it is real shown in Fig. 4 It applies the circulation in example based on step S5 and executes cumulative number to control the size of finally obtained subsample picture.The sample data Acquisition methods include:
Step S1, female samples pictures database is constructed, female samples pictures database includes: multiple mothers with calibration category Samples pictures.
In step sl, the quantity for female samples pictures that female samples pictures database is included is C;The shape of female samples pictures Shape is square, side length H.
To realize that recycling execution cumulative number to step S5 is monitored, and can configure a variable constant i, the variable constant I indicates that the circulation of step S5 executes cumulative number.Before step S2 execution, cumulative number i first can be executed to circulation and carried out just Beginningization, even i=0;It should be noted that enable the operation of i=0 can be executed before step S1 (not providing respective drawings) or (shown in referring to fig. 4) is executed between step S1 and step S2, belongs to the protection scope of the disclosure.
Step S2, multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set, Each mother's samples pictures set includes multiple female samples pictures.
In step s 2, the quantity of female samples pictures set is M, female sample graph that each mother's samples pictures set is included The ratio of the quantity of mother's samples pictures included in the quantity of piece and female samples pictures database is equal to the second pre-determined factor p.
Step S3, for each female samples pictures set, using the marquee with predetermined size from mother's sample graph Multiple subsample pictures are extracted in each female samples pictures in piece set, and are assigned tentatively for every subsample picture Category, to obtain subsample picture set corresponding to each female samples pictures set, the preliminary class of subsample picture is designated as The calibration category of female samples pictures belonging to it, wherein the size of marquee is less than the size of female samples pictures.
In step s3, the shape of marquee is square, the ratio of the side length of the side length of marquee and female samples pictures Equal to the first pre-determined factor q;N subsample pictures are extracted in each female samples pictures.
Step S4, it is directed to each subsample picture set, with the whole increments for being included in the subsample picture set This picture and the corresponding preliminary category of each subsample picture train each subsample picture set institute as training sample data Corresponding sample classification model.
Step S5, for each subsample picture in each subsample picture set, by the subsample picture point It is not input in this disaggregated model of various kinds, for various kinds, this disaggregated model exports corresponding classification results respectively, and chooses frequency Calibration category of the maximum classification results as the subsample picture.
It should be noted that every executed a step S5, it is performed both by an i=i+1, to realize the circulation to step S5 Cumulative number is executed to be counted.
Step S6b, the circulation of monitoring step S5 executes whether cumulative number i reaches pre-determined number threshold value.
In step S6b, when the circulation for monitoring step S5, which executes cumulative number i, reaches pre-determined number threshold value I, then flow Journey terminates;When the circulation for monitoring step S5, which executes cumulative number i, is not up to pre-determined number threshold value I, S7b is thened follow the steps.
Step S7b, to have the subsample picture of calibration category as new female samples pictures, new female sample is constructed Picture database.
After step S7b, above-mentioned steps S2 is executed based on new female samples pictures database again, is executed with recycling Step S2~step S7b, until the step S6b in certain cyclic process judges that the circulation of step S5 executes cumulative number I reaches pre-determined number threshold value I.It should be noted that the S1 of step in this present embodiment~step S5 specific descriptions, reference can be made to Corresponding contents in previous embodiment, details are not described herein again.
During above-mentioned circulation executes step S2~step S7b, when executing the step S5 i times, obtained completion The quantity of the subsample picture of standard is (M*N*p)i* C, the side length of every subsample picture are qi* H, i are positive integer.
It should be noted that the specific value of pre-determined number threshold value is the training sample picture according to required for preplanned mission Size by artificial preset.For example, when preplanned mission is the lesion Detection task for eyeground picture, it is assumed that step The side length of female samples pictures is H=1600 pixel in S1, the first pre-determined factor q=0.6 in step S3, then precomputes the The size of obtained subsample picture is 16 × 16 (units: pixel) after 9 execution above-mentioned steps S5.At this point, this is predetermined Frequency threshold value may be set to 9.
By sample data acquisition methods shown in Fig. 4, side length can be extracted from large-sized samples pictures and is less than or waits Automatic marking is carried out in the samples pictures of the small size of predetermined length threshold value, and for the samples pictures of these small sizes.It is same with this When, the size of finally obtained subsample picture can be controlled based on " pre-determined number threshold value ".
Fig. 5 is the structural block diagram that a kind of sample data that the embodiment of the present disclosure provides obtains system, as shown in figure 5, the sample Notebook data, which obtains system, can be used for realizing sample data acquisition methods provided by the various embodiments described above, which obtains system System includes: the first building module 1, sampling module 2, extraction module 3, training module 4 and processing module 5.
Wherein, for the first building module 1 for constructing female samples pictures database, female samples pictures database includes: to have Demarcate multiple female samples pictures of category.
Sampling module 2 is used to carry out multiple repairing weld to female samples pictures database, to obtain corresponding multiple female sample graphs Piece set, each mother's samples pictures set include multiple female samples pictures;
Extraction module 3 is used for for each female samples pictures set, using the marquee with predetermined size from the mother Multiple subsample pictures are extracted in each female samples pictures in samples pictures set, and are assigned for every subsample picture Give preliminary category, to obtain subsample picture set corresponding to each female samples pictures set, subsample picture it is preliminary Class is designated as the calibration category of its affiliated female samples pictures, and wherein the size of marquee is less than the size of female samples pictures;
Training module 4 is used to be directed to each subsample picture set, complete with included in the subsample picture set Portion subsample picture and the corresponding preliminary category of each subsample picture train each subsample picture as training sample data The corresponding sample classification model of set;
Processing module 5 is used for for each subsample picture in each subsample picture set, by the subsample Picture is separately input into this disaggregated model of various kinds, and for various kinds, this disaggregated model exports corresponding classification results respectively, and selects Take calibration category of the maximum classification results of frequency as the subsample picture.
Fig. 6 is a kind of structural block diagram of the first building module in the disclosure, as shown in fig. 6, alternatively, the One building module 1 includes: acquisition unit 101, size adjusting unit 102 and construction unit 103.
Wherein, acquisition unit 101 is used to acquire multiple original sample pictures with calibration category.
Size adjusting unit 102 is used to carry out size adjusting processing to original sample picture, to original sample picture Size unitizes.
Construction unit 103 is used to complete the original sample picture of size adjusting processing as female samples pictures, with building Female samples pictures database out.
In some embodiments, the shape of female samples pictures is square;The shape of marquee is square;Marquee The ratio of the side length of side length and female samples pictures is equal to the first pre-determined factor q, wherein 0 < q < 1;It is further preferred that first is pre- Determine coefficient q satisfaction: 0.5≤q≤0.7.
In some embodiments, multiple repairing weld is carried out in 2 pairs of sampling module female samples pictures databases, it is corresponding to obtain During multiple mother's samples pictures set, the quantity for female samples pictures that each mother's samples pictures set is included is equal;One Female samples pictures that the quantity for female samples pictures that a mother's samples pictures set is included and female samples pictures database are included Quantity ratio be equal to the second pre-determined factor p, wherein 0 < p < 1.It is further preferred that the second pre-determined factor p meets: 0.4 ≤p≤0.6。
In some embodiments, using in extraction module 3 has the marquee of predetermined size from mother's samples pictures set During extracting multiple subsample pictures in the female samples pictures of interior each, extracted from a female samples pictures Subsample picture quantity be predetermined quantity N;Wherein predetermined quantity N is positive integer, and 3≤N≤10.
Corresponding contents in preceding method embodiment can be found in each module in this present embodiment, the specific descriptions of unit, This is repeated no more.
Fig. 7 is the structural block diagram that another sample data that the embodiment of the present disclosure provides obtains system, as shown in fig. 7, Fig. 7 Shown sample data, which obtains system, can be used for realizing sample data acquisition methods shown in Fig. 3, and sample data shown in Fig. 7 obtains system Not only include the first building module 1, sampling module 2, extraction module 3, training module 4 and processing module 5 shown in Fig. 5, goes back It include: judgment module 6a, the second building module 7a and the first control module 8a.
Wherein, judgment module 6a is used to determine each increment in each subsample picture set in processing module After the calibration category of this picture, judge whether the side length of subsample picture is less than or equal to predetermined length threshold value.
Second building module 7a is used to judge that the side length of subsample picture is greater than predetermined length threshold value as judgment module 6a When, then to have the subsample picture of calibration category as new female samples pictures, new female samples pictures database is constructed, And it controls sampling module 2 and respective handling is continued to execute based on new female samples pictures database;
First control module 8a is used to judge that the side length of subsample picture is less than or equal to pre- fixed length as judgment module 6a When spending threshold value, control sample data obtains system stalls.
Corresponding contents in preceding method embodiment can be found in the specific descriptions of each module in this present embodiment, this time not It repeats again.
Fig. 8 is the structural block diagram that another sample data that the embodiment of the present disclosure provides obtains system, as shown in figure 8, Fig. 8 Shown sample data, which obtains system, can be used for realizing sample data acquisition methods shown in Fig. 4, and sample data shown in Fig. 8 obtains system Not only include the first building module 1, sampling module 2, extraction module 3, training module 4 and processing module 5 shown in Fig. 5, goes back It include: monitoring module 6b, third building module 7b and the second control module 8b.
Monitoring module 6b is used to determine each subsample figure in each subsample picture set in processing module After the calibration category of piece, the circulation of monitor processing module executes whether cumulative number reaches pre-determined number threshold value;
Third building module 7b, which is used to monitor circulation execution cumulative number in monitoring module 6b, does not reach pre-determined number threshold When value, to have the subsample picture of calibration category as new female samples pictures, new female samples pictures database is constructed, And it controls sampling module 2 and respective handling is continued to execute based on new female samples pictures database;
Second control module 8b is used to execute cumulative number arrival pre-determined number threshold value when monitoring module 6b monitors circulation When, control sample data obtains system stalls.
Corresponding contents in preceding method embodiment can be found in the specific descriptions of each module in this present embodiment, this time not It repeats again.
As a kind of concrete application scene, by taking preplanned mission is for the lesion Detection task of eyeground picture as an example, carry out Example description.
Firstly, using the eyeground picture for completing to mark as original sample, using sample provided by aforementioned any embodiment Data capture method or sample data obtain system, and to these eyeground, picture is handled, to obtain a large amount of small size increment This picture, and these small size subsamples are marked.In which it is assumed that the size of finally obtained small size subsample picture is w×d;
Then, using the above-mentioned a large amount of small size subsamples picture got as training sample data, it is based on deep learning Technology generates the lesion detection model for being directed to lesion Detection task.It is assumed that the lesion detection model trained is two disaggregated models, Two disaggregated model can be used for detecting in the picture of input with the presence or absence of lesion.
Then, eyeground picture (not being labeled) to be processed is marked off into multiple detection zones having a size of w × d, and Using image corresponding to each detection zone as input data, it is input in the lesion detection model previously trained, with right It detects in each detection zone with the presence or absence of lesion;
When detecting that there are when lesion, then identify the eyeground picture memory to be processed at least one detection zone Focal area is positioned in lesion, and according to there are the detection zones of lesion;When detecting there is no detection zone memory In lesion, then identify that there is no lesions in eyeground picture to be processed.
The embodiment of the present disclosure additionally provides a kind of server, which includes sample data provided by previous embodiment Acquisition system.
The embodiment of the present disclosure additionally provides a kind of server, which includes: one or more processors and storage Device;Wherein, one or more programs are stored on storage device, when said one or multiple programs are by said one or multiple When processor executes, so that said one or multiple processors realize the sample data acquisition side as provided by previous embodiment Method.
The embodiment of the present disclosure additionally provides a computer readable storage medium, is stored thereon with computer program, wherein should Computer program, which is performed, realizes the sample data acquisition methods as provided by previous embodiment.
It will appreciated by the skilled person that in whole or certain steps, device in method disclosed hereinabove Functional module/unit may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment, with Division between the functional module/unit referred in upper description not necessarily corresponds to the division of physical assemblies;For example, a physics Component can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain physical sets Part or all physical assemblies may be implemented as by processor, as central processing unit, digital signal processor or microprocessor are held Capable software is perhaps implemented as hardware or is implemented as integrated circuit, such as specific integrated circuit.Such software can be with Distribution on a computer-readable medium, computer-readable medium may include computer storage medium (or non-transitory medium) and Communication media (or fugitive medium).As known to a person of ordinary skill in the art, term computer storage medium is included in use In any method or technique of storage information (such as computer readable instructions, data structure, program module or other data) The volatile and non-volatile of implementation, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, Tape, disk storage or other magnetic memory apparatus or it can be used for storing desired information and can be accessed by a computer Any other medium.In addition, known to a person of ordinary skill in the art be, communication media generally comprises computer-readable finger It enables, other data in the modulated data signal of data structure, program module or such as carrier wave or other transmission mechanisms etc, It and may include any information delivery media.
Example embodiment has been disclosed herein, although and use concrete term, they are only used for simultaneously only should It is interpreted general remark meaning, and is not used in the purpose of limitation.In some instances, aobvious to those skilled in the art and Be clear to, unless otherwise expressly stated, the feature that description is combined with specific embodiment that otherwise can be used alone, characteristic and/ Or element, or the feature, characteristic and/or element of description can be combined with other embodiments and be applied in combination.Therefore, art technology Personnel will be understood that, in the case where not departing from the scope of the present disclosure illustrated by the attached claims, can carry out various forms With the change in details.

Claims (20)

1. a kind of sample data acquisition methods characterized by comprising
Female samples pictures database is constructed, mother's samples pictures database includes: multiple female sample graphs with calibration category Piece;
Multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples pictures set, Mei Gesuo Stating female samples pictures set includes multiple female samples pictures;
For mother's samples pictures set described in each, using the marquee with predetermined size out of this mother's samples pictures set Each mother's Zhang Suoshu samples pictures in extract multiple subsample pictures, and assign for every subsample picture preliminary Category, to obtain subsample picture set corresponding to each described female samples pictures set, the subsample picture just Step class is designated as the calibration category of its affiliated female samples pictures, wherein the size of the marquee is less than the ruler of female samples pictures It is very little;
For picture set in subsample described in each, with the whole subsample figure for being included in the subsample picture set Piece and the corresponding preliminary category of each subsample picture train each subsample pictures as training sample data Close corresponding sample classification model;
For each subsample picture in each subsample picture set, by the subsample, picture is separately input into each institute It states in sample classification model, so that each sample classification model exports corresponding classification results respectively, and chooses frequency maximum Calibration category of the classification results as the subsample picture.
2. the method according to claim 1, wherein the shape of mother's samples pictures is square;
The shape of the marquee is square;
The ratio of the side length of the side length of the marquee and female samples pictures is equal to the first pre-determined factor q, wherein 0 < q < 1。
3. according to the method described in claim 2, it is characterized in that, described for every in each subsample picture set One subsample picture, by the subsample, picture is separately input into each sample classification model, for each sample point Class model exports corresponding classification results respectively, and chooses mark of the maximum classification results of frequency as the subsample picture After the step of determining category, further includes:
Judge whether the side length of the subsample picture is less than or equal to predetermined length threshold value;
When the side length for judging the subsample picture is less than or equal to the predetermined length threshold value, then process terminates;
When the side length for judging the subsample picture is greater than the predetermined length threshold value, then to have described in calibration category Subsample picture constructs new female samples pictures database, and based on new female samples pictures as new female samples pictures Database continue to execute it is above-mentioned multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female sample graphs The step of piece set.
4. according to the method described in claim 2, it is characterized in that, described for every in each subsample picture set One subsample picture, by the subsample, picture is separately input into each sample classification model, for each sample point Class model exports corresponding classification results respectively, and chooses mark of the maximum classification results of frequency as the subsample picture After the step of determining category, further includes:
The monitoring each subsample picture in each subsample picture set, which is distinguished defeated Enter into each sample classification model, so that each sample classification model exports corresponding classification results respectively, and chooses Whether the circulation of the step of calibration category of the maximum classification results of frequency as the subsample picture executes cumulative number Reach pre-determined number threshold value;
When monitoring the circulation and executing cumulative number and be not up to the pre-determined number threshold value, then to have the institute of calibration category Subsample picture is stated as new female samples pictures, constructs new female samples pictures database, and based on new female sample graph Sheet data library continue to execute it is above-mentioned multiple repairing weld is carried out to female samples pictures database, to obtain corresponding multiple female samples The step of picture set;
When monitoring the circulation and executing cumulative number and reach the pre-determined number threshold value, then process terminates.
5. according to the method described in claim 2, it is characterized in that, the first pre-determined factor q meets: 0.5≤q≤0.7.
6. the method according to claim 1, wherein repeatedly being adopted to female samples pictures database Sample, the step of to obtain corresponding multiple female samples pictures set in, female sample graph that each mother's samples pictures set is included The quantity of piece is equal;
The quantity for female samples pictures that one female samples pictures set is included and female samples pictures database are included The ratio of the quantity of female samples pictures is equal to the second pre-determined factor p, wherein 0 < p < 1.
7. according to the method described in claim 6, it is characterized in that, the second pre-determined factor p meets: 0.4≤p≤0.6.
8. the method according to claim 1, wherein having the marquee of predetermined size from mother's sample using In the step of extracting multiple subsample pictures in each mother's Zhang Suoshu samples pictures in picture set, from a mother The quantity for the subsample picture that samples pictures are extracted is predetermined quantity N;
Wherein predetermined quantity N is positive integer, and 3≤N≤10.
9. any method in -8 according to claim 1, which is characterized in that the step for constructing female samples pictures database Suddenly include:
Acquire multiple original sample pictures with calibration category;
Size adjusting processing is carried out to the original sample picture, is unitized with the size to original sample picture;
Using the original sample picture of completion size adjusting processing as female samples pictures, to construct female samples pictures data Library.
10. a kind of sample data obtains system characterized by comprising
First building module, for constructing female samples pictures database, mother's samples pictures database includes: to have calibration class Multiple female samples pictures of target;
Sampling module, for carrying out multiple repairing weld to female samples pictures database, to obtain corresponding multiple female sample graphs Piece set, each female samples pictures set include multiple female samples pictures;
Extraction module, for for each described female samples pictures set, using the marquee with predetermined size from the mother Multiple subsample pictures are extracted in each mother's Zhang Suoshu samples pictures in samples pictures set, and are every increment This picture assigns preliminary category, described to obtain subsample picture set corresponding to each described female samples pictures set The preliminary class of subsample picture is designated as the calibration category of its affiliated female samples pictures, wherein the size of the marquee is less than described The size of female samples pictures;
Training module is complete with included in the subsample picture set for being directed to each described subsample picture set Subsample picture described in portion and the corresponding preliminary category of each subsample picture train each institute as training sample data State sample classification model corresponding to the picture set of subsample;
Processing module, each subsample picture for being directed in each subsample picture set, by the subsample picture It is separately input into each sample classification model, so that each sample classification model exports corresponding classification results respectively, And choose calibration category of the maximum classification results of frequency as the subsample picture.
11. system according to claim 10, which is characterized in that the shape of mother's samples pictures is square;
The shape of the marquee is square;
The ratio of the side length of the side length of the marquee and female samples pictures is equal to the first pre-determined factor q, wherein 0 < q < 1。
12. system according to claim 11, which is characterized in that further include:
Judgment module, for determining each subsample picture in each subsample picture set in the processing module Calibration category after, judge whether the side length of the subsample picture is less than or equal to predetermined length threshold value;
Second building module, for judging that the side length of the subsample picture is greater than the predetermined length when the judgment module When threshold value, then to have the subsample picture of calibration category as new female samples pictures, new female sample graph is constructed Sheet data library, and control the sampling module and respective handling is continued to execute based on new female samples pictures database;
First control module, for judging that it is described pre- that the side length of the subsample picture is less than or equal to when the judgment module When measured length threshold value, controls the sample data and obtain system stalls.
13. system according to claim 11, which is characterized in that further include:
Monitoring module, for determining each subsample picture in each subsample picture set in the processing module Calibration category after, monitor the processing module circulation execute cumulative number whether reach pre-determined number threshold value;
Third constructs module, does not reach described predetermined time for monitoring the circulation execution cumulative number in the monitoring module When number threshold value, to have the subsample picture of calibration category as new female samples pictures, new female sample graph is constructed Sheet data library, and control the sampling module and respective handling is continued to execute based on new female samples pictures database;
Second control module, for executing the cumulative number arrival pre-determined number when the monitoring module monitors the circulation When threshold value, controls the sample data and obtain system stalls.
14. system according to claim 11, which is characterized in that the first pre-determined factor q meets: 0.5≤q≤0.7.
15. system according to claim 10, which is characterized in that in the sampling module to female samples pictures data Library carries out multiple repairing weld, and during obtaining corresponding multiple female samples pictures set, each mother's samples pictures set is wrapped The quantity of the female samples pictures contained is equal;
The quantity for female samples pictures that one female samples pictures set is included and female samples pictures database are included The ratio of the quantity of female samples pictures is equal to the second pre-determined factor p, wherein 0 < p < 1.
16. system according to claim 15, which is characterized in that the second pre-determined factor p meets: 0.4≤p≤0.6.
17. system according to claim 10, which is characterized in that use the choosing with predetermined size in the extraction module Frame is taken to extract the process of multiple subsample pictures from each mother's Zhang Suoshu samples pictures in mother's samples pictures set In, the quantity of the subsample picture extracted from female samples pictures is predetermined quantity N;
Wherein predetermined quantity N is positive integer, and 3≤N≤10.
18. any system in 0-17 according to claim 1, which is characterized in that described first, which constructs module, includes:
Acquisition unit, for acquiring multiple original sample pictures with calibration category;
Size adjusting unit, for carrying out size adjusting processing to the original sample picture, with the ruler to original sample picture Little progress row is unitized;
Construction unit, the original sample picture for that will complete size adjusting processing is used as female samples pictures, to construct Female samples pictures database.
19. a kind of server characterized by comprising
One or more processors;
Storage device is stored thereon with one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize the method as described in any in claim 1-9.
20. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-9 is realized when row.
CN201910441621.6A 2019-05-24 2019-05-24 Sample data acquisition method, acquisition system, server and computer readable medium Active CN110162649B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910441621.6A CN110162649B (en) 2019-05-24 2019-05-24 Sample data acquisition method, acquisition system, server and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910441621.6A CN110162649B (en) 2019-05-24 2019-05-24 Sample data acquisition method, acquisition system, server and computer readable medium

Publications (2)

Publication Number Publication Date
CN110162649A true CN110162649A (en) 2019-08-23
CN110162649B CN110162649B (en) 2021-06-18

Family

ID=67632868

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910441621.6A Active CN110162649B (en) 2019-05-24 2019-05-24 Sample data acquisition method, acquisition system, server and computer readable medium

Country Status (1)

Country Link
CN (1) CN110162649B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781973A (en) * 2019-10-30 2020-02-11 广东利元亨智能装备股份有限公司 Article identification model training method, article identification device and electronic equipment
CN111080614A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Method for identifying damage to rim and tread of railway wagon wheel
CN111487189A (en) * 2020-04-03 2020-08-04 哈尔滨市科佳通用机电股份有限公司 Tread damage automatic detection system

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942282A (en) * 2014-04-02 2014-07-23 新浪网技术(中国)有限公司 Sample data obtaining method, device and system
US20170248574A1 (en) * 2014-05-12 2017-08-31 Cellomics, Inc. Automated imaging of chromophore labeled samples
CN108446697A (en) * 2018-03-06 2018-08-24 平安科技(深圳)有限公司 Image processing method, electronic device and storage medium
CN108537270A (en) * 2018-04-04 2018-09-14 厦门理工学院 Image labeling method, terminal device and storage medium based on multi-tag study
CN108711148A (en) * 2018-05-11 2018-10-26 沈阳理工大学 A kind of wheel tyre defect intelligent detecting method based on deep learning
CN108717547A (en) * 2018-03-30 2018-10-30 国信优易数据有限公司 The method and device of sample data generation method and device, training pattern
CN109242801A (en) * 2018-09-26 2019-01-18 北京字节跳动网络技术有限公司 Image processing method and device
CN109284779A (en) * 2018-09-04 2019-01-29 中国人民解放军陆军工程大学 Object detection method based on deep full convolution network
US20190080450A1 (en) * 2017-09-08 2019-03-14 International Business Machines Corporation Tissue Staining Quality Determination
CN109583369A (en) * 2018-11-29 2019-04-05 北京邮电大学 A kind of target identification method and device based on target area segmentation network
CN109657681A (en) * 2018-12-28 2019-04-19 北京旷视科技有限公司 Mask method, device, electronic equipment and the computer readable storage medium of picture
CN109685847A (en) * 2018-12-26 2019-04-26 北京因时机器人科技有限公司 A kind of training method and device of sensation target detection model
CN109697397A (en) * 2017-10-24 2019-04-30 高德软件有限公司 A kind of object detection method and device

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942282A (en) * 2014-04-02 2014-07-23 新浪网技术(中国)有限公司 Sample data obtaining method, device and system
US20170248574A1 (en) * 2014-05-12 2017-08-31 Cellomics, Inc. Automated imaging of chromophore labeled samples
US20190080450A1 (en) * 2017-09-08 2019-03-14 International Business Machines Corporation Tissue Staining Quality Determination
CN109697397A (en) * 2017-10-24 2019-04-30 高德软件有限公司 A kind of object detection method and device
CN108446697A (en) * 2018-03-06 2018-08-24 平安科技(深圳)有限公司 Image processing method, electronic device and storage medium
CN108717547A (en) * 2018-03-30 2018-10-30 国信优易数据有限公司 The method and device of sample data generation method and device, training pattern
CN108537270A (en) * 2018-04-04 2018-09-14 厦门理工学院 Image labeling method, terminal device and storage medium based on multi-tag study
CN108711148A (en) * 2018-05-11 2018-10-26 沈阳理工大学 A kind of wheel tyre defect intelligent detecting method based on deep learning
CN109284779A (en) * 2018-09-04 2019-01-29 中国人民解放军陆军工程大学 Object detection method based on deep full convolution network
CN109242801A (en) * 2018-09-26 2019-01-18 北京字节跳动网络技术有限公司 Image processing method and device
CN109583369A (en) * 2018-11-29 2019-04-05 北京邮电大学 A kind of target identification method and device based on target area segmentation network
CN109685847A (en) * 2018-12-26 2019-04-26 北京因时机器人科技有限公司 A kind of training method and device of sensation target detection model
CN109657681A (en) * 2018-12-28 2019-04-19 北京旷视科技有限公司 Mask method, device, electronic equipment and the computer readable storage medium of picture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭乔进 等: ""基于目标跟踪的半自动图像标注样本生成方法"", 《信息化研究》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781973A (en) * 2019-10-30 2020-02-11 广东利元亨智能装备股份有限公司 Article identification model training method, article identification device and electronic equipment
CN110781973B (en) * 2019-10-30 2021-05-11 广东利元亨智能装备股份有限公司 Article identification model training method, article identification device and electronic equipment
CN111080614A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Method for identifying damage to rim and tread of railway wagon wheel
CN111487189A (en) * 2020-04-03 2020-08-04 哈尔滨市科佳通用机电股份有限公司 Tread damage automatic detection system

Also Published As

Publication number Publication date
CN110162649B (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN110162649A (en) Sample data acquisition methods obtain system, server and computer-readable medium
US10489633B2 (en) Viewers and related methods, systems and circuits with patch gallery user interfaces
CN106709917B (en) Neural network model training method, device and system
CN104239873B (en) Image processing apparatus and processing method
EP2797051A2 (en) Image saliency map determination device, method, program, and recording medium
US6950554B2 (en) Learning type image classification apparatus, method thereof and processing recording medium on which processing program is recorded
DE202015009148U1 (en) Automatic editing of images
CN113096080B (en) Image analysis method and system
CN113237881B (en) Detection method and device for specific cells and pathological section detection system
CN113222913A (en) Circuit board defect detection positioning method and device and storage medium
EP3326109A1 (en) System and method for providing a recipe
Soini et al. Citrus greening infection detection (cigid) by computer vision and deep learning
WO2017145172A1 (en) System and method for extraction and analysis of samples under a microscope
CN105096293B (en) Method and apparatus for handling the block to be processed of sediment urinalysis image
CN115239715A (en) Method, system, equipment and storage medium for predicting development result of blastocyst
WO2024074921A1 (en) Distinguishing a disease state from a non-disease state in an image
CN116188432A (en) Training method and device of defect detection model and electronic equipment
CN117809124B (en) Medical image association calling method and system based on multi-feature fusion
CN113096079B (en) Image analysis system and construction method thereof
US20130230219A1 (en) Systems and methods for efficient comparative non-spatial image data analysis
CN116958093A (en) Stem detection method, device and system and computer readable storage medium
CN112381028A (en) Target feature detection method and device
WO2020152953A1 (en) Discerning device, cell mass discerning method, and computer program
US20230277061A1 (en) Deriving connectivity data from selected brain data
CN114663652A (en) Image processing method, image processing apparatus, management system, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190823

Assignee: Beijing Confucius Health Technology Co.,Ltd.

Assignor: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

Contract record no.: X2021990000477

Denomination of invention: Sample data acquisition method, acquisition system, server and computer-readable medium

Granted publication date: 20210618

License type: Common License

Record date: 20210812

EE01 Entry into force of recordation of patent licensing contract