WO2019233297A1 - 数据集的构建方法、移动终端、可读存储介质 - Google Patents

数据集的构建方法、移动终端、可读存储介质 Download PDF

Info

Publication number
WO2019233297A1
WO2019233297A1 PCT/CN2019/088378 CN2019088378W WO2019233297A1 WO 2019233297 A1 WO2019233297 A1 WO 2019233297A1 CN 2019088378 W CN2019088378 W CN 2019088378W WO 2019233297 A1 WO2019233297 A1 WO 2019233297A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
data set
classification model
target
preset
Prior art date
Application number
PCT/CN2019/088378
Other languages
English (en)
French (fr)
Inventor
刘耀勇
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019233297A1 publication Critical patent/WO2019233297A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Definitions

  • the present application relates to the field of computer applications, and in particular, to a method for constructing a data set, a mobile terminal, and a computer-readable storage medium.
  • AI artificial intelligence
  • the current training data acquisition methods mainly include open source data sets, web crawling, and offline collection.
  • open source data sets web crawling
  • offline collection in order to obtain a large amount of data related to learning tasks, it is generally necessary to manually filter classification and information labeling of open source datasets and web crawled data. After obtaining a large amount of filtered label data, it is then applied to model training. It consumes a lot of manpower and material resources, and the cost is very high.
  • a method for constructing a data set, a mobile terminal, and a computer-readable storage medium are provided.
  • a data set construction method includes:
  • a data set construction device includes:
  • a data set acquisition module configured to acquire a first data set having a first preset number and carrying labeled information according to a learning task
  • a model training module configured to train a classification model on the first data set, and evaluate accuracy information of the classification model
  • a data collection combining module configured to filter unlabeled data based on the trained classification model when the accuracy information reaches a preset value, and merge the filtered data into the first data set to form a second data set;
  • a data set processing module configured to classify and clean the data of the second data set based on the trained classification model to form a target data set having a target number, wherein the data amount of the second data set is greater than or equal to the target The amount of data in the dataset.
  • a mobile terminal includes a memory and a processor.
  • the memory stores a computer program.
  • the processor causes the processor to perform operations of a data set construction method.
  • a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, implements operations of a method of constructing a data set.
  • a method for constructing a data set, a mobile terminal, and a computer-readable storage medium in the embodiments of the present application obtain a first data set having a first preset number and carrying labeled information according to a learning task; training a classification model on the first data set, And evaluate the accuracy information of the classification model; when the accuracy information reaches a preset value, the unlabeled data is classified and filtered based on the trained classification model, and the filtered data is merged into the first data set to form a second data set; based on the training
  • the subsequent classification model classifies and cleans the data of the second data set to form a target data set with a target number; it can realize semi-automatic data collection and screening labeling, and can obtain a large number of high-quality data on the basis of less labor. Training the classification model data greatly saves manpower costs and improves the efficiency of composing the data set.
  • FIG. 1 is a flowchart of a method for constructing a data set in an embodiment.
  • FIG. 2 is a schematic diagram of categories of shooting scenes in an embodiment.
  • FIG. 3 is a flowchart of a method for constructing a data set in another embodiment.
  • FIG. 4 is a flowchart of obtaining a first data set having a first preset number and carrying label information according to a learning task according to an embodiment.
  • FIG. 5 is a flowchart of training the classification model on the first data set and evaluating accuracy information of the classification model in one embodiment.
  • FIG. 6 is a schematic structural diagram of a neural network in an embodiment.
  • FIG. 7 is a schematic structural diagram of a neural network in another embodiment.
  • FIG. 8 is a flowchart of classifying and filtering unlabeled data based on a classification model in an embodiment, and merging the filtered data into the first data set to form a second data set.
  • FIG. 9 is a flowchart of classifying and cleaning the data of the second data set to form a target data set with a target number based on the trained classification model in an embodiment.
  • FIG. 10 is a structural block diagram of an image processing apparatus in an embodiment.
  • FIG. 11 is a schematic diagram of an internal structure of a mobile terminal according to an embodiment.
  • FIG. 12 is a schematic diagram of an image processing circuit in an embodiment.
  • FIG. 1 is a flowchart of a method for constructing a data set in an embodiment. As shown in FIG. 1, a method for constructing a data set includes operations 102 to 106. among them:
  • Operation 102 Obtain a first data set having a first preset number and carrying labeled information according to the learning task.
  • the data in the first data set may be image data, video data, text data, voice data, and so on.
  • image data will be described as an example.
  • an image category and an object category of the image data to be collected and classified can be defined first.
  • the image category can be understood as the training target of the background area in the training data, for example, landscape, beach, snow, blue sky, green space, night scene, darkness, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc.
  • the object category is the training target of the foreground area in the training data, such as portrait, baby, cat, dog, food, etc.
  • the background training target and foreground training target can also be text documents, macros, and so on.
  • the background region refers to the background portion of the image data
  • the foreground region refers to the foreground portion of the image data
  • the shooting scene of the image data may include the image type of the background area, the object type of the foreground area, and others.
  • the image category of the background area may include landscape, beach, snow, blue sky, green space, night scene, dark, backlight, sunrise / sunset, indoor, fireworks, spotlight, and so on.
  • Object categories in the foreground area can be portrait, baby, cat, dog, gourmet, etc. Others can be text documents, macros, etc.
  • a large amount of data can be obtained through open source datasets and web crawlers, and classified manually.
  • the amount of data of each type of image category and each type of object category is within a preset range, and may be equal or different.
  • the specific value of the quantity can be set according to actual needs, for example, it can be set to 2000 or other values.
  • the first preset number of image data can be filtered out.
  • the annotation information includes at least one of an image category and an object category, that is, the annotation information may be an image category, such as landscape, beach, snow, blue sky, etc .; the annotation information may also be an object category, such as portrait, portrait + Babies, portraits + cats, etc .; the annotation information may also include image categories and object categories, for example, portrait + landscape; portrait + sunset; portrait + spotlight, etc.
  • the manually-selected image data including the first preset number is stored in a preset storage area of the mobile terminal or server to form a first data set, and each image data carries label information. Then, the mobile terminal can acquire and call the stored first data set according to the learning task.
  • Operation 104 train a classification model on the first data set, and evaluate accuracy information of the classification model
  • the labeling information is related to the training task of the classification model, and the accuracy of the labeling information affects the accuracy of the classification model.
  • the training of the classification model requires inputting the first data set carrying the labeled information at the same time, and training the classification model according to the learning task.
  • the classification model may be a neural network.
  • the neural network includes a neural network including at least one input layer, n intermediate layers, and two output layers.
  • the i-th intermediate layer is configured as an image feature extraction layer.
  • j intermediate layers are cascaded to the first branch of the neural network, and the kth intermediate layers are cascaded to the second branch of the neural network, where i is less than j and j is less than k;
  • i, j, k, n Are all positive integers, and i, j, and k are all smaller than n; one output layer is located on the first branch and one output layer is located on the second branch.
  • the first output of the first branch of the neural network may output a first confidence level when performing image detection with the neural network, and the first confidence level indicates the confidence level of the specified image category to which the background image detected by the neural network belongs.
  • the second output of the second branch of the neural network may output an offset parameter of each preselected default bounding box with respect to a real bounding box corresponding to a specified object and an assigned specified object category when performing image detection with the neural network. Second confidence.
  • the confidence interval for a probability sample is an interval estimate of a population parameter for this sample.
  • the confidence interval shows the degree to which the true value of this parameter has a certain probability of falling around the measurement result.
  • Confidence is the degree of confidence in the measured value of the parameter being measured.
  • the mobile terminal can simultaneously input the first data set carrying the labeled information to the input layer of the neural network, and then train the neural network.
  • the image data of the first data set can be divided into a training set and a test set according to a preset ratio, and the image data and annotation information of the training set are input to the input layer of the neural network, and the neural network is trained and adjusted. Parameters of the neural network.
  • the image data and annotation information of the test set are simultaneously input to the neural network after adjusting the parameters, and the neural network is evaluated for value, so as to obtain the accuracy information of the trained neural network, that is, to obtain the trained neural network pair first Test recognition rate of the test set in the dataset.
  • the accuracy information includes a first confidence level and a second confidence level.
  • unlabeled data is filtered based on the trained classification model, and the filtered data is merged into the first data set to form a second data set.
  • the amount of image data in the first data set is small, in order to optimize the performance of the classification model, tens to hundreds of thousands of image data are needed. If all of the data is collected and labeled by human resources, Time-consuming, inefficient and costly.
  • the classification model's test accuracy on the data of the test set reaches a preset value, it can indicate that the performance of the trained classification model is better, and it can be used to classify and filter the image data.
  • a large amount of unlabeled image data obtained by the network can be identified, filtered, and labeled.
  • the image data identified by the trained classification model is labeled and merged into the first data set to form a second data set.
  • the number of image data of each image category and each object category is within a preset range, which may be the same or different.
  • the sum of the image data of each image category and each object category is greater than the target number of the target data set, that is, the number of image data of the second data set is greater than the target number of image data of the target data set.
  • the trained classification model can filter, classify, and label a large amount of unlabeled image data obtained from the network. It can avoid spending a lot of manpower to filter image data and classify it, which greatly improves the acquisition of learning tasks. Efficiency of the dataset.
  • Operation 108 classify and clean the data of the second data set based on the trained classification model to form a target data set with a target number.
  • the image data of the second data set is automatically filtered and classified to obtain classification information of each data.
  • image data can be randomly selected from the screening results for manual verification to determine whether the classification information based on the classification model is correct; if it is incorrect, check whether the label information of the image data is correct, and if it is incorrect, correct it to achieve the first Data cleaning of the second dataset.
  • data cleaning can also be understood as deleting irrelevant data, duplicate data in the second data set, smoothing out noise data, filtering out data not related to the learning task, and processing missing values and outliers.
  • the quality and quantity of image data of each image category and each object category can meet preset requirements, for example, the amount of image data of each image category and each object category ranges from 5000-10000 In this way, the number of target data sets composed of image data of each image category and each object category can reach tens of thousands or tens of thousands.
  • the above data set construction method obtains a first data set having a first preset number and carrying label information according to a learning task; training a classification model on the first data set, and evaluating accuracy information of the classification model; when When the accuracy information reaches a preset value, the unlabeled data is classified and filtered based on the trained classification model, and the filtered data is merged into the first data set to form a second data set.
  • the classification model classifies and cleans the data of the second data set to form a target data set with a target number.
  • FIG. 3 is a flowchart of a method for constructing a data set in another embodiment.
  • a data set construction method includes operations 302 to 314. among them:
  • Operation 302 Obtain a first data set having a first preset number and carrying labeled information according to the learning task;
  • Operation 304 Train a classification model on the first data set, and evaluate accuracy information of the classification model.
  • Operation 306 When the accuracy information does not reach a preset value, obtain new data having a second preset number and carrying label information.
  • new data needs to be injected to continue training the classification model so that the accuracy information of the trained classification model reaches the preset value.
  • new data carrying the label information may be acquired again, and the sum of the numbers of the new data acquired again is the second preset number.
  • the new data has the same attributes as the data in the first data set, that is, the same image category and the same object category.
  • the new data can be sorted and sorted based on manual classification, and the data of each image category and each object category can be filtered out again (for example, the data of each category is increased by 1,000), and the filtered data can be labeled, so that The new filtered data also carries annotation information.
  • Operation 308 Merge the new data into the first data set to form a third data set.
  • the acquired new data is merged into the first data set to form a third data set, that is, the image data in the formed third data set are all manually classified and filtered data, and each type of data carries labeling information.
  • Operation 310 Train the classification model on the third data set again until the accuracy information of the classification model reaches a preset value.
  • Train the classification model again on the third data set that is, the new data newly added in the third data set may be operated in operation 104, the classification model is trained on the first data set, and the accuracy of the classification model is evaluated Based on the information, the classification model is trained again to optimize each parameter in the classification model. Furthermore, the accuracy information of the trained classification model is obtained based on the test set data in the third data set, and the accuracy information can also be understood as the test recognition rate of the data set in the data set by the classification model.
  • the acquired accuracy information is compared with a preset value, and if the preset value is reached, operation 312 is performed; if the preset value is not reached, operations 306 to 310 are repeated, and new data is continuously added to the first data set. Data until the accuracy information of the classification model trained on the new third data set reaches a preset value.
  • Operation 312 When the accuracy information reaches a preset value, filter unlabeled data based on the trained classification model, and merge the filtered data into the first data set to form a second data set;
  • Operation 314 classify and clean the data of the second data set based on the trained classification model to form a target data set with a target number.
  • the operations 312 to 314 correspond to the operations 106 to 108 in the foregoing embodiment, and are not repeated here.
  • the method of constructing a data set in this embodiment can continuously add new data to the first data set, so that the amount of data of the formed third data set is increased, and further, the classification model can be trained on the third data set again. , Can optimize each parameter in the classification model, improve the test recognition rate of the classification model, that is, improve the performance of the classification model. At the same time, more unlabeled network information can be classified and filtered based on the trained classification model, providing accuracy of classification and screening.
  • FIG. 4 is a flowchart of obtaining a first data set having a first preset number and carrying label information according to a learning task according to an embodiment.
  • acquiring a first data set having a first preset number and carrying labeled information according to a learning task includes operations 402 to 406. among them:
  • an image category and an object category of data to be acquired are defined according to the learning task.
  • the learning task can be understood as the ultimate recognition target of the classification model, that is, the purpose of training the classification model.
  • an image category and an object category of data to be acquired may be defined according to a learning task.
  • the image category is the training target of the background area in the image data, for example, landscape, beach, snow, blue sky, green space, night scene, darkness, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc.
  • the object category is the training target of the foreground area in the image data, for example, portrait, baby, cat, dog, food, etc.
  • the background training target and foreground training target can also be text documents, macros, and so on.
  • Operation 404 Acquire data according to the image category and the object category.
  • the web crawler technology can be used to search the image data of each image category and the object category on each search engine and complete the corresponding download.
  • you can also find and download available open source datasets such as: MNIT, handwritten digit recognition, and deep learning entry-level datasets; MS-COCO, which can be used for image segmentation, edge detection, keypoint detection, and image capture; ImageNet, one of the most famous image datasets, more commonly used models such as VGG, Inception, Resnet are trained based on it; Open ImageDataset, a dataset containing nearly 9 million image URLs. These images are annotated with thousands of categories and borders. Image data associated with learning tasks can be obtained based on each open source data set.
  • open source datasets such as: MNIT, handwritten digit recognition, and deep learning entry-level datasets; MS-COCO, which can be used for image segmentation, edge detection, keypoint detection, and image capture; ImageNet, one of the most famous image datasets, more commonly used models such as VGG, Inception, Resnet are trained based on it; Open ImageDataset, a dataset containing nearly 9 million image URLs. These images are annotated with thousands of categories and borders
  • the open source datasets can also be natural language processing, speech, and analytics Vidhya practice problems.
  • the web crawler technology and the downloaded open source data set can also be used to obtain image data associated with the learning task, which can improve the efficiency of data acquisition.
  • the amount of image data of each image category and the amount of image data of each object category are relatively balanced.
  • the amount of image data of each category is within a preset range, and the preset range can be set. It is between 2000 and 2500, or within other ranges, and is not further limited herein. This can ensure the comprehensive quality of the image data of each category after the classification model is trained, and avoid relatively more or less image data of a certain category in the first data set, which may affect the training effect of its own category or other categories.
  • Operation 406 Annotate the acquired data based on a manual annotation method to obtain a first data set having a first preset number and carrying annotation information.
  • a large amount of image data obtained by using a web crawler technology and / or an open source data set can be labeled, and it can also be understood as labeling the obtained data, and setting a label so that each type of data carries labeling information.
  • the annotation information includes an image category and / or an object category. That is, if the image data includes only a portrait area, the label information of the image data is a portrait; if the panoramic area in the image data is a beach, the label information of the image data is a beach; if the image data, the background area is Sunrise, the foreground area is a portrait, then the label information of this image data is sunrise and portrait.
  • While labeling the image data it is also necessary to set the number of each type of image category and each type of object category to keep the number of each type of image data within a suitable range. For example, for each category that carries labeling information, The number of image data can be kept in the range of 2000-2500 sheets. This can ensure the comprehensive quality of the image data of each category after training in the classification model, and avoid relatively more or less image data of a category in the first data set. , There will be results that affect the training effect of your own category or other categories.
  • Each type of image data carrying the labeled information is stored to form a first data set having a first preset number, where the first preset number is a sum of the number of types of image data.
  • FIG. 5 is a flowchart of training the classification model on the first data set and evaluating accuracy information of the classification model in one embodiment.
  • the classification model is a neural network
  • the annotation information includes an image category and an object category.
  • the classification model is trained on the first data set, and accuracy information of the classification model is evaluated, including operations 502 to 506. among them:
  • a first data set carrying label information is input to a neural network, and feature extraction is performed through a basic network layer of the neural network, and the extracted image features are input to a classification network layer and a target detection network layer.
  • the network layer obtains a first loss function that reflects the difference between the first predicted confidence level and the first true confidence level of the specified image category to which the background image belongs in the image data, and reflects the image data at the target detection network layer
  • the image data of the first data set can be divided into a training set and a test set according to a preset ratio, and the image data carrying the labeled information in the training set is input to a neural network to obtain the pixels of the background region in the image data.
  • a first loss function of the difference between the first predicted confidence level and the first true confidence level, and a difference between the second predicted confidence level and the second true confidence level of each pixel in the foreground region in the image data A second loss function;
  • the first prediction confidence is the confidence that a pixel of the background region in the image data predicted by the neural network belongs to the background training target, and the first true confidence is expressed in the image data
  • the pre-labeled pixel points belong to the confidence level of the background training target;
  • the second prediction confidence level is the confidence level that a pixel point in the foreground area in the image data predicted by the neural network belongs to the foreground training target.
  • the two true confidence degrees indicate the confidence degree that the pixel points previously marked in the image data belong to the foreground training target.
  • the data in the first data set may be divided into a training set and a test set according to a preset ratio.
  • the preset ratio of the number of image data in the training set to the number of image data in the test set may be set to 9: 1, that is, the ratio of the number of data in the training set to the number of data in the test set is 9: 1.
  • the preset ratio can be set according to actual needs, and no further limitation is made here.
  • the image data with labeled information in the training set can be input to the neural network.
  • the neural network extracts features based on the background training target and the foreground training target, and uses SIFT (Scale-invariant feature) direction.
  • Features are extracted from Histogram of Oriented Gradient (HOG) features, etc., and then detected by object detection algorithms such as SSD (Single Shot MultiBox Detector), VGG (Visual Geometry) Group, and Convolutional Neural Network (CNN).
  • the background training target is detected to obtain a first prediction confidence level
  • the foreground training target is detected to obtain a second prediction confidence level.
  • the first prediction confidence level is the confidence level that a pixel in the background region in the image data predicted by the neural network belongs to the background training target.
  • the second prediction confidence level is the confidence level that a pixel point in the foreground region in the image data predicted by the neural network belongs to the foreground training target.
  • the background training target and the foreground training target can be pre-labeled in the image data to obtain the first true confidence level and the second true confidence level.
  • the first true confidence degree indicates a confidence degree that the pixel point previously marked in the image data belongs to the background training target.
  • the second true confidence degree indicates the confidence degree that the pixel point previously marked in the image data belongs to the foreground training target.
  • the true confidence can be expressed as 1 (or positive value) and 0 (or negative value), which are used to indicate that the pixel belongs to the training target and does not belong to the training target, respectively.
  • a first loss function is obtained by obtaining a difference between the first prediction confidence and the first true confidence
  • a second loss function is obtained by obtaining a difference between the second prediction confidence and the second true confidence.
  • Both the first loss function and the second loss function can be logarithmic, hyperbolic, or absolute value functions.
  • a neural network For each or more pixels in the image data, a neural network can be used to predict a confidence level for the training target.
  • Operation 504 Weight the sum of the first loss function and the second loss function to obtain a target loss function.
  • the first loss function and the second loss function are respectively configured with corresponding weight values, and the weight values can be adjusted according to the recognition scenario. Multiply the first loss function by the corresponding first weight value a, the second loss function by the corresponding second weight value b, and then obtain the sum of the two products to obtain the target loss function.
  • Operation 506 Adjust parameters of the neural network according to the target loss function.
  • the parameters of the neural network refer to the weight value of each layer of the network.
  • the objective loss function is used to adjust the parameters of the neural network so that both the first loss function and the second loss function are minimized, that is, the difference between the predicted confidence level and the true confidence level of the pixel is minimized, or the The sum of the differences between the prediction confidence and the true confidence is minimized to obtain a trained neural network.
  • the objective loss function adjusts the parameters of the neural network.
  • the parameters of each layer of the network can be adjusted step by step through the back propagation algorithm.
  • Operation 508 Test the neural network based on a test set in the first data set to obtain accuracy information of the neural network.
  • the image data with labeled information in the test set is input to the neural network after adjusting parameters, and the neural network is evaluated for value to obtain the accuracy information of the trained neural network.
  • the accuracy information can also be understood as the test recognition rate of each data in the test set by the neural network. The higher the recognition rate, the higher the accuracy information, and the better the performance of the trained neural network.
  • the target loss function is obtained by weighting the first loss function corresponding to the background training target and the second loss function corresponding to the foreground training target, and the parameters of the neural network are adjusted according to the target loss function, so that the training The subsequent neural network can simultaneously identify image categories and object categories, obtain more information, and improve recognition efficiency.
  • FIG. 6 is a schematic structural diagram of a neural network in an embodiment.
  • the input layer of the neural network receives image data carrying labeled information, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features to a feature layer, which is used for background training targets.
  • the first loss function is obtained by detecting
  • the second loss function is obtained by detecting the foreground training target.
  • the first loss function and the second loss function are weighted and summed to obtain the target loss function.
  • FIG. 7 is a schematic structural diagram of a neural network in another embodiment.
  • the input layer of the neural network receives image data carrying labeled information, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features to a feature layer, which is used to train the background target.
  • the first loss function is obtained by performing category detection
  • the second loss function is obtained by performing category detection on the foreground training target according to image features
  • the position loss function is obtained by performing position detection on the foreground training target according to the foreground area
  • the first loss function and the second loss function are Weighted sum with the position loss function to obtain the target loss function.
  • the neural network may be a convolutional neural network.
  • Convolutional neural networks include a data input layer, a convolutional calculation layer, an activation layer, a pooling layer, and a fully connected layer.
  • the data input layer is used to pre-process the original image data.
  • the pre-processing may include de-averaging, normalization, dimensionality reduction, and whitening processes.
  • De-averaging refers to centering all dimensions of the input data to 0 in order to pull the center of the sample back to the origin of the coordinate system.
  • Normalization is normalizing the amplitude to the same range.
  • Whitening refers to normalizing the amplitude on each characteristic axis of the data.
  • the convolution calculation layer is used for local correlation and window sliding. The weight of each filter connected to the data window in the convolution calculation layer is fixed.
  • Each filter focuses on an image feature, such as vertical edges, horizontal edges, colors, textures, etc., and these filters are combined to obtain the entire image.
  • a filter is a weight matrix.
  • a weight matrix can be used to convolve with data in different windows.
  • the activation layer is used to non-linearly map the output of the convolution layer.
  • the activation function used by the activation layer may be ReLU (The Rectified Linear Unit).
  • the pooling layer can be sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting.
  • the pooling layer can use the maximum method or average method to reduce the dimensionality of the data.
  • the fully connected layer is located at the tail of the convolutional neural network, and all neurons between the two layers have the right to reconnect.
  • Part of the convolutional neural network is cascaded to the first confidence output node, part of the convolutional layer is cascaded to the second confidence output node, and part of the convolutional layer is cascaded to the position output node.
  • the first confidence output node it can be detected.
  • the output node can detect the type of the foreground object of the image according to the second confidence level, and the position corresponding to the foreground object can be detected according to the position output node.
  • FIG. 8 is a flowchart of classifying and filtering unlabeled data based on a classification model in an embodiment, and merging the filtered data into the first data set to form a second data set.
  • unlabeled data is classified and filtered based on the trained classification model, and the filtered data is merged into the first data set to form a second data set, including operations 802-806.
  • Operation 802 Classify the unlabeled data based on the trained classification model to filter out data with a preset category.
  • the image data in the first data set are all manually labeled data.
  • the quality of the data is high, but the quantity is small.
  • more training data is needed, that is, the first A data set is populated with more data.
  • the classification model can basically realize the identification and classification of a large amount of unlabeled data.
  • a large amount of data obtained based on web crawler technology and open source data sets can be classified and screened.
  • you can filter out data with a preset category which includes image categories (landscape, beach, snow, blue sky, green space, night scene, dark, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc.) , Object categories (portrait, baby, cat, dog, food, etc.) and other categories (text document, macro, etc.).
  • the trained classification model a large amount of unlabeled data can be classified to identify the category information of each data.
  • the category information is also the preset category, and the category information can also be understood as the label information of the data.
  • the data can be automatically labeled without manual labeling, which greatly improves the efficiency of screening, classification and labeling.
  • Operation 804 Obtain a third preset amount of data from the screening result, where the third preset amount is the sum of the data amounts of each of the preset categories.
  • the category information of the data can be automatically identified and automatically labeled, and the data of each category can be filtered out at the same time.
  • data including a third preset amount is acquired according to a preset demand amount.
  • the third preset number is the sum of the data amounts of the selected preset categories.
  • the amount of data in each preset category is within a certain range, and the range can be 3000-3500, and the range can be set according to the target number. It should be noted that the sum of the third preset number and the first preset number is greater than the target number.
  • Operation 806 Merge the third preset amount of data into the first data set to form a second data set.
  • the number of the second data set is the sum of the first preset number and the second preset number, so that the second
  • the quantity and quality of the data in the data set have been significantly improved, which can avoid a lot of manpower to filter the data and label the data in the process of constructing the data set, which saves costs and improves the acquisition of the data set.
  • a classification model in the process of constructing a target set, can be trained based on the first data set, and then a large amount of data that is not classified by classification can be filtered by the trained classification model and automatically Labeling can reduce the number of manual classification labels, save labeling costs, and improve the efficiency and quality of obtaining data sets that meet the learning task.
  • FIG. 9 is a flowchart of classifying and cleaning the data of the second data set to form a target data set with a target number based on the trained classification model in an embodiment.
  • the classifying and cleaning the data of the second data set based on the trained classification model to form a target data set with a target number includes operations 902-910:
  • Operation 902 Classify data of the second data set based on the trained classification model to filter out data that does not meet preset requirements.
  • the training-based classification model may be understood as a training model based on the first data set or a training model based on the second data set.
  • the amount of data in the second data set is greater than that in the first data set.
  • the classification model may be trained again based on the second data set.
  • each type of data in the second data set can be identified, and then category information of each type of data is obtained, and the category information includes image categories and object categories.
  • the preset requirement may be that the classification model can correctly identify the category information of the data, wherein the correct judgment criterion is that the identified category information is consistent with the manually labeled annotation information.
  • the classification model fails to identify the category information of a certain data, the data does not meet the preset requirements and is filtered out.
  • Clean data that does not meet the preset requirements such as deleting data that is not related to task learning, duplicate data, smoothing noise data, etc. in the second data set.
  • the category information identified by the classification model is inconsistent with the manually labeled annotation information; then check whether the labeled information of the data is correct, and if it is incorrect, correct it to achieve cleaning of data that does not meet the preset requirements.
  • the amount of data may be reduced after data cleaning.
  • the number of data after cleaning processing needs to be counted to determine whether the amount of data after cleaning has reached the target number.
  • operation 908 is performed to form the target data set according to the data after cleaning. Specifically, all the cleaned data may be retained to form a target data set, or data with a target number may be randomly selected from the cleaned data set to form a target data set.
  • operation 910 is performed to classify and filter the unlabeled data based on the trained classification model again to form a new second data set, and to the new second data The set is classified and cleaned to form a target data set with a target number.
  • operations 306 to 308 may be repeated to know that the data amount of the target data set reaches the target number.
  • new data with a second preset amount and carrying labeling information may also be acquired, and the new data may be merged into the second data set, and the new second data may be obtained.
  • the set is classified and cleaned to form a target data set with a target number.
  • the method for constructing a data set further comprises: retraining the classification model on the target data set.
  • the input data set is a target data set, and the amount of image data in the target data set is much greater than the number of image data in the first data set. Therefore, the classification model can be better trained based on the target data set, and various parameters in the classification model can be optimized, so that the accuracy of the trained classification model reaches an ideal state, and the performance of the classification model is improved.
  • FIGS. 1-5 and 8-9 are sequentially displayed according to the directions of the arrows, these operations are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order in which these operations can be performed, and these operations can be performed in other orders. Moreover, at least a part of the operations in FIGS. 1-5 and 8-9 may include multiple sub-operations or multiple stages. These sub-operations or stages are not necessarily performed at the same time, but may be performed at different times. These The execution order of the sub-operations or stages is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of the sub-operations or stages of other operations or other operations.
  • FIG. 10 is a structural block diagram of an apparatus for constructing a data set in an embodiment.
  • a device for constructing a data set includes:
  • a data set acquisition module 1010 configured to acquire a first data set having a first preset number and carrying labeled information according to a learning task;
  • a model training module 1020 configured to train a classification model on the first data set, and evaluate accuracy information of the classification model
  • a data collection merging module 1030 is configured to, when the accuracy information reaches a preset value, filter unlabeled data based on the trained classification model, and merge the filtered data into the first data set to form a first data set.
  • a data set processing module 1040 is configured to classify and clean the data of the second data set based on the trained classification model to form a target data set having a target number, where the data amount of the second data set is greater than or equal to The amount of data in the target dataset.
  • the above data set construction device can obtain a first data set having a first preset number and carrying label information according to a learning task; train a classification model on the first data set and evaluate accuracy information of the classification model; when the accuracy information reaches When the value is preset, the unlabeled data is classified and filtered based on the trained classification model, and the filtered data is merged into the first data set to form a second data set. Based on the trained classification model, the data of the second data set is processed. Classification and cleaning to form a target data set with a target number; through the realization of semi-automatic data collection and screening labeling, a large amount of high-quality training classification model data can be obtained on the basis of spending less labor, greatly saving labor costs , While improving the efficiency of composing the data set.
  • the apparatus for constructing a data set further includes:
  • a new data acquisition module configured to acquire new data having a second preset number and carrying labeled information when the accuracy information does not reach a preset value; and to merge the new data into the first data set To form a third data set;
  • the model training module is further configured to train the classification model on the third data set again until the accuracy information of the classification model reaches a preset value.
  • the apparatus for constructing a data set in this embodiment may continuously add new data to the first data set, so that the amount of data of the formed third data set is increased, and further, the classification model may be trained on the third data set again. , Can optimize each parameter in the classification model, improve the test recognition rate of the classification model, that is, improve the performance of the classification model. At the same time, more unlabeled network information can be classified and filtered based on the trained classification model, providing accuracy of classification and screening.
  • the data set acquisition module includes:
  • a definition unit configured to define an image category and an object category of data to be acquired according to the learning task
  • a first obtaining unit configured to obtain data according to the image category and the object category
  • the second obtaining unit is configured to mark the acquired data based on a manual labeling method to obtain a first data set having a first preset number and carrying labeling information.
  • the classification model is a neural network
  • the annotation information includes an image category and an object category
  • a model training module module includes:
  • An input unit is configured to input a first data set carrying labeled information to a neural network, perform feature extraction through a basic network layer of the neural network, and input the extracted image features to a classification network layer and a target detection network layer, and
  • the classification network layer obtains a first loss function reflecting a difference between a first predicted confidence level and a first true confidence level of a specified image category to which the background image belongs in the image data, and is reflected in the target detection network layer.
  • a processing unit configured to perform weighted summation of the first loss function and the second loss function to obtain a target loss function
  • An adjustment unit configured to adjust parameters of the neural network according to the target loss function
  • the evaluation unit is configured to test the neural network based on a test set in the first data set to obtain accuracy information of the neural network.
  • a target loss function is obtained by weighted summing a first loss function corresponding to a background training target and a second loss function corresponding to a foreground training target, and the nerve is adjusted according to the target loss function.
  • the parameters of the network enable the trained neural network to simultaneously identify the background category and the foreground target, obtain more information, and improve the recognition efficiency.
  • the data collection combining module includes:
  • a screening unit configured to classify the unlabeled data based on the trained classification model to filter out data having a preset category
  • a labeling unit configured to obtain a third preset amount of data in the screening result; wherein the third preset amount is a sum of the data amounts of each of the preset categories;
  • a third obtaining unit is configured to merge the third preset amount of data into the first data set to form a second data set.
  • the apparatus for constructing a data set in this embodiment can train a classification model based on the first data set during the target data construction process, and then use the trained classification model to filter a large amount of data that is not labeled and automatically classify it. Labeling can reduce the number of manual classification labels, save labeling costs, and improve the efficiency and quality of obtaining data sets that meet the learning task.
  • the data set processing module includes:
  • a screening unit configured to classify data of the second data set based on the trained classification model to filter out data that does not meet preset requirements
  • a cleaning unit configured to clean the data that does not meet the preset requirements
  • the judging unit judges whether the amount of the cleaned data reaches the target number; if it is, the target data set is formed according to the cleaned data; if not, the unlabeled data is classified and filtered based on the trained classification model and formed A new second data set, and classifying and cleaning the new second data set to form a target data set having a target number.
  • each module in the above-mentioned data set construction device is only for illustration. In other embodiments, the neural network processing device or image processing device may be divided into different modules as required to complete the above-mentioned data set construction device. Full or partial functionality.
  • An embodiment of the present application further provides a mobile terminal.
  • the mobile terminal includes a memory and a processor.
  • a computer program is stored in the memory.
  • the processor causes the processor to perform the operations of the data set construction method.
  • An embodiment of the present application further provides a computer-readable storage medium.
  • a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, implements the operations of the data set construction method.
  • FIG. 11 is a schematic diagram of an internal structure of a mobile terminal according to an embodiment.
  • the mobile terminal includes a processor, a memory, and a network interface connected through a system bus.
  • the processor is used to provide computing and control capabilities to support the operation of the entire mobile terminal.
  • the memory is used to store data, programs, and the like. At least one computer program is stored on the memory, and the computer program can be executed by a processor to implement the wireless network communication method applicable to the mobile terminal provided in the embodiments of the present application.
  • the memory may include a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the computer program may be executed by a processor to implement a method for constructing a data set provided by each of the following embodiments.
  • the internal memory provides a cached operating environment for operating system computer programs in a non-volatile storage medium.
  • the network interface may be an Ethernet card or a wireless network card, and is used to communicate with an external mobile terminal.
  • the mobile terminal may be a mobile phone, a tablet computer, or a personal digital assistant or a wearable device.
  • each module in the apparatus for constructing a data set provided in the embodiments of the present application may be in the form of a computer program.
  • the computer program can be run on a mobile terminal or server.
  • the program module constituted by the computer program can be stored in a memory of a mobile terminal or a server.
  • the computer program is executed by a processor, the operations of the method described in the embodiments of the present application are implemented.
  • a computer program product containing instructions that, when run on a computer, causes the computer to perform a method of constructing a data set.
  • An embodiment of the present application further provides a mobile terminal.
  • the above mobile terminal includes an image processing circuit, and the image processing circuit may be implemented by using hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline.
  • FIG. 12 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 12, for ease of description, only aspects of the image processing technology related to the embodiment of the present application are shown.
  • the image processing circuit includes an ISP processor 1240 and a control logic 1250.
  • the image data captured by the imaging device 1210 is first processed by the ISP processor 1240, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 1210.
  • the imaging device 1210 may include a camera having one or more lenses 1212 and an image sensor 1214.
  • the image sensor 1214 may include a color filter array (such as a Bayer filter).
  • the image sensor 1214 may obtain light intensity and wavelength information captured by each imaging pixel of the image sensor 1214, and provide a set of raw data that may be processed by the ISP processor 1240. Image data.
  • the sensor 1220 (such as a gyroscope) may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 1240 based on the interface type of the sensor 1220.
  • the sensor 1220 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
  • SMIA Standard Mobile Imaging Architecture
  • the image sensor 1214 may also send the original image data to the sensor 1220, and the sensor 1220 may provide the original image data to the ISP processor 1240 based on the interface type of the sensor 1220, or the sensor 1220 stores the original image data into the image memory 1230.
  • the ISP processor 1240 processes the original image data pixel by pixel in a variety of formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 1240 may perform one or more image processing operations on the original image data and collect statistical information about the image data.
  • the image processing operations may be performed with the same or different bit depth accuracy.
  • the ISP processor 1240 may also receive image data from the image memory 1230.
  • the sensor 1220 interface sends the original image data to the image memory 1230, and the original image data in the image memory 1230 is then provided to the ISP processor 1240 for processing.
  • the image memory 1230 may be a part of a memory device, a storage device, or a separate dedicated memory in a mobile terminal, and may include a DMA (Direct Memory Access) feature.
  • DMA Direct Memory Access
  • the ISP processor 1240 may perform one or more image processing operations, such as time-domain filtering.
  • the processed image data may be sent to the image memory 1230 for further processing before being displayed.
  • the ISP processor 1240 receives processed data from the image memory 1230, and performs image data processing on the processed data in the original domain and in the RGB and YCbCr color spaces.
  • the image data processed by the ISP processor 1240 may be output to the display 1270 for viewing by a user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit).
  • the output of the ISP processor 1240 can also be sent to the image memory 1230, and the display 1270 can read image data from the image memory 1230.
  • the image memory 1230 may be configured to implement one or more frame buffers.
  • the output of the ISP processor 1240 may be sent to an encoder / decoder 1260 to encode / decode image data.
  • the encoded image data can be saved and decompressed before being displayed on the display 1270 device.
  • the encoder / decoder 1260 may be implemented by a CPU or a GPU or a coprocessor.
  • the statistical data determined by the ISP processor 1240 may be sent to the control logic 1250 unit.
  • the statistical data may include image information of the image sensor 1214 such as automatic exposure, automatic white balance, automatic focus, flicker detection, black level compensation, and lens 1212 shading correction.
  • the control logic 1250 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine the control parameters of the imaging device 1210 and the ISP processing based on the received statistical data. 1240 control parameters.
  • control parameters of the imaging device 1210 may include sensor 1220 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 1212 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters.
  • ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 1212 shading correction parameters.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM), which is used as external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • SDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Abstract

一种数据集的构建方法包括:根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;在第一数据集上训练分类模型,并评估分类模型的精度信息;当精度信息达到预设值时,则基于训练后的分类模型分类筛选未标注的数据,将筛选出数据合并至第一数据集以形成第二数据集;基于训练后的分类模型对第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集。

Description

数据集的构建方法、移动终端、可读存储介质
相关申请的交叉引用
本申请要求于2018年06月08日提交中国专利局、申请号为201810588652.X、发明名称为“数据集的构建方法和装置、移动终端、可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机应用领域,特别是涉及一种数据集的构建方法、移动终端、计算机可读存储介质。
背景技术
人工智能(Artificial Intelligence,AI)领域的发展日新月异,特别是随着深度学习技术的广泛应用,其在物体检测、识别等领域取得了突破性的进展。一般,人工智能AI算法主要是基于监督式学习的深度学习技术,而训练数据是人工智能模型的驱动力。
目前的训练数据获取方式主要包含开源数据集、网络爬取、线下采集。然而,为了获得大量与学习任务相关的数据,一般需要对开源数据集和网络爬取的数据进行人工筛选分类和信息标注,在获取大量筛选后的标注数据后,再应用于模型训练,这样常常耗费大量的人力和物力,成本很高。
发明内容
根据本申请的各种实施例,提供一种数据集的构建方法、移动终端、计算机可读存储介质。
一种数据集的构建方法,包括:
根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
一种数据集的构建装置,包括:
数据集获取模块,用于根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
模型训练模块,用于在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
数据集合并模块,用于当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
数据集处理模块,用于基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
一种移动终端,包括存储器及处理器,所述存储器中储存有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行数据集的构建方法的操作。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现数据集的构建方法的操作。
本申请实施例中数据集的构建方法、移动终端、计算机可读存储介质,根据学习任务 获取具有第一预设数量且携带标注信息的第一数据集;在第一数据集上训练分类模型,并评估分类模型的精度信息;当精度信息达到预设值时,则基于训练后的分类模型分类筛选未标注的数据,将筛选出数据合并至第一数据集以形成第二数据集;基于训练后的分类模型对第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集;可以实现半自动化的数据采集和筛选标注,可在花费较小人力的基础上获取大量高质量的训练分类模型的数据,大大节约了人力成本,同时提高了构成数据集的效率。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中数据集的构建方法的流程图。
图2为一个实施例中拍摄场景的类别示意图。
图3为另一个实施例中数据集的构建方法的流程图。
图4为一个实施例中根据学习任务获取具有第一预设数量且携带标注信息的第一数据集的流程图。
图5为一个实施例中在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息的流程图。
图6为一个实施例中神经网络的架构示意图。
图7为另一个实施例中神经网络的架构示意图。
图8为一个实施例中基于分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集的流程图。
图9为一个实施例中基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集的流程图。
图10为一个实施例中图像处理装置的结构框图。
图11为一个实施例中移动终端的内部结构示意图。
图12为一个实施例中图像处理电路的示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中数据集的构建方法的流程图。如图1所示,一种数据集的构建方法,包括操作102至操作106。其中:
操作102,根据学习任务获取具有第一预设数量且携带标注信息的第一数据集。
其中,第一数据集中的数据可以为图像数据、视频数据、文字数据、语音数据等。在本申请中,以图像数据为例进行说明。根据学习任务可以先定义需要采集并分类筛选的图像数据的图像类别和对象类别。其中,图像类别可以理解为训练数据中背景区域的训练目标,例如,风景、海滩、雪景、蓝天、绿地、夜景、黑暗、背光、日出/日落、室内、烟火、聚光灯等。对象类别为训练数据中前景区域的训练目标,例如,人像、婴儿、猫、狗、美食等。另外,背景训练目标和前景训练目标还可为文本文档、微距等。
需要说明的是,背景区域是指图像数据的背景部分,前景区域是指图像数据的前景部分。
如图2所示,图像数据的拍摄场景可包括背景区域的图像类别、前景区域的对象类别 和其他。背景区域的图像类别可包括风景、海滩、雪景、蓝天、绿地、夜景、黑暗、背光、日出/日落、室内、烟火、聚光灯等。前景区域的对象类别可为人像、婴儿、猫、狗、美食等。其他可为文本文档、微距等。
根据定义的图像类别和对象类别可以通过开源数据集和网络爬虫获取大量的数据,并通过人工筛选分类。其中,每类图像类别和每类对象类别的数据数量在预设范围内,可以相等,也可以不等。数量的具体数值可以根据实际需求来设定,例如,可以设定为2000或其他数值。通过人工筛选分类就可以筛选出包括第一预设数量的图像数据。
同时,还需要对筛选出的图像数据进行人工标注,使每一张图像数据均携带标注信息。其中,标注信息包括图像类别和对象类别中的至少一种,也即,标注信息可以为图像类别,例如,风景、海滩、雪景、蓝天等;标注信息也可以为对象类别,例如人像,人像+婴儿,人像+猫等;标注信息还可以包括图像类别和对象类别,例如,人像+风景;人像+日落;人像+聚光灯等。
将人工筛选出的包括第一预设数量的图像数据的存储在移动终端或服务器的预设存储区域,以形成第一数据集,且每一张图像数据均携带标注信息。继而,移动终端可以根据学习任务获取并调用存储的第一数据集。
操作104,在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
标注信息与分类模型的训练任务相关联,其标注信息的准确性影响着分类模型的精度。分类模型训练需要同时输入携带标注信息的第一数据集,根据学习任务来训练分类模型。
具体地,该分类模型可以为神经网络,神经网络包括至少包含一个输入层、n个中间层和两个输出层的神经网络,其中,将第i个中间层配置为图像特征提取层,该第j个中间层级联到该神经网络的第一支路,将该第k个中间层级联到该神经网络的第二支路,其中,i小于j,j小于k;i、j、k、n均为正整数,且i、j、k均小于n;一个输出层位于该第一支路,一个输出层位于该第二支路。该神经网络的第一支路的第一输出可以在用该神经网络进行图像检测时输出第一置信度,该第一置信度表示采用该神经网络检测出的背景图像所属指定图像类别的置信度。该神经网络的第二支路的第二输出可以在用该神经网络进行图像检测时输出每种预选的默认边界框相对于指定对象所对应的真实边界框的偏移量参数和所属指定对象类别的第二置信度。
在统计学中,一个概率样本的置信区间是对这个样本的某个总体参数的区间估计。置信区间展现的是这个参数的真实值有一定概率落在测量结果的周围的程度。置信度是被测量参数的测量值的可信程度。
移动终端可以同时将携带标注信息的第一数据集输入至神经网络的输入层,进而对该神经网络进行训练。
具体地,可以将第一数据集的图像数据按照预设比例分为训练集和测试集,将训练集的图像数据和标注信息输入至神经网络的输入层,对该神经网络进行训练,进而调整神经网络的参数。将测试集的图像数据和标注信息同时输入至调整参数后的神经网络,对该神经网络进行价值评估,以获取训练后的神经网络的精度信息,也即,获取训练后的神经网络对第一数据集中测试集的测试识别率。其中,精度信息包括第一置信度和第二置信度。
操作106,当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集。
由于第一数据集内的图像数据的数量较少,而为了使分类模型的性能达到最优,则需要上万到几十万个图片数据,若全部靠人力收集数据以及对该数据进行标注,耗时长、效率低且成本高。当分类模型对测试集的数据的测试精度达到预设值时,可以表示训练后的分类模型的性能较好,可以用于对图像数据进行分类筛选。基于训练后的分类模型可以对网络获取的大量未标注的图像数据进行识别、筛选、标注。同时,将训练后的分类模型识 别出的图像数据进行标注,并合并至第一数据集中,以形成第二数据集。其中,通过分类模型识别出的图像数据中,每种图像类别和每种对象类别的图像数据的数量均在预设范围内,可以相同,也可以不同。同时,每类图像类别和每种对象类别的图像数据的总和大于目标数据集的目标数量,也即第二数据集的图像数据的数量大于目标数据集的图像数据的目标数量。
通过训练后的分类模型可以对网络获取的大量的未标注的图像数据进行筛选、分类、标注,可以避免耗费大量的人力去筛选图像数据,并对其进行分类处理,大大提高了获取符合学习任务的数据集的效率。
操作108,基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集。
基于训练后的分类模型自动对第二数据集的图像数据进行筛选、分类,获取每种数据的分类信息。可以从筛选结果中随机挑选若干图像数据进行人工验证,判断基于分类模型的分类信息是否正确;若不正确,则查验该图像数据的标注信息是否正确,若不正确,将其纠正以实现对第二数据集的数据清洗。可选的,数据清洗还可以理解为删除第二数据集中的无关数据、重复数据,平滑噪声数据,筛选掉与学习任务无关的数据,处理缺失值、异常值。
通过数据清洗,可以过滤掉第二数据集中与学习任务无关的数据,使第二数据集中保留的数据符合预设要求,即保留的数据均是与训练模型高度相关联的数据;同时使第二数据集的保留的数据数量达到目标数量,同时,继而可以根据第二数据集中保留的数据形成目标数据集。其中,目标数据集中,每种图像类别和每种对象类别的图像数据的质量和数量都可以达到预设要求,例如,每种图像类别和每种对象类别的图像数据的数量范围在5000-10000张之间,这样,由每种图像类别和每种对象类别的图像数据构成的目标数据集可以的数量可达到几万、十几万。
上述数据集的构建方法,根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;当所述精度信息达到预设值时,则基于训练后的所述分类模型分类筛选未标注的数据,将筛选出数据合并至所述第一数据集以形成第二数据集;基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集;可以通过实现半自动化的数据采集和筛选标注,可在花费较小人力的基础上获取大量高质量的训练分类模型的数据,大大节约了人力成本,同时提高了构成数据集的效率。
图3为另一个实施例中数据集的构建方法的流程图。如图3所示,一种数据集的构建方法,包括操作302至操作314。其中:
操作302,根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
操作304,在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
上述操作302-操作304与前述实施例中操作102-操作104一一对应,在此,不再赘述。
操作306,当所述精度信息未达到预设值时,则获取具有第二预设数量且携带标注信息的新数据。
当在第一数据集上训练的分类模型的精度信息为达到预设值时,则需要注入新的数据继续对该分类模型进行训练,使其训练后的分类模型的精度信息达到预设值。具体地,可以再次获取携带标注信息的新数据,再次获取的新数据的数量之和为第二预设数量。该新数据与第一数据集中的数据的属性相同,也即,图像类别相同、对象类别相同。例如,可以基于人工继续分类筛序新数据,每种图像类别和每种对象类别的数据再次筛选出若干(如,各种类别的数据均增加1000张),并对筛选的数据进行标注,使筛选的新数据也携带标注信息。
操作308,将所述新数据合并至所述第一数据集中,形成第三数据集。
将获取的新数据合并至第一数据集中,以形成第三数据集,也即,形成的第三数据集中的图像数据均为人工分类筛选的数据,且每种数据均携带标注信息。
操作310,在所述第三数据集上再次训练所述分类模型,直到所述分类模型的精度信息达到预设值。
在第三数据集上再次训练该分类模型,也即,可以将第三数据集中新增的新数据在操作104,在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息的基础上再次训练该分类模型,以优化该分类模型中的各个参数。进而基于第三数据集中的测试集数据获取训练后的分类模型的精度信息,精度信息也可以理解为该分类模型的对数据集中数据的测试识别率。
将获取的精度信息与预设值进行比较,若达到预设值时,则执行操作312;若仍未达到预设值,则重复执行操作306-操作310,不断的向第一数据集中添加新数据,直到在新的第三数据集上训练后的分类模型的精度信息达到预设值。
操作312,当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
操作314,基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集。
上述操作312-操作314与前述实施例中操作106-操作108一一对应,在此,不再赘述。
本实施例中的数据集的构建方法,可以不断地向第一数据集添加新数据,使形成的第三数据集的数据数量有增加,进而,可以在第三数据集上再次训练该分类模型,可以优化分类模型中的各个参数,提高分类模型的测试识别率,也即提高分类模型的性能。同时,可以基于训练后的分类模型来分类筛选更多的未标注的网络信息,提供分类筛选的准确性。
图4为一个实施例中根据学习任务获取具有第一预设数量且携带标注信息的第一数据集的流程图。如图4所示,根据学习任务获取具有第一预设数量且携带标注信息的第一数据集,包括操作402至操作406。其中:
操作402,根据所述学习任务定义待获取数据的图像类别和对象类别。
学习任务可以理解为分类模型的终极识别目标,也即,训练分类模型的目的。在本实施例中,可以根据学习任务定义待获取数据的图像类别和对象类别。其中,图像类别为图像数据中背景区域的训练目标,例如,风景、海滩、雪景、蓝天、绿地、夜景、黑暗、背光、日出/日落、室内、烟火、聚光灯等。对象类别为图像数据中前景区域的训练目标,例如,人像、婴儿、猫、狗、美食等。另外,背景训练目标和前景训练目标还可为文本文档、微距等。
操作404,根据所述图像类别和对象类别获取数据。
根据定义的图像类别和对象类别获取大量的图像数据。具体地,可以根据图像类别和对象类别的关键词,利用网络爬虫技术,在各个搜索引擎上搜索各个图像类别和对象类别的图像数据,并完成相应的下载。
可选的,还可以查找并下载可使用的开源数据集,例如:MNIT,手写数字识别,深度学习入门级数据集;MS-COCO,可用于图像分割,边缘检测,关键点检测及图像捕获;ImageNet,最有名的图像数据集之一,比较常用的模型如VGG、Inception、Resnet都是基于它训练的;Open Image Dataset,一个包含近900万个图像URL的数据集。这些图像拥有数千个类别及边框进行了注释等。可以基于各开源数据集获取与学习任务相关联的图像数据。
另外,可以根据学习任务下载不同的开源数据集,开源数据集还可以为自然语言处理 类、语音类、Analytics Vidhya实践问题等。
可选的,还可以同时利用网络爬虫技术和下载的开源数据集来获取与学习任务相关联的图像数据,这样可以提高获取数据的效率。其中,获取的图像数据中,每种图像类别的图像数据的数量与每种对象类别的图像数据的数量都比较均衡,各个类别的图像数据的数量在预设范围内,该预设范围可以设为2000-2500之间,或其他范围内,在此不做进一步的限定。这样可以保证每种类别的图像数据经过分类模型训练后的综合质量,避免第一数据集中某个类别的图像数据相对较多或较少,出现对自身类别或其他类别训练效果影响的结果。
可选的,还可以对获取的数据进行数据清洗,以删除原始数据中的无关数据、重复数据,平滑噪声数据,例如删掉与学习任务无关的数据,处理缺失值、异常值,以获取高质量的数据。
操作406,基于人工标注方式对获取的数据进行标注,以获取具有第一预设数量且携带标注信息的第一数据集。
可以对利用网络爬虫技术和/或开源数据集获取的大量图像数据进行标注,也可以理解为对获取的数据进行标注,设定标签,使每种数据携带标注信息。其中,标注信息包括图像类别和/或对象类别。也即,若图像数据中,仅包括人像区域,则该图像数据的标注信息为人像;若图像数据中全景区域为海滩,则该图像数据的标注信息为海滩;若图像数据中,背景区域为日出,前景区域为人像,则该图像数据的标注信息为日出和人像。
在对图像数据进行标注的同时,还需要设定每类图像类别和每类对象类别的数量,使每类图像数据的数量保持在一个合适的范围内,例如,携带标注信息的每种类别的图像数据的数量可以保持在2000-2500张的范围内,这样可以保证每种类别的图像数据经过分类模型训练后的综合质量,避免第一数据集中某个类别的图像数据相对较多或较少,出现对自身类别或其他类别训练效果影响的结果。
将携带标注信息的每类图像数据进行存储,以形成具有第一预设数量的第一数据集,其中第一预设数量为每类图像数据数量之和。
图5为一个实施例中在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息的流程图。在一个实施例中,所述分类模型为神经网络,所述标注信息包括图像类别和对象类别。如图5所示,在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息,包括操作502至操作506。其中:
操作502,将携带标注信息的第一数据集输入到神经网络,通过所述神经网络的基础网络层进行特征提取,将提取的图像特征输入到分类网络层和目标检测网络层,在所述分类网络层得到反映所述图像数据中背景图像所属指定图像类别的第一预测置信度与第一真实置信度之间的差异的第一损失函数,在所述目标检测网络层得到反映所述图像数据中前景目标所属指定对象类别的第二预测置信度与第二真实置信度之间的差异的第二损失函数。
具体地,可以将第一数据集的图像数据按照预设比例分为训练集和测试集,将训练集中的携带标注信息的图像数据输入到神经网络,得到反映该图像数据中背景区域各像素点的第一预测置信度与第一真实置信度之间的差异的第一损失函数,以及反映该图像数据中前景区域各像素点的第二预测置信度与第二真实置信度之间的差异的第二损失函数;该第一预测置信度为采用该神经网络预测出的该图像数据中背景区域某一像素点属于该背景训练目标的置信度,该第一真实置信度表示在该图像数据中预先标注的该像素点属于该背景训练目标的置信度;该第二预测置信度为采用该神经网络预测出的该图像数据中前景区域某一像素点属于该前景训练目标的置信度,该第二真实置信度表示在该图像数据中预先标注的该像素点属于该前景训练目标的置信度。
具体地,可以按照预设比例将第一数据集中的数据划分为训练集和测试集。例如,训 练集中的图像数据的数量与测试集中的图像数据的数量的预设比例可以设为9:1,也即训练集的数据数量与测试集的数据数量比值为9:1。当然,可以可以根据实际需求来设置预设比例,在此,不做进一步的限定。
在神经网络训练过程中,可将训练集中的携带标注信息的图像数据输入到神经网络中,神经网络根据背景训练目标和前景训练目标进行特征提取,通过SIFT(Scale-invariant feature transform)特征、方向梯度直方图(Histogram of Oriented Gradient,HOG)特征等提取特征,再通过SSD(Single Shot MultiBox Detector)、VGG(Visual Geometry Group)、卷积神经网络(Convolutional Neural Network,CNN)等目标检测算法,对背景训练目标进行检测得到第一预测置信度,对前景训练目标进行检测得到第二预测置信度。第一预测置信度为采用该神经网络预测出的该图像数据中背景区域某一像素点属于该背景训练目标的置信度。第二预测置信度为采用该神经网络预测出的该图像数据中前景区域某一像素点属于该前景训练目标的置信度。
图像数据中可以预先标注背景训练目标和前景训练目标,得到第一真实置信度和第二真实置信度。该第一真实置信度表示在该图像数据中预先标注的该像素点属于该背景训练目标的置信度。第二真实置信度表示在该图像数据中预先标注的该像素点属于该前景训练目标的置信度。针对图像中的每种像素点,真实置信度可以表示为1(或正值)和0(或负值),分别用以表示该像素点属于训练目标和不属于训练目标。
求取第一预测置信度与第一真实置信度之间的差异得到第一损失函数,求其第二预测置信度与第二真实置信度之间的差异得到第二损失函数。第一损失函数和第二损失函数均可采用对数函数、双曲线函数、绝对值函数等。
针对图像数据中的每一个或者多个像素点,可以利用神经网络预测出一个针对训练目标的置信度。
操作504,将所述第一损失函数和第二损失函数进行加权求和得到目标损失函数。
首先给第一损失函数和第二损失函数分别配置对应的权重值,该权重值可根据识别场景进行调整。将第一损失函数乘以对应的第一权重值a,第二损失函数乘以对应的第二权重值b,再求取两个乘积之和得到目标损失函数。
操作506,根据所述目标损失函数调整所述神经网络的参数。
具体地,神经网络的参数是指每层网络的权重值。利用目标损失函数调整神经网络的参数,使得第一损失函数和第二损失函数均最小化,也就是使得像素点的预测置信度与真实置信度之间的差异都最小,或者使得各个像素点的预测置信度与真实置信度之间的差异之和最小化,从而得到训练好的神经网络。目标损失函数调整神经网络的参数可通过反向传播算法逐级调整每层网络的参数。
操作508,基于第一数据集中的测试集对所述神经网络进行测试,获取所述神经网络的精度信息。
将测试集携带标注信息的图像数据输入至调整参数后的神经网络,对该神经网络进行价值评估,以获取训练后的神经网络的精度信息。该精度信息也可以理解为神经网络对测试集中各数据的测试识别率,其识别率越高,精度信息也就越高,其训练后的神经网络的性能也就越好。
本申请实施例中,通过对背景训练目标所对应的第一损失函数和前景训练目标所对应的第二损失函数的加权求和得到目标损失函数,根据目标损失函数调整神经网络的参数,使得训练的神经网络后续可以同时识别出图像类别和对象类别,获取更多的信息,且提高了识别效率。
图6为一个实施例中神经网络的架构示意图。如图6所示,神经网络的输入层接收携带标注信息的图像数据,通过基础网络(如CNN网络)进行特征提取,并将提取的图像特征输出给特征层,由该特征层进行背景训练目标的检测得到第一损失函数,以及进行前景 训练目标的检测得到第二损失函数,将第一损失函数和第二损失函数进行加权求和得到目标损失函数。
图7为另一个实施例中神经网络的架构示意图。如图7所示,神经网络的输入层接收携带标注信息的图像数据,通过基础网络(如CNN网络)进行特征提取,并将提取的图像特征输出给特征层,由该特征层对背景训练目标进行类别检测得到第一损失函数,对前景训练目标根据图像特征进行类别检测得到第二损失函数,对前景训练目标根据前景区域进行位置检测得到位置损失函数,将第一损失函数、第二损失函数和位置损失函数进行加权求和得到目标损失函数。该神经网络可为卷积神经网络。卷积神经网络包括数据输入层、卷积计算层、激活层、池化层和全连接层。数据输入层用于对原始图像数据进行预处理。该预处理可包括去均值、归一化、降维和白化处理。去均值是指将输入数据各个维度都中心化为0,目的是将样本的中心拉回到坐标系原点上。归一化是将幅度归一化到同样的范围。白化是指对数据各个特征轴上的幅度归一化。卷积计算层用于局部关联和窗口滑动。卷积计算层中每种滤波器连接数据窗的权重是固定的,每种滤波器关注一个图像特征,如垂直边缘、水平边缘、颜色、纹理等,将这些滤波器合在一起得到整张图像的特征提取器集合。一个滤波器是一个权重矩阵。通过一个权重矩阵可与不同窗口内数据做卷积。激活层用于将卷积层输出结果做非线性映射。激活层采用的激活函数可为ReLU(The Rectified Linear Unit,修正线性单元)。池化层可夹在连续的卷积层中间,用于压缩数据和参数的量,减小过拟合。池化层可采用最大值法或平均值法对数据降维。全连接层位于卷积神经网络的尾部,两层之间所有神经元都有权重连接。卷积神经网络的一部分卷积层级联到第一置信度输出节点,一部分卷积层级联到第二置信度输出节点,一部分卷积层级联到位置输出节点,根据第一置信度输出节点可以检测到图像的背景分类,根据第二置信度输出节点可以检测到图像的前景目标的类别,根据位置输出节点可以检测到前景目标所对应的位置。
图8为一个实施例中基于分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集的流程图。在一个实施例中,基于训练后的所述分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集,包括操作802-操作806。
操作802,基于训练后的所述分类模型对未标注的数据进行分类以筛选出具有预设类别的数据。
第一数据集中的图像数据均为人工标注的数据,其数据的质量高,但是其数量较少,为了使分类模型的精度达到最优,则需要更多的训练数据,也即还需要向第一数据集中填充更多的数据。
当训练后的分类模型的精度信息达到预设值时,其该分类模型基本上能够实现对大量未标注的数据的识别分类。基于在第一数据集上训练后的分类模型,可以对基于网络爬虫技术和开源数据集获取的大量数据进行分类筛选。通过筛选分类,可以筛选出具有预设类别的数据,该预设类别包括图像类别(风景、海滩、雪景、蓝天、绿地、夜景、黑暗、背光、日出/日落、室内、烟火、聚光灯等)、对象类别(人像、婴儿、猫、狗、美食等)和其他类别(文本文档、微距等)。根据训练后的分类模型,可以对大量未标注的数据进行分类,以识别出每种数据的类别信息,该类别信息也就是预设类别,而且该类别信息也可以理解为该数据的标注信息,基于该分类模型可以对数据进行自动标注,不需要人工一一标注,大大提高了筛选、分类及标注的效率。
进一步的,为了验证其训练后的分类模型对数据的自动标注的准确性,可以随机挑选若干个数据进行人工验证,并将自动标注错误的信息进行纠正,以提高携带标注信息的数据的质量。
操作804,在筛选结果中获取包括第三预设数量的数据;其中,所述第三预设数量为 每种所述预设类别的数据数量之和。
通过训练后的分类模型,可以自动识别数据的类别信息,并对其自动标注,同时筛选出各个类别的数据。在筛选结果中,根据可以预设需求量获取包括第三预设数量的数据。其中,第三预设数量为筛选出的各预设类别的数据数量之和。其中,各预设类别的数据数量均在一定的范围内,该范围可以为3000-3500,其范围可以根据目标数量来设定。其中,需要说明的是,第三预设数量与第一预设数量之和大于目标数量。
操作806,将所述第三预设数量的数据合并至所述第一数据集以形成第二数据集。
将由训练后的分类模型筛选出的数据合并至第一数据集以形成第二数据集,也即,第二数据集的数量为第一预设数量与第二预设数量之和,这样第二数据集中的数据数量和质量都显著提高,可以避免在构建数据集的过程中耗费大量的人力去筛选数据以及标注数据,节约了成本、提高了获取数据集的。
本实施例中的数据集的构建方法,在目标构建数据集的过程中,可以基于第一数据集训练分类模型,继而通过训练后的分类模型来筛选分类未标注的大量数据,并对其自动标注,可以减少人工分类标注的数量,节约了标注成本,同时,提高了获取符合学习任务的数据集的效率和质量。
图9为一个实施例中基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集的流程图。在一个实施例中,所述基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,包括操作902-操作910:
操作902,基于训练后的所述分类模型对所述第二数据集的数据进行分类以筛选出与不符合预设要求的数据。
基于训练后的所述分类模型,可以理解为基于第一数据集训练后的分类模型,也可以理解为,基于第二数据集训练后的分类模型。其中,第二数据集的数据数量大于第一数据集,在本实施例中,可以基于第二数据集来再次训练分类模型。
基于第二数据集再次训练的分类模型,可以对第二数据集中的每种数据进行识别,继而获取每种数据的类别信息,该类别信息包括图像类别和对象类别。
预设要求可以为分类模型能够正确识别出该数据的类别信息,其中,正确的判断标准为,识别的类别信息与人工标注的标注信息一致。
随机挑选若干个数据,进而判断同一数据,由分类模型识别出的类别信息是否与人工标注的标注信息是否一致,若不一致,则将该数据不符合预设要求,将其筛选出来。
可选的,若分类模型未能识别出某一数据的类别信息,则将该数据不符合预设要求,将其筛选出来。
操作904,对所述不符合预设要求的数据进行清洗。
对不符合预设要求的数据进行清洗,例如删除第二数据集中的与任务学习无关的数据、重复数据,平滑噪声数据等。同时,由分类模型识别出的类别信息是否与人工标注的标注信息不一致时;则查验该数据的标注信息是否正确,若不正确,将其纠正以实现对不符合预设要求的数据进行清洗。
操作906,判断清洗后的数据数量是否达到目标数量。
第二数据集中,经过数据清洗,其数据数量可能会减少,为了确保清洗后的数据数量到达目标数据,需要对清洗处理后的数据数量进行统计,以判断清洗后的数据数量是否达到目标数量。
当清洗后的数据数量达到目标数量时,则执行操作908,根据清洗后的数据形成所述目标数据集。具体的,可以保留清洗后的所有的数据,以形成目标数据集,也可以从清洗后的数据集中随机选取具有目标数量的数据,以形成目标数据集。
当清洗后的数据数量未达到目标数量时,则执行操作910,再次基于训练后的所述分 类模型分类筛选未标注的数据并形成新的第二数据集,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
当清洗后的数据数量未达到目标数量时,则可以重复操作306-操作308,知道目标数据集的数据数量达到目标数量。当清洗后的数据数量未达到目标数量时,还可以获取具有第二预设数量且携带标注信息的新数据,并将其新数据合并至第二数据集中,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
通过数据清洗,可以删掉第二数据集中与学习任务无关的数据,也可以对标注错误的数据进行纠正,使第二数据集中的数据都是高质量数据,也即与分类模型的训练是高度相关联的数据。同时第二数据集中的数据数量也能达到目标数据,使其第二数据集的数据可以满足训练分类模型的数量要求和质量要求,为进一步训练分类模型奠定了基础,基于目标数据集可以训练分类模型以提升分类模型性能和精度。
在一个实施例中,数据集的构建方法还包括:对在所述目标数据集上再次训练所述分类模型。
在目标数据集上在此训练该分类模型的方法可以参考上述实施例中操作502-操作508。根据操作502-操作508在此训练该分类模型时,仅输入至该分类模型的数据集不同,其他操作不便。
其输入的数据集为目标数据集,目标数据集中的图像数据的数量远多于第一数据集的图像数据数量。因此,基于目标数据集可以更好的训练该分类模型,可以优化该分类模型中的各个参数,使训练后的分类模型的精度达到理想状态,提高了分类模型的性能。
应该理解的是,虽然图1-5、图8-9的流程图中的各个操作按照箭头的指示依次显示,但是这些操作并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些操作的执行并没有严格的顺序限制,这些操作可以以其它的顺序执行。而且,图1-5、图8-9的至少一部分操作可以包括多个子操作或者多个阶段,这些子操作或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子操作或者阶段的执行顺序也不必然是依次进行,而是可以与其它操作或者其它操作的子操作或者阶段的至少一部分轮流或者交替地执行。
图10为一个实施例中数据集的构建装置的结构框图。在一个实施例中,数据集的构建装置,包括:。
数据集获取模块1010,用于根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
模型训练模块1020,用于在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
数据集合并模块1030,用于当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
数据集处理模块1040,用于基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
上述数据集的构建装置,能够根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;在第一数据集上训练分类模型,并评估分类模型的精度信息;当精度信息达到预设值时,则基于训练后的分类模型分类筛选未标注的数据,将筛选出数据合并至第一数据集以形成第二数据集;基于训练后的分类模型对第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集;可以通过实现半自动化的数据采集和筛选标注,可在花费较小人力的基础上获取大量高质量的训练分类模型的数据,大大节约了人力成本,同时提高了构成数据集的效率。
在一个实施例中,数据集的构建装置,还包括:
新数据获取模块,用于当所述精度信息未达到预设值时,则获取具有第二预设数量且携带标注信息的新数据;用于将所述新数据合并至所述第一数据集中,形成第三数据集;
模型训练模块,还用于在所述第三数据集上再次训练所述分类模型,直到所述分类模型的精度信息达到预设值。
本实施例中的数据集的构建装置,可以不断地向第一数据集添加新数据,使形成的第三数据集的数据数量有增加,进而,可以在第三数据集上再次训练该分类模型,可以优化分类模型中的各个参数,提高分类模型的测试识别率,也即提高分类模型的性能。同时,可以基于训练后的分类模型来分类筛选更多的未标注的网络信息,提供分类筛选的准确性。
在一个实施例中,数据集获取模块,包括:
定义单元,用于根据所述学习任务定义待获取数据的图像类别和对象类别;
第一获取单元,用于根据所述图像类别和对象类别获取数据;
第二获取单元,用于基于人工标注方式对获取的数据进行标注,以获取具有第一预设数量且携带标注信息的第一数据集。
在一个实施例中,所述分类模型为神经网络,所述标注信息包括图像类别和对象类别;模型训练模块模块,包括:
输入单元,用于将携带标注信息的第一数据集输入到神经网络,通过所述神经网络的基础网络层进行特征提取,将提取的图像特征输入到分类网络层和目标检测网络层,在所述分类网络层得到反映所述图像数据中背景图像所属指定图像类别的第一预测置信度与第一真实置信度之间的差异的第一损失函数,在所述目标检测网络层得到反映所述图像数据中前景目标所属指定对象类别的第二预测置信度与第二真实置信度之间的差异的第二损失函数;
处理单元,用于将所述第一损失函数和第二损失函数进行加权求和得到目标损失函数;
调整单元,用于根据所述目标损失函数调整所述神经网络的参数;
评估单元,用于基于第一数据集中的测试集对所述神经网络进行测试,获取所述神经网络的精度信息。
本申请实施例中的数据集的构建方法,通过对背景训练目标所对应的第一损失函数和前景训练目标所对应的第二损失函数的加权求和得到目标损失函数,根据目标损失函数调整神经网络的参数,使得训练的神经网络后续可以同时识别出背景类别和前景目标,获取更多的信息,且提高了识别效率。
在一个实施例中,数据集合并模块,包括:
筛选单元,用于基于训练后的所述分类模型对未标注的数据进行分类以筛选出具有预设类别的数据;
标注单元,用于在筛选结果中获取包括第三预设数量的数据;其中,所述第三预设数量为每种所述预设类别的数据数量之和;
第三获取单元,用于将所述第三预设数量的数据合并至所述第一数据集以形成第二数据集。
本实施例中的数据集的构建装置,在目标构建数据集的过程中,可以基于第一数据集训练分类模型,继而通过训练后的分类模型来筛选分类未标注的大量数据,并对其自动标注,可以减少人工分类标注的数量,节约了标注成本,同时,提高了获取符合学习任务的数据集的效率和质量。
在一个实施例中,数据集处理模块,包括:
筛选单元,用于基于训练后的所述分类模型对所述第二数据集的数据进行分类以筛选出与不符合预设要求的数据;
清洗单元,用于对所述不符合预设要求的数据进行清洗;
判断单元,判断清洗后的数据数量是否达到目标数量;若是,则根据清洗后的数据形成所述目标数据集;若否,则再次基于训练后的所述分类模型分类筛选未标注的数据并形成新的第二数据集,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
通过数据清洗,可以删掉第二数据集中与学习任务无关的数据,也可以对标注错误的数据进行纠正,使第二数据集中的数据都是高质量数据,也即与分类模型的训练是高度相关联的数据。同时第二数据集中的数据数量也能达到目标数据,使其第二数据集的数据可以满足训练分类模型的数量要求和质量要求,为进一步训练分类模型奠定了基础,基于目标数据集可以训练分类模型以提升分类模型性能和精度。
上述数据集的构建装置中各个模块的划分仅用于举例说明,在其他实施例中,可将神经网络处理装置或图像处理装置按照需要划分为不同的模块,以完成上述数据集的构建装置的全部或部分功能。
本申请实施例还提供一种移动终端。该移动终端包括存储器及处理器,该存储器中储存有计算机程序,该计算机程序被该处理器执行时,使得该处理器执行该的数据集的构建方法的操作。
本申请实施例还提供一种计算机可读存储介质。一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现该的数据集的构建方法的操作。
图11为一个实施例中移动终端的内部结构示意图。如图11所示,该移动终端包括通过系统总线连接的处理器、存储器和网络接口。其中,该处理器用于提供计算和控制能力,支撑整个移动终端的运行。存储器用于存储数据、程序等,存储器上存储至少一个计算机程序,该计算机程序可被处理器执行,以实现本申请实施例中提供的适用于移动终端的无线网络通信方法。存储器可包括非易失性存储介质及内存储器。非易失性存储介质存储有操作系统和计算机程序。该计算机程序可被处理器所执行,以用于实现以下各个实施例所提供的一种数据集的构建方法。内存储器为非易失性存储介质中的操作系统计算机程序提供高速缓存的运行环境。网络接口可以是以太网卡或无线网卡等,用于与外部的移动终端进行通信。该移动终端可以是手机、平板电脑或者个人数字助理或穿戴式设备等。
本申请实施例中提供的数据集的构建装置中的各个模块的实现可为计算机程序的形式。该计算机程序可在移动终端或服务器上运行。该计算机程序构成的程序模块可存储在移动终端或服务器的存储器上。该计算机程序被处理器执行时,实现本申请实施例中所描述方法的操作。
一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行数据集的构建方法。
本申请实施例还提供一种移动终端。上述移动终端中包括图像处理电路,图像处理电路可以利用硬件和/或软件组件实现,可包括定义ISP(Image Signal Processing,图像信号处理)管线的各种处理单元。图12为一个实施例中图像处理电路的示意图。如图12所示,为便于说明,仅示出与本申请实施例相关的图像处理技术的各个方面。
如图12所示,图像处理电路包括ISP处理器1240和控制逻辑器1250。成像设备1210捕捉的图像数据首先由ISP处理器1240处理,ISP处理器1240对图像数据进行分析以捕捉可用于确定和/或成像设备1210的一个或多个控制参数的图像统计信息。成像设备1210可包括具有一个或多个透镜1212和图像传感器1214的照相机。图像传感器1214可包括色彩滤镜阵列(如Bayer滤镜),图像传感器1214可获取用图像传感器1214的每种成像像素捕捉的光强度和波长信息,并提供可由ISP处理器1240处理的一组原始图像数据。传感器1220(如陀螺仪)可基于传感器1220接口类型把采集的图像处理的参数(如防抖参数)提供给ISP处理器1240。传感器1220接口可以利用SMIA(Standard Mobile Imaging  Architecture,标准移动成像架构)接口、其它串行或并行照相机接口或上述接口的组合。
此外,图像传感器1214也可将原始图像数据发送给传感器1220,传感器1220可基于传感器1220接口类型把原始图像数据提供给ISP处理器1240,或者传感器1220将原始图像数据存储到图像存储器1230中。
ISP处理器1240按多种格式逐个像素地处理原始图像数据。例如,每种图像像素可具有8、10、12或14比特的位深度,ISP处理器1240可对原始图像数据进行一个或多个图像处理操作、收集关于图像数据的统计信息。其中,图像处理操作可按相同或不同的位深度精度进行。
ISP处理器1240还可从图像存储器1230接收图像数据。例如,传感器1220接口将原始图像数据发送给图像存储器1230,图像存储器1230中的原始图像数据再提供给ISP处理器1240以供处理。图像存储器1230可为存储器装置的一部分、存储设备、或移动终端内的独立的专用存储器,并可包括DMA(Direct Memory Access,直接直接存储器存取)特征。
当接收到来自图像传感器1214接口或来自传感器1220接口或来自图像存储器1230的原始图像数据时,ISP处理器1240可进行一个或多个图像处理操作,如时域滤波。处理后的图像数据可发送给图像存储器1230,以便在被显示之前进行另外的处理。ISP处理器1240从图像存储器1230接收处理数据,并对所述处理数据进行原始域中以及RGB和YCbCr颜色空间中的图像数据处理。ISP处理器1240处理后的图像数据可输出给显示器1270,以供用户观看和/或由图形引擎或GPU(Graphics Processing Unit,图形处理器)进一步处理。此外,ISP处理器1240的输出还可发送给图像存储器1230,且显示器1270可从图像存储器1230读取图像数据。在一个实施例中,图像存储器1230可被配置为实现一个或多个帧缓冲器。此外,ISP处理器1240的输出可发送给编码器/解码器1260,以便编码/解码图像数据。编码的图像数据可被保存,并在显示于显示器1270设备上之前解压缩。编码器/解码器1260可由CPU或GPU或协处理器实现。
ISP处理器1240确定的统计数据可发送给控制逻辑器1250单元。例如,统计数据可包括自动曝光、自动白平衡、自动聚焦、闪烁检测、黑电平补偿、透镜1212阴影校正等图像传感器1214统计信息。控制逻辑器1250可包括执行一个或多个例程(如固件)的处理器和/或微控制器,一个或多个例程可根据接收的统计数据,确定成像设备1210的控制参数及ISP处理器1240的控制参数。例如,成像设备1210的控制参数可包括传感器1220控制参数(例如增益、曝光控制的积分时间、防抖参数等)、照相机闪光控制参数、透镜1212控制参数(例如聚焦或变焦用焦距)、或这些参数的组合。ISP控制参数可包括用于自动白平衡和颜色调整(例如,在RGB处理期间)的增益水平和色彩校正矩阵,以及透镜1212阴影校正参数。
以下为运用图12中图像处理技术实现上述的数据集的构建方法的操作。
本申请所使用的对存储器、存储、数据库或其它介质的任何引用可包括非易失性和/或易失性存储器。合适的非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM),它用作外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDR SDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种数据集的构建方法,包括:
    根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
    在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
    当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
    基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
  2. 根据权利要求1所述的方法,其特征在于,还包括:
    当所述精度信息未达到预设值时,则获取具有第二预设数量且携带标注信息的新数据;
    将所述新数据合并至所述第一数据集中,形成第三数据集;
    在所述第三数据集上再次训练所述分类模型,直到所述分类模型的精度信息达到预设值。
  3. 根据权利要求1所述的方法,其特征在于,所述根据学习任务获取具有第一预设数量且携带标注信息的第一数据集,包括:
    根据所述学习任务定义待获取数据的图像类别和对象类别;
    根据所述图像类别和对象类别获取数据;
    基于人工标注方式对获取的数据进行标注,以获取具有第一预设数量且携带标注信息的第一数据集。
  4. 根据权利要求1所述的方法,其特征在于,所述分类模型为神经网络,所述标注信息包括图像类别和对象类别;
    所述在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息,包括:
    将携带标注信息的第一数据集输入到神经网络,通过所述神经网络的基础网络层进行特征提取,将提取的图像特征输入到分类网络层和目标检测网络层,在所述分类网络层得到反映所述数据中背景图像所属指定图像类别的第一预测置信度与第一真实置信度之间的差异的第一损失函数,在所述目标检测网络层得到反映所述数据中前景目标所属指定对象类别的第二预测置信度与第二真实置信度之间的差异的第二损失函数;
    将所述第一损失函数和第二损失函数进行加权求和得到目标损失函数;
    根据所述目标损失函数调整所述神经网络的参数;
    基于第一数据集中的测试集对所述神经网络进行测试,获取所述神经网络的精度信息。
  5. 根据权利要求1所述的方法,其特征在于,基于训练后的所述分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集,包括:
    基于训练后的所述分类模型对未标注的数据进行分类以筛选出具有预设类别的数据;
    在筛选结果中获取包括第三预设数量的数据;其中,所述第三预设数量为每种所述预设类别的数据数量之和;
    将所述第三预设数量的数据合并至所述第一数据集以形成第二数据集。
  6. 根据权利要求1所述的方法,其特征在于,所述基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,包括:
    基于训练后的所述分类模型对所述第二数据集的数据进行分类以筛选出与不符合预设要求的数据;
    对所述不符合预设要求的数据进行清洗;
    判断清洗后的数据数量是否达到目标数量;
    若是,则根据清洗后的数据形成所述目标数据集;
    若否,则再次基于训练后的所述分类模型分类筛选未标注的数据并形成新的第二数据集,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,还包括:
    在所述目标数据集上再次训练所述分类模型。
  8. 一种移动终端,包括存储器及处理器,所述存储器中储存有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如下操作:
    根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
    在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
    当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
    基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
  9. 根据权利要求8所述的移动终端,其特征在于,所述处理器还执行如下操作:
    当所述精度信息未达到预设值时,则获取具有第二预设数量且携带标注信息的新数据;
    将所述新数据合并至所述第一数据集中,形成第三数据集;
    在所述第三数据集上再次训练所述分类模型,直到所述分类模型的精度信息达到预设值。
  10. 根据权利要求8所述的移动终端,其特征在于,所述处理器执行所述根据学习任务获取具有第一预设数量且携带标注信息的第一数据集操作时,还执行如下操作:
    根据所述学习任务定义待获取数据的图像类别和对象类别;
    根据所述图像类别和对象类别获取数据;
    基于人工标注方式对获取的数据进行标注,以获取具有第一预设数量且携带标注信息的第一数据集。
  11. 根据权利要求8所述的移动终端,其特征在于,所述分类模型为神经网络,所述标注信息包括图像类别和对象类别;
    所述处理器执行所述在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息时,还执行如下操作:
    将携带标注信息的第一数据集输入到神经网络,通过所述神经网络的基础网络层进行特征提取,将提取的图像特征输入到分类网络层和目标检测网络层,在所述分类网络层得到反映所述数据中背景图像所属指定图像类别的第一预测置信度与第一真实置信度之间的差异的第一损失函数,在所述目标检测网络层得到反映所述数据中前景目标所属指定对象类别的第二预测置信度与第二真实置信度之间的差异的第二损失函数;
    将所述第一损失函数和第二损失函数进行加权求和得到目标损失函数;
    根据所述目标损失函数调整所述神经网络的参数;
    基于第一数据集中的测试集对所述神经网络进行测试,获取所述神经网络的精度信息。
  12. 根据权利要求8所述的移动终端,其特征在于,所述处理器执行所述基于训练后的所述分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集时,还执行如下操作:
    基于训练后的所述分类模型对未标注的数据进行分类以筛选出具有预设类别的数据;
    在筛选结果中获取包括第三预设数量的数据;其中,所述第三预设数量为每种所述预设类别的数据数量之和;
    将所述第三预设数量的数据合并至所述第一数据集以形成第二数据集。
  13. 根据权利要求8所述的移动终端,其特征在于,所述处理器执行所述基于训练后 的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集时,还执行如下操作:
    基于训练后的所述分类模型对所述第二数据集的数据进行分类以筛选出与不符合预设要求的数据;
    对所述不符合预设要求的数据进行清洗;
    判断清洗后的数据数量是否达到目标数量;
    若是,则根据清洗后的数据形成所述目标数据集;
    若否,则再次基于训练后的所述分类模型分类筛选未标注的数据并形成新的第二数据集,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
  14. 根据权利要求8至13任一项所述的移动终端,其特征在于,所述处理器还执行如下操作:
    在所述目标数据集上再次训练所述分类模型。
  15. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下操作:
    根据学习任务获取具有第一预设数量且携带标注信息的第一数据集;
    在所述第一数据集上训练分类模型,并评估所述分类模型的精度信息;
    当所述精度信息达到预设值时,则基于训练后的所述分类模型筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集;
    基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集,其中,第二数据集的数据数量大于等于目标数据集的数据数量。
  16. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器时还执行如下操作:
    当所述精度信息未达到预设值时,则获取具有第二预设数量且携带标注信息的新数据;
    将所述新数据合并至所述第一数据集中,形成第三数据集;
    在所述第三数据集上再次训练所述分类模型,直到所述分类模型的精度信息达到预设值。
  17. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述根据学习任务获取具有第一预设数量且携带标注信息的第一数据集操作时,还执行如下操作:
    根据所述学习任务定义待获取数据的图像类别和对象类别;
    根据所述图像类别和对象类别获取数据;
    基于人工标注方式对获取的数据进行标注,以获取具有第一预设数量且携带标注信息的第一数据集。
  18. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述分类模型为神经网络,所述标注信息包括图像类别和对象类别;
    所述计算机程序被处理器执行所述在所述第一数据集上训练所述分类模型,并评估所述分类模型的精度信息时,还执行如下操作:
    将携带标注信息的第一数据集输入到神经网络,通过所述神经网络的基础网络层进行特征提取,将提取的图像特征输入到分类网络层和目标检测网络层,在所述分类网络层得到反映所述数据中背景图像所属指定图像类别的第一预测置信度与第一真实置信度之间的差异的第一损失函数,在所述目标检测网络层得到反映所述数据中前景目标所属指定对象类别的第二预测置信度与第二真实置信度之间的差异的第二损失函数;
    将所述第一损失函数和第二损失函数进行加权求和得到目标损失函数;
    根据所述目标损失函数调整所述神经网络的参数;
    基于第一数据集中的测试集对所述神经网络进行测试,获取所述神经网络的精度信息。
  19. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述基于训练后的所述分类模型分类筛选未标注的数据,将筛选出的数据合并至所述第一数据集以形成第二数据集时,还执行如下操作:
    基于训练后的所述分类模型对未标注的数据进行分类以筛选出具有预设类别的数据;
    在筛选结果中获取包括第三预设数量的数据;其中,所述第三预设数量为每种所述预设类别的数据数量之和;
    将所述第三预设数量的数据合并至所述第一数据集以形成第二数据集。
  20. 根据权利要求8所述的计算机可读存储介质,其特征在于,所述计算机程序被处理器执行所述基于训练后的所述分类模型对所述第二数据集的数据进行分类、清洗以形成具有目标数量的目标数据集时,还执行如下操作:
    基于训练后的所述分类模型对所述第二数据集的数据进行分类以筛选出与不符合预设要求的数据;
    对所述不符合预设要求的数据进行清洗;
    判断清洗后的数据数量是否达到目标数量;
    若是,则根据清洗后的数据形成所述目标数据集;
    若否,则再次基于训练后的所述分类模型分类筛选未标注的数据并形成新的第二数据集,并对所述新的第二数据集进行分类、清洗以形成具有目标数量的目标数据集。
PCT/CN2019/088378 2018-06-08 2019-05-24 数据集的构建方法、移动终端、可读存储介质 WO2019233297A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810588652.XA CN108764372B (zh) 2018-06-08 2018-06-08 数据集的构建方法和装置、移动终端、可读存储介质
CN201810588652.X 2018-06-08

Publications (1)

Publication Number Publication Date
WO2019233297A1 true WO2019233297A1 (zh) 2019-12-12

Family

ID=63999571

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088378 WO2019233297A1 (zh) 2018-06-08 2019-05-24 数据集的构建方法、移动终端、可读存储介质

Country Status (2)

Country Link
CN (1) CN108764372B (zh)
WO (1) WO2019233297A1 (zh)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178447A (zh) * 2019-12-31 2020-05-19 北京市商汤科技开发有限公司 模型压缩方法、图像处理方法及相关装置
CN111260608A (zh) * 2020-01-08 2020-06-09 来康科技有限责任公司 一种基于深度学习的舌部区域检测方法及系统
CN111488989A (zh) * 2020-04-16 2020-08-04 济南浪潮高新科技投资发展有限公司 一种在手机端实现轻量级目标检测的方法及模型
CN111709966A (zh) * 2020-06-23 2020-09-25 上海鹰瞳医疗科技有限公司 眼底图像分割模型训练方法及设备
CN111753843A (zh) * 2020-06-28 2020-10-09 平安科技(深圳)有限公司 基于深度学习的分割效果评估方法、装置、设备及介质
CN111783891A (zh) * 2020-07-06 2020-10-16 中国人民武装警察部队工程大学 一种定制化物体检测方法
CN111833372A (zh) * 2020-07-23 2020-10-27 浙江大华技术股份有限公司 一种前景目标提取方法及装置
CN112000808A (zh) * 2020-09-29 2020-11-27 迪爱斯信息技术股份有限公司 一种数据处理方法及装置、可读存储介质
CN112102331A (zh) * 2020-08-26 2020-12-18 广州金域医学检验中心有限公司 病理切片的训练图像集获取方法、系统、设备和介质
CN112182371A (zh) * 2020-09-22 2021-01-05 珠海中科先进技术研究院有限公司 健康管理产品组合及定价方法及介质
CN112200218A (zh) * 2020-09-10 2021-01-08 浙江大华技术股份有限公司 一种模型训练方法、装置及电子设备
CN112419270A (zh) * 2020-11-23 2021-02-26 深圳大学 元学习下的无参考图像质量评价方法、装置及计算机设备
CN112800037A (zh) * 2021-01-06 2021-05-14 银源工程咨询有限公司 工程造价数据处理的优化方法及装置
CN112819099A (zh) * 2021-02-26 2021-05-18 网易(杭州)网络有限公司 网络模型的训练方法、数据处理方法、装置、介质及设备
CN112906704A (zh) * 2021-03-09 2021-06-04 深圳海翼智新科技有限公司 用于跨域目标检测的方法和装置
CN112926621A (zh) * 2021-01-21 2021-06-08 百度在线网络技术(北京)有限公司 数据标注方法、装置、电子设备及存储介质
CN113010705A (zh) * 2021-02-03 2021-06-22 腾讯科技(深圳)有限公司 标签预测方法、装置、设备及存储介质
CN113128335A (zh) * 2021-03-09 2021-07-16 西北大学 微体古生物化石图像检测、分类及发现方法、系统及应用
CN113505800A (zh) * 2021-06-30 2021-10-15 深圳市慧鲤科技有限公司 图像处理方法及其模型的训练方法和装置、设备、介质
CN115333902A (zh) * 2021-05-10 2022-11-11 陕西尚品信息科技有限公司 通信信号调制识别方法及装置
CN115879248A (zh) * 2023-03-03 2023-03-31 山东亿宁环保科技有限公司 一种适用于真空泵的全生命周期管理方法和系统
CN116204769A (zh) * 2023-03-06 2023-06-02 深圳市乐易网络股份有限公司 一种基于数据分类识别的数据清洗方法、系统及存储介质
CN112926621B (zh) * 2021-01-21 2024-05-10 百度在线网络技术(北京)有限公司 数据标注方法、装置、电子设备及存储介质

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764372B (zh) * 2018-06-08 2019-07-16 Oppo广东移动通信有限公司 数据集的构建方法和装置、移动终端、可读存储介质
CN111259918B (zh) * 2018-11-30 2023-06-20 重庆小雨点小额贷款有限公司 一种意图标签的标注方法、装置、服务器及存储介质
CN109543772B (zh) * 2018-12-03 2020-08-25 北京锐安科技有限公司 数据集自动匹配方法、装置、设备和计算机可读存储介质
CN109740752B (zh) * 2018-12-29 2022-01-04 北京市商汤科技开发有限公司 深度模型训练方法及装置、电子设备及存储介质
CN111414922B (zh) * 2019-01-07 2022-11-15 阿里巴巴集团控股有限公司 特征提取方法、图像处理方法、模型训练方法及装置
CN109829483B (zh) * 2019-01-07 2021-05-18 鲁班嫡系机器人(深圳)有限公司 缺陷识别模型训练方法、装置、计算机设备和存储介质
CN109767448B (zh) * 2019-01-17 2021-06-01 上海长征医院 分割模型训练方法及装置
CN109977255A (zh) * 2019-02-22 2019-07-05 北京奇艺世纪科技有限公司 模型生成方法、音频处理方法、装置、终端及存储介质
CN110008372A (zh) * 2019-02-22 2019-07-12 北京奇艺世纪科技有限公司 模型生成方法、音频处理方法、装置、终端及存储介质
CN109978029B (zh) * 2019-03-13 2021-02-09 北京邮电大学 一种基于卷积神经网络的无效图像样本筛选方法
CN111797078A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 数据清洗方法、模型训练方法、装置、存储介质及设备
CN111797175B (zh) * 2019-04-09 2023-12-19 Oppo广东移动通信有限公司 数据存储方法、装置、存储介质及电子设备
CN111797288A (zh) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 数据筛选方法、装置、存储介质及电子设备
CN110751012B (zh) * 2019-05-23 2021-01-12 北京嘀嘀无限科技发展有限公司 目标检测评估方法、装置、电子设备及存储介质
CN110443141A (zh) * 2019-07-08 2019-11-12 深圳中兴网信科技有限公司 数据集处理方法、数据集处理装置及存储介质
CN110334772A (zh) * 2019-07-11 2019-10-15 山东领能电子科技有限公司 一种扩充类别式数据快速标注方法
CN110490237B (zh) * 2019-08-02 2022-05-17 Oppo广东移动通信有限公司 数据处理方法、装置、存储介质及电子设备
CN110569379A (zh) * 2019-08-05 2019-12-13 广州市巴图鲁信息科技有限公司 一种汽车配件图片数据集制作方法
CN110610169B (zh) * 2019-09-20 2023-12-15 腾讯科技(深圳)有限公司 图片标注方法和装置、存储介质及电子装置
CN112699908B (zh) * 2019-10-23 2022-08-05 武汉斗鱼鱼乐网络科技有限公司 标注图片的方法、电子终端、计算机可读存储介质及设备
CN112702751A (zh) * 2019-10-23 2021-04-23 中国移动通信有限公司研究院 无线通信模型的训练和升级方法、网络设备及存储介质
CN110865421B (zh) * 2019-11-18 2022-04-15 北京百度网讯科技有限公司 自动驾驶业务模型训练方法、检测方法、装置和电子设备
CN112825144A (zh) * 2019-11-20 2021-05-21 深圳云天励飞技术有限公司 一种图片的标注方法、装置、电子设备及存储介质
CN112884158A (zh) * 2019-11-29 2021-06-01 杭州海康威视数字技术股份有限公司 一种机器学习程序的训练方法、装置及设备
CN110889457B (zh) * 2019-12-03 2022-08-19 深圳奇迹智慧网络有限公司 样本图像分类训练方法、装置、计算机设备和存储介质
CN111143912B (zh) * 2019-12-11 2023-04-07 万翼科技有限公司 展示标注方法及相关产品
CN111177136B (zh) * 2019-12-27 2023-04-18 上海依图网络科技有限公司 标注数据清洗装置和方法
CN113191173A (zh) * 2020-01-14 2021-07-30 北京地平线机器人技术研发有限公司 一种训练数据获取方法及装置
CN113269215B (zh) * 2020-02-17 2023-08-01 百度在线网络技术(北京)有限公司 一种训练集的构建方法、装置、设备和存储介质
CN111339964A (zh) * 2020-02-28 2020-06-26 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN111462069B (zh) * 2020-03-30 2023-09-01 北京金山云网络技术有限公司 目标对象检测模型训练方法、装置、电子设备及存储介质
CN111814833A (zh) * 2020-06-11 2020-10-23 浙江大华技术股份有限公司 票据处理模型的训练方法及图像处理方法、图像处理设备
CN111859953B (zh) * 2020-06-22 2023-08-22 北京百度网讯科技有限公司 训练数据的挖掘方法、装置、电子设备及存储介质
CN112182257A (zh) * 2020-08-26 2021-01-05 合肥三恩信息科技有限公司 一种基于神经网络的人工智能数据清洗方法
CN112183321A (zh) * 2020-09-27 2021-01-05 深圳奇迹智慧网络有限公司 机器学习模型优化的方法、装置、计算机设备和存储介质
CN112328822B (zh) * 2020-10-15 2024-04-02 深圳市优必选科技股份有限公司 图片预标注方法、装置及终端设备
CN112528109B (zh) * 2020-12-01 2023-10-27 科大讯飞(北京)有限公司 一种数据分类方法、装置、设备及存储介质
CN113221627B (zh) * 2021-03-08 2022-05-10 广州大学 一种人脸遗传特征分类数据集构建方法、系统、装置及介质
CN113344216A (zh) * 2021-06-17 2021-09-03 上海商汤科技开发有限公司 数据标注方法和平台
CN113269139B (zh) * 2021-06-18 2023-09-26 中电科大数据研究院有限公司 一种针对复杂场景的自学习大规模警员图像分类模型
CN113421176B (zh) * 2021-07-16 2022-11-01 昆明学院 一种学生成绩分数中异常数据智能筛选方法
CN114359676B (zh) * 2022-03-08 2022-07-19 人民中科(济南)智能技术有限公司 训练目标检测模型和构建样本集的方法、装置及存储介质
CN114689122B (zh) * 2022-03-31 2023-11-10 国网北京市电力公司 一种设备故障监测方法、装置、设备及介质
CN115440238B (zh) * 2022-08-16 2023-04-07 广西壮族自治区通信产业服务有限公司技术服务分公司 一种语音自动标注数据中的噪音筛选方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017055878A1 (en) * 2015-10-02 2017-04-06 Tractable Ltd. Semi-automatic labelling of datasets
CN107392125A (zh) * 2017-07-11 2017-11-24 中国科学院上海高等研究院 智能模型的训练方法/系统、计算机可读存储介质及终端
CN107704878A (zh) * 2017-10-09 2018-02-16 南京大学 一种基于深度学习的高光谱数据库半自动化建立方法
CN108764372A (zh) * 2018-06-08 2018-11-06 Oppo广东移动通信有限公司 数据集的构建方法和装置、移动终端、可读存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10074014B2 (en) * 2015-04-22 2018-09-11 Battelle Memorial Institute Feature identification or classification using task-specific metadata
CN106649610A (zh) * 2016-11-29 2017-05-10 北京智能管家科技有限公司 图片标注方法及装置
CN107247972A (zh) * 2017-06-29 2017-10-13 哈尔滨工程大学 一种基于众包技术的分类模型训练方法
CN107480696A (zh) * 2017-07-12 2017-12-15 深圳信息职业技术学院 一种分类模型构建方法、装置及终端设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017055878A1 (en) * 2015-10-02 2017-04-06 Tractable Ltd. Semi-automatic labelling of datasets
CN107392125A (zh) * 2017-07-11 2017-11-24 中国科学院上海高等研究院 智能模型的训练方法/系统、计算机可读存储介质及终端
CN107704878A (zh) * 2017-10-09 2018-02-16 南京大学 一种基于深度学习的高光谱数据库半自动化建立方法
CN108764372A (zh) * 2018-06-08 2018-11-06 Oppo广东移动通信有限公司 数据集的构建方法和装置、移动终端、可读存储介质

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178447A (zh) * 2019-12-31 2020-05-19 北京市商汤科技开发有限公司 模型压缩方法、图像处理方法及相关装置
CN111178447B (zh) * 2019-12-31 2024-03-08 北京市商汤科技开发有限公司 模型压缩方法、图像处理方法及相关装置
CN111260608A (zh) * 2020-01-08 2020-06-09 来康科技有限责任公司 一种基于深度学习的舌部区域检测方法及系统
CN111488989A (zh) * 2020-04-16 2020-08-04 济南浪潮高新科技投资发展有限公司 一种在手机端实现轻量级目标检测的方法及模型
CN111488989B (zh) * 2020-04-16 2024-03-29 山东浪潮科学研究院有限公司 一种在手机端实现轻量级目标检测的方法及模型
CN111709966B (zh) * 2020-06-23 2023-06-06 上海鹰瞳医疗科技有限公司 眼底图像分割模型训练方法及设备
CN111709966A (zh) * 2020-06-23 2020-09-25 上海鹰瞳医疗科技有限公司 眼底图像分割模型训练方法及设备
CN111753843A (zh) * 2020-06-28 2020-10-09 平安科技(深圳)有限公司 基于深度学习的分割效果评估方法、装置、设备及介质
CN111783891A (zh) * 2020-07-06 2020-10-16 中国人民武装警察部队工程大学 一种定制化物体检测方法
CN111783891B (zh) * 2020-07-06 2023-10-31 中国人民武装警察部队工程大学 一种定制化物体检测方法
CN111833372A (zh) * 2020-07-23 2020-10-27 浙江大华技术股份有限公司 一种前景目标提取方法及装置
CN112102331A (zh) * 2020-08-26 2020-12-18 广州金域医学检验中心有限公司 病理切片的训练图像集获取方法、系统、设备和介质
CN112102331B (zh) * 2020-08-26 2024-03-29 广州金域医学检验中心有限公司 病理切片的训练图像集获取方法、系统、设备和介质
CN112200218A (zh) * 2020-09-10 2021-01-08 浙江大华技术股份有限公司 一种模型训练方法、装置及电子设备
CN112182371A (zh) * 2020-09-22 2021-01-05 珠海中科先进技术研究院有限公司 健康管理产品组合及定价方法及介质
CN112182371B (zh) * 2020-09-22 2024-05-14 珠海中科先进技术研究院有限公司 健康管理产品组合及定价方法及介质
CN112000808B (zh) * 2020-09-29 2024-04-16 迪爱斯信息技术股份有限公司 一种数据处理方法及装置、可读存储介质
CN112000808A (zh) * 2020-09-29 2020-11-27 迪爱斯信息技术股份有限公司 一种数据处理方法及装置、可读存储介质
CN112419270B (zh) * 2020-11-23 2023-09-26 深圳大学 元学习下的无参考图像质量评价方法、装置及计算机设备
CN112419270A (zh) * 2020-11-23 2021-02-26 深圳大学 元学习下的无参考图像质量评价方法、装置及计算机设备
CN112800037B (zh) * 2021-01-06 2024-02-02 银源工程咨询有限公司 工程造价数据处理的优化方法及装置
CN112800037A (zh) * 2021-01-06 2021-05-14 银源工程咨询有限公司 工程造价数据处理的优化方法及装置
CN112926621A (zh) * 2021-01-21 2021-06-08 百度在线网络技术(北京)有限公司 数据标注方法、装置、电子设备及存储介质
CN112926621B (zh) * 2021-01-21 2024-05-10 百度在线网络技术(北京)有限公司 数据标注方法、装置、电子设备及存储介质
CN113010705B (zh) * 2021-02-03 2023-12-12 腾讯科技(深圳)有限公司 标签预测方法、装置、设备及存储介质
CN113010705A (zh) * 2021-02-03 2021-06-22 腾讯科技(深圳)有限公司 标签预测方法、装置、设备及存储介质
CN112819099B (zh) * 2021-02-26 2023-12-22 杭州网易智企科技有限公司 网络模型的训练方法、数据处理方法、装置、介质及设备
CN112819099A (zh) * 2021-02-26 2021-05-18 网易(杭州)网络有限公司 网络模型的训练方法、数据处理方法、装置、介质及设备
CN113128335A (zh) * 2021-03-09 2021-07-16 西北大学 微体古生物化石图像检测、分类及发现方法、系统及应用
CN112906704A (zh) * 2021-03-09 2021-06-04 深圳海翼智新科技有限公司 用于跨域目标检测的方法和装置
CN115333902A (zh) * 2021-05-10 2022-11-11 陕西尚品信息科技有限公司 通信信号调制识别方法及装置
CN113505800A (zh) * 2021-06-30 2021-10-15 深圳市慧鲤科技有限公司 图像处理方法及其模型的训练方法和装置、设备、介质
CN115879248A (zh) * 2023-03-03 2023-03-31 山东亿宁环保科技有限公司 一种适用于真空泵的全生命周期管理方法和系统
CN116204769B (zh) * 2023-03-06 2023-12-05 深圳市乐易网络股份有限公司 一种基于数据分类识别的数据清洗方法、系统及存储介质
CN116204769A (zh) * 2023-03-06 2023-06-02 深圳市乐易网络股份有限公司 一种基于数据分类识别的数据清洗方法、系统及存储介质

Also Published As

Publication number Publication date
CN108764372B (zh) 2019-07-16
CN108764372A (zh) 2018-11-06

Similar Documents

Publication Publication Date Title
WO2019233297A1 (zh) 数据集的构建方法、移动终端、可读存储介质
US11138478B2 (en) Method and apparatus for training, classification model, mobile terminal, and readable storage medium
CN108764370B (zh) 图像处理方法、装置、计算机可读存储介质和计算机设备
CN108764208B (zh) 图像处理方法和装置、存储介质、电子设备
WO2019233393A1 (zh) 图像处理方法和装置、存储介质、电子设备
US10896323B2 (en) Method and device for image processing, computer readable storage medium, and electronic device
WO2019233266A1 (zh) 图像处理方法、计算机可读存储介质和电子设备
WO2019233263A1 (zh) 视频处理方法、电子设备、计算机可读存储介质
EP3937481A1 (en) Image display method and device
CN108830855B (zh) 一种基于多尺度低层特征融合的全卷积网络语义分割方法
CN113065558A (zh) 一种结合注意力机制的轻量级小目标检测方法
CN108960245B (zh) 轮胎模具字符的检测与识别方法、装置、设备及存储介质
US20200412940A1 (en) Method and device for image processing, method for training object detection model
CN108921161B (zh) 模型训练方法、装置、电子设备和计算机可读存储介质
WO2019233262A1 (zh) 视频处理方法、电子设备、计算机可读存储介质
WO2019237887A1 (zh) 图像处理方法、电子设备、计算机可读存储介质
WO2020001196A1 (zh) 图像处理方法、电子设备、计算机可读存储介质
CN110580487A (zh) 神经网络的训练方法、构建方法、图像处理方法和装置
US10592764B2 (en) Reconstructing document from series of document images
WO2019233392A1 (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN109063737A (zh) 图像处理方法、装置、存储介质及移动终端
CN109635634B (zh) 一种基于随机线性插值的行人再识别数据增强方法
CN108765033B (zh) 广告信息推送方法和装置、存储介质、电子设备
CN108647625A (zh) 一种表情识别方法及装置
CN107911625A (zh) 测光方法、装置、可读存储介质和计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19815927

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19815927

Country of ref document: EP

Kind code of ref document: A1