CN110457476A - Method and apparatus for generating disaggregated model - Google Patents

Method and apparatus for generating disaggregated model Download PDF

Info

Publication number
CN110457476A
CN110457476A CN201910721353.3A CN201910721353A CN110457476A CN 110457476 A CN110457476 A CN 110457476A CN 201910721353 A CN201910721353 A CN 201910721353A CN 110457476 A CN110457476 A CN 110457476A
Authority
CN
China
Prior art keywords
model
text
hyper parameter
training
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910721353.3A
Other languages
Chinese (zh)
Inventor
曲福
陈兴波
谢国斌
杜泓江
刘彦江
卢俊豪
冯博豪
秦文静
薛礼强
罗小兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910721353.3A priority Critical patent/CN110457476A/en
Publication of CN110457476A publication Critical patent/CN110457476A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiment of the disclosure discloses the method and apparatus for generating disaggregated model.One specific embodiment of this method includes: to obtain training sample set, wherein training sample includes sample text and sample class corresponding with sample text;Determine the statistical nature of training sample set, wherein statistical nature includes the form feature for characterizing text length;Based on statistical nature and to the training of initial model, textual classification model is generated, wherein textual classification model is used to characterize the corresponding relationship between text categories and text, and initial model is chosen from preset pre-training model set.The embodiment realizes automatically generating for the textual classification model based on cloud computing technology, carries out adjusting ginseng by hand without user.

Description

Method and apparatus for generating disaggregated model
Technical field
Embodiment of the disclosure is related to field of computer technology, and in particular to for generating the method and dress of disaggregated model It sets.
Background technique
With the rapid development of artificial intelligence technology (Artificial Intelligence, AI) and Internet technology, face To the mass text information of rapid growth, how effectively to carry out classification to text is subsequent lookup content, extracts information value Important prerequisite.
Relevant mode be usually there are two types of: one is using Feature Engineering technology extract text feature, further according to extraction Feature between similarity degree classify to text.The second is using trained automation textual classification model come to text This is classified.
Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for generating disaggregated model.
In a first aspect, embodiment of the disclosure provides a kind of method for generating disaggregated model, this method comprises: obtaining Take training sample set, wherein training sample includes sample text and sample class corresponding with sample text;Determine training sample The statistical nature of this set, wherein statistical nature includes the form feature for characterizing text length;Based on statistical nature and to initial The training of model generates textual classification model, wherein textual classification model is corresponding between text categories and text for characterizing Relationship, initial model are chosen from preset pre-training model set.
In some embodiments, above-mentioned to be based on statistical nature and initial model, generate textual classification model, comprising: by shape Formula feature is input to hyper parameter trained in advance and generates model, obtains hyper parameter group corresponding with form feature, wherein hyper parameter It include model hyper parameter and training hyper parameter in group;From selection in pre-training model set and the model hyper parameter in hyper parameter group Matched pre-training model is as initial model;Initial model is trained according to the training hyper parameter in hyper parameter group, it is raw At textual classification model.
In some embodiments, above-mentioned to be based on statistical nature and initial model, generate textual classification model, comprising: obtain Initial hyper parameter group set, wherein include initial model hyper parameter and initial training hyper parameter in initial hyper parameter group;From initial Choose initial hyper parameter group in hyper parameter group set, and execute step identified below: chosen from pre-training model set with The matched pre-training model of initial model hyper parameter in selected initial hyper parameter group is as initial model;According to selected Initial hyper parameter group in initial training hyper parameter selected initial model is trained, generate initial hyper parameter group pair The quasi- textual classification model answered;Quasi- textual classification model generated is evaluated and tested based on verifying text collection, generates evaluation and test As a result;Meet hyper parameter group in response to determination evaluation result generated and determine condition, determines condition from hyper parameter group is met Textual classification model is determined in the corresponding quasi- textual classification model of evaluation result;It is discontented in response to determination evaluation result generated Sufficient hyper parameter group determines condition, is updated to the initial hyper parameter group in initial hyper parameter group set;From updated initial Initial hyper parameter group is chosen in hyper parameter group set, continues to execute determining step.
In some embodiments, above-mentioned statistical nature further includes the content characteristic for characterizing content of text;And it is above-mentioned from pre- It chooses in training pattern set and makees with the matched pre-training model of initial model hyper parameter in selected initial hyper parameter group For initial model, comprising: from pre-training model set choose with content characteristic and selected initial hyper parameter group in just The matched pre-training model of beginning model hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
In some embodiments, the initial training hyper parameter in the above-mentioned initial hyper parameter group according to selected by is to selected Initial model be trained, generate the corresponding quasi- textual classification model of initial hyper parameter group, comprising: from training sample set Choose training sample, and execute following training step: by the sample text of the training sample of selection be input to it is selected just Beginning model generates text categories;According to text categories generated sample class corresponding with the sample text of input, determination is poor Different value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on selected super ginseng Initial training hyper parameter in array determines;Meet training completion condition in response to determining, selected initial model is determined For the selected corresponding quasi- textual classification model of hyper parameter group;It is unsatisfactory for training completion condition in response to determining, selected by adjustment The relevant parameter of the initial model taken, and training sample is chosen from training sample set, use initial model adjusted As selected initial model, training step is continued to execute.
In some embodiments, above-mentioned acquisition training sample set, comprising: receive the mark text set that user terminal is sent It closes, wherein mark text includes text and text categories markup information corresponding with text;Mark text collection is drawn Point, generate training sample set and verifying text collection, wherein training sample include as sample text text and as with The text categories markup information of the corresponding sample class of sample text.
In some embodiments, this method further include: receive the text collection to be sorted that user terminal is sent;By text to be sorted This set is input to textual classification model, generates the corresponding classification information of text to be sorted in text collection to be sorted, wherein For characterizing classification belonging to text to be sorted, classification information matches classification information with sample class;By classification generated Information is sent to user terminal with corresponding text information to be sorted, wherein text information to be sorted is for identifying text to be sorted Text to be sorted in set.
Second aspect, embodiment of the disclosure provide a kind of for generating the device of disaggregated model, which includes: to obtain Unit is taken, is configured to obtain training sample set, wherein training sample includes sample text and sample corresponding with sample text This classification;Determination unit is configured to determine the statistical nature of training sample set, wherein statistical nature includes characterization text The form feature of length;First generation unit is configured to generate text point based on statistical nature and to the training of initial model Class model, wherein textual classification model is used to characterize corresponding relationship between text categories and text, and initial model is from preset It is chosen in pre-training model set.
In some embodiments, above-mentioned first generation unit includes: the first generation module, is configured to form feature is defeated Enter to hyper parameter trained in advance and generate model, obtains hyper parameter group corresponding with form feature, wherein include in hyper parameter group Model hyper parameter and training hyper parameter;Choose module, be configured to from pre-training model set choose with hyper parameter group in The matched pre-training model of model hyper parameter is as initial model;Second generation module is configured to according in hyper parameter group Training hyper parameter is trained initial model, generates textual classification model.
In some embodiments, above-mentioned first generation unit includes: acquisition module, is configured to obtain initial hyper parameter group Set, wherein include initial model hyper parameter and initial training hyper parameter in initial hyper parameter group;Determining module is configured to Initial hyper parameter group is chosen from initial hyper parameter group set, and executes step identified below: from pre-training model set It chooses with the matched pre-training model of initial model hyper parameter in selected initial hyper parameter group as initial model;According to Initial training hyper parameter in selected initial hyper parameter group is trained selected initial model, generates initial super ginseng The corresponding quasi- textual classification model of array;Quasi- textual classification model generated is evaluated and tested based on verifying text collection, it is raw At evaluation result;Meet hyper parameter group in response to determination evaluation result generated and determine condition, is determined from hyper parameter group is met Textual classification model is determined in the corresponding quasi- textual classification model of the evaluation result of condition;Update module is configured in response to Determine that evaluation result generated is unsatisfactory for hyper parameter group and determines condition, to the initial hyper parameter group in initial hyper parameter group set It is updated;Initial hyper parameter group is chosen from updated initial hyper parameter group set, continues to execute determining step.
In some embodiments, above-mentioned statistical nature further includes the content characteristic for characterizing content of text;Determining module is into one Step is configured to: from selection in pre-training model set and the initial model in content characteristic and selected initial hyper parameter group The matched pre-training model of hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
In some embodiments, above-mentioned determining module further comprises: choosing submodule, is configured to from training sample set Training sample is chosen in conjunction, and executes following training step: the sample text of the training sample of selection being input to selected Initial model, generate text categories;According to text categories generated sample class corresponding with the sample text of input, really Determine difference value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on selected Initial training hyper parameter in hyper parameter group determines;Meet training completion condition in response to determining, by selected initial model It is determined as the selected corresponding quasi- textual classification model of hyper parameter group;Adjusting submodule is configured in response to determine discontented Foot training completion condition, adjusts the relevant parameter of selected initial model, and training sample is chosen from training sample set This, uses initial model adjusted as selected initial model, continues to execute training step.
In some embodiments, above-mentioned acquiring unit includes: receiving module, is configured to receive the mark of user terminal transmission Text collection, wherein mark text includes text and text categories markup information corresponding with text;Third generation module, quilt It is configured to divide mark text collection, generates training sample set and verifying text collection, wherein training sample includes Text as sample text and the text categories markup information as sample class corresponding with sample text.
In some embodiments, device further include: receiving unit is configured to receive the text to be sorted of user terminal transmission This set;Second generation unit is configured to text collection to be sorted being input to textual classification model, generates text to be sorted The corresponding classification information of text to be sorted in set, wherein classification information is for characterizing classification belonging to text to be sorted, class Other information matches with sample class;Transmission unit is configured to classification information generated and corresponding text to be sorted Information is sent to user terminal, wherein text information to be sorted is used to identify the text to be sorted in text collection to be sorted.
The third aspect, embodiment of the disclosure provide a kind of server, which includes: one or more processing Device;Storage device is stored thereon with one or more programs;When one or more programs are executed by one or more processors, So that one or more processors realize the method as described in implementation any in first aspect.
Fourth aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program, The method as described in implementation any in first aspect is realized when the program is executed by processor.
The method and apparatus for generating disaggregated model that embodiment of the disclosure provides, first acquisition training sample set It closes.Wherein, training sample includes sample text and sample class corresponding with sample text.Then, training sample set is determined Statistical nature.Wherein, statistical nature includes the form feature for characterizing text length.Later, based on statistical nature and to initial The training of model generates textual classification model.Wherein, textual classification model is corresponding between text categories and text for characterizing Relationship.Above-mentioned initial model is chosen from preset pre-training model set.Without adjusting ginseng that text point can be realized manually Class model automatically generates.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating disaggregated model of the disclosure;
Fig. 3 is according to an embodiment of the present disclosure for generating the signal of an application scenarios of the method for disaggregated model Figure;
Fig. 4 is the flow chart according to another embodiment of the method for generating disaggregated model of the disclosure;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating disaggregated model of the disclosure;
Fig. 6 is adapted for the structural schematic diagram for realizing the electronic equipment of embodiment of the disclosure.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can the method for generating disaggregated model using the disclosure or the dress for generating disaggregated model The exemplary architecture 100 set.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
Terminal device 101,102,103 is interacted by network 104 with server 105, to receive or send message etc..Terminal Various telecommunication customer end applications can be installed in equipment 101,102,103, such as the application of web browser applications, searching class, Instant messaging tools, mailbox client, social platform software, the application of text editing class, reading class application etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be with display screen and the various electronic equipments of supporting text to show, including but not limited to smart phone, plate Computer, E-book reader, pocket computer on knee and desktop computer etc..When terminal device 101,102,103 is soft When part, it may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or software module (such as The software or software module of Distributed Services are provided), single software or software module also may be implemented into.Specific limit is not done herein It is fixed.
Server 105 can be to provide the server of various services, for example, text on terminal device 101,102,103 Classification application provides the background server supported.Optionally, server 105 is also possible to Cloud Server.Background server can be with Textual classification model is obtained according to the training of the training sample set of acquisition, and the text that can be sent to terminal device is analyzed Deng processing, and processing result (such as text categories information) is fed back into terminal device.
It should be noted that above-mentioned training sample set can also be stored directly in the local of server 105, server 105, which can directly extract the local training sample set stored, merges progress model training, at this point it is possible to which terminal device is not present 101,102,103 and network 104.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software To be implemented as multiple softwares or software module (such as providing the software of Distributed Services or software module), also may be implemented At single software or software module.It is not specifically limited herein.
It should be noted that for generating the method for disaggregated model generally by server provided by embodiment of the disclosure 105 execute, and correspondingly, the device for generating disaggregated model is generally positioned in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the stream of one embodiment of the method for generating disaggregated model according to the disclosure is shown Journey 200.This be used for generate disaggregated model method the following steps are included:
Step 201, training sample set is obtained.
It in the present embodiment, can for generating the executing subject (server 105 as shown in Figure 1) of the method for disaggregated model To obtain training sample set by wired connection mode or radio connection.Wherein, above-mentioned training sample may include Sample text and sample class corresponding with sample text.Specifically, above-mentioned executing subject is available is pre-stored within local Training sample set, also it is available communicate connection electronic equipment (such as terminal device shown in FIG. 1) send Training sample set.
In practice, above-mentioned training sample can obtain in several ways.As an example, can be by technical staff to be obtained Each text in the text collection taken carries out classification mark.Text is stored with the category associations marked, finally obtains training Sample.As another example, the information resources in portal website can be processed.Such as the article in webpage can be made Training sample is formed using column belonging to article as sample class for sample text.It is formed largely by a large amount of data Training sample, and then form training sample set.
In some optional implementations of the present embodiment, above-mentioned executing subject can also obtain instruction in accordance with the following steps Practice sample set:
The first step receives the mark text collection that user terminal is sent.
In these implementations, above-mentioned executing subject be can receive on user terminal (such as terminal device shown in FIG. 1) The mark text collection of biography.Wherein, above-mentioned mark text may include text and text categories markup information corresponding with text.
Second step divides mark text collection, generates training sample set and verifying text collection.
In these implementations, above-mentioned executing subject can be according to a certain percentage to the received mark of above-mentioned first step institute Note text collection is divided, to obtain training sample set and verifying text collection.Wherein, above-mentioned training sample can wrap Include the text as sample text and the text categories markup information as sample class corresponding with sample text.On in general, The ratio of stating can be preset, such as 8:2 or 7:3.Optionally, aforementioned proportion can also be set according to the user's choice.
Step 202, the statistical nature of training sample set is determined.
In the present embodiment, above-mentioned executing subject can determine the statistical nature of training sample set by various modes. Wherein, above-mentioned statistical nature may include the form feature for characterizing text length.Above-mentioned statistical nature can include but is not limited to At least one of below: the class mesh number of sample class, maximum text size, minimum text size, average text size.
In some optional implementations of the present embodiment, above-mentioned statistical nature can also include characterization content of text Content characteristic.Above-mentioned statistical nature can include but is not limited at least one of following: word frequency vector is based on vector space model Text eigenvector determined by (vector space model, VSM).
Step 203, based on statistical nature and to the training of initial model, textual classification model is generated.
In the present embodiment, based on statistical nature and to the training of initial model, above-mentioned executing subject can be by various Mode generates textual classification model.Wherein, above-mentioned textual classification model can be used for characterizing pair between text categories and text It should be related to.Above-mentioned initial model can be chosen from preset pre-training model set.It is pre- in above-mentioned pre-training model set Training pattern can be including the mass data collection based on different field (such as finance, law, science and technology, sport) and in advance training Model.Above-mentioned pre-training model can be used for characterizing the corresponding relationship between text categories and text.Above-mentioned pre-training model can With knowledge such as underlying semantics, the reasonings of containing enough related fieldss.It is appreciated that the above-mentioned mass data based on different field The pre-training model for collecting and generating can regard meta learning (the model-agnostic meta- under different field as respectively Learning, MAML) initial token (representation).To which above-mentioned pre-training model is by subsequent training to mould It can be used for the text classification of the subdomains of above-mentioned mass data collection fields after the adjustment of type result.
In some optional implementations of the present embodiment, above-mentioned executing subject can generate text in accordance with the following steps Disaggregated model:
Form feature is input to hyper parameter (hyper-parameters) trained in advance and generates model, obtained by the first step To hyper parameter group corresponding with form feature.
In these optional implementations, above-mentioned executing subject can be inputted form feature determined by step 202 Model is generated to hyper parameter trained in advance, obtains hyper parameter group corresponding with form feature.It wherein, can in above-mentioned hyper parameter group To include model hyper parameter and training hyper parameter.Above-mentioned model hyper parameter can be used for the attribute of characterization model itself.On for example, State model hyper parameter can include but is not limited to it is at least one of following: the number of plies of neural network, every layer of number of nodes, word in hidden layer The dimension of vector (embedding).Above-mentioned trained hyper parameter can serve to indicate that the training process of model.For example, above-mentioned training Hyper parameter can include but is not limited at least one of following: learning rate (learning rate), batchparameters (batch Size), greatest gradient (clip c) is limited, dropout value (such as 0.5), L2 regular value (such as 1.0).
In these optional implementations, above-mentioned hyper parameter, which generates model, can be used for characterizing hyper parameter group and form spy Corresponding relationship between sign.As an example, above-mentioned hyper parameter, which generates model, can be technical staff based on special to a large amount of form The statistics for the corresponding preferable hyper parameter group of training effect of seeking peace and it is preassigned, be stored with multiple form features and hyper parameter The mapping table of the corresponding relationship of group.As another example, above-mentioned hyper parameter generates model and can be based on a large amount of sample, The model generated using machine learning algorithm training.Wherein, above-mentioned sample can by form feature and corresponding training effect compared with Good hyper parameter group composition.
Second step, from selection in pre-training model set and the matched pre-training model of model hyper parameter in hyper parameter group As initial model.
In these implementations, above-mentioned executing subject can from pre-training model set Selection Model structure with it is above-mentioned The matched pre-training model of model hyper parameter in the obtained hyper parameter group of the first step is as initial model.Wherein, above-mentioned With may include same or similar.For example, it is 2 that above-mentioned model hyper parameter, which can be hidden layer number, then above-mentioned initial model can be Neural network including 2 layers of hidden layer.
Third step is trained initial model according to the training hyper parameter in hyper parameter group, generates textual classification model.
In these implementations, above-mentioned executing subject can be according to the instruction in the obtained hyper parameter group of the above-mentioned first step The instruction for practicing hyper parameter, is trained initial model selected by above-mentioned second step using various machine learning algorithms, generates Textual classification model.
In some optional implementations of the present embodiment, above-mentioned executing subject can also generate text in accordance with the following steps This disaggregated model:
The first step obtains initial hyper parameter group set.
In these implementations, above-mentioned executing subject can pass through wired connection mode or radio connection first Obtain initial hyper parameter group set.It wherein, may include initial model hyper parameter and initial training in above-mentioned initial hyper parameter group Hyper parameter.The associated description of initial model hyper parameter and initial training hyper parameter in above-mentioned initial hyper parameter group can with it is aforementioned Model hyper parameter in hyper parameter group is consistent with training hyper parameter, and details are not described herein again.
In these implementations, above-mentioned executing subject is available to be pre-stored within local initial hyper parameter group collection It closes, the initial hyper parameter group set that also the available electronic equipment (such as data server) for communicating connection is sent.It can Selection of land, above-mentioned executing subject can also obtain and the matched initial hyper parameter group of statistical nature from local or above-mentioned electronic equipment Set.For example, can have between frequency of training threshold value in average text size and initial hyper parameter group in statistical nature Matching relationship.Optionally, it is corresponding can also to generate each initial hyper parameter group in initial hyper parameter group at random for above-mentioned executing subject Initial value, and then generate initial hyper parameter group set.
Second step chooses initial hyper parameter group from initial hyper parameter group set, and executes step identified below.It is above-mentioned Determine that step may include:
S1, it is matched from selection in pre-training model set with the initial model hyper parameter in selected initial hyper parameter group Pre-training model as initial model.
In these implementations, above-mentioned executing subject can be chosen from pre-training model set with it is selected initial The matched one or more pre-training models of initial model hyper parameter in hyper parameter group are as initial model.Optionally, above-mentioned Executing subject can also be chosen with the matched one or more pre-training models of statistical nature as initial model.For example, statistics It can have matching relationship between the number of hidden nodes in maximum text size and initial hyper parameter group in feature.
Optionally, based on the content characteristic in above-mentioned statistical nature, above-mentioned executing subject can also be from pre-training Models Sets It is chosen in conjunction matched one or more pre- with the initial model hyper parameter in content characteristic and selected initial hyper parameter group Training pattern is as initial model.Wherein, pre-training model can be corresponding with semantic label.In general, upper semantic tags can be with With the pre-training model in above-mentioned pre-training model set during pre-training based on the field of data set it is consistent.It is above-mentioned Executing subject can determine the matching degree between above content feature and semantic label by various modes.On as an example, Similarity between the word frequency vector and upper semantic tags of above-mentioned training sample set can be calculated by stating executing subject.
S2, selected initial model is instructed according to the initial training hyper parameter in selected initial hyper parameter group Practice, generates the corresponding quasi- textual classification model of initial hyper parameter group.
Based on above-mentioned optional implementation, above-mentioned executing subject can initially surpass ginseng according to selected by above-mentioned second step The instruction of initial training hyper parameter in array, using various machine learning algorithms to introductory die selected in above-mentioned steps S1 Type is trained, and generates the selected corresponding quasi- textual classification model of initial hyper parameter group.
Optionally, above-mentioned executing subject can also generate the selected corresponding standard of initial hyper parameter group in accordance with the following steps Textual classification model:
Step 1 chooses training sample from training sample set, and executes following training step.Above-mentioned training step May include:
S21, the sample text of the training sample of selection is input to selected initial model, generates text categories.
S22, according to text categories generated sample class corresponding with the sample text of input, determine difference value.
S23, determine whether difference value meets trained completion condition.
Based on above-mentioned optional implementation, above-mentioned difference value and training completion condition can be based on selected hyper parameters Initial training hyper parameter in group determines.As an example, above-mentioned initial training hyper parameter may include loss function, training is completed Condition threshold.Above-mentioned difference value can be determined according to loss function.Condition threshold is completed in above-mentioned training In at least one of following: training duration threshold value, frequency of training threshold value, difference value threshold value verify the accuracy rate threshold value of text set.
S24, meet training completion condition in response to determining, selected initial model is determined as to selected hyper parameter The corresponding quasi- textual classification model of group.
Step 2 is unsatisfactory for training completion condition, adjusts the relevant parameter of selected initial model in response to determining, with And training sample is chosen from training sample set, use initial model adjusted as selected initial model, continues Execute above-mentioned training step.
Based on above-mentioned optional implementation, it is unsatisfactory for training completion condition in response to determining, above-mentioned executing subject can be with Training sample is chosen using the relevant parameter of the selected initial model of various methods adjustment, and from training sample set, Use initial model adjusted as selected initial model, continues to execute above-mentioned training step.It should be noted that root It is one or more according to the number for choosing sample, the method for above-mentioned adjustment relevant parameter can include but is not limited to following at least one : batch gradient declines (batch gradient descent, BGD), stochastic gradient descent (stochastic gradient Descent, SGD) and small lot gradient decline (mini-batch gradient descent, MBGD).
S3, quasi- textual classification model generated is evaluated and tested based on verifying text collection, generates evaluation result.
Based on above-mentioned optional implementation, above-mentioned executing subject can use verifying text collection to quasi- text generated This disaggregated model is evaluated and tested, and then generates evaluation result.Wherein, above-mentioned verifying text collection can be pre-set.On Stating verifying text may include text and verifying text marking information corresponding with text.Optionally, above-mentioned verifying text collection It is also based on the division for the mark text collection that the user terminal received is sent and generates.
S4, meet hyper parameter group in response to determination evaluation result generated and determine condition, determined from hyper parameter group is met Textual classification model is determined in the corresponding quasi- textual classification model of the evaluation result of condition.
Based on above-mentioned optional implementation, meets hyper parameter group in response to determination evaluation result generated and determine item Part, above-mentioned executing subject can surpass ginseng from meeting according to the mode for choosing initial hyper parameter group from initial hyper parameter group set Array, which determines, determines textual classification model in the corresponding quasi- textual classification model of the evaluation result of condition.Wherein, above-mentioned hyper parameter It is at least one of following that group determines that condition can include but is not limited to: the accuracy rate for verifying text collection is more than preset accuracy rate threshold The number of iterations of value, initial hyper parameter group is more than preset frequency threshold value, and the initial hyper parameter group of adjacent iteration twice is corresponding The difference verified between the accuracy rate of text collection is less than preset discrepancy threshold.
As an example, above-mentioned executing subject can choose an initial hyper parameter every time from initial hyper parameter group set Group, then, above-mentioned executing subject can will meet hyper parameter group and determine the corresponding quasi- textual classification model of evaluation result of condition It is determined as textual classification model.As another example, above-mentioned executing subject can be chosen every time from initial hyper parameter group set Multiple initial hyper parameter groups, then, meet hyper parameter group and determines that the corresponding quasi- textual classification model of the evaluation result of condition can also It is multiple to have.Above-mentioned executing subject can usually choose the optimal quasi- textual classification model of evaluation result as text classification mould Type.For example, the optimal accuracy rate highest that can be on verifying text collection of above-mentioned evaluation result.
Third step is unsatisfactory for hyper parameter group in response to determination evaluation result generated and determines condition, to initial hyper parameter Hyper parameter group in group set is updated;And initial hyper parameter group is chosen from updated initial hyper parameter group set, Continue to execute above-mentioned determining step.
In these implementations, hyper parameter group is unsatisfactory in response to determination evaluation result generated and determines condition, on It states executing subject and can use various modes and the hyper parameter group in initial hyper parameter group set is updated;And above-mentioned execution Main body can also choose initial hyper parameter group from updated initial hyper parameter group set, continue to execute above-mentioned determining step.
It should be noted that being one or more according to the number for choosing initial hyper parameter group, the method for above-mentioned update can To include but is not limited at least one of following: genetic algorithm (genetic algorithm, GA), simulated annealing (simulated annealing, SA), ant group algorithm (ant colony algorithm, ACA), Bayes's optimization (Bayesian Optimization)。
In some optional implementations of the present embodiment, above-mentioned executing subject can also send to target terminal and characterize The information that textual classification model training is completed.
It is according to an embodiment of the present disclosure for generating the application scenarios of the method for disaggregated model with continued reference to Fig. 3, Fig. 3 A schematic diagram.In the application scenarios of Fig. 3,301 using terminal equipment 302 of user will be with the investment type text collection marked 303 are uploaded to background server 304.Background server 304 can determine the average text of the investment type text collection 303 with mark This length is 3000 words.Then, background server 304 can generate corresponding hyper parameter according to identified average text size Group 3031.Next, background server 304 can be chosen from preset pre-training model set 3032 is based on financial field number The pre-training model being trained according to collection is as initial model 3033, according to hyper parameter group 3031 generated to above-mentioned introductory die Type 3033 is trained, and generates textual classification model 305.Optionally, background server 304 can also be sent out to terminal device 302 Send the characterization model prompt information 306 that training is completed.
Currently, one of prior art is usually to obtain feature term vector by complicated Feature Engineering technology or according to default Hyper parameter initial model is trained.Since the design of Feature Engineering and the selection of hyper parameter generally require user with rich Rich modeling experience, causes the technical threshold of training text disaggregated model higher.And the side provided by the above embodiment of the disclosure Method, by determine training sample set statistical nature, according to statistical nature carry out pre-training model selection and according to system The mode of meter characteristic matching is trained.It realizes and text classification mould is generated by the fine tuning of pre-training model and small-scale sample Type.Moreover, adjusting ginseng without artificial in the training process, significantly reduce user uses threshold.In turn, family can be used low The text classification application for being suitble to self-demand is quickly landed under the investment of cost.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for generating disaggregated model. This is used to generate the process 400 of the method for disaggregated model, comprising the following steps:
Step 401, training sample set is obtained.
Step 402, the statistical nature of training sample set is determined.
Step 403, based on statistical nature and to the training of initial model, textual classification model is generated.
Above-mentioned steps 401, step 402, step 403 respectively with step 201, step 202, the step in previous embodiment 203 is consistent, and the description above with respect to step 201, step 202 and step 203 is also applied for step 401, step 402 and step 403, details are not described herein again.
Step 404, the text collection to be sorted that user terminal is sent is received.
In the present embodiment, for generating the executing subject (such as server 105 shown in FIG. 1) of the method for disaggregated model Can receive that user terminal (such as terminal device shown in FIG. 1) send by wired connection mode or radio connection to Classifying text set.
It should be noted that above-mentioned steps 404 and step 401 can be basically executed in parallel, can also first carry out above-mentioned Step 404, above-mentioned steps 401 are executed then.
Step 405, text collection to be sorted is input to textual classification model, generate in text collection to be sorted to point The corresponding classification information of class text.
In the present embodiment, the received text collection to be sorted of step 405 institute can be input to step by above-mentioned executing subject Rapid 403 textual classification model generated generates the corresponding classification information of text to be sorted in text collection to be sorted.Its In, above-mentioned classification information can be used for characterizing classification belonging to text to be sorted.It is appreciated that due to above-mentioned textual classification model It is obtained based on the training of training sample set acquired in above-mentioned steps 401, thus the corresponding classification of text to be sorted generated Information can match with the sample class of above-mentioned training sample.
Step 406, classification information generated is sent to user terminal with corresponding text information to be sorted.
In the present embodiment, above-mentioned executing subject can will be generated by wired connection mode or radio connection Classification information be sent to user terminal with corresponding text information to be sorted.Wherein, above-mentioned text information to be sorted can be used for Identify the text to be sorted in above-mentioned text collection to be sorted.
Figure 4, it is seen that the process 400 of the method for generating disaggregated model in the present embodiment embodies utilization The step of textual classification model generated classifies to the text to be sorted that user uploads.The present embodiment describes as a result, The text training text disaggregated model of mark that scheme can be uploaded by user, and then use trained textual classification model Classify to the text that do not mark of upload, to realize without artificial tune ginseng, has using existing sample Pointedly train and apply required textual classification model.It thereby reduces the training of textual classification model and uses door Sill.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides for generating classification mould One embodiment of the device of type, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.
As shown in figure 5, the device 500 provided in this embodiment for generating disaggregated model includes acquiring unit 501, determines Unit 502 and the first generation unit 503.Wherein, acquiring unit 501 are configured to obtain training sample set, wherein training Sample includes sample text and sample class corresponding with sample text;Determination unit 502, is configured to determine training sample set The statistical nature of conjunction, wherein statistical nature includes the form feature for characterizing text length;First generation unit 503, is configured to Based on statistical nature and to the training of initial model, textual classification model is generated, wherein textual classification model is for characterizing text Corresponding relationship between classification and text, initial model are chosen from preset pre-training model set.
In the present embodiment, in the device 500 for generating disaggregated model: acquiring unit 501, determination unit 502 and The specific processing of one generation unit 503 and its brought technical effect can be respectively with reference to the steps in Fig. 2 corresponding embodiment 201, step 202, the related description of step 203, details are not described herein.
In some optional implementations of the present embodiment, above-mentioned first generation unit 503 may include the first generation Module (not shown) chooses module (not shown), the second generation module (not shown).Wherein, above-mentioned first Generation module may be configured to for form feature being input to hyper parameter trained in advance and generate model, obtains and form feature Corresponding hyper parameter group.It wherein, may include model hyper parameter and training hyper parameter in above-mentioned hyper parameter group.Above-mentioned selection mould Block may be configured to from selection in pre-training model set and the matched pre-training model of model hyper parameter in hyper parameter group As initial model.Above-mentioned second generation module may be configured to according to the training hyper parameter in hyper parameter group to introductory die Type is trained, and generates textual classification model.
In some optional implementations of the present embodiment, above-mentioned first generation unit 503 may include obtaining module (not shown), determining module (not shown), update module (not shown).Wherein, above-mentioned acquisition module, can be with It is configured to obtain initial hyper parameter group set.Wherein, may include in above-mentioned initial hyper parameter group initial model hyper parameter and Initial training hyper parameter.Above-mentioned determining module may be configured to choose initial hyper parameter group from initial hyper parameter group set, And execute step identified below: from selection in pre-training model set and the initial model in selected initial hyper parameter group The matched pre-training model of hyper parameter is as initial model;According to the initial training hyper parameter in selected initial hyper parameter group Selected initial model is trained, the corresponding quasi- textual classification model of initial hyper parameter group is generated;Based on verifying text Set evaluates and tests quasi- textual classification model generated, generates evaluation result;In response to determination evaluation result generated Meet hyper parameter group and determine condition, is determined in the corresponding quasi- textual classification model of evaluation result of condition really from hyper parameter group is met Determine textual classification model.Above-mentioned update module may be configured to be unsatisfactory for super ginseng in response to determination evaluation result generated Array determines condition, is updated to the initial hyper parameter group in initial hyper parameter group set;From updated initial hyper parameter Initial hyper parameter group is chosen in group set, continues to execute determining step.
In some optional implementations of the present embodiment, above-mentioned statistical nature can also include characterization content of text Content characteristic.Above-mentioned determining module can be further configured to: be chosen and content characteristic and institute from pre-training model set The matched pre-training model of initial model hyper parameter in initial hyper parameter group chosen is as initial model.Wherein, above-mentioned pre- Training pattern can be corresponding with semantic label.
In some optional implementations of the present embodiment, above-mentioned determining module be may further include: choose submodule Block (not shown), adjusting submodule (not shown).Wherein, above-mentioned selection submodule, may be configured to from training Training sample is chosen in sample set, and executes following training step: the sample text of the training sample of selection is input to Selected initial model generates text categories;According to text categories generated sample corresponding with the sample text of input Classification determines difference value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on Initial training hyper parameter in selected hyper parameter group determines;Meet training completion condition in response to determining, it will be selected Initial model is determined as the selected corresponding quasi- textual classification model of hyper parameter group.Above-mentioned adjusting submodule can be configured It is unsatisfactory for training completion condition in response to determining, adjusts the relevant parameter of selected initial model, and from training sample Training sample is chosen in set, is used initial model adjusted as selected initial model, is continued to execute training step.
In some optional implementations of the present embodiment, above-mentioned acquiring unit 501 may include: receiving module (figure In be not shown), third generation module (not shown).Wherein, above-mentioned receiving module may be configured to receive user terminal hair The mark text collection sent.Wherein, above-mentioned mark text may include text and text categories markup information corresponding with text. Above-mentioned generation module may be configured to divide mark text collection, generate training sample set and verifying text set It closes.Wherein, above-mentioned training sample may include as the text of sample text and as sample class corresponding with sample text Text categories markup information.
In some optional implementations of the present embodiment, the above-mentioned device 500 for generating disaggregated model be can wrap It includes: receiving unit (not shown), the second generation unit (not shown), transmission unit (not shown).Wherein, on Receiving unit is stated, may be configured to receive the text collection to be sorted that user terminal is sent.Above-mentioned second generation unit, can be by It is configured to text collection to be sorted being input to textual classification model, the text to be sorted generated in text collection to be sorted is corresponding Classification information.Wherein, above-mentioned classification information can be used for characterizing classification belonging to text to be sorted.Above-mentioned classification information can be with Match with sample class.Above-mentioned transmission unit may be configured to classification information generated and corresponding text to be sorted This information is sent to user terminal.Wherein, above-mentioned text information to be sorted can be used for identifying in text collection to be sorted to point Class text.
The device provided by the above embodiment of the disclosure obtains training sample set by acquiring unit 501.Wherein, it instructs Practicing sample includes sample text and sample class corresponding with sample text.Then, determination unit 502 determines training sample set Statistical nature.Wherein, statistical nature includes the form feature for characterizing text length.Later, the first generation unit 503 is based on system Feature and the training to initial model are counted, textual classification model is generated.Wherein, textual classification model for characterize text categories with Corresponding relationship between text.Initial model is chosen from preset pre-training model set.Without adjusting ginseng manually Realize automatically generating for textual classification model.
Below with reference to Fig. 6, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1 Server) 600 structural schematic diagram.Server shown in Fig. 6 is only an example, should not be to the function of embodiment of the disclosure Any restrictions can be brought with use scope.
As shown in fig. 6, electronic equipment 600 may include processing unit (such as central processing unit, graphics processor etc.) 601, random access can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608 Program in memory (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with electronic equipment Various programs and data needed for 600 operations.Processing unit 601, ROM 602 and RAM 603 pass through the phase each other of bus 604 Even.Input/output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph As the input unit 606 of head, microphone etc.;Including such as liquid crystal display (LCD, Liquid Crystal Display), raise The output device 607 of sound device, vibrator etc.;Storage device 608 including such as tape, hard disk etc.;And communication device 609. Communication device 609 can permit electronic equipment 600 and wirelessly or non-wirelessly be communicated with other equipment to exchange data.Although Fig. 6 The electronic equipment 600 with various devices is shown, it should be understood that being not required for implementing or having all dresses shown It sets.It can alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent a device, Also it can according to need and represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608 It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed The above-mentioned function of being limited in the method for example.
It is situated between it should be noted that computer-readable medium described in embodiment of the disclosure can be computer-readable signal Matter or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited Memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer readable storage medium, which can be, appoints What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its It is used in combination.And in embodiment of the disclosure, computer-readable signal media may include in a base band or as carrier wave The data-signal that a part is propagated, wherein carrying computer-readable program code.The data-signal of this propagation can be adopted With diversified forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal is situated between Matter can also be any computer-readable medium other than computer readable storage medium, which can be with It sends, propagate or transmits for by the use of instruction execution system, device or device or program in connection.Meter The program code for including on calculation machine readable medium can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (Radio Frequency, radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned server;It is also possible to individualism, and without It is incorporated in the server.Above-mentioned computer-readable medium carries one or more program, when said one or multiple journeys Sequence by the server execute when so that the server: obtain training sample set, wherein training sample include sample text and Sample class corresponding with sample text;Determine the statistical nature of training sample set, wherein statistical nature includes characterization text The form feature of length;Based on statistical nature and to the training of initial model, textual classification model is generated, wherein text classification Model is used to characterize the corresponding relationship between text categories and text, and initial model is selected from preset pre-training model set It takes.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof The computer program code of work, described program design language include object oriented program language-such as Java, Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).
Flow chart and block diagram in attached drawing illustrate system, method and the computer of the various embodiments according to the disclosure The architecture, function and operation in the cards of program product.In this regard, each box in flowchart or block diagram can be with A part of a module, program segment or code is represented, a part of the module, program segment or code includes one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke Yiyong The dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computer The combination of order is realized.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor, Including acquiring unit, determination unit and the first generation unit.Wherein, the title of these units under certain conditions constitute pair The restriction of the unit itself, for example, acquiring unit is also described as " obtaining the unit of training sample set, wherein training Sample includes sample text and sample class corresponding with sample text ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.

Claims (16)

1. a kind of method for generating disaggregated model, comprising:
Obtain training sample set, wherein training sample includes sample text and sample class corresponding with sample text;
Determine the statistical nature of the training sample set, wherein the statistical nature includes the form spy for characterizing text length Sign;
Based on the statistical nature and to the training of initial model, textual classification model is generated, wherein the textual classification model For characterizing the corresponding relationship between text categories and text, the initial model is selected from preset pre-training model set It takes.
2. it is described based on the statistical nature and to the training of initial model according to the method described in claim 1, wherein, it is raw At textual classification model, comprising:
The form feature is input to hyper parameter trained in advance and generates model, obtains super ginseng corresponding with the form feature Array, wherein include model hyper parameter and training hyper parameter in hyper parameter group;
Make from being chosen in the pre-training model set with the matched pre-training model of model hyper parameter in the hyper parameter group For initial model;
The initial model is trained according to the training hyper parameter in the hyper parameter group, generates the text classification mould Type.
3. it is described based on the statistical nature and to the training of initial model according to the method described in claim 1, wherein, it is raw At textual classification model, comprising:
Obtain initial hyper parameter group set, wherein include initial model hyper parameter and the super ginseng of initial training in initial hyper parameter group Number;
Initial hyper parameter group is chosen from the initial hyper parameter group set, and executes step identified below: from the pre- instruction Practice and is chosen in model set and the matched pre-training model conduct of initial model hyper parameter in selected initial hyper parameter group Initial model;Selected initial model is instructed according to the initial training hyper parameter in selected initial hyper parameter group Practice, generates the corresponding quasi- textual classification model of initial hyper parameter group;Based on verifying text collection to quasi- text classification generated Model is evaluated and tested, and evaluation result is generated;Meet hyper parameter group in response to determination evaluation result generated and determine condition, from completely The foot hyper parameter group, which determines, determines the textual classification model in the corresponding quasi- textual classification model of the evaluation result of condition;
The hyper parameter group is unsatisfactory in response to determination evaluation result generated and determines condition, to the initial hyper parameter group collection Initial hyper parameter group in conjunction is updated;Initial hyper parameter group is chosen from updated initial hyper parameter group set, is continued Execute the determining step.
4. according to the method described in claim 3, wherein, the statistical nature further includes the content characteristic for characterizing content of text; And
The initial model hyper parameter from the pre-training model set in selection and selected initial hyper parameter group The pre-training model matched is as initial model, comprising:
From selection in the pre-training model set and the introductory die in the content characteristic and selected initial hyper parameter group The matched pre-training model of type hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
5. according to the method described in claim 3, wherein, the initial training in the initial hyper parameter group according to selected by is super Parameter is trained selected initial model, generates the corresponding quasi- textual classification model of initial hyper parameter group, comprising:
Training sample is chosen from the training sample set, and executes following training step: by the training sample of selection Sample text is input to selected initial model, generates text categories;According to the sample of text categories generated and input The corresponding sample class of text, determines difference value;Determine whether difference value meets trained completion condition, wherein the difference value It is determined with training completion condition based on the initial training hyper parameter in selected hyper parameter group;Meet the instruction in response to determination Practice completion condition, selected initial model is determined as to the selected corresponding quasi- textual classification model of hyper parameter group;
It is unsatisfactory for the trained completion condition in response to determining, adjusts the relevant parameter of selected initial model, and from institute It states in training sample set and chooses training sample, use initial model adjusted as selected initial model, continue to hold The row training step.
6. according to the method described in claim 3, wherein, the acquisition training sample set, comprising:
Receive the mark text collection that user terminal is sent, wherein mark text includes text and text categories corresponding with text Markup information;
The mark text collection is divided, the training sample set and the verifying text collection are generated, wherein instruction Practicing sample includes the text as sample text and the text categories markup information as sample class corresponding with sample text.
7. method described in one of -6 according to claim 1, wherein the method also includes:
Receive the text collection to be sorted that user terminal is sent;
The text collection to be sorted is input to the textual classification model, generate in the text collection to be sorted to point The corresponding classification information of class text, wherein classification information is for characterizing classification belonging to text to be sorted, classification information and sample Classification matches;
Classification information generated is sent to the user terminal with corresponding text information to be sorted, wherein text to be sorted Information is used to identify the text to be sorted in the text collection to be sorted.
8. a kind of for generating the device of disaggregated model, comprising:
Acquiring unit is configured to obtain training sample set, wherein training sample include sample text and with sample text pair The sample class answered;
Determination unit is configured to determine the statistical nature of the training sample set, wherein the statistical nature includes characterization The form feature of text length;
First generation unit is configured to generate textual classification model based on the statistical nature and to the training of initial model, Wherein, the textual classification model is used to characterize corresponding relationship between text categories and text, and the initial model is from default Pre-training model set in choose.
9. device according to claim 8, wherein first generation unit includes:
First generation module is configured to for the form feature being input in advance trained hyper parameter and generates model, obtain with The corresponding hyper parameter group of the form feature, wherein include model hyper parameter and training hyper parameter in hyper parameter group;
Module is chosen, is configured to from selection in the pre-training model set and the model hyper parameter in the hyper parameter group The pre-training model matched is as initial model;
Second generation module is configured to instruct the initial model according to the training hyper parameter in the hyper parameter group Practice, generates the textual classification model.
10. device according to claim 8, wherein first generation unit includes:
Module is obtained, is configured to obtain initial hyper parameter group set, wherein includes the super ginseng of initial model in initial hyper parameter group Several and initial training hyper parameter;
Determining module is configured to choose initial hyper parameter group from the initial hyper parameter group set, and executes following true Determine step: being matched from being chosen in the pre-training model set with the initial model hyper parameter in selected initial hyper parameter group Pre-training model as initial model;According to the initial training hyper parameter in selected initial hyper parameter group to selected Initial model is trained, and generates the corresponding quasi- textual classification model of initial hyper parameter group;Based on verifying text collection to giving birth to At quasi- textual classification model evaluated and tested, generate evaluation result;Meet hyper parameter in response to determination evaluation result generated The condition of determination is organized, is determined in the corresponding quasi- textual classification model of evaluation result of condition described in determination from the hyper parameter group is met Textual classification model;
Update module is configured in response to determine that evaluation result generated is unsatisfactory for the hyper parameter group and determines condition, right Initial hyper parameter group in the initial hyper parameter group set is updated;It is chosen from updated initial hyper parameter group set Initial hyper parameter group, continues to execute the determining step.
11. device according to claim 10, wherein the statistical nature further includes the content spy for characterizing content of text Sign;The determining module is further configured to:
From selection in the pre-training model set and the introductory die in the content characteristic and selected initial hyper parameter group The matched pre-training model of type hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
12. device according to claim 10, wherein the determining module further comprises:
Submodule is chosen, is configured to choose training sample from the training sample set, and execute following training step: The sample text of the training sample of selection is input to selected initial model, generates text categories;According to text generated This classification sample class corresponding with the sample text of input, determines difference value;Determine whether difference value meets training and complete item Part, wherein the difference value and training completion condition are determined based on the initial training hyper parameter in selected hyper parameter group;It rings The trained completion condition should be met in determining, selected initial model is determined as to the selected corresponding standard of hyper parameter group Textual classification model;
Adjusting submodule is configured in response to determination and is unsatisfactory for the trained completion condition, adjusts selected initial model Relevant parameter, and choose training sample from the training sample set, use initial model adjusted selected by The initial model taken continues to execute the training step.
13. device according to claim 10, wherein the acquiring unit includes:
Receiving module, be configured to receive user terminal transmission mark text collection, wherein mark text include text and with text This corresponding text categories markup information;
Third generation module is configured to divide the mark text collection, generates the training sample set and institute State verifying text collection, wherein training sample includes as the text of sample text and as sample corresponding with sample text The text categories markup information of classification.
14. the device according to one of claim 8-13, wherein described device further include:
Receiving unit is configured to receive the text collection to be sorted of user terminal transmission;
Second generation unit, is configured to for the text collection to be sorted being input to the textual classification model, described in generation The corresponding classification information of text to be sorted in text collection to be sorted, wherein classification information is for characterizing text institute to be sorted The classification of category, classification information match with sample class;
Transmission unit is configured to classification information generated being sent to the user with corresponding text information to be sorted End, wherein text information to be sorted is used to identify the text to be sorted in the text collection to be sorted.
15. a kind of server, comprising:
One or more processors;
Storage device is stored thereon with one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-7.
16. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Method as described in any in claim 1-7.
CN201910721353.3A 2019-08-06 2019-08-06 Method and apparatus for generating disaggregated model Pending CN110457476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910721353.3A CN110457476A (en) 2019-08-06 2019-08-06 Method and apparatus for generating disaggregated model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910721353.3A CN110457476A (en) 2019-08-06 2019-08-06 Method and apparatus for generating disaggregated model

Publications (1)

Publication Number Publication Date
CN110457476A true CN110457476A (en) 2019-11-15

Family

ID=68485051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910721353.3A Pending CN110457476A (en) 2019-08-06 2019-08-06 Method and apparatus for generating disaggregated model

Country Status (1)

Country Link
CN (1) CN110457476A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241280A (en) * 2020-01-07 2020-06-05 支付宝(杭州)信息技术有限公司 Training method of text classification model and text classification method
CN111563163A (en) * 2020-04-29 2020-08-21 厦门市美亚柏科信息股份有限公司 Text classification model generation method and device and data standardization method and device
CN111696517A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Speech synthesis method, speech synthesis device, computer equipment and computer readable storage medium
CN113761181A (en) * 2020-06-15 2021-12-07 北京京东振世信息技术有限公司 Text classification method and device
WO2023109828A1 (en) * 2021-12-15 2023-06-22 维沃移动通信有限公司 Data collection method and apparatus, and first device and second device
US11748597B1 (en) * 2022-03-17 2023-09-05 Sas Institute, Inc. Computerized engines and graphical user interfaces for customizing and validating forecasting models

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241280A (en) * 2020-01-07 2020-06-05 支付宝(杭州)信息技术有限公司 Training method of text classification model and text classification method
CN111241280B (en) * 2020-01-07 2023-09-05 支付宝(杭州)信息技术有限公司 Training method of text classification model and text classification method
CN111563163A (en) * 2020-04-29 2020-08-21 厦门市美亚柏科信息股份有限公司 Text classification model generation method and device and data standardization method and device
CN111696517A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Speech synthesis method, speech synthesis device, computer equipment and computer readable storage medium
CN113761181A (en) * 2020-06-15 2021-12-07 北京京东振世信息技术有限公司 Text classification method and device
WO2023109828A1 (en) * 2021-12-15 2023-06-22 维沃移动通信有限公司 Data collection method and apparatus, and first device and second device
US11748597B1 (en) * 2022-03-17 2023-09-05 Sas Institute, Inc. Computerized engines and graphical user interfaces for customizing and validating forecasting models

Similar Documents

Publication Publication Date Title
CN110457476A (en) Method and apparatus for generating disaggregated model
CN108197652B (en) Method and apparatus for generating information
CN109325541A (en) Method and apparatus for training pattern
CN110288049A (en) Method and apparatus for generating image recognition model
CN107491534A (en) Information processing method and device
CN110458107A (en) Method and apparatus for image recognition
CN108121800A (en) Information generating method and device based on artificial intelligence
CN110555714A (en) method and apparatus for outputting information
CN109976997A (en) Test method and device
CN106484766B (en) Searching method and device based on artificial intelligence
CN108734293A (en) Task management system, method and apparatus
CN108960316A (en) Method and apparatus for generating model
CN108256476A (en) For identifying the method and apparatus of fruits and vegetables
CN109961032A (en) Method and apparatus for generating disaggregated model
CN109933217A (en) Method and apparatus for pushing sentence
CN109299477A (en) Method and apparatus for generating text header
CN109766418A (en) Method and apparatus for output information
CN109902446A (en) Method and apparatus for generating information prediction model
CN108121699A (en) For the method and apparatus of output information
CN110084317A (en) The method and apparatus of image for identification
CN109190123A (en) Method and apparatus for output information
CN112418059A (en) Emotion recognition method and device, computer equipment and storage medium
CN109214501A (en) The method and apparatus of information for identification
CN109117758A (en) Method and apparatus for generating information
CN110727871A (en) Multi-mode data acquisition and comprehensive analysis platform based on convolution decomposition depth model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination