CN116595978A - Object category identification method, device, storage medium and computer equipment - Google Patents
Object category identification method, device, storage medium and computer equipment Download PDFInfo
- Publication number
- CN116595978A CN116595978A CN202310866653.7A CN202310866653A CN116595978A CN 116595978 A CN116595978 A CN 116595978A CN 202310866653 A CN202310866653 A CN 202310866653A CN 116595978 A CN116595978 A CN 116595978A
- Authority
- CN
- China
- Prior art keywords
- text
- neural network
- network model
- features
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 126
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 238000003062 neural network model Methods 0.000 claims abstract description 309
- 238000000605 extraction Methods 0.000 claims abstract description 71
- 230000006399 behavior Effects 0.000 claims description 216
- 238000012549 training Methods 0.000 claims description 137
- 239000013598 vector Substances 0.000 claims description 123
- 230000004927 fusion Effects 0.000 claims description 111
- 238000012545 processing Methods 0.000 claims description 78
- 238000004364 calculation method Methods 0.000 claims description 27
- 238000006243 chemical reaction Methods 0.000 claims description 24
- 230000009466 transformation Effects 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 15
- 238000007499 fusion processing Methods 0.000 claims description 14
- 238000005520 cutting process Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 description 43
- 238000010586 diagram Methods 0.000 description 25
- 230000006870 function Effects 0.000 description 10
- 238000012512 characterization method Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000010606 normalization Methods 0.000 description 7
- 230000003542 behavioural effect Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 6
- 238000013507 mapping Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 241000070023 Phoenicopterus roseus Species 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure provides an object class identification method, an object class identification device, a storage medium and a computer device. The method comprises the steps of obtaining name text of a target object, behavior statistical data of the target object and category label text corresponding to a plurality of category categories; extracting features of the name text based on the first neural network model to obtain name text features; feature extraction is carried out on the behavior statistical data based on the second neural network model, so that behavior statistical features are obtained; extracting characteristics of the category label text based on the third neural network model to obtain category label text characteristics; calculating the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. The method can improve the accuracy of object category identification.
Description
Technical Field
The disclosure relates to the technical field of artificial intelligence, and in particular relates to an object category identification method, an object category identification device, a storage medium and computer equipment.
Background
With the continuous development of internet technology, people's production and life have also changed greatly. For example, the business operation mode is changed from a complete off-line operation mode to an on-line operation mode or an on-line and off-line combined operation mode. Traditional merchants realize online drainage through a resident internet platform, so that transaction amount is improved.
With the increasing number and variety of merchants entering the internet platform, the difficulty of searching the merchants by users of the internet platform is increasing, and the internet platform reduces the difficulty of searching the merchants by users by carrying out category identification on the merchants entering the internet platform and then classifying the merchants according to the category identification result. However, the accuracy of category identification for merchants is currently poor.
Disclosure of Invention
The embodiment of the disclosure provides an object category identification method, an object category identification device, a storage medium and computer equipment.
According to an aspect of the present disclosure, there is provided an object class identification method, including:
Acquiring name text of a target object, behavior statistical data of the target object and category label text corresponding to a plurality of category categories;
extracting features of the name text based on a first neural network model to obtain name text features;
performing feature extraction on the behavior statistical data based on a second neural network model to obtain behavior statistical features;
extracting features of the category label text based on a third neural network model to obtain category label text features, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data;
calculating the classification probability of the target object corresponding to each classification category according to the name text features, the behavior statistical features and the category label text features;
and determining a target object category of the target object based on the classification probability.
According to an aspect of the present disclosure, there is provided an object class identification apparatus including:
the system comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is used for acquiring a name text of a target object, behavior statistical data of the target object and class label texts corresponding to a plurality of classification classes;
The first extraction unit is used for extracting the characteristics of the name text based on the first neural network model to obtain the characteristics of the name text;
the second extraction unit is used for extracting the characteristics of the behavior statistical data based on a second neural network model to obtain behavior statistical characteristics;
the third extraction unit is used for extracting the characteristics of the category label text based on a third neural network model to obtain the characteristics of the category label text, and the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data;
the computing unit is used for computing the classification probability of the target object corresponding to each classification category according to the name text features, the behavior statistical features and the category label text features;
and the determining unit is used for determining the target object category of the target object based on the classification probability.
Optionally, the computing unit includes:
the first fusion subunit is used for carrying out fusion processing on the name text features and the behavior statistical features to obtain fusion features;
a calculating subunit, configured to calculate a dot product result of the fusion feature and the category label text feature;
And the determining subunit is used for determining the classification probability of the target object corresponding to each classification category according to the dot product result.
Optionally, the fusion subunit includes:
the acquisition module is used for respectively acquiring weight coefficients corresponding to the name text features and the behavior statistical features;
and the fusion module is used for carrying out fusion processing on the name text features and the behavior statistical features based on the weight coefficients to obtain fusion features.
Optionally, the acquiring module includes:
the recognition sub-module is used for carrying out semantic recognition on the name text to obtain a semantic recognition result;
and the determining submodule is used for determining a weight coefficient corresponding to the name text feature and the behavior statistical feature based on the semantic recognition result.
Optionally, the computing subunit includes:
the conversion module is used for carrying out dimension conversion on the fusion features based on the feature dimension of the category label text features to obtain target fusion features;
and the first calculation module is used for carrying out dot product calculation on the target fusion feature and the category label text feature to obtain a dot product result.
Optionally, the second neural network model includes a first sub-model and a second sub-model, and the second extraction unit includes:
The first processing subunit is used for carrying out dimension-lifting processing on the input features corresponding to the behavior statistical data based on the first sub-model to obtain high-dimensional features;
and the base transformation subunit is used for carrying out feature transformation on the high-dimensional features by the second sub-model to obtain behavior statistical features.
Optionally, the processing subunit includes:
the first processing module is used for acquiring numerical data in the behavior statistical data, and carrying out dimension lifting processing on the numerical data based on a first weight vector and a first offset vector to obtain a numerical vector;
the second processing module is used for acquiring attribute data in the behavior statistical data, and carrying out dimension lifting processing on input features corresponding to the attribute data based on a second weight vector and a second bias vector to obtain an attribute vector;
and the stacking module is used for stacking the numerical vector and the attribute vector to obtain a high-dimensional characteristic.
Optionally, the second processing module includes:
the conversion sub-module is used for acquiring attribute data in the behavior statistical data and carrying out numerical conversion on the attribute data to obtain converted numerical data;
and the dimension lifting sub-module is used for carrying out dimension lifting processing on the converted numerical data based on the second weight vector and the second offset vector to obtain an attribute vector.
Optionally, the first extraction unit includes:
the second processing subunit is used for carrying out text processing on the name text according to the preset text length to obtain a target text;
and the first extraction subunit is used for extracting the characteristics of the target text based on the first neural network model to obtain the characteristics of the name text.
Optionally, the second processing subunit comprises:
the expansion module is used for expanding the text of the name text based on the preset text length to obtain a target text when the text length of the name text is smaller than the preset text length;
and the cutting module is used for cutting the text of the name text according to the preset text length when the text length of the name text is greater than or equal to the preset text length, so as to obtain a target text.
Optionally, the object class identification device provided by the present disclosure further includes:
the first acquisition subunit is used for acquiring training sample data, wherein the training sample data comprises sample name texts of sample objects, sample behavior statistical data of the sample objects, sample category labels of the sample objects and category label texts corresponding to a plurality of classification categories;
The second extraction subunit is used for extracting the characteristics of the sample name text based on the first neural network model to obtain the characteristics of the sample name text;
the third extraction subunit is used for extracting the characteristics of the sample behavior statistical data based on the second neural network model to obtain sample behavior statistical characteristics;
the second fusion subunit is used for carrying out feature fusion on the sample name text features and the sample behavior statistical features to obtain sample fusion features;
a fourth extraction subunit, configured to perform feature extraction on the category label text based on the third neural network model, to obtain sample label text features;
an adjustment subunit, configured to calculate a loss value based on the sample fusion feature, the sample tag text feature, and the sample class tag, and adjust parameters of the first neural network model, the second neural network model, and the third neural network model based on the loss value;
and the execution subunit is used for circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the acquired training sample data until the first neural network model, the second neural network model and the third neural network model converge.
Optionally, the adjusting subunit includes:
the first adjusting module is used for carrying out dimension adjustment on the sample fusion characteristics based on a fourth neural network model to obtain target sample fusion characteristics;
the second calculation module is used for carrying out dot product calculation on the target sample fusion characteristics and the sample label text characteristics to obtain sample classification results;
the second adjustment module is used for calculating a loss value based on the sample classification result and the sample class label and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the loss value;
the execution subunit is further configured to:
and circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the acquired training sample data until the first neural network model, the second neural network model, the third neural network model and the fourth neural network model converge.
According to an aspect of the present disclosure, there is provided a computer device comprising a memory storing a computer program and a processor implementing the object class identification method as described above when executing the computer program.
According to an aspect of the present disclosure, there is provided a storage medium storing a computer program which, when executed by a processor, implements the object class identification method as described above.
According to an aspect of the present disclosure, there is provided a computer program product comprising a computer program, which is read and executed by a processor of a computer device, causing the computer device to perform the object class identification method as described above.
In the embodiment of the disclosure, a name text of a target object, behavior statistical data of the target object and category label texts corresponding to a plurality of classification categories are obtained; extracting features of the name text based on the first neural network model to obtain name text features; feature extraction is carried out on the behavior statistical data based on the second neural network model, so that behavior statistical features are obtained; extracting characteristics of the category label text based on the third neural network model to obtain category label text characteristics, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data; calculating the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. As such, the present disclosure performs object category recognition by employing an end-to-end model to fuse the name text and behavioral statistics of an object into the same category recognition task. Compared with the method for identifying the object category by using only one object data, the method for identifying the object category by using the object data with more dimensions can obtain more accurate category identification results. Compared with the method that different models are adopted to conduct object recognition on object data in different dimensions, and then recognition results of the different models are integrated, the method and the device can avoid expansion of model errors, and accuracy of object category recognition can be improved.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. The objectives and other advantages of the disclosure will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosed embodiments and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain, without limitation, the disclosed embodiments.
FIG. 1 is a system architecture diagram of an object class identification method application in accordance with an embodiment of the present disclosure;
FIG. 2 is a flow diagram of an object class identification method of an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of the structure of an object class identification model of an embodiment of the present disclosure;
FIG. 4 is a training flow diagram of an object class identification model of an embodiment of the present disclosure;
FIG. 5 is another schematic structural diagram of an object class identification model in an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a process for feature extraction of text names of target objects using a BERT model in an embodiment of the disclosure;
FIG. 7 is a schematic diagram of a structure of a second neural network model in an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a process for upscaling features of behavioral statistics based on a first sub-model in an embodiment of the disclosure;
FIG. 9 is a schematic diagram of a model structure of a second sub-model in an embodiment of the present disclosure;
FIG. 10 is another flow diagram of an object class identification method of an embodiment of the present disclosure;
FIG. 11 is a schematic diagram of the structure of an object class identification device in an embodiment of the disclosure;
FIG. 12 is a block diagram of a terminal implementing methods according to one embodiment of the present disclosure;
fig. 13 is a server block diagram implementing methods according to one embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more apparent, the present disclosure will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present disclosure.
Before proceeding to further detailed description of the disclosed embodiments, the terms and terms involved in the disclosed embodiments are described, which are applicable to the following explanation:
Pre-training model (Pre-training model): the model is also called a basic stone model and a large model, which refer to a deep neural network (Deep neural network, DNN) with large parameters, the deep neural network is trained on massive unlabeled data, PTM extracts common characteristics on the data by utilizing the function approximation capability of the large-parameter DNN, and the model is suitable for downstream tasks through technologies such as fine tuning, efficient fine tuning (PEFT) and prompt-tuning. Therefore, the pre-training model can achieve ideal effects in a small sample (Few-shot) or Zero sample (Zero-shot) scene. PTM can be classified according to the data modality of the process into a language model (ELMO, BERT, GPT), a visual model (swin-transducer, viT, V-MOE), a speech model (VALL-E), a multi-modal model (ViBERT, CLIP, flamingo, gato), etc., wherein a multi-modal model refers to a model that builds a representation of the characteristics of two or more data modalities. The pre-training model is an important tool for outputting Artificial Intelligence Generation Content (AIGC), and can also be used as a general interface for connecting a plurality of specific task models.
BERT: all Bidirectional Encoder Representation from Transformers, i.e., transform-based bi-directional coded representation, is a pre-trained language characterization model. It emphasizes that instead of pre-training as in the past with a conventional one-way language model or with a shallow concatenation of two one-way language models, a new mask language model (masked language model, MLM) is used to enable deep bi-directional language characterization.
one-hot conversion: also known as one-hot coding, is the simplest and more commonly used method of text feature representation. The essence is that the subscript of a word in a word set is directly taken as a representation of the word.
In the related art, when a class of an object needs to be identified, a name text of the object may be acquired, and then the class identification of the object is performed through a large-scale pre-trained language model (e.g., BERT model). Specifically, a large-scale pre-training language model is adopted to extract the characteristics of the name text of the object, so that the name text characteristics of the object are obtained; and meanwhile, the large-scale pre-training language model can be adopted to conduct feature extraction on category texts of a plurality of object categories to obtain a plurality of category features, and then the category of the object is identified by calculating the similarity relation between the name text features and the plurality of category features. Alternatively, the behavioral statistics of the subject may be obtained and then classified using a multi-classification model such as a tree model, e.g., an extreme gradient lifting tree (extreme gradient boosting, XGBoost).
The large-scale pre-training language model and the tree model adopted in the related art can only process object data with a single dimension, and cannot process data with two dimensions, namely name text and behavior statistical data of an object, which are important for object category recognition tasks. The accuracy of the recognition result obtained by recognizing the object category by using only object data with a single dimension is poor. In order to solve the problem of poor accuracy of object category identification, the disclosure provides an object category identification method, so as to improve accuracy of object category identification.
The embodiment of the disclosure is applied to a system architecture and a scene description. Fig. 1 is a system architecture diagram to which an object class identification method according to an embodiment of the present disclosure is applied. It includes a terminal 140, the internet 130, a gateway 120, a server 110, etc.
The terminal 140 includes various forms of a desktop computer, a laptop computer, a PDA (personal digital assistant), a mobile phone, an in-vehicle terminal, a home theater terminal, a dedicated terminal, and the like. In addition, the device can be a single device or a set of a plurality of devices. The terminal 140 may communicate with the internet 130 in a wired or wireless manner, exchanging data.
Server 110 refers to a computer system that can provide certain services to terminal 140. The server 110 is required to have higher stability, security, performance, etc. than the general terminal 140. The server 110 may be one high-performance computer in a network platform, a cluster of multiple high-performance computers, a portion of one high-performance computer (e.g., a virtual machine), a combination of portions of multiple high-performance computers (e.g., virtual machines), etc.
Gateway 120 is also known as an intersubnetwork connector, protocol converter. The gateway implements network interconnection on the transport layer, and is a computer system or device that acts as a translation. The gateway is a translator between two systems using different communication protocols, data formats or languages, and even architectures that are quite different. At the same time, the gateway may also provide filtering and security functions. The message sent by the terminal 140 to the server 110 is to be sent to the corresponding server 110 through the gateway 120. A message sent by the server 110 to the terminal 140 is also sent to the corresponding terminal 140 through the gateway 120.
The object class identification method of the embodiment of the present disclosure may be implemented entirely in the terminal 140; may be implemented entirely on the server 110; or may be implemented in part at the terminal 140 and in part at the server 110.
The object class identification method may be that the terminal 140 obtains a name text of the target object, behavior statistical data of the target object, and class label texts corresponding to multiple classification classes when the terminal 140 is completely implemented; then, the terminal 140 performs feature extraction on the name text based on the first neural network model deployed in the terminal 140 to obtain name text features; feature extraction is performed on the behavior statistical data based on a second neural network model deployed in the terminal 140 to obtain behavior statistical features; extracting characteristics of the category label text based on a third neural network model deployed in the terminal 140 to obtain characteristics of the category label text, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data; calculating the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. The process of training the first neural network model, the second neural network model, and the third neural network model may also be implemented in the terminal 140, where the first neural network model, the second neural network model, and the third neural network model may be specifically understood as three modules in the object class recognition model, and the three neural network models are trained simultaneously in the process of training the object class recognition model.
In the case that the object class identification method is completely implemented by the server 110, the name text of the target object, the behavior statistical data of the target object and class label texts corresponding to a plurality of classification classes may be acquired by the server 110; then, the server 110 performs feature extraction on the name text based on the first neural network model deployed in the server 110 to obtain name text features; feature extraction is performed on the behavior statistical data based on a second neural network model deployed in the server 110, so as to obtain behavior statistical features; extracting characteristics of the category label text based on a third neural network model deployed in the server 110 to obtain characteristics of the category label text, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data; calculating the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. The process of training the first neural network model, the second neural network model, and the third neural network model may also be implemented in the server 110.
In the case where a part of the object class identification method is implemented in the terminal 140 and another part is implemented in the server 110, the terminal 140 may acquire the name text of the target object, the behavior statistics of the target object, and class label texts corresponding to the multiple classification classes; then, the terminal 140 transmits the acquired name text of the target object, the behavior statistical data of the target object, and the category label text corresponding to the plurality of classification categories to the server 110 for object category recognition. After receiving the name text of the target object, the behavior statistical data of the target object, and the category label text corresponding to the plurality of classification categories sent by the terminal 140, the server 110 may perform feature extraction on the name text based on the first neural network model deployed in the server 110, to obtain a name text feature; feature extraction is performed on the behavior statistical data based on a second neural network model deployed in the server 110, so as to obtain behavior statistical features; extracting characteristics of the category label text based on a third neural network model deployed in the server 110 to obtain characteristics of the category label text, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data; then the server 110 calculates the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. Further, the server 110 may transmit the target object category of the identified target object to the terminal 140. Wherein the process of training the first, second, and third neural network models may be implemented in the server 110. Or, the training process of the first neural network model, the second neural network model and the third neural network model is implemented in the server 110, and then the trained models are deployed in the terminal 140 to predict, and the terminal 140 can directly determine the target object category of the target object in the terminal 140 after acquiring the data of the name text of the target object, the behavior statistical data of the target object and the category label texts corresponding to the classification categories.
The embodiment of the disclosure can be applied to various scenes, such as a scene of identifying business operations of merchants according to mobile payment data, a scene of classifying online merchants by an internet platform, and the like.
First, identifying scenes of off-line commercial business according to mobile payment data
In recent years, with the continuous development of internet technology and mobile terminal technology, the payment scenario of people in the daily consumption process is also changed: offline payment is gradually changed from physical payment to mobile payment. The mobile payment refers to the payment by using the mobile terminal to scan the two-dimensional code provided by the merchant; specifically, the user can use the application program of each big bank loaded in the mobile terminal or use some application programs of enterprises with payment license plates issued by related departments to scan the two-dimension codes provided by the merchants, so that money is transferred from the account of the user to the account of the corresponding merchant, and further the payment process is realized. The mobile payment greatly improves the safety of payment scenes, and avoids the loss of both transaction parties caused by the defects of easy adulteration and easy loss of entity money and the like. The online banking application programs of each big bank and the payment applications provided by various enterprises with payment license plates also need corresponding cost for operation and maintenance of the background, and merchants can improve payment efficiency and reduce transaction risks by using mobile payment, so that corresponding mobile payment benefits can be obtained, and therefore, a platform providing the mobile payment function can charge the merchants a certain fee for maintaining the operation of the background. In some scenarios, a platform providing a mobile payment function may offer some rates for merchants supporting certain industries to use mobile payment, and in the scenarios, different merchants need to be identified in advance for business industries. Such as when the merchant is an industry of educational, medical, and other public welfare nature, such as schools, hospitals, etc., the merchant may be offered or exempted from rates. Therefore, merchants belonging to the industries can be identified through the object category identification method provided by the disclosure, and then targeted rate preference or deduction is carried out on the merchants.
When different merchants are identified in the business industry, name text of a certain merchant, behavior statistical data of the merchant and industry label text of a plurality of industries can be acquired. Then, extracting features of the name text of the merchant by adopting a first neural network model to obtain name text features; performing feature extraction on the behavior statistical data of the merchant by adopting a second neural network model to obtain behavior statistical features; and extracting characteristics of the industry label text by adopting a third neural network model to obtain characteristics of the industry label text. The first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data at the same time. And then calculating the classification probability of the commercial tenant corresponding to each industry according to the name text features, the behavior statistical features and the industry label text features, and then determining the industry corresponding to the commercial tenant according to the classification probability of the commercial tenant corresponding to each industry. Therefore, the business industry of the merchant can be identified according to the name text and the behavior statistical data of the merchant, and the more accurate business industry of the merchant can be obtained through identification.
(II) scene of on-line commercial tenant industry classification by Internet platform
When a merchant is in an internet platform, the merchant generally selects the business of the merchant during the residence, and the internet platform classifies the merchant according to different business industries so as to improve the efficiency of searching the corresponding merchant by the user of the internet platform. However, some industries operated by merchants are crossed, and the industry division rules of the merchants and users of the internet platform cannot be known in very detail, so that the internet platform users cannot find corresponding merchants in merchant lists according to different industries, the use experience of the internet platform users is affected, and the transaction amount of the merchants entering the platform is reduced. Therefore, the internet platform can acquire the name text, the behavior statistical data and the industry label text provided by the internet platform of each merchant, and then, the feature extraction is carried out on the name text of each merchant based on the first neural network model to obtain the name text feature; then, feature extraction is carried out on the behavior statistical data based on the second neural network model, and behavior statistical features are obtained; and extracting the characteristics of the industry label text based on the third neural network model to obtain the characteristics of the industry label text. The first neural network model, the second neural network model and the third neural network model are obtained by training simultaneously based on the same batch of training sample data. Further, the classification probability corresponding to each business of each merchant can be calculated according to the name text features, the behavior statistical features and the industry label text features. One or more industries corresponding to each merchant may then be determined based on the classification probabilities corresponding to each business by each merchant. Each merchant may then be presented in one or more industry classifications according to its corresponding one or more industries. The method can improve the accuracy of identifying the business industry of the merchant, and can also improve the use experience of a platform user and the transaction amount of the merchant.
General description of embodiments of the present disclosure. According to one embodiment of the present disclosure, there is provided an object class identification method. The method can be used for identifying the scene of the business operation industry according to the mobile payment data or the scene of the on-line business classification of the internet platform. Of course, the method can also be used in scenes of other object category recognition.
Fig. 2 is a schematic flow chart of an object class identification method provided in the present disclosure. The method may be applied to an object class identification device, which may be integrated in a computer device, which may be a terminal or a server. The object class identification method may include:
step 210, acquiring name text of the target object, behavior statistical data of the target object and category label text corresponding to a plurality of classification categories.
In the embodiment of the present disclosure, the overall description will be made of the object type recognition method provided by the present disclosure, taking an example that the object type recognition method is applied in a scenario of recognizing a business operation industry according to mobile payment data.
In the scenario of identifying a business industry according to mobile payment data, the target object may be a business that needs business industry identification. In other object class identification scenarios, the target object may be other corresponding thing. For example, in a scenario in which a user population is identified by a category, the target object may be a user to be identified by a category, or the like. The acquiring of the name text of the target object may specifically be acquiring the name text of the merchant to be identified by the business industry, for example, XX big pharmacy chain company, XXX stationery wholesale supermarket, XXXX grocery shop, etc. In general, the name of the merchant may include a text indicating the business industry of the merchant, for example, "big pharmacy" in XX big pharmacy chain limited may indicate that the business industry of the merchant is a pharmaceutical industry, and "stationery" in XXX stationery wholesale supermarkets may indicate that the business industry of the merchant is a stationery industry. Therefore, in some schemes in the related art, the business industry of the merchant can be identified directly through the name text of the merchant, and particularly, the business industry of the merchant can be identified by adopting a large-scale pre-training language model based on the name text of the merchant.
When the large-scale pre-training language model is adopted to identify the business operated by the commercial tenant based on the name text of the commercial tenant, the large-scale pre-training language model which is generally adopted can be a BERT model, a RoBERTa model or an ALBERT model. The BERT model is described in detail in the foregoing, and will not be described herein. The RoBERTa model is named a brute force optimized BERT model (A Robustly Optimized BERT), i.e. the RoBERTa model is a modified version of the BERT model, which has the following improvements over the BERT model: having a larger number of model parameters; having a larger training sample lot size; more training data is employed. Thus, the RoBERTa model can be understood as a finer tuning version of the BERT model. The ALBERT model is also a modified version of the BERT model, which is a pre-trained small model with fewer model parameters and better model results. When the large-scale pre-training language model is adopted to identify the business operated by the commercial tenant, the large-scale pre-training language model can be adopted to process the name text of the commercial tenant, so as to obtain the output name text characteristics; meanwhile, a large-scale pre-training language model can be adopted to process industry label texts of multiple industries, so that output industry label text characteristics are obtained. And then, calculating the similarity between the name text features and the industry label text features, and identifying the industry operated by the merchant according to the similarity. Specifically, a similarity threshold may be determined, and industries corresponding to the industry label text features with similarity of the name text features greater than the similarity threshold may be identified as industries corresponding to merchants.
In the embodiment of the disclosure, in addition to acquiring the name text of the target object, the behavior statistical data of the target object may be further acquired. The behavioral statistics of the merchant may specifically include transaction data of the merchant, such as transaction amount, unit price, or unit price of the customer of the merchant; in addition, the behavior statistics of the merchant may further include information of category information, nature, morphology, and the like of the merchant determined according to transaction behaviors of the merchant. The category information is different from the business industry of the merchant, and specifically, the category information can be directly identified according to the transaction data of the merchant, for example, when the object served by the merchant is an object of a specific group, for example, only serves a female user, for example, a merchant such as a female beauty shop, and the like, then the category of the merchant can be determined to be the merchant serving the female. The merchant nature may particularly refer to the form of a merchant service client, for example, whether the merchant is a business user (wholesaler) or an individual user (retailer) or both, may be determined as the merchant nature based on the transaction behavior of the merchant. The merchant modality may include an online merchant or an offline merchant, etc.
The behavior statistical data of the object, namely the transaction data of the merchant, can also disclose the business operated by the merchant to a certain extent. Accordingly, in the related art, the industry operated by the merchant may be identified using a tree model based on the transaction data of the merchant. Wherein the tree model is a decision tree, and the mathematical description is a piecewise function. The learning process of the tree model simply is to learn a better tree model structure from supervised data experience; the method comprises the steps of sequentially selecting the features, determining the internal nodes of the feature threshold for dividing, and finally outputting the numerical value or class result of the leaf nodes. Specifically, XGboost may be employed to process transaction data of a merchant to identify the business that the merchant is conducting.
Since a large-scale pre-trained language model is generally used for processing text features, and a tree model is generally used for processing numerical features, the name text and the behavior statistical data of an object cannot be fused in the related art to perform class identification of the object. The accuracy of the class recognition result obtained by performing class recognition based on object data of a single dimension is not high, and the method for performing object class recognition by fusing and using the name text and the behavior statistical data of the object in an end-to-end model is provided in the present disclosure, and the end-to-end model may be specifically referred to as an object class recognition model, and is described in detail below.
The name text and the behavior statistical data of the target object to be subjected to category identification are acquired, and meanwhile category label texts corresponding to a plurality of category categories can be further acquired. The multiple classification categories may be preset multiple classification categories, and the process of identifying the target object is to determine one or more categories corresponding to the target object from the preset multiple classification categories. In a scene of business industry identification of merchants, a plurality of classification categories are a plurality of industries; the category label text corresponding to the plurality of category categories is industry label text of a plurality of industries. The category label text corresponding to the plurality of classification categories can be generated manually or automatically for the computer equipment; or the detail adjustment can be automatically generated by the computer equipment and then manually performed. The computer equipment automatically generates and manually adjusts the details to ensure that a plurality of classification categories can finely and accurately cover the objects to be classified. After the object type recognition model provided by the disclosure is trained, the acquired name text, behavior statistical data and type label text of the target object can be input into the trained object type recognition model for recognition, and then the type recognition result of the target object can be obtained.
The object class recognition model provided by the present disclosure needs to be trained before the object class recognition model is used to recognize the class of the target object. First, a basic structure of the object class identification model provided in the present disclosure may be described. As shown in fig. 3, a schematic structure diagram of an object class identification model provided in the present disclosure is shown. As shown, the object class recognition model 300 provided by the present disclosure is mainly composed of three neural network models, specifically including a first neural network model 310, a second neural network model 320, and a third neural network model 330. The first neural network model 310 is used for processing the name text to obtain the name text characteristics; the second neural network model 320 is used for processing the behavior statistical data to obtain behavior statistical features; the third neural network model 330 is used for processing the category label text to obtain the characteristics of the category label text. In addition, the object class recognition model 300 further includes a feature fusion network 340, configured to perform feature fusion on the name text feature output by the first neural network model 310 and the behavior statistical feature output by the second neural network model 320, and output the fused feature. The object class recognition model 300 further includes a feature calculation network 350 for performing dot product calculation on the fusion feature and the class label text feature to obtain a calculation result, and outputting an object recognition result according to the calculation result.
As can be seen from fig. 3, the main structures of the object class identification model are a first neural network model 310, a second neural network model 320, and a third neural network model 330. The first neural network model 310 and the third neural network model 330 may specifically be the foregoing large-scale pre-training language model, that is, the first neural network model 310 and the third neural network model 330 may be neural network models with the same structure, but parameters of the first neural network model and the third neural network model are not shared, and in a training stage of the object class identification model, the two models independently adjust respective parameters. Alternatively, the first neural network model and the third neural network model may also employ large-scale pre-training language models of different structures. The second neural network model 320 may be a numerical processing model, and in particular may be a feature embedding transformation (feature tokenizer transformer, FT-transformer) model. The characteristic embedding transformation model generally comprises a characteristic embedding module and a multi-layer transformation module, wherein the characteristic embedding module is used for carrying out dimension lifting on an original input, and the multi-layer transformation module is used for transforming the characteristics after dimension lifting to obtain output characteristics.
Based on the above-described model structure of the object class identification model, a process of training the object class identification model provided by the present disclosure is further described below. As shown in fig. 4, a training flow diagram of the object class identification model provided in the present disclosure is shown. The training process of the object class identification model specifically may include the following steps:
In step 410, training sample data is obtained.
First, training sample data for training the object class recognition model provided by the present disclosure needs to be acquired. The training sample data specifically may include sample name text of the sample object, sample behavior statistical data of the sample object, sample category labels of the sample object, and category label text corresponding to the plurality of classification categories. Wherein it is understood that the training sample data obtained here is one of a plurality of training sample data for training the object class recognition model. The sample object may be a merchant, and the sample name text of the sample object may be a text formed by a merchant name of the merchant and a company name of a company to which the merchant belongs. The sample behavior statistics of the sample object may be statistics of transaction data of the merchant over a period of time, including transaction amount, unit price of a pen, unit price of a guest, and merchant information determined by the transaction information, such as information of merchant type, merchant property, and merchant morphology. The sample class label of the sample object can be specifically an industry operated by the manually marked merchant, and the manually marked industry operated by the merchant can be one or a plurality of industries. The category label text corresponding to the plurality of classification categories may be the same category label text as the category label text corresponding to the plurality of classification categories obtained when the training object category recognition model is adopted for reasoning. The plurality of classification categories may be those obtained by manually creating the classification categories, automatically creating the classification categories by a computer device, or automatically creating the classification categories by a computer device and manually correcting the classification categories.
And step 420, extracting features of the sample name text based on the first neural network model to obtain sample name text features.
After the training sample data is obtained, the sample name text of the sample object in the training sample data, the sample behavior statistical data of the sample object and the class label text corresponding to the multiple classification classes can be input into the object class identification model to be trained.
The object type recognition model 300 mainly comprises a first neural network model 310, a second neural network model 320, and a third neural network model 330, that is, the training process of the object type recognition model 300 is a process of adjusting parameters of the first neural network model 310, the second neural network model 320, and the third neural network model 330. Thus, the sample name text of the sample object, the sample behavior statistical data of the sample object and the class label text corresponding to the plurality of classification classes in the obtained training sample data are input into the object class identification model to be trained, specifically, the sample name text of the sample object is input into the first neural network model, the sample behavior statistical data of the sample object is input into the second neural network model, and the class label text corresponding to the plurality of classification classes is input into the third neural network model.
And inputting the sample name text of the sample object into a first neural network model, namely adopting the first neural network model to extract the characteristics of the sample name text, and obtaining the characteristics of the sample name text. Specifically, before the feature extraction is performed on the sample name text by adopting the first neural network model, the sample name text can be processed in a text mode, the sample name text is processed into a text with a specific length according to a preset text length, and then the feature extraction is performed on the processed text by adopting the first neural network model, so that the sample name text features are obtained.
And step 430, extracting features of the sample behavior statistical data based on the second neural network model to obtain sample behavior statistical features.
And inputting the sample behavior statistical data of the sample object into the second neural network model, namely extracting the characteristics of the sample behavior statistical data based on the second neural network model, so as to obtain sample behavior statistical characteristics. In an embodiment of the present disclosure, the second neural network model may specifically be the aforementioned FT-transducer model. Before the feature extraction of the sample behavior statistical data by adopting the second neural network model, the sample behavior statistical data can be processed first, numerical data is extracted from the sample behavior statistical data to serve as a first input, and non-numerical data in the sample behavior statistical data is converted into corresponding numerical data serving as a second input by a numerical conversion method.
Then, the first input and the second input may be respectively input to a feature embedding module of the second neural network model to perform an up-dimension process, and features obtained by the up-dimension process may be stacked to obtain stacked features. Further, the stacked characteristics obtained by stacking can be further input into a plurality of layers of transformers of the second neural network model for transformation processing, and output sample behavior statistical characteristics are obtained.
And step 440, carrying out feature fusion on the sample name text features and the sample behavior statistical features to obtain sample fusion features.
In the embodiment of the disclosure, feature extraction is performed on the sample name text and the sample behavior statistical data of the sample object based on the first neural network model and the second neural network model respectively, so that after the sample name text feature and the sample behavior statistical feature are obtained, the extracted sample name text feature and the sample behavior statistical feature can be further fused, and a sample fusion feature is obtained.
Specifically, the sample name text feature and the sample behavior statistical feature may be input into the feature fusion network 340 in fig. 3 to perform feature fusion, so as to obtain an output sample fusion feature.
And 450, extracting the characteristics of the category label text based on the third neural network model to obtain the characteristics of the sample label text.
And similarly, inputting the category label texts corresponding to the plurality of classification categories into a third neural network model, namely extracting the characteristics of the category label texts based on the third neural network model, so as to obtain sample label text characteristics. Wherein the third neural network model may be a large-scale pre-trained language model. The process of extracting the features of the category label text based on the third neural network model is consistent with the process of extracting the features of the sample name text based on the first neural network model, and will not be described herein.
Step 460, calculating a loss value based on the sample fusion feature, the sample tag text feature and the sample class tag, and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the loss value.
After the sample fusion feature and the sample label text feature are obtained, dot product calculation can be carried out on the sample fusion feature and the sample label text feature, and a dot product result is obtained. The dot product result indicates the classification probability of the sample object corresponding to each classification category, so that the loss value can be calculated according to the dot product result and the sample category label corresponding to the sample object. The sample class labels corresponding to the sample objects can be one or a plurality of sample class labels, the output labels of the classes corresponding to the sample class labels in the plurality of classification classes are 1, and the output labels of the other classes except the class corresponding to the sample class labels are 0. The constraint objective of designing the loss function is to approximate the classification probability of the sample object corresponding to each classification category as close as possible to the output label corresponding to each classification category.
After the loss value is calculated based on the designed loss function, parameters of the first neural network model, the second neural network model and the third neural network model can be updated based on the loss value. Specifically, the inverse transmission gradient value may be calculated according to the calculated loss value, and then parameters of the first neural network model, the second neural network model, and the third neural network model may be updated according to the inverse transmission gradient value.
Step 470, the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model, and the third neural network model based on the acquired training sample data are circularly performed until the first neural network model, the second neural network model, and the third neural network model converge.
After the parameters of the first neural network model, the second neural network model and the third neural network model are updated for one round, training sample data can be obtained again, namely one training sample data is obtained again in a plurality of training sample data, and then the parameters of the first neural network model, the second neural network model and the third neural network model are adjusted again according to sample name text, sample statistical data and sample category labels in the obtained training sample data. This is repeated until the object class identification model converges. Specifically, the judgment standard of convergence of the object class identification model may be that the number of times of the cyclic training reaches a preset number, or that the parameter variation of the three neural network models is smaller than a preset variation range.
In some embodiments, calculating the loss value based on the sample fusion feature, the sample tag text feature, and the sample class tag, and adjusting parameters of the first, second, and third neural network models based on the loss value, includes:
performing dimension adjustment on the sample fusion characteristics based on the fourth neural network model to obtain target sample fusion characteristics;
dot product calculation is carried out on the fusion characteristics of the target sample and the text characteristics of the sample label, so that a sample classification result is obtained;
calculating a loss value based on the sample classification result and the sample class label, and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the loss value;
the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the acquired training sample data are circularly executed until the first neural network model, the second neural network model and the third neural network model converge, including:
and circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the acquired training sample data until the first neural network model, the second neural network model, the third neural network model and the fourth neural network model converge.
In the embodiment of the disclosure, when a specific process of calculating the loss value based on the sample fusion feature, the sample tag text feature and the sample class tag is to calculate a dot product result between the fusion feature and the sample tag text feature, and then calculate the loss value based on the dot product result and the sample class tag, before calculating the dot product result between the fusion feature and the sample tag text feature, dimension adjustment may be performed on the fusion feature through a fourth neural network model to obtain a target sample fusion feature consistent with the sample tag text feature dimension. The fourth neural network model may be a multi-layer perceptron (Multilayer Perceptron, MLP) specifically, and then performing dot product calculation on the fusion features of the target sample and the text features of the sample tag to obtain a sample classification result. Further, a loss value may be calculated based on the sample classification result and the sample class label, and parameters of the three neural network models and the fourth neural network model may be updated based on the loss value. In this embodiment, the model structure of the object class identification model also changes to some extent, as shown in fig. 5, which is another schematic diagram of the object class identification model provided in the present disclosure. As shown, the object class recognition model 300 may further include a fourth neural network model 360, where the fourth neural network model 360 may be specifically a multi-layer perceptron, and the fourth neural network model 360 is configured to perform feature processing on the fused feature, so as to adjust the dimension of the fused feature to be consistent with the class label text feature, so that the fused feature may perform dot product calculation with the class label text feature.
The training of the object class recognition model may be performed before step 210 or before step 220. After the training of the object type recognition model is completed, the object type recognition model can be deployed on line to perform object type recognition. When the category identification needs to be performed on the target object, the name text and the behavior statistical data of the target object can be obtained as in the previous steps, and then the category of the target object is identified based on the name text of the target object, the behavior statistical data of the target object and the category label texts corresponding to the plurality of classification categories.
And 220, extracting features of the name text based on the first neural network model to obtain the features of the name text.
After training of the object class recognition model is completed and the object class recognition model is deployed on line, object class recognition can be performed based on the trained object class recognition model in the disclosure. Specifically, in the scenario of identifying the business of a merchant, the object class identification model may also be employed to identify the business of a merchant.
Specifically, when the trained object class recognition model is adopted to perform class recognition on the target object, feature extraction can be performed on the name text of the target object based on the first neural network model in the trained object class recognition model, so that the name text feature of the target object is obtained. That is, when the trained object type recognition model is adopted to recognize the business operated by the target business, the first neural network model in the object type recognition model can be adopted to extract the characteristics of the business name text of the target business, and the corresponding name text characteristics can be obtained.
In some embodiments, feature extraction is performed on the name text based on the first neural network model to obtain name text features, including:
performing text processing on the name text according to the preset text length to obtain a target text;
and extracting the characteristics of the target text based on the first neural network model to obtain the characteristics of the name text.
In an embodiment of the present disclosure, the first neural network model may be a large-scale pre-training language model, and in particular may be a BERT model. Before feature extraction is performed on the name text of the target object by adopting the BERT model, text processing can be performed on the name text of the target object according to a preset text length, so that the target text is obtained. The preset text length can be set manually or can be determined according to the input requirement of the BERT model. When the target object is a merchant, the name text of the target object may be "XX big pharmacy XXXX pharmacy chain limited", and the preset text length may be 15 characters, then the name text of the target object may be processed, so as to obtain a target text "XX big pharmacy XXXX pharmacy chain limited" with the preset text length. And then, extracting features of the processed target text based on the BERT model to obtain name text features.
The BERT is used as a pre-trained language representation model, which emphasizes that the pre-training is not performed by adopting a traditional unidirectional language model or a method for performing shallow splicing on two unidirectional language models as before, but a new mask language model (masked language model, MLM) is adopted, so that deep bidirectional language representation can be generated. The structure of the prior pre-training model is limited by a unidirectional language model (from left to right or from right to left), so that the representation capability of the model is limited, and only unidirectional context information can be acquired. While BERT is pre-trained using MLM and uses deep bi-directional transformers (unidirectional transformers are commonly referred to as transformer decoders, each of which is associated with only the currently left-going token), bi-directional transformers are referred to as transformer encoders, each of which is associated with all of the token's), to construct the entire model, thus ultimately yielding deep bi-directional language representations that can fuse the left and right context information. The main structure of BERT is in fact the structure of a multilayer transducer stack. As shown in fig. 6, a schematic diagram of feature extraction of text names of target objects using the BERT model is shown. As shown in the figure, for the target name text after data processing, word segmentation may be performed on the target name text to obtain N symbols, and then a classification symbol (cls token) is added before the sequence of N symbols. Word embedding is then performed on the n+1 tokens to obtain tokens (emmbedding) for each token, and the n+1 tokens (i.e., classification tokens and tokens 1 through tokens N in the figure) are input into the BERT model 600 as inputs to the model. The BERT model 600 outputs n+1 output features (i.e., classification feature and text feature 1 through text feature N in fig. 6) after reasoning. Since the output of the last transducer layer corresponding to the classification token is used to aggregate the whole sequence characterization information, that is, the classification feature output by the BERT model 600 may aggregate the whole information of the target name text, in the embodiment of the disclosure, the classification feature output by the BERT model 600 may be used as the extracted name text feature.
In some embodiments, text processing is performed on the name text according to a preset text length to obtain a target text, including:
when the text length of the name text is smaller than the preset text length, performing text expansion on the name text based on the preset text length to obtain a target text;
when the text length of the name text is greater than or equal to the preset text length, text cutting is carried out on the name text according to the preset text length, and a target text is obtained.
In the embodiment of the disclosure, when the name text is text-processed based on the preset text length, the length relation between the text length of the name text and the preset text length may be determined. When the text length of the name text is the same as the preset text length, text processing is not needed, and the feature extraction can be directly carried out on the name text by adopting the first neural network model, so that the object class identification efficiency is improved. When the text length of the name text is smaller than the preset text length, the name text needs to be subjected to text expansion, so that the text length of the expanded target text is consistent with the preset text length. Specifically, text expansion is performed on the name text, and a character "0" can be added at the end of the name text until the length of the name text reaches a preset text length. When the text length of the name text is detected to be larger than the preset text length, the name text of the target object can be truncated according to the preset text length, and the target text is obtained. The method includes that a name text of a target object is truncated according to a preset text length, specifically, a text with a preset text length before the name text is truncated to be used as the target text, a text with a preset text length after the name text is truncated to be used as the target text, or a word of a mood, a word of a help and a plurality of general words without distinction in the name text can be identified and deleted until the length of the rest text reaches the preset text length.
In some embodiments, when the text length of the name text of the target object is greater than the preset text length, the name text needs to be truncated, that is, some words in the name text are deleted, which may result in deletion of some information in the name text, so that the extracted characteristics of the name text are not accurate enough. Therefore, when the preset text length can be manually determined, a longer preset text length can be set, so that the information loss caused by the fact that the name text of the target object is truncated because the text length is larger than the preset text length is avoided as much as possible. Of course, the preset text length cannot be too large, otherwise, excessive invalid information is expanded, so that key deviation is caused during feature extraction, and accuracy of name text feature extraction is affected. In this regard, name texts of a large number of objects (e.g., multiple merchants) within a certain range may be acquired, text length statistics may be performed based on the acquired name texts, and then a suitable preset text length may be determined according to the text length obtained by the statistics.
And 230, extracting features of the behavior statistical data based on the second neural network model to obtain behavior statistical features.
When the class identification is performed on the target object based on the trained object class identification model, feature extraction can be further performed on the obtained behavior statistical data of the target object. The behavior statistical data of the target object comprises numerical data for recording the behavior of the target object and attribute data for reflecting the target object according to the behavior of the target object. For example, when the target object is a merchant, the behavior statistics of the target object may include numerical data for recording transaction behaviors of the merchant, such as transaction amount, unit price of pen, unit price of guest, and the like, and may also include attribute data of the merchant, such as category of the merchant, nature of the merchant, and morphology of the merchant, and the like, which are reflected according to the transaction behaviors of the merchant. Attribute data may also be understood herein as category data that classifies merchants, where the category is a different category than the business industry. For example, the merchant is an online merchant or an offline merchant, or the merchant is a user-oriented merchant or a business-oriented merchant. For numerical data of the target object and attribute data of the target object, in the embodiment of the disclosure, a second neural network model may be used to perform feature extraction on these data to obtain a behavior statistical feature. Wherein the second neural network model may embed a transformation model for the characteristics. The characteristic embedding transformation model comprises a characteristic embedding layer and a plurality of transformation layers, wherein the characteristic embedding layer is used for carrying out dimension-lifting processing on the digital data and the attribute data, and the plurality of transformation layers are used for carrying out transformation processing on the characteristics obtained through the dimension-lifting processing so as to obtain the output behavior statistical characteristics.
That is, in some embodiments, the second neural network model includes a first sub-model and a second sub-model, and feature extraction is performed on the behavior statistical data based on the second neural network model to obtain the behavior statistical feature, including:
performing dimension-lifting processing on the input features corresponding to the behavior statistical data based on the first sub-model to obtain high-dimension features;
and carrying out feature transformation on the high-dimensional features based on the second sub-model to obtain behavior statistical features.
The first sub-model included in the second neural network model may be understood as the aforementioned characteristic embedding module, and the second sub-model included in the second neural network model may be understood as the embedding multi-layer transducer module. As shown in fig. 7, a schematic structural diagram of a second neural network model provided in the present disclosure is provided. As shown, the second neural network model 320 includes a first sub-model 710 and a second sub-model 720. When the second neural network model 320 is used to perform feature extraction on the behavior statistical data of the target object, the behavior statistical data 701 is input into the first submodel 710 to perform dimension increasing processing, so as to obtain a high-dimensional feature 702. Then, a feature corresponding to the cls token is spliced on the obtained high-dimensional feature 702 to obtain a spliced input feature 703, the spliced input feature 703 is input into the second submodel 720 to perform conversion processing, and an output feature 704 is obtained, wherein the feature of the output feature 704 corresponding to the cls token in the dimension can be determined as a behavior statistical feature output by the second neural network model.
In some embodiments, performing an up-scaling process on the input feature corresponding to the behavior statistical data based on the first sub-model to obtain a high-dimensional feature, including:
acquiring numerical data in the behavior statistical data, and carrying out dimension lifting processing on the numerical data based on a first weight vector and a first bias vector to obtain a numerical vector;
acquiring attribute data in the behavior statistical data, and carrying out dimension lifting processing on input features corresponding to the attribute data based on a second weight vector and a second bias vector to obtain an attribute vector;
stacking the numerical vector and the attribute vector to obtain the high-dimensional feature.
In the embodiment of the present disclosure, a specific process of performing an upscaling on the numerical data and the attribute data in the behavior statistical data by the first sub-model may be to determine weight vectors and bias vectors corresponding to the numerical data and the attribute data respectively. And then, carrying out dimension ascending on the numerical data based on a first weight vector and a first offset vector corresponding to the numerical data to obtain the numerical vector. And performing dimension lifting processing on the input features corresponding to the attribute data based on the second weight vector and the second bias vector corresponding to the attribute data, so as to obtain the attribute vector. Further, the numerical vector and the attribute vector may be further stacked to obtain a high-dimensional feature.
Specifically, the specific process of carrying out dimension lifting on the numerical data based on the first weight vector and the first offset vector corresponding to the numerical data may be that the product of the numerical data and the corresponding first weight vector is calculated, and then vector addition is carried out on the product and the corresponding first offset vector, so as to obtain the corresponding numerical vector. Similarly, the specific process of upsizing the attribute data based on the second weight vector and the second bias vector corresponding to the attribute data may be that the product of the attribute data and the corresponding second weight vector is calculated and then added to the corresponding second bias vector, thereby obtaining the corresponding attribute vector. Here, the number vector and the attribute vector may be one or more.
FIG. 8 is a schematic diagram of a process for upscaling features of behavior statistics based on a first sub-model. As shown, for each numerical data 810 in the behavior statistics, the product of the numerical data 810 and the first weight vector 820 may be calculated and then added to the first bias vector 830. The first weight vector 820 and the first bias vector 830 may be vectors with a dimension d, so as to obtain a numerical vector corresponding to each numerical data 810. For any attribute data 840, its product with the corresponding second weight vector 850 may be calculated and then added to the second bias vector 860, resulting in an attribute vector. The dimensions of the second weight vector 850 and the second bias vector 860 may be d dimensions, so that the attribute vector corresponding to each calculated attribute data may be a d-dimensional vector. Wherein, for each numerical value data, a weight vector and a bias vector can be corresponding; for the attribute data, the weight vector corresponding to the corresponding category can be determined according to the corresponding different category, and the attribute data with different classification scales can correspond to different offset vectors. For example, the attribute data of the first classification mode may be classified into two categories, i.e., male and female, according to the sex of the customer, and then the attribute data of the classification mode corresponds to two weight vectors, and then a corresponding weight vector may be determined among the two weight vectors according to the specific sex of the customer. Wherein the same bias vector can be set regardless of which weight vector the attribute data corresponds to. In the attribute data of another classification mode, the attribute data may be classified into individual users, enterprise users, and mixed users according to the form of the client, and the attribute data in the classification mode may correspond to three weight vectors and one bias vector. After the numerical vector and the attribute vector are obtained through calculation, the numerical vector and the attribute vector can be further stacked to obtain a high-dimensional characteristic of k-d dimensions. Where k is the total number of value vectors and attribute vectors.
In some embodiments, obtaining attribute data in the behavior statistical data, and performing dimension-lifting processing on input features corresponding to the attribute data based on the second weight vector and the second bias vector to obtain an attribute vector, including:
acquiring attribute data in the behavior statistical data, and performing numerical conversion on the attribute data to obtain converted numerical data;
and carrying out dimension lifting processing on the converted numerical data based on the second weight vector and the second bias vector to obtain an attribute vector.
In the embodiment of the disclosure, before performing the dimension-increasing processing on the attribute data based on the second weight vector and the second bias vector, the attribute data may be subjected to numerical conversion to obtain converted numerical data; and then, carrying out up-scaling on the conversion numerical data based on the second weight vector and the second bias vector. Specifically, the converted numerical data may be a vector constituted by numerical data; the attribute data is converted into a vector formed by the numerical data, and specifically, a one-hot algorithm (one-hot) may be used for processing, where the one-hot algorithm is described in the foregoing, and is not described herein again. In the embodiment of the disclosure, the efficiency of performing the dimension increasing processing on the features corresponding to the attribute data can be improved by performing numerical conversion on the attribute data to the corresponding vectors and then performing dimension increasing by adopting the corresponding weight vectors and the bias vectors. And further, the object type identification efficiency can be improved.
As shown in fig. 9, a schematic diagram of the model structure of the second sub-model is shown. As disclosed above, the second sub-model may specifically be a multi-layer transducer structure, and fig. 9 shows a schematic diagram of one layer transducer structure. As shown, a layer of the transducer structure 900 includes a first normalization layer 910, a multi-headed self-attention layer 920, a second normalization layer 930, and a forward transport layer 940. For the input feature, normalization processing may be performed through the first normalization layer 910, then self-attention computation processing may be performed through the multi-head self-attention layer 920, and the data output by the multi-head self-attention layer 920 may be added to the input feature. The added features are further input into the second normalization layer 930 for normalization, then forward transmission is performed through the forward transmission layer 940, and then the result output by the forward transmission layer and the added features are further added to obtain an output feature of the transformation former structure 900. Further, the output feature may then be further transformed as an input feature for the next transducer structure until the last transducer structure outputs the corresponding final output feature. The feature of the dimension corresponding to the cls token in the final output feature is the behavior statistical feature.
In the training stage of the object recognition model, when the model parameters of the second neural network model are adjusted, the weight vector and the bias vector in the first sub-model can be specifically adjusted; and adjusting network parameters of the two normalization layers, the plurality of self-attention layers and the forward transmission layer in the second sub-model.
And 240, extracting the characteristics of the category label text based on the third neural network model to obtain the characteristics of the category label text.
In the embodiment of the disclosure, when the target object is identified based on the trained object class identification model, the feature extraction may be further performed on the acquired class label text based on a third neural network model in the trained object class identification model. In the business management industry identification scene, the category label text can be specifically an industry label text corresponding to a plurality of industries.
When the feature extraction is performed on the category label text based on the third neural network model, the feature extraction can be performed on each category label text by adopting the third neural network model, so as to obtain the category label text feature corresponding to each category label text. And then stacking a plurality of category label text features corresponding to the category label texts to obtain final category label text features. The dimension of the finally obtained category label text features is consistent with the number of categories; in a scene of identifying industries operated by merchants, the dimension of the finally obtained category label text features is consistent with the number of the industries.
Wherein, as previously disclosed, the third neural network model may specifically be a large-scale pre-trained language model, such as a BERT model. When the third neural network model is the BERT model, a process of extracting features of the class label text of each class based on the third neural network model is consistent with a process of extracting features of the name text of the target object based on the first neural network model, and will not be described herein.
Step 250, calculating the classification probability of the target object corresponding to each classification category according to the name text feature, the behavior statistical feature and the category label text feature.
Because the name text of the target object and the behavior statistical data of the target object are two types of data with smaller relevance, the name text of the target object and the behavior statistical data of the target object can be related with the classification category of the target object in different relevance modes. Accordingly, in the related art, the classification category of the target object may be identified based on the name text of the target object alone, or may be identified based on the behavior statistics of the target object alone. When the classification category of the target object is identified based on the name text of the target object alone, a large-scale pre-training language model can be adopted to conduct feature extraction on the name text of the target object and category label texts corresponding to the classification categories, then the similarity between the extracted name text features and each category label text feature is calculated, and the classification category corresponding to the target object can be determined according to the similarity between the name text features and each category label text feature. When the classification category of the target object is identified based solely on the behavior statistics of the target object, the classification tree may be employed to multi-classify the behavior statistics of the plurality of dimensions, thereby learning the relationship between the behavior statistics and the plurality of classification categories.
However, due to the specificity of the data, the two types of data cannot be combined in one type of model to perform object type recognition. The strong line combines the two types of data into one type of model to learn the association relation between the input data and the classification category, so that the category identification effect of the model cannot be improved, and the model effect is reduced due to the introduction of invalid learning data. Specifically, if numerical data included in the behavior statistics data of the target object and attribute data not belonging to one dimension with the classification category are taken as input of a large-scale pre-training language model, the features extracted by the large-scale pre-training language model cannot generate good correlation with the classification category text, because the objects of completely different classification categories may have the same or very similar numerical data and attribute data. Similarly, if the name text of the target object is input as a classification tree, the multi-classification processing cannot be performed. This makes it possible to perform category recognition on the target object using only the name text of the target object or using only the behavior statistics of the target object in the related art.
In the embodiment of the disclosure, the feature extraction is performed on the behavior statistical data by designing the FT-transducer model which can perform numerical dimension-increasing transformation on numerical data in the behavior statistical data and perform dimension-increasing transformation on attribute data in the behavior statistical data after numerical conversion, and then the extracted feature and the name text feature extracted by the large-scale pre-training language model are used together to predict the category of the target object by the category label text feature extracted by the large-scale pre-training language model on the category label text, namely the name text information and the behavior statistical information of the target object can be fully used, so that the accuracy of object category identification is improved. And because the two large-scale pre-training language models and the FT-transducer are used as sub-modules of the object class recognition model from one end to the other end, parameters are adjusted simultaneously in the training process of the object class recognition model, the problem of poor model precision caused by model error expansion among a plurality of models which are respectively trained is avoided, namely, the model precision of the object class recognition model is improved, and the accuracy of object class recognition is further improved.
Based on the above, in the reasoning stage of the object class identification model, when the name text of the target object is respectively subjected to feature extraction based on the first neural network model to obtain name text features, the behavior statistical data of the target object is subjected to feature extraction based on the second neural network model to obtain behavior statistical features, and the class label text is subjected to feature extraction based on the third neural network model to obtain class label text features, classification probability corresponding to each classification class of the target object can be further calculated based on the name text features, the behavior statistical features and the class label text features, and then the class of the target object is identified based on the classification probability corresponding to each classification class of the target object. In the industry identification scene of the business management, namely, the classification probability of the business to be identified and each industry is calculated, and then the industry of the business management to be identified is determined according to the classification probability.
In some embodiments, calculating the classification probability of the target object corresponding to each classification category according to the name text feature, the behavior statistics feature and the category label text feature comprises:
carrying out fusion processing on the name text features and the behavior statistical features to obtain fusion features;
Calculating dot product results of the fusion features and the category label text features;
and determining the classification probability of the target object corresponding to each classification category according to the dot product result.
In the embodiment of the disclosure, when the classification probability of the target object corresponding to each classification category is calculated according to the name text, the behavior statistical feature and the category label text feature, fusion processing can be performed on the name text feature and the behavior statistical feature to obtain a fusion feature. The fusion feature can also be understood as the characterization information extracted from the target object to be identified by the object identification model, and the category label text feature is the characterization information extracted from the category label by the object identification model. Further, the category of the target object may be identified based on a similarity relationship between the characterization information of the target object and the characterization information of the classification category label. In the embodiment of the disclosure, the classification probability of the target object corresponding to each classification category can be determined by performing dot product calculation on the fusion feature and the category label text feature and then according to the result of the dot product calculation.
The method comprises the steps of carrying out fusion processing on name text features and behavior statistical features, and specifically splicing the name text features and the behavior statistical features in a feature splicing mode to obtain fusion features; or, the feature addition mode can be adopted to add the corresponding dimension of the name text feature and the behavior statistical feature, so as to obtain the fusion feature with the same dimension as the name text feature or the behavior statistical feature. The dot product result of the fusion feature and the category label text feature is calculated, specifically, the numerical values of the corresponding positions in the vectors corresponding to the fusion feature and the category label text feature are multiplied, and then all the products are added to obtain a scalar. Since the text features of the category labels are composed of text features of a plurality of category labels, the text features can be understood as a text matrix of the category labels, and therefore, the dot product result of the fusion features and the text features of the category labels can be a vector composed of a plurality of scalar quantities. The vector has the same dimensions as the number of classification categories, and each scalar in the vector corresponds to a classification category, each scalar indicating a probability that the target object belongs to the corresponding category. I.e. the dot product result indicates the probability that the target object belongs to each classification category. Thus, the target object category of the target object can be further determined according to the probability that the target object belongs to each classification category.
In some embodiments, the fusion processing is performed on the name text feature and the behavior statistical feature to obtain a fusion feature, including:
respectively acquiring weight coefficients corresponding to the name text features and the behavior statistical features;
and carrying out fusion processing on the name text features and the behavior statistical features based on the weight coefficients to obtain fusion features.
In the training stage of the object class identification model, since a large amount of training sample data is used for training the object class identification model, the name text and the behavior statistical data of a single sample object can have different contributions to the classification result of the sample object, and when the sample size of the training sample data is large enough, the contributions of the name text and the behavior statistical data of the sample object to the classification result of the sample object tend to be average. Therefore, in the object class recognition model obtained through training, model parameters of the first neural network model and the second neural network model are obtained through adjusting on the basis of contribution balance of the name text and the behavior statistical data of the sample object to the classification result of the sample object. However, in the actual reasoning process, the target object is taken as an independent individual, and the name text and the behavior statistical data of the target object are not necessarily balanced to share the classification result, so in order to further improve the accuracy of object identification, in the embodiment of the disclosure, different weight coefficients may be further allocated to the name text feature corresponding to the name text and the behavior statistical feature corresponding to the behavior statistical data to adjust the contribution of the name text and the behavior statistical data of the target object to the classification result.
Specifically, when the name text feature and the behavior statistical feature of the target object are fused, weight coefficients corresponding to the name text feature and the behavior statistical feature of the target object can be acquired respectively, and then fusion processing is performed on the name text feature and the behavior statistical feature based on the weight coefficients acquired respectively, so that the fusion feature is obtained. When the name text features and the behavior statistical features of the target object are fused, the contribution of the name text features and the behavior statistical features to the classification result is regulated by adopting different weight coefficients, so that the object classification result obtained by recognition can be more accurate.
The weight coefficients of the name text features and the behavior statistical features are obtained, the weight coefficients of the name text features and the behavior statistical features can be determined manually according to the name text and the behavior statistical data, and the weight coefficients of the name text features and the behavior statistical features can be automatically determined by computer equipment. For example, when the name text is more detailed and the amount of behavior statistics is smaller, a larger weight coefficient may be set for the name text feature and a smaller weight coefficient may be set for the behavior statistics feature; on the contrary, when the name text is simpler and the behavior statistical data amount is larger, smaller weight coefficients can be set for the name text features and larger weight coefficients can be set for the behavior statistical features.
In some embodiments, respectively obtaining the weight coefficients corresponding to the name text features and the behavior statistical features includes:
carrying out semantic recognition on the name text to obtain a semantic recognition result;
and determining a weight coefficient corresponding to the name text feature and the behavior statistical feature based on the semantic recognition result.
In an embodiment of the disclosure, a scheme for automatically determining a weight coefficient of a name text and a behavior statistical feature is provided. Specifically, semantic recognition can be performed on the name text to obtain a semantic recognition result. Then, a weight coefficient corresponding to the name text feature and the behavior statistical feature is determined based on the semantic recognition result. For example, after semantic recognition is performed on the name text of the target object to obtain a semantic recognition result, the probability that the name text reveals the classification category of the target object may be determined according to the semantic recognition result. For example, in an industry identification scenario of a merchant, a probability of indicating industry information in a name text of a merchant may be determined according to a semantic identification result of the name text of the merchant. The weighting coefficients for the named text features and the behavioral statistics can then be further calculated based on the probabilities.
In some embodiments, computing the dot product result of the fusion feature and the category label text feature includes:
Performing dimension conversion on the fusion features based on feature dimensions of the category label text features to obtain target fusion features;
and performing dot product calculation on the target fusion features and the category label text features to obtain dot product results.
The object class recognition model may further have a fourth neural network model, which may specifically be a multi-layer perceptron, for performing dimension adjustment on the sample fusion feature so that the dimension of the sample fusion feature is consistent with that of the sample tag text feature, so that the sample fusion feature and the sample tag text feature may calculate a sample classification result corresponding to each object class by using a dot product method. Therefore, in the embodiment of the present disclosure, when the class identification is performed on the target object based on the trained object class identification model, after the fusion feature is obtained by fusing the name text feature and the behavior statistical feature, the trained fourth neural network model may be further adopted to perform the dimension conversion on the fusion feature, specifically, the dimension conversion may be performed on the fusion feature based on the feature dimension of the class label text feature, and specifically, the dimension of the fusion feature may be converted to be consistent with the dimension of the class label text feature, so as to obtain the target fusion feature.
And then, dot product calculation can be further carried out on the target fusion feature and the category label text feature, and a dot product result is obtained.
Step 260, determining a target object class of the target object based on the classification probability.
After the classification probability of the target object corresponding to each classification category is calculated, the target object category of the target object can be determined according to the classification probability of the target object corresponding to each classification category. Specifically, a preset probability threshold value can be obtained, and when the classification probability of the target object corresponding to a certain classification category is greater than the probability threshold value, the target object can be determined to belong to the classification category; otherwise, when the classification probability of the target object corresponding to the classification category is not greater than the probability threshold, determining that the target object does not belong to the classification category.
The probability threshold may be determined manually or automatically by a computer device according to a statistical result. It is to be understood that, among the plurality of classification categories, there may be only one classification category having a classification probability corresponding to the target object greater than the probability threshold, or there may be a plurality of classification categories having a classification probability corresponding to the target object greater than the probability threshold. When the classification probability of the plurality of classification categories corresponding to the target object is larger than the probability threshold, determining that the target object category corresponding to the target object is a plurality of. For example, in a scenario of business identification of a business of business administration, when there are a plurality of businesses whose classification probabilities corresponding to the business to be identified are greater than a probability threshold, it is determined that the business to be identified belongs to the plurality of businesses.
In summary, the object category identification method provided by the disclosure is adopted, namely, the name text of the target object, the behavior statistical data of the target object and category label texts corresponding to a plurality of category categories are obtained; extracting features of the name text based on the first neural network model to obtain name text features; feature extraction is carried out on the behavior statistical data based on the second neural network model, so that behavior statistical features are obtained; extracting characteristics of the category label text based on the third neural network model to obtain category label text characteristics, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data; calculating the classification probability corresponding to each classification category of the target object according to the name text features, the behavior statistical features and the category label text features; a target object class of the target object is determined based on the classification probability. According to the object type identification method provided by the embodiment of the invention, the type identification is carried out on the target object by fully utilizing the data of the two dimensions of the name text of the target object and the behavior statistical data of the target object, and the problem that the type identification cannot be carried out by adopting the name text and the behavior statistical data at the same time due to the limitation of the model function in the related technology is solved, so that the accuracy of the object type identification is improved.
The embodiments of the present disclosure are described in detail in connection with specific application scenarios. Fig. 10 is another flow chart of the object class identification method provided in the present disclosure. In the embodiment of the present disclosure, a scenario in which the object class identification method provided in the present disclosure is applied to identification of a business industry will be taken as an example, and the object class identification method provided in the present disclosure will be described in detail. The method specifically comprises the following steps:
in step 1001, the computer device builds a business industry identification model.
The object type recognition method provided in the embodiments of the present disclosure may be specifically a method for recognizing a business industry, where the method is based on a business industry recognition model that can simultaneously use name text and transaction information of a business to recognize the business industry. The accuracy of identifying the commercial tenant business industry can be improved.
Firstly, the computer equipment needs to construct a merchant management industry identification model which can simultaneously utilize the name text and the transaction information of the merchant to identify the merchant management industry. The computer device may be a terminal or a server. The business transaction identification model constructed by the computer device may be a model of the structure shown in fig. 5. The business transaction identification model may include a first large-scale pre-trained language model, a feature embedding transformation model, a second large-scale pre-trained language model, and a multi-layer perceptron. The first large-scale pre-training language model is used for extracting characteristics of name texts of merchants to obtain text characteristics; the characteristic embedding transformation model user digitizes transaction information of the commercial tenant, and then characterizes the numerical value obtained by the digitizing to obtain a numerical value characteristic; the second large-scale pre-training language model is used for extracting text features of the full-scale industry label texts to obtain industry text features corresponding to each industry label text; the multi-layer perceptron is used for mapping the fusion characteristics obtained by fusing the text characteristics and the numerical characteristics, and mapping the fusion characteristics into target fusion characteristics of target dimensions, wherein the number of the target dimensions is the same as the number of the labels of the whole industry. The business industry identification model also comprises a calculation layer for calculating a predicted value according to the target fusion characteristic and the industry text characteristic.
At step 1002, a computer device obtains training sample data comprising sample merchant name text for a plurality of sample merchants, sample merchant transaction information, industry to which the sample merchants belong, and industry label text in full volume.
After the business industry identification model is built, training sample data can be further obtained and the business industry identification model can be trained based on the training sample data. The training sample data comprises relevant data of a plurality of sample merchants and the full-scale industry label text. The relevant data of the sample merchant comprises sample merchant name text of the sample merchant, sample merchant transaction information and industries to which the sample merchant belongs. The relevant data of the sample merchants can be data obtained from the public training data set or data obtained by information acquisition of the known merchants, wherein the information acquisition process accords with the regulations of relevant laws and regulations, and the information acquisition process is obtained on the basis of sample merchant authorization under the condition that the sample merchants know. The industries of the sample merchants can be specifically the industries of manual labeling, and the industries of the sample merchants can be one industry or a plurality of industries. The full industry label text can be understood as text corresponding to industry names of all industries obtained by classification according to a certain classification rule.
In step 1003, the computer device trains the business operation industry identification model using the training sample data.
After the training sample data is obtained, the computer equipment can train the constructed commercial tenant business industry identification model by adopting the training sample data. Specifically, a sample name text, sample transaction information and the full-scale industry label text of a sample merchant can be acquired first, and then the sample name text, the sample transaction information and the full-scale industry label text of the sample merchant are input into a merchant business identification model to be trained. After the business management industry identification model receives the input data, the feature extraction can be carried out on the sample name text of the sample business based on the first large-scale pre-training language model, so as to obtain sample text features; processing sample transaction information of the sample object based on the characteristic embedding transformation model to obtain an output sample numerical characteristic; and further extracting text features of the full-scale industry label text based on the second large-scale pre-training language model to obtain the industry label text features. Then, fusing the sample text features and the sample numerical features to obtain sample fusion features; further, mapping processing is carried out on the sample fusion characteristics based on the multi-layer perceptron to obtain target sample fusion characteristics. And performing dot product calculation on the target sample fusion feature and the industry label text feature to obtain a dot product result, wherein the dot product result is the prediction probability of the sample commercial tenant corresponding to each industry. Thus, the asymmetric loss can be calculated according to the prediction probability of the sample merchant corresponding to each industry and the industry to which the sample merchant belongs. And then adjusting parameters of the first large-scale pre-training language model, the characteristic embedding transformation model, the second large-scale pre-training language model and the multi-layer perceptron based on the asymmetric loss.
And then, acquiring the relevant data of a new sample merchant again, and training the merchant management industry identification model for a new round based on the relevant data of a new sample user. And circularly executing the step of re-acquiring new sample merchant related data to train the merchant business industry identification model until the circulation turns reach the preset turns or the change of the model parameters is detected to be smaller than the preset range, so as to obtain the trained merchant business industry identification model.
In step 1004, the computer device deploys the trained business operation industry identification model on-line.
After training the commercial tenant business identification model, the computer equipment can deploy the commercial tenant business identification model on line. In particular, the deployment may be performed locally or into other computer devices. The business industry identification model with the model structure obtained by training through the method can integrate the name text and the transaction information of the business to identify the business industry of the business, and obtain more accurate business industry of the business.
In response to receiving the business administration industry identification task, the computer device obtains the merchant name text of the target merchant to be identified and the merchant transaction information, step 1005.
After the business industry identification model is deployed on line, the business industry identification model can be adopted to execute business industry identification tasks. When receiving the business management industry identification task, the computer equipment can extract the business name text and the business transaction information of the target business to be identified from the business management industry identification task. The merchant name text comprises a merchant name and a text corresponding to the name of the company to which the merchant belongs and the group; the merchant transaction information includes transaction data of the merchant, such as numerical information of transaction amount, unit price of a pen, unit price of a guest, and the like, and further includes non-numerical information of a merchant category, a merchant property, a merchant morphology, and the like, which are determined according to transaction behaviors of the merchant, and detailed meanings of the information are described in the foregoing, and are not repeated herein.
In step 1006, the computer device performs text processing on the merchant name text based on the preset text length, and performs feature extraction on the processed merchant name text by using the first large-scale pre-training language model, so as to obtain name text features.
In the process of executing the business management industry identification task, after the business name text of the target business to be identified is acquired, text processing can be performed on the business name text based on the preset text length. When the length of the merchant name text is greater than the preset text length, the merchant name text can be cut off, and the processed merchant name text with the text length consistent with the preset text length is obtained. Otherwise, when the length of the merchant name text is smaller than the preset text length, the merchant name text can be expanded, for example, a method of adding 0 at the end of the text is adopted to expand the merchant name text to obtain the processed merchant name text with the text length consistent with the preset text length. And then, further adopting a first large-scale pre-training language model in the trained business management industry recognition model to extract text features of the processed business name text, and obtaining name text features.
In step 1007, the computer device converts the merchant transaction information into numerical data, and inputs the numerical data into the feature embedding model to obtain the output numerical feature.
Because the merchant transaction information contains some non-numeric data in addition to numeric data. In the embodiment of the disclosure, a one-hot algorithm may be first used to perform numerical conversion on the non-numerical data, and the non-numerical data may be converted into a numerical vector. Further, the converted numerical data can be uniformly input into the characteristic embedding model for processing, and the output numerical characteristics are obtained.
In the characteristic embedding model, the numerical data can be subjected to dimension increasing treatment through the characteristic embedding module, and then vectors obtained through dimension increasing are stacked to obtain stacking characteristics; further, the stacking features may be input to a multi-layer transducer module for feature conversion processing, thereby obtaining numerical features output by the multi-layer transducer module. The feature embedding module performs dimension-lifting processing on the numerical data, and specifically may perform dimension-lifting processing on the numerical data by using the weight vector and the bias vector obtained by training in the feature embedding module. The process can convert the merchant transaction information which cannot be inferred together with the merchant name text into the same type of characteristics which can be subjected to fusion processing, so that the merchant business industry identification model can simultaneously adopt the merchant name text and the transaction information to carry out the merchant business industry identification.
And step 1008, the computer equipment fuses the name text features and the numerical features, and adopts a multi-layer perceptron to perform dimension conversion on the fused features obtained by fusion to obtain target fused features.
Further, the text feature and the numerical feature of the target merchant may be subjected to fusion processing, for example, splicing processing or adding processing, to obtain a fusion feature. And then, further mapping the fusion features by adopting a multi-layer perceptron so that the target fusion features obtained by mapping can be calculated with features corresponding to the full-scale industry label text.
In step 1009, the computer device performs feature extraction on the full-scale industry label text based on the second large-scale pre-trained language model to obtain label text features.
Further, the computer device may perform text feature extraction on the full-scale industry label text based on a second large-scale pre-trained language model in the trained business industry recognition model to obtain label text features. The process of extracting text features from the full-scale industry label text based on the second large-scale pre-training language model is consistent with the process of extracting text features from the merchant name text based on the first large-scale pre-training language model, and will not be described in detail herein. The text features of the label obtained here may be text features corresponding to the full-scale industry. For example, there are k industries in total, and the text feature of the label can be a feature obtained by combining text features corresponding to the industry names of the k industries, that is, the text feature of the label can be a feature of D x k dimensions.
In step 1010, the computer device performs dot product calculation on the target fusion feature and the tag text feature to obtain a prediction probability corresponding to each industry tag text by the target merchant.
After the target fusion characteristics of the target object and the label text characteristics corresponding to the label text of the whole industry are determined, dot product calculation can be further carried out on the target fusion characteristics and the label text characteristics, and the prediction probability corresponding to each industry label text of the target merchant is obtained. For example, the dimension of the target fusion feature is D, the dimension of the label text feature is D x k, and the dot product result of the two is a k-dimensional vector consisting of k scalar quantities. Each scalar corresponds to the prediction class probability of the label text of one industry and the target merchant, and also corresponds to the probability that the target merchant belongs to the industry.
In step 1011, the computer device determines that the label corresponding to the label text with the prediction probability greater than the preset probability value is the target industry operated by the target merchant.
After determining the probability that the target merchant belongs to each industry, the target industry operated by the target merchant can be further determined according to the probability that the target merchant belongs to each industry. Specifically, a preset probability value may be obtained, and then, an industry with a predicted probability greater than the preset probability value is determined as a target industry for the target business of the target business.
The object category identification method provided by the embodiment of the disclosure is suitable for identification scenes of commercial tenant business industries. The method constructs a merchant business industry identification model which can integrate the name text and the transaction information of the merchant to identify the business industry of the merchant. Specifically, the business operation industry identification model extracts text features of a business name text through a large-scale pre-training language model, adopts a characteristic embedding transformation model to convert transaction information into numerical information, and then performs dimension lifting on the numerical information to obtain numerical features which can be fused with the text features, so that information which cannot be fused due to different processing models in the related technology can be fused. And then, predicting the business industry of the merchant according to the fusion information, so that the information of the name text and the transaction information of the merchant can be fully utilized to identify the business industry of the merchant, and the accuracy of identifying the business industry of the merchant is improved.
Apparatus and device descriptions of embodiments of the present disclosure. It will be appreciated that, although the steps in the various flowcharts described above are shown in succession in the order indicated by the arrows, the steps are not necessarily executed in the order indicated by the arrows. The steps are not strictly limited in order unless explicitly stated in the present embodiment, and may be performed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of steps or stages that are not necessarily performed at the same time but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least a portion of the steps or stages in other steps or other steps.
In the various embodiments of the present disclosure, when related processing is performed according to data related to characteristics of a target object, such as attribute information or attribute information set of the target object, permission or consent of the target object is obtained first, and collection, use, processing, and the like of the data comply with related laws and regulations and standards of related countries and regions. In addition, when the embodiment of the application needs to acquire the attribute information of the target object, the independent permission or independent consent of the target object is acquired through a popup window or a jump to a confirmation page or the like, and after the independent permission or independent consent of the target object is explicitly acquired, the necessary target object related data for enabling the embodiment of the application to normally operate is acquired.
Fig. 11 is a schematic structural diagram of an object class identification device 1100 according to an embodiment of the disclosure. The object class identification device 1100 includes:
an obtaining unit 1110, configured to obtain a name text of a target object, behavioral statistics of the target object, and category label texts corresponding to a plurality of classification categories;
the first extraction unit 1120 is configured to perform feature extraction on the name text based on the first neural network model, so as to obtain a name text feature;
The second extraction unit 1130 is configured to perform feature extraction on the behavior statistical data based on the second neural network model to obtain a behavior statistical feature;
the third extraction unit 1140 is configured to perform feature extraction on the category label text based on a third neural network model to obtain a category label text feature, where the first neural network model, the second neural network model, and the third neural network model are obtained by training simultaneously based on the same batch of training sample data;
a calculating unit 1150, configured to calculate a classification probability corresponding to each classification category of the target object according to the name text feature, the behavior statistics feature and the category label text feature;
a determining unit 1160 for determining a target object class of the target object based on the classification probability.
Optionally, the computing unit 1150 includes:
the first fusion subunit is used for carrying out fusion processing on the name text features and the behavior statistical features to obtain fusion features;
the computing subunit is used for computing dot product results of the fusion features and the category label text features;
and the determining subunit is used for determining the classification probability of the target object corresponding to each classification category according to the dot product result.
Optionally, the fusion subunit comprises:
The acquisition module is used for respectively acquiring weight coefficients corresponding to the name text features and the behavior statistical features;
and the fusion module is used for carrying out fusion processing on the name text features and the behavior statistical features based on the weight coefficients to obtain fusion features.
Optionally, the acquiring module includes:
the recognition sub-module is used for carrying out semantic recognition on the name text to obtain a semantic recognition result;
and the determining submodule is used for determining weight coefficients corresponding to the name text features and the behavior statistical features based on the semantic recognition results.
Optionally, the computing subunit comprises:
the conversion module is used for carrying out dimension conversion on the fusion features based on the feature dimension of the category label text features to obtain target fusion features;
and the first calculation module is used for carrying out dot product calculation on the target fusion feature and the category label text feature to obtain a dot product result.
Optionally, the second neural network model includes a first sub-model and a second sub-model, and the second extraction unit 1130 includes:
the first processing subunit is used for carrying out dimension-lifting processing on the input features corresponding to the behavior statistical data based on the first sub-model to obtain high-dimensional features;
and the base transformation subunit is used for carrying out feature transformation on the high-dimensional features in the second sub-model to obtain the behavior statistical features.
Optionally, the processing subunit includes:
the first processing module is used for acquiring numerical data in the behavior statistical data, and carrying out dimension lifting processing on the numerical data based on the first weight vector and the first offset vector to obtain a numerical vector;
the second processing module is used for acquiring attribute data in the behavior statistical data, and carrying out dimension lifting processing on input features corresponding to the attribute data based on a second weight vector and a second bias vector to obtain an attribute vector;
and the stacking module is used for stacking the logarithmic vector and the attribute vector to obtain high-dimensional characteristics.
Optionally, the second processing module includes:
the conversion sub-module is used for acquiring attribute data in the behavior statistical data and carrying out numerical conversion on the attribute data to obtain converted numerical data;
and the dimension lifting sub-module is used for carrying out dimension lifting processing on the converted numerical data based on the second weight vector and the second offset vector to obtain an attribute vector.
Optionally, the first extraction unit 1120 includes:
the second processing subunit is used for carrying out text processing on the name text according to the preset text length to obtain a target text;
and the first extraction subunit is used for extracting the characteristics of the target text to obtain the characteristics of the name text.
Optionally, the second processing subunit comprises:
the expansion module is used for carrying out text expansion on the name text based on the preset text length when the text length of the name text is smaller than the preset text length, so as to obtain a target text;
and the cutting module is used for cutting the text of the name text according to the preset text length to obtain the target text when the text length of the name text is greater than or equal to the preset text length.
Optionally, the object class identification device 1100 provided by the present disclosure further includes:
the first acquisition subunit is used for acquiring training sample data, wherein the training sample data comprises sample name texts of sample objects, sample behavior statistical data of the sample objects, sample category labels of the sample objects and category label texts corresponding to a plurality of classification categories;
the second extraction subunit is used for extracting the characteristics of the sample name text based on the first neural network model to obtain the characteristics of the sample name text;
the third extraction subunit is used for extracting characteristics of the sample behavior statistical data based on the second neural network model to obtain sample behavior statistical characteristics;
the second fusion subunit is used for carrying out feature fusion on the sample name text features and the sample behavior statistical features to obtain sample fusion features;
The fourth extraction subunit is used for extracting the characteristics of the category label text based on the third neural network model to obtain sample label text characteristics;
the adjustment subunit is used for calculating a loss value based on the sample fusion characteristic, the sample label text characteristic and the sample class label and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the loss value;
and the execution subunit is used for circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the acquired training sample data until the first neural network model, the second neural network model and the third neural network model converge.
Optionally, the adjusting subunit includes:
the first adjusting module is used for carrying out dimension adjustment on the sample fusion characteristics based on the fourth neural network model to obtain target sample fusion characteristics;
the second calculation module is used for carrying out dot product calculation on the fusion characteristics of the target sample and the text characteristics of the sample label to obtain a sample classification result;
the second adjustment module is used for calculating a loss value based on the sample classification result and the sample class label and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the loss value;
An execution subunit further configured to:
and circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the acquired training sample data until the first neural network model, the second neural network model, the third neural network model and the fourth neural network model converge.
Referring to fig. 12, fig. 12 is a block diagram of a portion of a terminal 140 implementing an object class identification method according to an embodiment of the present disclosure, the terminal 140 including: radio Frequency (RF) circuitry 1210, memory 1215, input unit 1230, display unit 1240, sensor 1250, audio circuitry 1260, wireless fidelity (wireless fidelity, wiFi) module 1270, processor 1280, and power supply 1290. It will be appreciated by those skilled in the art that the terminal 140 structure shown in fig. 12 is not limiting of a cell phone or computer and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The RF circuit 1210 may be used for receiving and transmitting signals during a message or a call, and in particular, after receiving downlink information of a base station, the signal is processed by the processor 1280; in addition, the data of the design uplink is sent to the base station.
The memory 1215 may be used to store software programs and modules, and the processor 1280 performs various function applications and object class identification of the terminal by executing the software programs and modules stored in the memory 1215.
The input unit 1230 may be used to receive input numerical or character information and generate key signal inputs related to the setting and function control of the terminal. Specifically, the input unit 1230 may include a touch panel 1231 and other input devices 1232.
The display unit 1240 may be used to display input information or provided information and various menus of the terminal. The display unit 1240 may include a display panel 1241.
Audio circuitry 1260, speaker 1261, microphone 1262 may provide an audio interface.
In this embodiment, the processor 1280 included in the terminal 140 may perform the object class identification method of the previous embodiment.
The terminal 140 of the embodiments of the present disclosure includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, an aircraft, etc. The embodiment of the invention can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like.
Fig. 13 is a block diagram of a portion of a server 110 implementing an object class identification method of an embodiment of the present disclosure. The server 110 may vary considerably in configuration or performance and may include one or more central processing units (Central Processing Units, simply CPU) 1322 (e.g., one or more processors) and storage devices 1332, one or more storage media 1330 (e.g., one or more mass storage devices) storing applications 1342 or data 1344. Wherein the storage 1332 and the storage medium 1330 may be transitory or persistent. The program stored on the storage medium 1330 may include one or more modules (not shown), each of which may include a series of instruction operations on the server 110. Further, the central processor 1322 may be configured to communicate with the storage medium 1330, and execute a series of instruction operations on the storage medium 1330 on the server 110.
The server 110 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input/output interfaces 1358, and/or one or more operating systems 1341, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
The central processor 1322 in the server 110 may be used to perform the object class identification method of embodiments of the present disclosure.
The embodiments of the present disclosure also provide a storage medium storing program codes for executing the object class identification method of the foregoing embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program. The processor of the computer device reads the computer program and executes it, causing the computer device to execute the object class identification method as described above.
The terms "first," "second," "third," "fourth," and the like in the description of the present disclosure and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this disclosure, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It should be understood that in the description of the embodiments of the present disclosure, the meaning of a plurality (or multiple) is two or more, and that greater than, less than, exceeding, etc. is understood to not include the present number, and that greater than, less than, within, etc. is understood to include the present number.
In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units may be stored in a removable storage medium if implemented in the form of software functional units and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It should also be appreciated that the various implementations provided by the embodiments of the present disclosure may be arbitrarily combined to achieve different technical effects.
The above is a specific description of the embodiments of the present disclosure, but the present disclosure is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present disclosure, and are included in the scope of the present disclosure as defined in the claims.
Claims (16)
1. An object class identification method, the method comprising:
acquiring name text of a target object, behavior statistical data of the target object and category label text corresponding to a plurality of category categories;
extracting features of the name text based on a first neural network model to obtain name text features;
performing feature extraction on the behavior statistical data based on a second neural network model to obtain behavior statistical features;
extracting features of the category label text based on a third neural network model to obtain category label text features, wherein the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data;
Calculating the classification probability of the target object corresponding to each classification category according to the name text features, the behavior statistical features and the category label text features;
and determining a target object category of the target object based on the classification probability.
2. The method of claim 1, wherein the second neural network model comprises a first sub-model and a second sub-model, wherein the feature extraction of the behavior statistics based on the second neural network model to obtain behavior statistics comprises:
performing dimension-lifting processing on the input features corresponding to the behavior statistical data based on the first sub-model to obtain high-dimensional features;
and carrying out feature transformation on the high-dimensional features based on the second sub-model to obtain behavior statistical features.
3. The method according to claim 2, wherein the performing, based on the first sub-model, the up-scaling processing on the input feature corresponding to the behavior statistical data to obtain a high-dimensional feature includes:
acquiring numerical data in the behavior statistical data, and carrying out dimension lifting processing on the numerical data based on a first weight vector and a first bias vector to obtain a numerical vector;
Acquiring attribute data in the behavior statistical data, and carrying out dimension lifting processing on input features corresponding to the attribute data based on a second weight vector and a second bias vector to obtain an attribute vector;
and stacking the numerical vector and the attribute vector to obtain a high-dimensional feature.
4. The method of claim 3, wherein the obtaining the attribute data in the behavior statistics and performing the dimension-up processing on the input feature corresponding to the attribute data based on the second weight vector and the second bias vector to obtain the attribute vector comprises:
acquiring attribute data in the behavior statistical data, and performing numerical conversion on the attribute data to obtain converted numerical data;
and carrying out dimension lifting processing on the converted numerical data based on the second weight vector and the second bias vector to obtain an attribute vector.
5. The method of claim 1, wherein the calculating the classification probability of the target object corresponding to each classification category based on the name text feature, the behavior statistics feature, and the category label text feature comprises:
performing fusion processing on the name text features and the behavior statistical features to obtain fusion features;
Calculating dot product results of the fusion features and the category label text features;
and determining the classification probability of the target object corresponding to each classification category according to the dot product result.
6. The method of claim 5, wherein the fusing the name text feature and the behavior statistical feature to obtain a fused feature comprises:
respectively acquiring weight coefficients corresponding to the name text features and the behavior statistical features;
and carrying out fusion processing on the name text features and the behavior statistical features based on the weight coefficients to obtain fusion features.
7. The method of claim 6, wherein the respectively obtaining the weight coefficients of the name text feature and the behavior statistical feature comprises:
carrying out semantic recognition on the name text to obtain a semantic recognition result;
and determining a weight coefficient corresponding to the name text feature and the behavior statistical feature based on the semantic recognition result.
8. The method of claim 5, wherein said calculating a dot product of said fusion feature and said category label text feature comprises:
Performing dimension conversion on the fusion features based on feature dimensions of the category label text features to obtain target fusion features;
and carrying out dot product calculation on the target fusion feature and the category label text feature to obtain a dot product result.
9. The method according to claim 1, wherein the feature extraction of the name text based on the first neural network model to obtain name text features includes:
performing text processing on the name text according to a preset text length to obtain a target text;
and extracting the characteristics of the target text based on the first neural network model to obtain the name text characteristics.
10. The method according to claim 9, wherein the text processing the name text according to the preset text length to obtain the target text includes:
when the text length of the name text is smaller than the preset text length, performing text expansion on the name text based on the preset text length to obtain a target text;
when the text length of the name text is greater than or equal to the preset text length, text cutting is carried out on the name text according to the preset text length, and a target text is obtained.
11. The method according to claim 1, further comprising, before the obtaining the name text of the target object, the behavior statistics of the target object, and the category label text corresponding to the plurality of classification categories:
obtaining training sample data, wherein the training sample data comprises sample name text of a sample object, sample behavior statistical data of the sample object, sample category labels of the sample object and category label texts corresponding to a plurality of classification categories;
extracting features of the sample name text based on the first neural network model to obtain sample name text features;
performing feature extraction on the sample behavior statistical data based on the second neural network model to obtain sample behavior statistical features;
performing feature fusion on the sample name text features and the sample behavior statistical features to obtain sample fusion features;
extracting features of the category label text based on the third neural network model to obtain sample label text features;
calculating a loss value based on the sample fusion feature, the sample tag text feature and the sample class tag, and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the loss value;
And circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model and the third neural network model based on the acquired training sample data until the first neural network model, the second neural network model and the third neural network model converge.
12. The method of claim 11, wherein the calculating a loss value based on the sample fusion feature, the sample tag text feature, and the sample class tag and adjusting parameters of the first, second, and third neural network models based on the loss value comprises:
performing dimension adjustment on the sample fusion characteristics based on a fourth neural network model to obtain target sample fusion characteristics;
performing dot product calculation on the target sample fusion features and the sample label text features to obtain sample classification results;
calculating a loss value based on the sample classification result and the sample class label, and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the loss value;
The cyclically executing the steps of acquiring training sample data and adjusting parameters of the first, second, and third neural network models based on the acquired training sample data until the first, second, and third neural network models converge, comprises:
and circularly executing the steps of acquiring training sample data and adjusting parameters of the first neural network model, the second neural network model, the third neural network model and the fourth neural network model based on the acquired training sample data until the first neural network model, the second neural network model, the third neural network model and the fourth neural network model converge.
13. An object class identification device, the device comprising:
the system comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is used for acquiring a name text of a target object, behavior statistical data of the target object and class label texts corresponding to a plurality of classification classes;
the first extraction unit is used for extracting the characteristics of the name text based on the first neural network model to obtain the characteristics of the name text;
The second extraction unit is used for extracting the characteristics of the behavior statistical data based on a second neural network model to obtain behavior statistical characteristics;
the third extraction unit is used for extracting the characteristics of the category label text based on a third neural network model to obtain the characteristics of the category label text, and the first neural network model, the second neural network model and the third neural network model are obtained by training based on the same batch of training sample data;
the computing unit is used for computing the classification probability of the target object corresponding to each classification category according to the name text features, the behavior statistical features and the category label text features;
and the determining unit is used for determining the target object category of the target object based on the classification probability.
14. A storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the object class identification method according to any one of claims 1 to 12.
15. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the object class identification method according to any one of claims 1 to 12 when executing the computer program.
16. A computer program product comprising a computer program, which computer program is read and executed by a processor of a computer device, to cause the computer device to perform the object class identification method according to any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310866653.7A CN116595978B (en) | 2023-07-14 | 2023-07-14 | Object category identification method, device, storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310866653.7A CN116595978B (en) | 2023-07-14 | 2023-07-14 | Object category identification method, device, storage medium and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116595978A true CN116595978A (en) | 2023-08-15 |
CN116595978B CN116595978B (en) | 2023-11-14 |
Family
ID=87599411
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310866653.7A Active CN116595978B (en) | 2023-07-14 | 2023-07-14 | Object category identification method, device, storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116595978B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285699A1 (en) * | 2017-03-28 | 2018-10-04 | Hrl Laboratories, Llc | Machine-vision method to classify input data based on object components |
WO2021008037A1 (en) * | 2019-07-15 | 2021-01-21 | 平安科技(深圳)有限公司 | A-bilstm neural network-based text classification method, storage medium, and computer device |
WO2021121127A1 (en) * | 2020-07-28 | 2021-06-24 | 平安科技(深圳)有限公司 | Sample type identification method, apparatus, computer device, and storage medium |
CN113627447A (en) * | 2021-10-13 | 2021-11-09 | 腾讯科技(深圳)有限公司 | Label identification method, label identification device, computer equipment, storage medium and program product |
CN113761250A (en) * | 2021-04-25 | 2021-12-07 | 腾讯科技(深圳)有限公司 | Model training method, merchant classification method and device |
US20220237409A1 (en) * | 2021-01-22 | 2022-07-28 | EMC IP Holding Company LLC | Data processing method, electronic device and computer program product |
CN114817586A (en) * | 2022-04-29 | 2022-07-29 | 北京三快在线科技有限公司 | Target object classification method and device, electronic equipment and storage medium |
CN115187311A (en) * | 2022-08-03 | 2022-10-14 | 上海维智卓新信息科技有限公司 | Shop site selection method and device suitable for multiple industries |
CN115731425A (en) * | 2022-12-05 | 2023-03-03 | 广州欢聚时代信息科技有限公司 | Commodity classification method, commodity classification device, commodity classification equipment and commodity classification medium |
WO2023051085A1 (en) * | 2021-09-30 | 2023-04-06 | 腾讯科技(深圳)有限公司 | Object recognition method and apparatus, device, storage medium and program product |
CN116226785A (en) * | 2023-02-14 | 2023-06-06 | 腾讯科技(深圳)有限公司 | Target object recognition method, multi-mode recognition model training method and device |
-
2023
- 2023-07-14 CN CN202310866653.7A patent/CN116595978B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285699A1 (en) * | 2017-03-28 | 2018-10-04 | Hrl Laboratories, Llc | Machine-vision method to classify input data based on object components |
WO2021008037A1 (en) * | 2019-07-15 | 2021-01-21 | 平安科技(深圳)有限公司 | A-bilstm neural network-based text classification method, storage medium, and computer device |
WO2021121127A1 (en) * | 2020-07-28 | 2021-06-24 | 平安科技(深圳)有限公司 | Sample type identification method, apparatus, computer device, and storage medium |
US20220237409A1 (en) * | 2021-01-22 | 2022-07-28 | EMC IP Holding Company LLC | Data processing method, electronic device and computer program product |
CN113761250A (en) * | 2021-04-25 | 2021-12-07 | 腾讯科技(深圳)有限公司 | Model training method, merchant classification method and device |
WO2023051085A1 (en) * | 2021-09-30 | 2023-04-06 | 腾讯科技(深圳)有限公司 | Object recognition method and apparatus, device, storage medium and program product |
CN113627447A (en) * | 2021-10-13 | 2021-11-09 | 腾讯科技(深圳)有限公司 | Label identification method, label identification device, computer equipment, storage medium and program product |
CN114817586A (en) * | 2022-04-29 | 2022-07-29 | 北京三快在线科技有限公司 | Target object classification method and device, electronic equipment and storage medium |
CN115187311A (en) * | 2022-08-03 | 2022-10-14 | 上海维智卓新信息科技有限公司 | Shop site selection method and device suitable for multiple industries |
CN115731425A (en) * | 2022-12-05 | 2023-03-03 | 广州欢聚时代信息科技有限公司 | Commodity classification method, commodity classification device, commodity classification equipment and commodity classification medium |
CN116226785A (en) * | 2023-02-14 | 2023-06-06 | 腾讯科技(深圳)有限公司 | Target object recognition method, multi-mode recognition model training method and device |
Non-Patent Citations (1)
Title |
---|
曹瑛;: "基于对偶主题空间的PLS图像标注方法", 控制工程, no. 02, pages 36 - 41 * |
Also Published As
Publication number | Publication date |
---|---|
CN116595978B (en) | 2023-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109101537B (en) | Multi-turn dialogue data classification method and device based on deep learning and electronic equipment | |
CN110263160B (en) | Question classification method in computer question-answering system | |
CN113627447B (en) | Label identification method, label identification device, computer equipment, storage medium and program product | |
CN113268609B (en) | Knowledge graph-based dialogue content recommendation method, device, equipment and medium | |
CN111159409B (en) | Text classification method, device, equipment and medium based on artificial intelligence | |
CN111814487B (en) | Semantic understanding method, device, equipment and storage medium | |
CN115994226B (en) | Clustering model training system and method based on federal learning | |
CN110046981A (en) | A kind of credit estimation method, device and storage medium | |
CN111767697B (en) | Text processing method and device, computer equipment and storage medium | |
CN115391499A (en) | Method for generating multitask generation model, question-answer pair generation method and related device | |
CN118113815B (en) | Content searching method, related device and medium | |
US20240037335A1 (en) | Methods, systems, and media for bi-modal generation of natural languages and neural architectures | |
CN114328841A (en) | Question-answer model training method and device, question-answer method and device | |
CN116975267A (en) | Information processing method and device, computer equipment, medium and product | |
CN114692624A (en) | Information extraction method and device based on multitask migration and electronic equipment | |
CN117725220A (en) | Method, server and storage medium for document characterization and document retrieval | |
CN116595978B (en) | Object category identification method, device, storage medium and computer equipment | |
CN116956289A (en) | Method for dynamically adjusting potential blacklist and blacklist | |
CN115879958A (en) | Foreign-involved sales call decision method and system based on big data | |
CN116383478A (en) | Transaction recommendation method, device, equipment and storage medium | |
CN114925682A (en) | Knowledge perception attention network-based multi-mode fusion Mongolian rumor detection method | |
CN113688938A (en) | Method for determining object emotion and method and device for training emotion classification model | |
CN115203516A (en) | Information recommendation method, device, equipment and storage medium based on artificial intelligence | |
CN111079013A (en) | Information recommendation method and device based on recommendation model | |
CN114722089B (en) | Data processing method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40092233 Country of ref document: HK |