WO2024021321A1 - Model generation method and apparatus, electronic device, and storage medium - Google Patents

Model generation method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2024021321A1
WO2024021321A1 PCT/CN2022/126426 CN2022126426W WO2024021321A1 WO 2024021321 A1 WO2024021321 A1 WO 2024021321A1 CN 2022126426 W CN2022126426 W CN 2022126426W WO 2024021321 A1 WO2024021321 A1 WO 2024021321A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
model
node
data set
function control
Prior art date
Application number
PCT/CN2022/126426
Other languages
French (fr)
Chinese (zh)
Inventor
李睿宇
石康
原卉
Original Assignee
深圳思谋信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳思谋信息科技有限公司 filed Critical 深圳思谋信息科技有限公司
Publication of WO2024021321A1 publication Critical patent/WO2024021321A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Definitions

  • the embodiments of the present application relate to the field of artificial intelligence technology, and in particular, to a method, device, electronic device, and storage medium for model generation.
  • AI artificial intelligence
  • face recognition image classification
  • object detection object detection
  • speech recognition etc.
  • AI development platforms are commonly used to provide users with services such as selection, construction, verification, and optimization of AI models for certain task goals.
  • AI development platforms are less flexible. How to obtain a model with diverse functions and the requirements for data processing in complex scenarios are issues that need to be solved urgently.
  • a method of model generation is provided to obtain a sample image data set; a tree structure model is determined based on the sample image data set.
  • the types of nodes (function controls) in the tree-structured model can have at most the following four items: image segmentation function controls, image classification function controls, image target detection function controls, or optical character recognition function controls.
  • the tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more. This can provide richer node types. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
  • a method for generating a tree-structured model is provided.
  • the input of each node in the model is the output of the node in the previous layer connected to the node.
  • Users can select the functions corresponding to different nodes (function controls) according to their needs to process the sample image data set.
  • the four node types that may be included in the model are more suitable for application in industrial scenarios, which to a certain extent simplifies the process of model generation, improves the flexibility of model generation, and can cope with more complex data processing scenarios while reducing It reduces the difficulty for users to learn and use.
  • control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types).
  • the control options include the above four functional controls (node types).
  • the user's second operation one or more of the four function controls are selected.
  • a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
  • the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, where the m-th layer
  • the node of is the image segmentation function control
  • the node of the m+1th layer is the image target detection function control, where m is a positive integer less than N.
  • a tree-structured model may include multiple sub-models.
  • the first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model.
  • the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series.
  • this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the tree-structured model generation method can be intuitively displayed on the user interface, which can facilitate the user to perform the model generation process and reduce the difficulty of user learning.
  • the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer and the m+2-th layer, Among them, the node on the mth layer is the image segmentation function control, the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition function control, where m is A positive integer less than N.
  • the image target detection function control in the first sub-model is subsequently added with an optical character recognition function control
  • the second sub-model includes the first sub-model and the newly added optical character recognition function control.
  • the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls.
  • this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. The output data is used as the final output data.
  • the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the j-th layer
  • the node of is the optical character recognition control
  • the node of the j+1th layer is the image classification control, where j is a positive integer less than N.
  • the third sub-model is connected in parallel with the first sub-model (or second sub-model).
  • the third sub-model includes an optical character recognition function control and an image classification function control.
  • this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
  • the sample processing is one of the following processing: marking processing, sample selection processing, scaling processing, and preprocessing.
  • the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  • an electronic device including: one or more processors; one or more memories; the one or more memories store one or more computer programs, and the one or more computer programs include instructions , when the instruction is executed by the one or more processors, the electronic device is caused to execute the first aspect and the method in any possible implementation manner of the first aspect.
  • a third aspect provides a device for model generation, a processor coupled to a memory, the memory is used to store a computer program, and the processor is used to run the computer program, so that the device for model generation performs the above-mentioned first aspect and method in any of its possible implementations.
  • the device for model generation further includes one or more of the memory and a transceiver, the transceiver being used to receive signals and/or transmit signals.
  • a computer-readable storage medium includes a computer program or instructions.
  • the computer program or instructions When the computer program or instructions are run on a computer, it makes it possible as in the first aspect and any of the instructions thereof. The methods in the implementation are executed.
  • a computer program product includes a computer program or instructions.
  • the computer program product includes a computer program or instructions.
  • the computer program or instructions When the computer program or instructions are run on a computer, the computer program or instructions perform as in the first aspect and any possible implementation manner thereof. The method is executed.
  • a sixth aspect provides a computer program that, when run on a computer, causes the method in the first aspect and any possible implementation thereof to be executed.
  • Figure 1 is a schematic diagram of the system architecture provided by an embodiment of the present application.
  • Figure 2 is a framework diagram of the model generation method provided by the embodiment of the present application.
  • Figure 3 is a schematic diagram of a model generation method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a functional module provided by an embodiment of the present application.
  • Figure 5 is a processing flow chart of a data set provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of image segmentation provided by an embodiment of the present application.
  • Figure 7 is a processing flow chart of another data set provided by the embodiment of the present application.
  • Figure 8 is a processing flow chart of another data set provided by the embodiment of the present application.
  • Figure 9 is a schematic diagram of the data set processing flow provided by the embodiment of this application.
  • Figure 10 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 12 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 13 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 14 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 15 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 16 is a schematic flow chart of a model generation method provided by an embodiment of the present application.
  • the model generation method provided by this application can be applied to the system architecture diagram shown in Figure 1.
  • the terminal device 11 communicates with the server 12 through the network.
  • the terminal device 11 sends the sample pattern data set and the user's intention to the server 12 .
  • user intention is used to represent the user's required processing of the sample pattern data set.
  • the server 12 uses the sample pattern data set to train the corresponding model according to the user's intention, and finally generates the model actually needed by the user. That is to say, the user first uploads the data set that needs to be processed to the AI development platform provided by the embodiment of this application, and then selects the corresponding processing operation according to the actual needs to generate the model.
  • the terminal device 12 in the above system architecture diagram includes but is not limited to mobile phones, tablet computers, wearable electronic devices with wireless communication functions (such as smart watches), etc.
  • Exemplary embodiments of portable electronic devices include, but are not limited to, carrying Or portable electronic devices with other operating systems.
  • the above-mentioned electronic device may not be a portable electronic device, but a desktop computer.
  • the server 12 can be implemented as an independent server or a server cluster composed of multiple servers.
  • the model generation framework diagram 200 includes an image segmentation module 210, an image classification module 220, an image target detection module 230, and an optical character recognition (optical character recognition, OCR) 240.
  • image segmentation module 210 an image classification module 220
  • image target detection module 230 an image target detection module 230
  • optical character recognition optical character recognition
  • Image segmentation module 210 may be used to perform image segmentation.
  • Image segmentation is a technique and process that divides an image into several specific regions with unique properties and proposes objects of interest. It is a key step from image processing to image analysis.
  • Existing image segmentation methods are mainly divided into the following categories: threshold-based segmentation methods, region-based segmentation methods, edge-based segmentation methods, and segmentation methods based on specific theories.
  • the segmentation algorithm based on deep learning is mainly used.
  • image segmentation is the process of dividing a digital image into disjoint regions.
  • the process of image segmentation is also a labeling process, which can assign the same number to pixels belonging to the same area.
  • Image segmentation can be applied to the detection and edge recognition of detected objects down to the pixel level. For example, it can identify defects in fine parts such as cracked areas on silicon wafers and damaged areas in bearings.
  • Image classification module 220 may be used to perform image classification.
  • Image classification is an image processing method that distinguishes different categories of targets based on the different characteristics reflected in the image information.
  • Image classification can be applied to classify and judge detected materials. For example, two classification judgments are made based on whether materials are qualified, the color of the test object, the type of food being tested, defect subdivision, or classifying the test objects according to different materials.
  • the image object detection module 230 may be used to perform image object detection.
  • Image target detection is a theory and method that utilizes the fields of image processing and pattern recognition. Objects of interest can be located from the image, the specific category of each object is determined, and the bounding box of each object is given. Image target detection can be used to locate and classify targets in detection materials, and is suitable for multi-target detection, small target detection or counting, etc. For example, determine the number of pharmaceutical pills, determine the location of defects in components, etc.
  • the optical character recognition module 240 may be used to perform optical character recognition.
  • Optical character recognition refers to the process of analyzing and recognizing image files of text materials to obtain text and layout information.
  • Optical character recognition can be applied to single-character labeling and recognition, and multi-character labeling and recognition. It can break the limitations of traditional methods and solve complex character recognition problems such as curve character recognition, low contrast character recognition, and large character recognition. For example, text on fine parts can be recognized.
  • FIG 3 shows a schematic diagram of a model generation method provided by an embodiment of the present application.
  • the data set 300 includes but is not limited to a picture data set, a video data set, a text data set, etc.
  • the structure generated by this model is a tree structure.
  • the user can select modules with different functions for processing the data set 300 according to actual needs, such as image segmentation module, image classification module, image target detection module or optical character recognition module, etc. In Figure 3, only the above four modules are taken as examples for explanation. The user can select the required modules to build tree branches to implement the next step of processing the data set 300.
  • Figure 3 a model generating three series or parallel schemes is shown in Figure 3 .
  • the data set 300 is subjected to image segmentation, image target detection, optical character recognition and image classification processing in sequence.
  • the data set 300 is subjected to image segmentation, image target detection and optical character recognition processing in sequence.
  • image classification processing is performed on the data set 300.
  • the obtained segmentation data will be transmitted to the image object detection module.
  • the input data of each module is the output data of the previous module.
  • users can freely combine and connect the above four modules according to actual needs.
  • other functional modules will not be connected after image classification.
  • the user can selectively generate a model corresponding to the tree structure. For example, output the software development kit (SDK) corresponding to the model, etc.
  • SDK software development kit
  • each sub-node in the data structure diagram corresponds to a complete solution. However, users can choose to output all or part of the solution according to actual needs.
  • the user can select or delete functional modules through an input device (such as a mouse).
  • an input device such as a mouse
  • four functional modules including image segmentation, image classification, image target detection, and optical character recognition, will appear for the user to choose.
  • image segmentation process can be performed on the data set 300.
  • image classification process can be performed on the data set 300.
  • image object detection can be performed on the data set 300.
  • optical character recognition can be performed on the data set 300.
  • the above four functional modules will also appear for the user to choose the specific task to be performed next. That is to say, after each processing, the user can choose a specific processing method for the next step.
  • the specific processing method can be one or more, and this application does not limit this.
  • the user when the user right-clicks the image segmentation module, it means deleting the image segmentation module.
  • the image classification module 320 when the user right-clicks the image classification module 320, it means that the image target detection module 320, the optical character recognition modules 330 and 360, and the image classification module 340 are all deleted.
  • the image segmentation module 310 and image classification module 350 in Figure 3 will be retained.
  • FIG. 6 is an example of image data provided by this application.
  • the segmented image 600 may include four regions 610, 620, 630, and 640.
  • the four areas in Figure 6 are only examples, and other areas may also be included.
  • Image segmentation can assign the same number to pixels belonging to the same area.
  • the target of interest is located from the segmented areas of the 100 images in step S501, the specific category of each target is determined, and the bounding box of each target is given.
  • the target may be a trademark logo, or a description label, etc.
  • the 100 images in step S503 are classified.
  • the area numbered A in Figure 6 is divided into one category
  • the area numbered B is divided into another category
  • the area numbered C is divided into another category
  • the area numbered D is divided into another category.
  • a model can be generated, which can be used to process certain images according to the process of image segmentation, image target detection, optical character recognition and image classification.
  • Image segmentation can assign the same number to pixels belonging to the same area.
  • the target of interest is located from the segmented areas of the 100 images in step S701, the specific category of each target is determined, and the bounding box of each target is given.
  • the target may be a trademark logo, or a description label, etc.
  • a model can be generated, which can be used to process images according to the process of image segmentation, image target detection and optical character recognition.
  • the 100 images in the data set 300 are separated into different categories of areas in the image, and marked with different numbers.
  • a model is generated, which can directly perform image classification processing on images.
  • Figures 9, 10, 11 and 12 show schematic diagrams of data sets being processed in a tree structure model.
  • the processing flow of the tree structure model includes: on the one hand, the data set undergoes image target detection (for example, detection 1 in Figure 9), and then undergoes optical character recognition (for example, OCR1 in Figure 9) processing.
  • the data set is processed by image object detection (eg, detection 1 in Figure 9) and then image classification (eg, classification 1 in Figure 9).
  • image object detection eg, detection 1 in Figure 9
  • image classification eg, classification 1 in Figure 9
  • detection 1, OCR1 and classification 1 in Figure 9 can be considered as different functional controls.
  • FIG. 10 a schematic diagram of detection 1 in the above-mentioned aspect of the processing flow is shown.
  • the image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, the area above the characters "9BC3" is currently set.
  • OCR1 perform character recognition on the area located after executing detection 1.
  • OCR1 can also perform post-processing on the results output by detection 1.
  • the post-processing can be to offset the determined area by a fixed amount so that the character "9BC3" area in the figure is located.
  • the above post-processing can be the result of 4 types of processing on the sample image data set, and further adjustments can be made.
  • the area corresponding to the processed output result is offset or scaled by a certain amount to make the area corresponding to the output result more accurate.
  • the image is segmented into specific areas, and the user interface may display the segmented specific areas. Users can visually check whether the sample image data set is segmented accurately. When the specific segmented area deviates from the area that the user needs to be segmented, the specific area corresponding to the image segmentation process can be further adjusted to make the process more accurate.
  • the images are classified into different areas.
  • the classification of the sample image data set may be inaccurate, and areas that should not be in the same category are classified into the same category. Users can adjust the image classification by further moving the area corresponding to the output result, making the classification of the sample image data set more accurate.
  • objects of interest to the user are marked. For example, count the number of objects of interest to the user in the sample image data.
  • the user can further adjust and manually mark the targets that are not counted to improve the accuracy of quantity statistics.
  • Further processing of the sample image data set includes but is not limited to labeling processing, sample selection processing, scaling processing, and preprocessing. Further processing can be understood as the process of optimizing the processed data, which allows the final trained model to process data more accurately.
  • FIG. 12 a schematic diagram of Classification 1 in the processing flow of the above-mentioned aspect is shown.
  • Figures 13, 14 and 15 show schematic diagrams of data sets being processed in another tree structure model.
  • the processing flow of the tree structure model includes: the data set undergoes image target detection (for example, detection 1 in Figure 13), and then image segmentation (for example, segmentation 1 in Figure 13). Among them, detection 1 and segmentation 1 in Figure 13 can be considered as different functional controls.
  • FIG. 14 a schematic diagram of detection 1 in the above process flow is shown.
  • the image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, what is currently set is the position of the diode in the image.
  • segmentation 1 As shown in FIG. 15 , a schematic diagram of segmentation 1 in the above processing flow is shown. Segmentation 1: For the positioned diode, the defective area is further identified and the defective area will be segmented.
  • Figure 16 shows a model generation method 1600 provided by the embodiment of the present application. This method can be applied in the framework shown in Figure 2. The method 1600 is described in detail below.
  • S1602. Determine a tree structure model based on the sample image data set.
  • the tree structure includes N layers, and each layer in the N layers includes at least one node, wherein the input of the node of the first layer is the sample image data set, and the input of the node of the i-th layer is the i-th layer.
  • the output of one of the nodes in layer 1, where N is a positive integer greater than 1, i 2,...,N; each node is one of the following controls: image segmentation function control, image classification function control, image target detection function Controls, optical character recognition function controls.
  • the embodiments of this application do not limit the specific number of sample image data sets.
  • the nodes (function controls) in the model of the tree structure can include at most four types of function controls: image segmentation function control, image classification function control, image target detection function control or optical character recognition function control.
  • the tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more, which can provide a richer node type. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
  • control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types).
  • the control options include the above four functional controls (node types).
  • the user's second operation one or more of the four function controls are selected.
  • a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
  • the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, wherein the nodes of the m-th layer are image segmentation function controls, and the m+1-th layer
  • the node is the image target detection function control, where m is a positive integer less than N.
  • a tree-structured model may include multiple sub-models.
  • the first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model.
  • the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series.
  • this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer, and the m+2-th layer, wherein the nodes of the m-th layer are image segmentation function controls.
  • the node on the m+1th layer is the image target detection function control
  • the node on the m+2th layer is the optical character recognition function control, where m is a positive integer less than N.
  • the image target detection function control in the first sub-model was later added with an optical character recognition function control
  • the second sub-model includes the first sub-model and the newly added optical character recognition function control.
  • the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls.
  • this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. Output data as final output data.
  • the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the node on the j-th layer is an optical character recognition control, and the j+1-th layer
  • the node is the image classification control, where j is a positive integer less than N.
  • the third sub-model is connected in parallel with the first sub-model (or second sub-model).
  • the third sub-model includes an optical character recognition function control and an image classification function control.
  • this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
  • the sample processing is one of: labeling processing, sample selection processing, scaling processing, preprocessing.
  • the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  • a model generation method based on a tree structure is provided. Users can choose different functional modules to process data according to specific needs, and can also choose multiple branch functions for the next step of processing the processed data. This method improves the flexibility of model generation and can cope with more complex data processing scenarios. At the same time, users can intuitively and conveniently select different functional modules on the user interface according to specific needs, which reduces the difficulty for users to learn and use the model.
  • Embodiments of the present application provide a computer program product.
  • the computer program product When the computer program product is run on an electronic device, it causes the electronic device to execute the technical solutions in the above embodiments.
  • the implementation principles and technical effects are similar to the above-mentioned method-related embodiments, and will not be described again here.
  • Embodiments of the present application provide a readable storage medium.
  • the readable storage medium contains instructions.
  • the instructions When the instructions are run on an electronic device, the electronic device executes the technical solutions of the above embodiments.
  • the implementation principles and technical effects are similar and will not be described again here.
  • Embodiments of the present application provide a chip, which is used to execute instructions. When the chip is running, it executes the technical solutions in the above embodiments. The implementation principles and technical effects are similar and will not be described again here.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • this function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method in various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

Embodiments of the present application provide a model generation method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring a sample image data set; and determining a model of a tree structure according to the sample image data set, the tree structure comprising N layers, each of the N layers comprising at least one node, the input of the node of the first layer being the sample image data set, and the input of the node of the i-th layer being the output of one of the nodes in the (i-1)-th layer, wherein N is a positive integer greater than 1, i = 2, ..., N. Each node is one of the following controls: an image segmentation function control, an image classification function control, an image target detection function control and an optical character recognition function control. According to the method, on a user interface, a user can visually and conveniently select different functional modules to process data, the flexibility of model generation is improved, and a relatively complex data processing scene can be dealt with conveniently.

Description

模型生成的方法、装置、电子设备和存储介质Model generation method, device, electronic device and storage medium
本申请要求于2022年7月26日提交中国专利局、申请号为202210881785.2、申请名称为“模型生成的方法、装置、电子设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on July 26, 2022, with the application number 202210881785.2 and the application title "Method, device, electronic device and storage medium for model generation", the entire content of which is incorporated by reference. incorporated in this application.
技术领域Technical field
本申请实施例涉及人工智能技术领域,特别是涉及一种模型生成的方法、装置、电子设备和存储介质。The embodiments of the present application relate to the field of artificial intelligence technology, and in particular, to a method, device, electronic device, and storage medium for model generation.
背景技术Background technique
目前,人工智能(artificial intelligence,AI)受到了学术界和工业界的广泛关注,因AI技术较快的处理速率和较高的处理准确率,受到广泛的应用。例如,人脸识别、图像分类、物体检测、语音识别等。At present, artificial intelligence (AI) has received widespread attention from academia and industry. AI technology has been widely used due to its faster processing speed and higher processing accuracy. For example, face recognition, image classification, object detection, speech recognition, etc.
在应用AI模型之前,首先需要构建或选择合适的AI模型,以及训练和优化AI模型。现常用AI开发平台为用户提供一定任务目标的AI模型的选择、构件、验证、调优等业务。Before applying an AI model, you first need to build or select a suitable AI model, as well as train and optimize the AI model. Currently, AI development platforms are commonly used to provide users with services such as selection, construction, verification, and optimization of AI models for certain task goals.
但是,AI开发平台的灵活性较差。如何获得功能多样的模型,以及能够完整复杂场景下数据处理的要求是目前亟待解决的问题。However, AI development platforms are less flexible. How to obtain a model with diverse functions and the requirements for data processing in complex scenarios are issues that need to be solved urgently.
发明内容Contents of the invention
第一方面,提供了一种模型生成的方法,获取样本图像数据集;根据该样本图像数据集确定树状结构的模型,该树状结构包括N个层,该N个层中的每个层包括至少一个节点,其中第1层的节点的输入为该样本图像数据集,第i层的节点的输入为第i-1层中节点之一的输出,其中N为大于1的正整数,i=2,…,N;其中每个该节点为以下控件之一:图像分割功能控件、图像分类功能控件、图像目标检测功能控件、光学字符识别功能控件。In the first aspect, a method of model generation is provided to obtain a sample image data set; a tree structure model is determined based on the sample image data set. The tree structure includes N layers, and each layer in the N layers Including at least one node, where the input of the node in the first layer is the sample image data set, and the input of the node in the i-th layer is the output of one of the nodes in the i-1 layer, where N is a positive integer greater than 1, i =2,...,N; each node is one of the following controls: image segmentation function control, image classification function control, image target detection function control, optical character recognition function control.
应理解,本申请实施例对样品图像数据集的具体数量不作限定。该树状结构的模型中的节点(功能控件)的种类最多可以有以下4项:图像分割功能控件、图像分类功能控件、图像目标检测功能控件或光学字符识别功能控件。树状结构的模型中可以包含上述4种功能控件中的一个或者多个,同一功能控件也可以有一个或者多个。这样能够提供更加丰富的节点类型。用户可以根据实际情况,选择模型中的全部或者部分功能进行输出。该树状结构模型的每一层的任一个节点,仅与该节点的上一层的一个节点连接,不会与上一层的所有节点都连接。并且,每一层的节点之间没有连接,每一节点的输入都是与该节点连接的上一层节点的输出。It should be understood that the embodiments of the present application do not limit the specific number of sample image data sets. The types of nodes (function controls) in the tree-structured model can have at most the following four items: image segmentation function controls, image classification function controls, image target detection function controls, or optical character recognition function controls. The tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more. This can provide richer node types. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
本申请实施例中,提供了一种树状结构的模型生成方法。该模型中的每一节点的输入为与该节点连接的上一层节点的输出。用户可以根据需求选择不同的节点(功能控件)对 应的功能对样本图像数据集进行处理。并且,该模型中可能包括的4种节点类型更适合应用于工业场景中,在一定程度上简化了模型生成的过程,提高了模型生成的灵活性,能够应对较为复杂的数据处理场景,同时降低了用户的学习和使用的难度。In this embodiment of the present application, a method for generating a tree-structured model is provided. The input of each node in the model is the output of the node in the previous layer connected to the node. Users can select the functions corresponding to different nodes (function controls) according to their needs to process the sample image data set. Moreover, the four node types that may be included in the model are more suitable for application in industrial scenarios, which to a certain extent simplifies the process of model generation, improves the flexibility of model generation, and can cope with more complex data processing scenarios while reducing It reduces the difficulty for users to learn and use.
结合第一方面,在第一方面的某些实现方式中,该根据该样本图像数据集确定树状结构的模型,包括:响应于用户的第一操作,显示该树状结构的第k层的控件选项;响应于用户的第二操作,从该第k层的控件选项中确定该第k层的节点;根据所确定的该第k层的节点,显示对该样本图像数据集进行样本处理的工具;使用进行该样本处理后的样本图像数据集,训练该树状结构的模型,其中,k=1,2,…,N。In connection with the first aspect, in some implementations of the first aspect, determining the model of the tree structure based on the sample image data set includes: in response to the user's first operation, displaying the k-th layer of the tree structure. Control options; in response to the second operation of the user, determine the node of the k-th layer from the control options of the k-th layer; display the sample processing method for the sample image data set according to the determined node of the k-th layer. Tool; use the sample image data set after the sample processing to train the tree structure model, where k=1,2,...,N.
应理解,响应于用户的第一操作,显示树状结构的某一层的控件选项,该控件选项包括上述4种功能控件(节点类型)。通过用户的第二操作,从该4中功能控件中选择一个或者多个。并且,显示对样本图像数据集进行样本处理的工具,用户可使用该样本处理的工具对样本图像数据集进行一些辅助操作。It should be understood that in response to the user's first operation, control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types). Through the user's second operation, one or more of the four function controls are selected. Furthermore, a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
结合第一方面,在第一方面的某些实现方式中,该树状结构的模型包括第一子模型,该第一子模型包括第m层和第m+1层,其中,该第m层的节点为该图像分割功能控件,该第m+1层的节点为该图像目标检测功能控件,其中m为小于N的正整数。Combined with the first aspect, in some implementations of the first aspect, the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, where the m-th layer The node of is the image segmentation function control, and the node of the m+1th layer is the image target detection function control, where m is a positive integer less than N.
应理解,树状结构的模型可以包括多个子模型。第一子模型是多个子模型中的其中一个。该树状结构的模型中的任一功能控件都可以视为一个子模型。示例性的,第一子模型包括图像分割功能控件和图像目标检测功能控件。也就是说,第一子模型包括两个串联的功能控件。当然,本申请对第一子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。It should be understood that a tree-structured model may include multiple sub-models. The first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model. Exemplarily, the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series. Of course, this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
本申请实施例中,该树状结构的模型生成的方法,可以直观地显示在用户界面上,能够便于用户执行模型生成的过程,降低了用户学习的难度。In the embodiment of the present application, the tree-structured model generation method can be intuitively displayed on the user interface, which can facilitate the user to perform the model generation process and reduce the difficulty of user learning.
结合第一方面,在第一方面的某些实现方式中,该树状结构的模型包括第二子模型,该第二子模型包括第m层、第m+1层和第m+2层,其中,该第m层的节点为该图像分割功能控件,该第m+1层的节点为该图像目标检测功能控件,该第m+2层的节点为该光学字符识别功能控件,其中m为小于N的正整数。Combined with the first aspect, in some implementations of the first aspect, the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer and the m+2-th layer, Among them, the node on the mth layer is the image segmentation function control, the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition function control, where m is A positive integer less than N.
应理解,第一子模型中的图像目标检测功能控件之后又增加了光学字符识别功能控件,第二子模型包括第一子模型和新增加的光学字符识别功能控件。示例性的,第二子模型包括图像分割功能控件、图像目标检测功能控件和光学字符识别功能控件。当然,本申请对第二子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。可以理解为,第二子模型与第一子模型串联,可以根据实际情况,选择第一子模型的输出数据或者第二子模型的输出数据作为最终的输出数据,也可以选择任一功能控件的输出数据作为最终的输出数据。It should be understood that the image target detection function control in the first sub-model is subsequently added with an optical character recognition function control, and the second sub-model includes the first sub-model and the newly added optical character recognition function control. Exemplarily, the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls. Of course, this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user. It can be understood that the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. The output data is used as the final output data.
结合第一方面,在第一方面的某些实现方式中,该树状结构的模型包括第三子模型,该第三子模型包括第j层和第j+1层,其中,该第j层的节点为该光学字符识别控件,该第j+1层的节点为该图像分类控件,其中j为小于N的正整数。Combined with the first aspect, in some implementations of the first aspect, the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the j-th layer The node of is the optical character recognition control, and the node of the j+1th layer is the image classification control, where j is a positive integer less than N.
应理解,第三子模型与第一子模型(或者第二子模型)并联。示例性的,第三子模型包括光学字符识别功能控件和图像分类功能控件。当然,本申请对第三子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。It should be understood that the third sub-model is connected in parallel with the first sub-model (or second sub-model). Exemplarily, the third sub-model includes an optical character recognition function control and an image classification function control. Of course, this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
可选地,用户可以根据实际需要,实时修改树状结构的模型。示例性的,当用户对第二子模型中的图像目标检测功能控件执行删除操作时,图像目标检测功能控件以及该功能控件之后的都可以被删除。也就是说,用户可以通过一次删除操作,删除相应的功能控件以及该功能控件之后的任一功能控件。能够提高模型生成的灵活性。Optionally, users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
结合第一方面,在第一方面的某些实现方式中,该样本处理为以下处理之一:标记处理、样本选择处理、缩放处理、预处理。Combined with the first aspect, in some implementations of the first aspect, the sample processing is one of the following processing: marking processing, sample selection processing, scaling processing, and preprocessing.
结合第一方面,在第一方面的某些实现方式中,该图像分割功能控件对应的功能为基于深度学习的分割算法。Combined with the first aspect, in some implementations of the first aspect, the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
第二方面,提供了一种电子设备,包括:一个或多个处理器;一个或多个存储器;该一个或多个存储器存储有一个或多个计算机程序,该一个或多个计算机程序包括指令,当该指令被该一个或多个处理器执行时,使得该电子设备执行第一方面及第一方面中任一种可能实现方式中的方法。In a second aspect, an electronic device is provided, including: one or more processors; one or more memories; the one or more memories store one or more computer programs, and the one or more computer programs include instructions , when the instruction is executed by the one or more processors, the electronic device is caused to execute the first aspect and the method in any possible implementation manner of the first aspect.
第三方面,提供了一种模型生成的装置,与存储器耦合的处理器,该存储器用于存储计算机程序,该处理器用于运行该计算机程序,使得该模型生成的装置执行如上述第一方面及其任一种可能的实现方式中的方法。A third aspect provides a device for model generation, a processor coupled to a memory, the memory is used to store a computer program, and the processor is used to run the computer program, so that the device for model generation performs the above-mentioned first aspect and method in any of its possible implementations.
结合第三方面,在第三方面的某些实现方式中,该模型生成的装置还包括该存储器和收发器中的一项或多项,该收发器用于接收信号和/或发送信号。In conjunction with the third aspect, in some implementations of the third aspect, the device for model generation further includes one or more of the memory and a transceiver, the transceiver being used to receive signals and/or transmit signals.
第四方面,提供了一种计算机可读存储介质,该计算机可读存储介质包括计算机程序或指令,当该计算机程序或指令在计算机上运行时,使得如第一方面及其任一种可能的实现方式中的方法被执行。In a fourth aspect, a computer-readable storage medium is provided. The computer-readable storage medium includes a computer program or instructions. When the computer program or instructions are run on a computer, it makes it possible as in the first aspect and any of the instructions thereof. The methods in the implementation are executed.
第五方面,提供了一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当该计算机程序或指令在计算机上运行时,使得如第一方面及其任一种可能的实现方式中的方法被执行。In a fifth aspect, a computer program product is provided. The computer program product includes a computer program or instructions. When the computer program or instructions are run on a computer, the computer program or instructions perform as in the first aspect and any possible implementation manner thereof. The method is executed.
第六方面,提供了一种计算机程序,当其在计算机上运行时,使得如第一方面及其任一种可能的实现方式中的方法被执行。A sixth aspect provides a computer program that, when run on a computer, causes the method in the first aspect and any possible implementation thereof to be executed.
附图说明Description of drawings
图1是本申请实施例提供的系统架构示意图。Figure 1 is a schematic diagram of the system architecture provided by an embodiment of the present application.
图2是本申请实施例提供的模型生成方法的框架图。Figure 2 is a framework diagram of the model generation method provided by the embodiment of the present application.
图3是本申请实施例提供的一种模型生成方法的示意图。Figure 3 is a schematic diagram of a model generation method provided by an embodiment of the present application.
图4是本申请实施例提供的一种功能模块示意图。Figure 4 is a schematic diagram of a functional module provided by an embodiment of the present application.
图5是本申请实施例提供的一种数据集的处理流程图。Figure 5 is a processing flow chart of a data set provided by an embodiment of the present application.
图6是本申请实施例提供的一种图像分割示意图。Figure 6 is a schematic diagram of image segmentation provided by an embodiment of the present application.
图7是本申请实施例提供的另一种数据集的处理流程图。Figure 7 is a processing flow chart of another data set provided by the embodiment of the present application.
图8是本申请实施例提供的另一种数据集的处理流程图。Figure 8 is a processing flow chart of another data set provided by the embodiment of the present application.
图9是本申请实施例提供的数据集处理流程示意图。Figure 9 is a schematic diagram of the data set processing flow provided by the embodiment of this application.
图10是本申请实施例提供的另一数据集处理流程示意图。Figure 10 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图11是本申请实施例提供的另一数据集处理流程示意图。Figure 11 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图12是本申请实施例提供的另一数据集处理流程示意图。Figure 12 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图13是本申请实施例提供的另一数据集处理流程示意图。Figure 13 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图14是本申请实施例提供的另一数据集处理流程示意图。Figure 14 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图15是本申请实施例提供的另一数据集处理流程示意图。Figure 15 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
图16是本申请实施例提供的一种模型生成的方法示意性流程图。Figure 16 is a schematic flow chart of a model generation method provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“该”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。The terminology used in the following examples is for the purpose of describing specific embodiments only and is not intended to limit the application. As used in the specification and appended claims of this application, the singular expressions "a", "an", "the", "above", "the" and "the" are intended to also include For example, the expression "one or more" unless the context clearly indicates otherwise. It should also be understood that the term "and/or" is used to describe the relationship between associated objects, indicating that there can be three relationships; for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. In the case of B, A and B can be singular or plural. The character "/" generally indicates that the related objects are in an "or" relationship.
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。Reference in this specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Therefore, the phrases "in one embodiment", "in some embodiments", "in other embodiments", "in other embodiments", etc. appearing in different places in this specification are not necessarily References are made to the same embodiment, but rather to "one or more but not all embodiments" unless specifically stated otherwise. The terms “including,” “includes,” “having,” and variations thereof all mean “including but not limited to,” unless otherwise specifically emphasized.
本申请提供的模型生成的方法,可以应用于如图1所示的系统架构示意图中。其中,终端设备11通过网络与服务器12进行通信。终端设备11向服务器12发送样本图样数据集,以及用户意图。其中,用户意图用来表示用户对该样本图样数据集需要的处理。服务器12将样本图样数据集根据用户意图训练相应的模型,最终生成用户实际需要的模型。也就是说,用户首先将需要处理的数据集上传至本申请实施例所提供的AI开发平台,随后根据实际需求选择对应的处理操作,生成模型。The model generation method provided by this application can be applied to the system architecture diagram shown in Figure 1. Among them, the terminal device 11 communicates with the server 12 through the network. The terminal device 11 sends the sample pattern data set and the user's intention to the server 12 . Among them, user intention is used to represent the user's required processing of the sample pattern data set. The server 12 uses the sample pattern data set to train the corresponding model according to the user's intention, and finally generates the model actually needed by the user. That is to say, the user first uploads the data set that needs to be processed to the AI development platform provided by the embodiment of this application, and then selects the corresponding processing operation according to the actual needs to generate the model.
应理解,上述系统架构示意图中的终端设备12包括但不限于手机、平板电脑、具备无线通讯功能的可穿戴电子设备(例如智能手表)等。便携式电子设备的示例性实施例包括但不限于搭载
Figure PCTCN2022126426-appb-000001
或者其它操作系统的便携式电子设备。还应当理解的是,在其他一些实施例中,上述电子设备也可以不是便携式电子设备,而是台式计算机。服务器12可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
It should be understood that the terminal device 12 in the above system architecture diagram includes but is not limited to mobile phones, tablet computers, wearable electronic devices with wireless communication functions (such as smart watches), etc. Exemplary embodiments of portable electronic devices include, but are not limited to, carrying
Figure PCTCN2022126426-appb-000001
Or portable electronic devices with other operating systems. It should also be understood that in some other embodiments, the above-mentioned electronic device may not be a portable electronic device, but a desktop computer. The server 12 can be implemented as an independent server or a server cluster composed of multiple servers.
如图2所示,示出了本申请实施例提供的一种模型生成方法的框架图。如图2所示,模型生成框架图200包括图像分割模块210、图像分类模块220、图像目标检测模块230以及光学字符识别(optical character recognition,OCR)240。下面对上述各个模块进行具体介绍。As shown in Figure 2, a framework diagram of a model generation method provided by an embodiment of the present application is shown. As shown in Figure 2, the model generation framework diagram 200 includes an image segmentation module 210, an image classification module 220, an image target detection module 230, and an optical character recognition (optical character recognition, OCR) 240. Below is a detailed introduction to each of the above modules.
图像分割模块210,可以用于执行图像分割。图像分割是将图像分成若干个特定的、具有独特性质的区域并提出感兴趣目标的技术和过程。它是由图像处理到图像分析的关键步骤。现有的图像分割方法主要分以下几类:基于阈值的分割方法、基于区域的分割方法、基于边缘的分割方法以及基于特定理论的分割方法等。在本申请实施例中,主要采用基于 深度学习的分割算法。从数学角度来看,图像分割是将数字图像划分成互不相交的区域的过程。图像分割的过程也是一个标记过程,可以把属于同一区域的像素赋予相同的编号。图像分割可以应用于对检测对象精细至像素级别的检测和边缘识别。例如,识别硅晶片的裂纹区域、轴承碰伤区域等精细零件的碰伤。Image segmentation module 210 may be used to perform image segmentation. Image segmentation is a technique and process that divides an image into several specific regions with unique properties and proposes objects of interest. It is a key step from image processing to image analysis. Existing image segmentation methods are mainly divided into the following categories: threshold-based segmentation methods, region-based segmentation methods, edge-based segmentation methods, and segmentation methods based on specific theories. In the embodiment of this application, the segmentation algorithm based on deep learning is mainly used. From a mathematical perspective, image segmentation is the process of dividing a digital image into disjoint regions. The process of image segmentation is also a labeling process, which can assign the same number to pixels belonging to the same area. Image segmentation can be applied to the detection and edge recognition of detected objects down to the pixel level. For example, it can identify defects in fine parts such as cracked areas on silicon wafers and damaged areas in bearings.
图像分类模块220,可以用于执行图像分类。图像分类是一种根据各自在图像信息中所反映的不同特征,把不同类别的目标区分开来的图像处理方法。图像分类可以应用于对检测物料进行分类判断。例如,根据物料是否分合格进行二分类判断、检测对象的颜色、检测食品种类、缺陷细分或根据不同材质对检测对象进行分类等。Image classification module 220 may be used to perform image classification. Image classification is an image processing method that distinguishes different categories of targets based on the different characteristics reflected in the image information. Image classification can be applied to classify and judge detected materials. For example, two classification judgments are made based on whether materials are qualified, the color of the test object, the type of food being tested, defect subdivision, or classifying the test objects according to different materials.
图像目标检测模块230,可以用于执行图像目标检测。图像目标检测是一种利用图像处理与模式识别等领域的理论和方法。可以从图像中定位感兴趣的目标,判断每个目标的具体类别,并给出每个目标的边界框。图像目标检测可以应用于对检测物料中的目标进行定位及分类,适合多目标检测、小目标检测或计数等。例如,确定药品药丸的数量、确定部件的缺陷定位等。The image object detection module 230 may be used to perform image object detection. Image target detection is a theory and method that utilizes the fields of image processing and pattern recognition. Objects of interest can be located from the image, the specific category of each object is determined, and the bounding box of each object is given. Image target detection can be used to locate and classify targets in detection materials, and is suitable for multi-target detection, small target detection or counting, etc. For example, determine the number of pharmaceutical pills, determine the location of defects in components, etc.
光学字符识别模块240,可以用于执行光学字符识别。光学字符识别是指对文本资料的图像文件进行分析识别处理,获取文字及版面信息的过程。光学字符识别可以应用于单字符标注与识别、多字符标注与识别。能够打破传统方技术上的局限性,解曲线字符识别、低对比度字符识别、较大字符识别等复杂的字符识别问题。例如,可以识别精细零件上的文字。The optical character recognition module 240 may be used to perform optical character recognition. Optical character recognition refers to the process of analyzing and recognizing image files of text materials to obtain text and layout information. Optical character recognition can be applied to single-character labeling and recognition, and multi-character labeling and recognition. It can break the limitations of traditional methods and solve complex character recognition problems such as curve character recognition, low contrast character recognition, and large character recognition. For example, text on fine parts can be recognized.
图3示出了本申请实施例提供的一种模型生成方法的示意图。如图所示,数据集300包括但不限于图片数据集、视频数据集、文本数据集等。该模型生成的结构为树状结构。用户可以根据实际需要,选择对数据集300处理的不同功能的模块,例如图像分割模块、图像分类模块、图像目标检测模块或光学字符识别模块等。图3中仅以上述四种模块为例进行说明。用户可以选择需要的模块来构建树状分支,从而实现对数据集300进行下一步处理。Figure 3 shows a schematic diagram of a model generation method provided by an embodiment of the present application. As shown in the figure, the data set 300 includes but is not limited to a picture data set, a video data set, a text data set, etc. The structure generated by this model is a tree structure. The user can select modules with different functions for processing the data set 300 according to actual needs, such as image segmentation module, image classification module, image target detection module or optical character recognition module, etc. In Figure 3, only the above four modules are taken as examples for explanation. The user can select the required modules to build tree branches to implement the next step of processing the data set 300.
示例性的,图3中示出了生成三种串联或并联方案的模型。其一,数据集300依次进行图像分割、图像目标检测、光学字符识别和图像分类处理。其二,数据集300依次进行图像分割、图像目标检测和光学字符识别处理。其三,仅对数据集300进行图像分类处理。As an example, a model generating three series or parallel schemes is shown in Figure 3 . First, the data set 300 is subjected to image segmentation, image target detection, optical character recognition and image classification processing in sequence. Second, the data set 300 is subjected to image segmentation, image target detection and optical character recognition processing in sequence. Third, only image classification processing is performed on the data set 300.
以上述第一种方式为例。当数据集300进行图像分割处理后,得到的分割数据将被传输至图像目标检测模块。也就是说,每一模块的输入数据是上一模块的输出数据。Take the first method above as an example. After the image segmentation process is performed on the data set 300, the obtained segmentation data will be transmitted to the image object detection module. In other words, the input data of each module is the output data of the previous module.
可选地,用户可以根据实际需要,对上述四个模块自由进行组合连接。但图像分类之后将不会再连接其他功能模块。Optionally, users can freely combine and connect the above four modules according to actual needs. However, other functional modules will not be connected after image classification.
可选地,如图3所示的树状结构构建完成后,用户可以选择性地生成该树状结构对应的模型。例如,输出模型对应的软件开发工具包(software development kit,SDK)等方式。Optionally, after the tree structure shown in Figure 3 is constructed, the user can selectively generate a model corresponding to the tree structure. For example, output the software development kit (SDK) corresponding to the model, etc.
应理解,树状结构图中包含N个子节点,则最终会有N个输出。该数据结构图中每一个子节点均对应一条完整的方案。但是用户可以根据实际需要,选择输出该方案中的全部或者部分方案。It should be understood that if the tree structure diagram contains N child nodes, there will eventually be N outputs. Each sub-node in the data structure diagram corresponds to a complete solution. However, users can choose to output all or part of the solution according to actual needs.
示例性的,用户可以通过输入设备(例如鼠标)实现对功能模块的选择或者删除等操作。如图4所示,当用户通过鼠标左键点击数据集300时,会出现图像分割、图像分类、图像目标检测、光学字符识别四种功能模块以供用户选择。当用户通过鼠标左键点击图像 分割时,可以对数据集300执行图像分割处理。当用户通过鼠标点击图像分类时,可以对数据集300执行图像分类处理。当用户通过鼠标点击图像目标检测时,可以对数据集300执行图像目标检测处理。当用户通过鼠标点击光学字符识别时,可以对数据集300执行光学字符识别处理。For example, the user can select or delete functional modules through an input device (such as a mouse). As shown in Figure 4, when the user clicks the data set 300 with the left mouse button, four functional modules, including image segmentation, image classification, image target detection, and optical character recognition, will appear for the user to choose. When the user clicks image segmentation with the left mouse button, the image segmentation process can be performed on the data set 300. When the user clicks on the image classification through the mouse, the image classification process can be performed on the data set 300. When the user clicks on the image object detection through the mouse, the image object detection process can be performed on the data set 300. When the user clicks on the optical character recognition through the mouse, the optical character recognition process can be performed on the data set 300.
当然,当用户点击图像分割之后,也会出现上述四种功能模块以供用户选择下一步执行的具体任务。也就是说,每进行一次处理之后,用户都可以选择下一步具体的处理方式,具体的处理方式可以是一种或者多种,本申请对此不作限定。Of course, when the user clicks on image segmentation, the above four functional modules will also appear for the user to choose the specific task to be performed next. That is to say, after each processing, the user can choose a specific processing method for the next step. The specific processing method can be one or more, and this application does not limit this.
示例性的,当用户通过鼠标右键点击图像分割模块时,表示删除该图像分割模块。当图像分割模块之后还存在有其他功能模块时,用户通过鼠标右键。点击该图像分割模块,该图像分割模块之后的其他功能模块也将被删除。例如,如图3所示,当用户通过鼠标右键点击图像分类模块320时,表示图像目标检测模块320、光学字符识别模块330和360、图像分类模块340都作删除处理。而图3中的图像分割模块310和图像分类模块350将被保留。For example, when the user right-clicks the image segmentation module, it means deleting the image segmentation module. When there are other functional modules after the image segmentation module, the user clicks the right mouse button. Click the image segmentation module, and other functional modules after the image segmentation module will also be deleted. For example, as shown in Figure 3, when the user right-clicks the image classification module 320, it means that the image target detection module 320, the optical character recognition modules 330 and 360, and the image classification module 340 are all deleted. The image segmentation module 310 and image classification module 350 in Figure 3 will be retained.
应理解,上述用户可以通过输入设备(例如鼠标)实现对功能模块的选择或者删除等操作的方式仅为一种示例。本申请实施例中还可以通过拖动功能模块、设置控件等其他方式实现对不同模块的选择、删除或者更改等操作。It should be understood that the above-mentioned manner in which a user can select or delete functional modules through an input device (such as a mouse) is only an example. In the embodiment of this application, operations such as selecting, deleting or changing different modules can also be implemented by dragging function modules, setting controls and other methods.
为了便于理解,下面以用户上传的数据集300包括100张图像为例,详细说明图3中数据集300的处理流程。For ease of understanding, the processing flow of the data set 300 in Figure 3 will be described in detail below, taking the data set 300 uploaded by the user including 100 images as an example.
如图5所示,示出图3中的一种数据集的处理流程。具体步骤如下:As shown in Figure 5, the processing flow of a data set in Figure 3 is shown. Specific steps are as follows:
S501,图像分割。S501, image segmentation.
应理解,将数据集300中的100张图像进行分割。为了方便理解,图6是本申请提供的一个图像数据的示例。如图6所示,分割后的图像600可以包括610、620、630、640四个区域。图6中的四个区域仅为示例,还可以包括其他区域。图像分割可以把属于同一区域的像素赋予相同的编号。It should be understood that the 100 images in the data set 300 are segmented. To facilitate understanding, FIG. 6 is an example of image data provided by this application. As shown in Figure 6, the segmented image 600 may include four regions 610, 620, 630, and 640. The four areas in Figure 6 are only examples, and other areas may also be included. Image segmentation can assign the same number to pixels belonging to the same area.
S502,图像目标检测。S502, image target detection.
应理解,从步骤S501中100张图像分割后的区域定位感兴趣的目标,判断每个目标的具体类别,并给出每个目标的边界框。示例性地,该目标可以是一个商标标志,或者一个说明标签等。It should be understood that the target of interest is located from the segmented areas of the 100 images in step S501, the specific category of each target is determined, and the bounding box of each target is given. For example, the target may be a trademark logo, or a description label, etc.
S503,光学字符识别。S503, optical character recognition.
应理解,对步骤S502中定位的目标进行文字识别分析。It should be understood that text recognition analysis is performed on the target located in step S502.
S504,图像分类。S504, image classification.
应理解,将步骤S503中100张图像进行分类。例如,将图6中编号为A的区域分为一类,将编号为B的区域分为一类,将编号为C的区域分为一类,将编号为D的区域分为一类。It should be understood that the 100 images in step S503 are classified. For example, the area numbered A in Figure 6 is divided into one category, the area numbered B is divided into another category, the area numbered C is divided into another category, and the area numbered D is divided into another category.
通过上述步骤S501至S504,便可以生成一种模型,可以用于按照图像分割、图像目标检测、光学字符识别和图像分类这一流程处理一定的图像。Through the above steps S501 to S504, a model can be generated, which can be used to process certain images according to the process of image segmentation, image target detection, optical character recognition and image classification.
如图7所示,示出图3中的另一种数据集的处理流程。具体步骤如下:As shown in Figure 7, the processing flow of another data set in Figure 3 is shown. Specific steps are as follows:
S701,图像分割。S701, image segmentation.
应理解,将数据集300中的100张图像进行分割。图像分割可以把属于同一区域的像 素赋予相同的编号。It should be understood that the 100 images in the data set 300 are segmented. Image segmentation can assign the same number to pixels belonging to the same area.
S702,图像目标检测。S702, image target detection.
应理解,从步骤S701中100张图像分割后的区域定位感兴趣的目标,判断每个目标的具体类别,并给出每个目标的边界框。示例性地,该目标可以是一个商标标志,或者一个说明标签等。It should be understood that the target of interest is located from the segmented areas of the 100 images in step S701, the specific category of each target is determined, and the bounding box of each target is given. For example, the target may be a trademark logo, or a description label, etc.
S703,光学字符识别。S703, optical character recognition.
应理解,对步骤S702中定位的目标数据进行文字识别分析。It should be understood that text recognition analysis is performed on the target data located in step S702.
通过上述步骤S701至S703,便可以生成一种模型,可以用于按照图像分割、图像目标检测和光学字符识别这一流程处理图像。Through the above steps S701 to S703, a model can be generated, which can be used to process images according to the process of image segmentation, image target detection and optical character recognition.
如图8所示,示出图3中的另一种数据集的处理流程。具体步骤如下:As shown in Figure 8, the processing flow of another data set in Figure 3 is shown. Specific steps are as follows:
S801,图像分类。S801, image classification.
将数据集300中的100张图像根据图像信息中所反映的不同特征,将图像中的不同类别的区域分开,并标记赋予不同的编号。According to the different features reflected in the image information, the 100 images in the data set 300 are separated into different categories of areas in the image, and marked with different numbers.
通过上述步骤S801,生成一种模型,该模型可以直接对图像进行图像分类的处理。Through the above step S801, a model is generated, which can directly perform image classification processing on images.
为了便于理解,以找出部件上的缺陷为例,图9、图10、图11和图12示出了数据集在一个树状结构模型中进行处理的示意图。For ease of understanding, taking finding defects on components as an example, Figures 9, 10, 11 and 12 show schematic diagrams of data sets being processed in a tree structure model.
如图9所示,示出了该树状结构模型的处理流程。该处理流程包括:一方面,数据集经过图像目标检测(例如,图9中的检测1),再进行光学字符识别(例如,图9中的OCR1)处理。另一方面,数据集经过图像目标检测(例如,图9中的检测1),再经过图像分类(例如,图9中的分类1)处理。其中,图9中的检测1、OCR1和分类1可以认为是不同的功能控件。As shown in Figure 9, the processing flow of the tree structure model is shown. The processing flow includes: on the one hand, the data set undergoes image target detection (for example, detection 1 in Figure 9), and then undergoes optical character recognition (for example, OCR1 in Figure 9) processing. On the other hand, the data set is processed by image object detection (eg, detection 1 in Figure 9) and then image classification (eg, classification 1 in Figure 9). Among them, detection 1, OCR1 and classification 1 in Figure 9 can be considered as different functional controls.
如图10所示,示出了上述一方面处理流程中的检测1的示意图。检测1对应的图像目标检测用于定位出图像中特定的区域。例如,当前设置的是字符“9BC3”上方的区域。As shown in FIG. 10 , a schematic diagram of detection 1 in the above-mentioned aspect of the processing flow is shown. The image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, the area above the characters "9BC3" is currently set.
如图11所示,示出了上述一方面处理流程中的OCR1的示意图。对于执行检测1之后定位出的区域进行字符识别。从图中可以看出,OCR1还可以对检测1输出的结果进行后处理,该后处理可以是将确定的区域进行固定量的偏移,使得定位到图中的字符“9BC3”区域。As shown in FIG. 11 , a schematic diagram of OCR1 in the above-mentioned processing flow is shown. Perform character recognition on the area located after executing detection 1. As can be seen from the figure, OCR1 can also perform post-processing on the results output by detection 1. The post-processing can be to offset the determined area by a fixed amount so that the character "9BC3" area in the figure is located.
应理解,上述后处理可以是对样本图像数据集进行4种处理之后的结果,进行进一步调整。例如,对处理之后的输出结果所对应的区域进行一定量的偏移或者缩放,使得输出结果对应的区域更加准确。It should be understood that the above post-processing can be the result of 4 types of processing on the sample image data set, and further adjustments can be made. For example, the area corresponding to the processed output result is offset or scaled by a certain amount to make the area corresponding to the output result more accurate.
示例性的,当样本图像数据集经过图像分割处理之后,将图像分割成特定的区域,用户界面可显示被分割出的特定区域。用户可以直观地检查样本图像数据集是否分割准确,当分割的具体区域与用户需要被分割出的区域有所偏差时,可以在进一步调整图像分割处理对应的特定区域,使得该处理更加准确。For example, after the sample image data set undergoes image segmentation processing, the image is segmented into specific areas, and the user interface may display the segmented specific areas. Users can visually check whether the sample image data set is segmented accurately. When the specific segmented area deviates from the area that the user needs to be segmented, the specific area corresponding to the image segmentation process can be further adjusted to make the process more accurate.
示例性的,当样本图像数据集经过图像分类之后,将图像分类成不同的区域。而根据预设的分类依据,可能对样本图像数据集的分类不准确,本不应该在同一类别的区域却被划分为同一类。用户可以通过进一步移动输出结果所对应的区域,实现对图像的分类的调整,使得样本图像数据集的分类更加准确。For example, after the sample image data set is subjected to image classification, the images are classified into different areas. According to the preset classification basis, the classification of the sample image data set may be inaccurate, and areas that should not be in the same category are classified into the same category. Users can adjust the image classification by further moving the area corresponding to the output result, making the classification of the sample image data set more accurate.
示例性的,当样本图像数据集经过图像目标检测之后,标记了用户感兴趣的目标。例 如,统计样本图像数据中用户感兴趣的目标的数量。当存在没有被图像目标检测处理统计到的目标时,用户可以进一步调整,手动标注出没有统计到的目标,以提高数量统计的准确性。For example, after the sample image data set undergoes image object detection, objects of interest to the user are marked. For example, count the number of objects of interest to the user in the sample image data. When there are targets that are not counted by the image target detection process, the user can further adjust and manually mark the targets that are not counted to improve the accuracy of quantity statistics.
示例性的,当样本图像数据集经过光学字符识别之后,标记当前图像上的所有字符。但实际上,用户可能仅对当前图像上的数字字符感兴趣,可以对光学字符识别之后的区域做进一步的缩放处理,选择仅识别图像上的数字字符。For example, after the sample image data set undergoes optical character recognition, all characters on the current image are marked. But in fact, the user may only be interested in the numeric characters on the current image, and can further zoom in on the area after optical character recognition, and choose to recognize only the numeric characters on the image.
应理解,用户可以根据实际需要,对经过上述4种处理之后得到的数据进行进一步处理,对样本图像数据集进一步处理包括但不限于标记处理、样本选择处理、缩放处理、预处理。进一步处理可以理解为对处理之后的数据进行优化的过程,这可以使得最终训练得到的模型能够更准确地进行数据处理。It should be understood that users can further process the data obtained after the above four types of processing according to actual needs. Further processing of the sample image data set includes but is not limited to labeling processing, sample selection processing, scaling processing, and preprocessing. Further processing can be understood as the process of optimizing the processed data, which allows the final trained model to process data more accurately.
如图12所示,示出了上述另一方面处理流程中的分类1的示意图。对于执行检测1之后定位出的区域进行分类。从图中可以看出,包括有单码和多码两种类型。其中图11中字符区域的“9BC3”是单码,图14中字符区域上下两个“9BD4”是多码。As shown in FIG. 12 , a schematic diagram of Classification 1 in the processing flow of the above-mentioned aspect is shown. Classify the areas located after performing detection 1. As can be seen from the figure, there are two types: single code and multi-code. Among them, "9BC3" in the character area in Figure 11 is a single code, and the two "9BD4" above and below the character area in Figure 14 are multi-codes.
为了便于理解,以找出部件上的缺陷为例,图13、图14和图15示出了数据集在另一树状结构模型中进行处理的示意图。For ease of understanding, taking finding defects on components as an example, Figures 13, 14 and 15 show schematic diagrams of data sets being processed in another tree structure model.
如图13所示,示出了该树状结构模型的处理流程。该处理流程包括:数据集经过图像目标检测(例如,图13中的检测1),再进行图像分割(例如,图13中的分割1)处理。其中,图13中的检测1和分割1可以认为是不同的功能控件。As shown in Figure 13, the processing flow of the tree structure model is shown. The processing flow includes: the data set undergoes image target detection (for example, detection 1 in Figure 13), and then image segmentation (for example, segmentation 1 in Figure 13). Among them, detection 1 and segmentation 1 in Figure 13 can be considered as different functional controls.
如图14所示,示出了上述处理流程中的检测1的示意图。检测1对应的图像目标检测用于定位出图像中特定的区域。例如,当前设置的是图像中的二极管的位置。As shown in FIG. 14 , a schematic diagram of detection 1 in the above process flow is shown. The image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, what is currently set is the position of the diode in the image.
如图15所示,示出了上述处理流程中的分割1的示意图。分割1对于定位后的二极管,进一步识别其中的缺陷区域,并将分割出缺陷区域。As shown in FIG. 15 , a schematic diagram of segmentation 1 in the above processing flow is shown. Segmentation 1: For the positioned diode, the defective area is further identified and the defective area will be segmented.
如图16示出了本申请实施例提供的一种模型生成的方法1600。该方法可以应用于如图2所示的框架中。下面对该方法1600进行详细说明。Figure 16 shows a model generation method 1600 provided by the embodiment of the present application. This method can be applied in the framework shown in Figure 2. The method 1600 is described in detail below.
S1601,获取样本图像数据集。S1601, obtain the sample image data set.
应理解,本申请实施例对样品图像数据集的具体数量不作限定。It should be understood that the embodiments of the present application do not limit the specific number of sample image data sets.
S1602,根据样本图像数据集确定树状结构的模型。S1602. Determine a tree structure model based on the sample image data set.
应理解,该树状结构包括N个层,N个层中的每个层包括至少一个节点,其中第1层的节点的输入为样本图像数据集,第i层的节点的输入为第i-1层中节点之一的输出,其中N为大于1的正整数,i=2,…,N;其中每个节点为以下控件之一:图像分割功能控件、图像分类功能控件、图像目标检测功能控件、光学字符识别功能控件。It should be understood that the tree structure includes N layers, and each layer in the N layers includes at least one node, wherein the input of the node of the first layer is the sample image data set, and the input of the node of the i-th layer is the i-th layer. The output of one of the nodes in layer 1, where N is a positive integer greater than 1, i=2,...,N; each node is one of the following controls: image segmentation function control, image classification function control, image target detection function Controls, optical character recognition function controls.
本申请实施例对样品图像数据集的具体数量不作限定。该树状结构的该模型中的节点(功能控件)包括的功能控件的种类最多可以有以下4项:图像分割功能控件、图像分类功能控件、图像目标检测功能控件或光学字符识别功能控件。树状结构的模型中可以包含上述4种功能控件中的一个或者多个,同一功能控件也可以有一个或者多个,这样能够提供更加丰富的节点类型。用户可以根据实际情况,选择模型中的全部或者部分功能进行输出。该树状结构模型的每一层的任一个节点,仅与该节点的上一层的一个节点连接,不会与上一层的所有节点都连接。并且,每一层的节点之间没有连接,每一节点的输入都是与该节点连接的上一层节点的输出。The embodiments of this application do not limit the specific number of sample image data sets. The nodes (function controls) in the model of the tree structure can include at most four types of function controls: image segmentation function control, image classification function control, image target detection function control or optical character recognition function control. The tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more, which can provide a richer node type. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
在一些实施例中,响应于用户的第一操作,显示树状结构的第k层的控件选项;响应于用户的第二操作,从第k层的控件选项中确定第k层的节点;根据所确定的第k层的节点,显示对样本图像数据集进行样本处理的工具;使用进行样本处理后的样本图像数据集,训练树状结构的模型,其中,k=1,2,…,N。In some embodiments, in response to the user's first operation, the control options of the kth layer of the tree structure are displayed; in response to the user's second operation, the node of the kth layer is determined from the control options of the kth layer; according to The determined node of the kth layer displays the tool for sample processing of the sample image data set; the sample image data set after sample processing is used to train a tree-structured model, where k=1,2,...,N .
应理解,响应于用户的第一操作,显示树状结构的某一层的控件选项,该控件选项包括上述4种功能控件(节点类型)。通过用户的第二操作,从该4中功能控件中选择一个或者多个。并且,显示对样本图像数据集进行样本处理的工具,用户可使用该样本处理的工具对样本图像数据集进行一些辅助操作。It should be understood that in response to the user's first operation, control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types). Through the user's second operation, one or more of the four function controls are selected. Furthermore, a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
在一些实施例中,树状结构的模型包括第一子模型,第一子模型包括第m层和第m+1层,其中,第m层的节点为图像分割功能控件,第m+1层的节点为图像目标检测功能控件,其中m为小于N的正整数。In some embodiments, the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, wherein the nodes of the m-th layer are image segmentation function controls, and the m+1-th layer The node is the image target detection function control, where m is a positive integer less than N.
应理解,树状结构的模型可以包括多个子模型。第一子模型是多个子模型中的其中一个。该树状结构的模型中的任一功能控件都可以视为一个子模型。示例性的,第一子模型包括图像分割功能控件和图像目标检测功能控件。也就是说,第一子模型包括两个串联的功能控件。当然,本申请对第一子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。It should be understood that a tree-structured model may include multiple sub-models. The first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model. Exemplarily, the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series. Of course, this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
在一些实施例中,树状结构的模型包括第二子模型,第二子模型包括第m层、第m+1层和第m+2层,其中,第m层的节点为图像分割功能控件,第m+1层的节点为图像目标检测功能控件,第m+2层的节点为光学字符识别功能控件,其中m为小于N的正整数。In some embodiments, the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer, and the m+2-th layer, wherein the nodes of the m-th layer are image segmentation function controls. , the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition function control, where m is a positive integer less than N.
应理解,第一子模型中的图像目标检测功能控件之后又增加了光学字符识别功能控件,第二子模型包括第一子模型和新增加的光学字符识别功能控件。示例性的,第二子模型包括图像分割功能控件、图像目标检测功能控件和光学字符识别功能控件。当然,本申请对第二子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。可以理解为,第二子模型与第一子模型串联,可以根据实际情况,选择第一子模型的输出数据或者第二子模型的输出数据作为最终的输出数据,也可以选择任一功能控件的输出数据作为最终的输出数据。It should be understood that the image target detection function control in the first sub-model was later added with an optical character recognition function control, and the second sub-model includes the first sub-model and the newly added optical character recognition function control. Exemplarily, the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls. Of course, this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user. It can be understood that the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. Output data as final output data.
在一些实施例中,树状结构的模型包括第三子模型,第三子模型包括第j层和第j+1层,其中,第j层的节点为光学字符识别控件,第j+1层的节点为图像分类控件,其中j为小于N的正整数。In some embodiments, the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the node on the j-th layer is an optical character recognition control, and the j+1-th layer The node is the image classification control, where j is a positive integer less than N.
应理解,第三子模型与第一子模型(或者第二子模型)并联。示例性的,第三子模型包括光学字符识别功能控件和图像分类功能控件。当然,本申请对第三子模型中包括的具体功能控件不作限定,该具体功能控件可以根据用户的实际需要而确定。It should be understood that the third sub-model is connected in parallel with the first sub-model (or second sub-model). Exemplarily, the third sub-model includes an optical character recognition function control and an image classification function control. Of course, this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
可选地,用户可以根据实际需要,实时修改树状结构的模型。示例性的,当用户对第二子模型中的图像目标检测功能控件执行删除操作时,图像目标检测功能控件以及该功能控件之后的都可以被删除。也就是说,用户可以通过一次删除操作,删除相应的功能控件以及该功能控件之后的任一功能控件。能够提高模型生成的灵活性。Optionally, users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
在一些实施例中,样本处理为以下处理之一:标记处理、样本选择处理、缩放处理、预处理。In some embodiments, the sample processing is one of: labeling processing, sample selection processing, scaling processing, preprocessing.
在一些实施例中,图像分割功能控件对应的功能为基于深度学习的分割算法。In some embodiments, the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
在本申请实施例中,提供了一种基于树状结构的模型生成方法。用户可以根据具体需求,选择不同的功能模块对数据进行处理,对于处理后的数据也可以选择多个分支功能进行下一步处理。该方法提高了模型生成的灵活性,从而能够应对较为复杂的数据处理的场景。同时,用户可以做根据具体需求,在用户界面上直观且方便地选择不同的功能模块,降低了用户学习和使用模型的难度。In this embodiment of the present application, a model generation method based on a tree structure is provided. Users can choose different functional modules to process data according to specific needs, and can also choose multiple branch functions for the next step of processing the processed data. This method improves the flexibility of model generation and can cope with more complex data processing scenarios. At the same time, users can intuitively and conveniently select different functional modules on the user interface according to specific needs, which reduces the difficulty for users to learn and use the model.
本申请实施例提供一种计算机程序产品,当该计算机程序产品在电子设备运行时,使得电子设备执行上述实施例中的技术方案。其实现原理和技术效果与上述方法相关实施例类似,此处不再赘述。Embodiments of the present application provide a computer program product. When the computer program product is run on an electronic device, it causes the electronic device to execute the technical solutions in the above embodiments. The implementation principles and technical effects are similar to the above-mentioned method-related embodiments, and will not be described again here.
本申请实施例提供一种可读存储介质,该可读存储介质包含指令,当该指令在电子设备运行时,使得该电子设备执行上述实施例的技术方案。其实现原理和技术效果类似,此处不再赘述。Embodiments of the present application provide a readable storage medium. The readable storage medium contains instructions. When the instructions are run on an electronic device, the electronic device executes the technical solutions of the above embodiments. The implementation principles and technical effects are similar and will not be described again here.
本申请实施例提供一种芯片,该芯片用于执行指令,当该芯片运行时,执行上述实施例中的技术方案。其实现原理和技术效果类似,此处不再赘述。Embodiments of the present application provide a chip, which is used to execute instructions. When the chip is running, it executes the technical solutions in the above embodiments. The implementation principles and technical effects are similar and will not be described again here.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。Those of ordinary skill in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of the embodiments of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The unit described as a separate component may or may not be physically separated, and the component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
该功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例该方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If this function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application are essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method in various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program code. .
以上所述,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以该权利要求的保护范围为准。The above are only specific implementation modes of the embodiments of the present application, but the protection scope of the embodiments of the present application is not limited thereto. Any person familiar with the technical field can easily implement the implementation within the technical scope disclosed in the embodiments of the present application. Any changes or substitutions that come to mind should be included in the protection scope of the embodiments of this application. Therefore, the protection scope of the embodiments of the present application shall be subject to the protection scope of the claims.

Claims (15)

  1. 一种模型生成的方法,其特征在于,包括:A method of model generation, characterized by including:
    获取样本图像数据集;Get a sample image data set;
    根据所述样本图像数据集确定树状结构的模型,所述树状结构包括N个层,所述N个层中的每个层包括至少一个节点,其中第1层的节点的输入为所述样本图像数据集,第i层的节点的输入为第i-1层中节点之一的输出,其中N为大于1的正整数,i=2,…,N;A model of the tree structure is determined based on the sample image data set. The tree structure includes N layers. Each layer in the N layers includes at least one node, wherein the input of the node of the first layer is the Sample image data set, the input of the node in the i-th layer is the output of one of the nodes in the i-1 layer, where N is a positive integer greater than 1, i=2,...,N;
    其中每个所述节点为以下控件之一:Each of the nodes is one of the following controls:
    图像分割功能控件、图像分类功能控件、图像目标检测功能控件、光学字符识别功能控件。Image segmentation function control, image classification function control, image target detection function control, optical character recognition function control.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述样本图像数据集确定树状结构的模型,包括:The method of claim 1, wherein determining a tree structure model based on the sample image data set includes:
    响应于用户的第一操作,显示所述树状结构的第k层的控件选项;In response to the user's first operation, display the control options of the k-th layer of the tree structure;
    响应于用户的第二操作,从所述第k层的控件选项中确定所述第k层的节点;In response to the user's second operation, determine the node of the k-th layer from the control options of the k-th layer;
    根据所确定的所述第k层的节点,显示对所述样本图像数据集进行样本处理的工具;Display a tool for performing sample processing on the sample image data set according to the determined node of the kth layer;
    使用进行所述样本处理后的样本图像数据集,训练所述树状结构的模型,Use the sample image data set after the sample processing to train the tree-structured model,
    其中,k=1,2,…,N。Among them, k=1,2,...,N.
  3. 根据权利要求2所述的方法,其特征在于,所述树状结构的模型包括第一子模型,所述第一子模型包括第m层和第m+1层,其中,所述第m层的节点为所述图像分割功能控件,所述第m+1层的节点为所述图像目标检测功能控件,其中m为小于N的正整数。The method of claim 2, wherein the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, wherein the m-th layer The nodes of are the image segmentation function controls, and the nodes of the m+1th layer are the image target detection function controls, where m is a positive integer less than N.
  4. 根据权利要求2所述的方法,其特征在于,所述树状结构的模型包括第二子模型,所述第二子模型包括第m层、第m+1层和第m+2层,其中,所述第m层的节点为所述图像分割功能控件,所述第m+1层的节点为所述图像目标检测功能控件,所述第m+2层的节点为所述光学字符识别功能控件,其中m为小于N的正整数。The method according to claim 2, characterized in that the tree-structured model includes a second sub-model, and the second sub-model includes an m-th layer, an m+1-th layer and an m+2-th layer, wherein , the node on the mth layer is the image segmentation function control, the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition function Control, where m is a positive integer less than N.
  5. 根据权利要求2至4中任一项所述的方法,其特征在于,所述样本处理为以下处理之一:The method according to any one of claims 2 to 4, characterized in that the sample processing is one of the following processing:
    标记处理、样本选择处理、缩放处理、预处理。Marking processing, sample selection processing, scaling processing, preprocessing.
  6. 根据权利要求5所述的方法,其特征在于,所述图像分割功能控件对应的功能为基于深度学习的分割算法。The method according to claim 5, characterized in that the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  7. 一种电子设备,其特征在于,包括:An electronic device, characterized by including:
    一个或多个处理器;one or more processors;
    一个或多个存储器;one or more memories;
    所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述电子设备执行以下步骤:The one or more memories store one or more computer programs, the one or more computer programs include instructions that, when executed by the one or more processors, cause the electronic device to perform the following step:
    获取样本图像数据集;Get a sample image data set;
    根据所述样本图像数据集确定树状结构的模型,所述树状结构包括N个层,所述N个层中的每个层包括至少一个节点,其中第1层的节点的输入为所述样本图像数据集,第i层的节点的输入为第i-1层中节点之一的输出,其中N为大于1的正整数,i=2,…,N;A model of the tree structure is determined based on the sample image data set. The tree structure includes N layers. Each layer in the N layers includes at least one node, wherein the input of the node of the first layer is the Sample image data set, the input of the node in the i-th layer is the output of one of the nodes in the i-1 layer, where N is a positive integer greater than 1, i=2,...,N;
    其中每个所述节点为以下控件之一:Each of the nodes is one of the following controls:
    图像分割功能控件、图像分类功能控件、图像目标检测功能控件、光学字符识别功能控件。Image segmentation function control, image classification function control, image target detection function control, optical character recognition function control.
  8. 根据权利要求7所述的电子设备,其特征在于,所述根据所述样本图像数据集确定树状结构的模型,当所述指令被所述一个或多个处理器执行时,使得所述电子设备执行以下步骤:The electronic device according to claim 7, wherein the model of the tree structure is determined based on the sample image data set, and when the instructions are executed by the one or more processors, the electronic device The device performs the following steps:
    响应于用户的第一操作,显示所述树状结构的第k层的控件选项;In response to the user's first operation, display the control options of the k-th layer of the tree structure;
    响应于用户的第二操作,从所述第k层的控件选项中确定所述第k层的节点;In response to the user's second operation, determine the node of the k-th layer from the control options of the k-th layer;
    根据所确定的所述第k层的节点,显示对所述样本图像数据集进行样本处理的工具;Display a tool for performing sample processing on the sample image data set according to the determined node of the kth layer;
    使用进行所述样本处理后的样本图像数据集,训练所述树状结构的模型,Use the sample image data set after the sample processing to train the tree-structured model,
    其中,k=1,2,…,N。Among them, k=1,2,...,N.
  9. 根据权利要求8所述的电子设备,其特征在于,所述树状结构的模型包括第一子模型,所述第一子模型包括第m层和第m+1层,其中,所述第m层的节点为所述图像分割功能控件,所述第m+1层的节点为所述图像目标检测功能控件,其中m为小于N的正整数。The electronic device according to claim 8, wherein the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, wherein the m-th layer The nodes of the layer are the image segmentation function controls, and the nodes of the m+1th layer are the image target detection function controls, where m is a positive integer less than N.
  10. 根据权利要求8所述的电子设备,其特征在于,所述树状结构的模型包括第二子模型,所述第二子模型包括第m层、第m+1层和第m+2层,其中,所述第m层的节点为所述图像分割功能控件,所述第m+1层的节点为所述图像目标检测功能控件,所述第m+2层的节点为所述光学字符识别功能控件,其中m为小于N的正整数。The electronic device according to claim 8, characterized in that the tree-structured model includes a second sub-model, and the second sub-model includes an m-th layer, an m+1-th layer and an m+2-th layer, Wherein, the node on the mth layer is the image segmentation function control, the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition Function control, where m is a positive integer less than N.
  11. 根据权利要求8至10中任一项所述电子设备,其特征在于,所述样本处理为以下处理之一:The electronic device according to any one of claims 8 to 10, characterized in that the sample processing is one of the following processes:
    标记处理、样本选择处理、缩放处理、预处理。Marking processing, sample selection processing, scaling processing, preprocessing.
  12. 根据权利要求11所述的电子设备,其特征在于,所述图像分割功能控件对应的功能为基于深度学习的分割算法。The electronic device according to claim 11, wherein the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  13. 一种模型生成的装置,其特征在于,与存储器耦合的处理器,所述存储器用于存储计算机程序,所述处理器用于运行所述计算机程序,使得所述模型生成的装置执行如权利要求1至6中任一项所述的方法。A device for model generation, characterized by a processor coupled to a memory, the memory is used to store a computer program, and the processor is used to run the computer program, so that the device for model generation performs as claimed in claim 1 The method described in any one of to 6.
  14. 根据权利要求13所述的模型生成的装置,其特征在于,所述模型生成的装置还包括所述存储器和收发器中的一项或多项,所述收发器用于接收信号和/或发送信号。The device for model generation according to claim 13, characterized in that the device for model generation further includes one or more of the memory and a transceiver, the transceiver being used to receive signals and/or send signals. .
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得如权利要求1至6中任一项所述的方法被执行。A computer-readable storage medium, characterized in that the computer-readable storage medium includes a computer program or instructions. When the computer program or instructions are run on a computer, the computer-readable storage medium causes the computer-readable storage medium to perform as claimed in any one of claims 1 to 6. The method described is executed.
PCT/CN2022/126426 2022-07-26 2022-10-20 Model generation method and apparatus, electronic device, and storage medium WO2024021321A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210881785.2 2022-07-26
CN202210881785.2A CN114943976B (en) 2022-07-26 2022-07-26 Model generation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2024021321A1 true WO2024021321A1 (en) 2024-02-01

Family

ID=82911496

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/126426 WO2024021321A1 (en) 2022-07-26 2022-10-20 Model generation method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN114943976B (en)
WO (1) WO2024021321A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943976B (en) * 2022-07-26 2022-10-11 深圳思谋信息科技有限公司 Model generation method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416363A (en) * 2018-01-30 2018-08-17 平安科技(深圳)有限公司 Generation method, device, computer equipment and the storage medium of machine learning model
CN109949031A (en) * 2019-04-02 2019-06-28 山东浪潮云信息技术有限公司 A kind of machine learning model training method and device
CN112990423A (en) * 2019-12-16 2021-06-18 华为技术有限公司 Artificial intelligence AI model generation method, system and equipment
US20220058531A1 (en) * 2020-08-19 2022-02-24 Royal Bank Of Canada System and method for cascading decision trees for explainable reinforcement learning
CN114943976A (en) * 2022-07-26 2022-08-26 深圳思谋信息科技有限公司 Model generation method and device, electronic equipment and storage medium

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229286A (en) * 2017-05-27 2018-06-29 北京市商汤科技开发有限公司 Language model generates and application process, device, electronic equipment and storage medium
CN107899244A (en) * 2017-11-29 2018-04-13 武汉秀宝软件有限公司 A kind of construction method and system of AI models
EP3570164B1 (en) * 2018-05-14 2023-04-26 Schneider Electric Industries SAS Method and system for generating a mobile application from a desktop application
CN109948668A (en) * 2019-03-01 2019-06-28 成都新希望金融信息有限公司 A kind of multi-model fusion method
CN110309888A (en) * 2019-07-11 2019-10-08 南京邮电大学 A kind of image classification method and system based on layering multi-task learning
CN111046886B (en) * 2019-12-12 2023-05-12 吉林大学 Automatic identification method, device and equipment for number plate and computer readable storage medium
CN111881315A (en) * 2020-06-24 2020-11-03 华为技术有限公司 Image information input method, electronic device, and computer-readable storage medium
AU2021301463A1 (en) * 2020-06-30 2022-12-22 Australia And New Zealand Banking Group Limited Method and system for generating an ai model using constrained decision tree ensembles
CN111782879B (en) * 2020-07-06 2023-04-18 Oppo(重庆)智能科技有限公司 Model training method and device
CN111931841A (en) * 2020-08-05 2020-11-13 Oppo广东移动通信有限公司 Deep learning-based tree processing method, terminal, chip and storage medium
CN113836128A (en) * 2021-09-24 2021-12-24 北京拾味岛信息科技有限公司 Abnormal data identification method, system, equipment and storage medium
CN114418035A (en) * 2022-03-25 2022-04-29 腾讯科技(深圳)有限公司 Decision tree model generation method and data recommendation method based on decision tree model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416363A (en) * 2018-01-30 2018-08-17 平安科技(深圳)有限公司 Generation method, device, computer equipment and the storage medium of machine learning model
CN109949031A (en) * 2019-04-02 2019-06-28 山东浪潮云信息技术有限公司 A kind of machine learning model training method and device
CN112990423A (en) * 2019-12-16 2021-06-18 华为技术有限公司 Artificial intelligence AI model generation method, system and equipment
US20220058531A1 (en) * 2020-08-19 2022-02-24 Royal Bank Of Canada System and method for cascading decision trees for explainable reinforcement learning
CN114943976A (en) * 2022-07-26 2022-08-26 深圳思谋信息科技有限公司 Model generation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114943976A (en) 2022-08-26
CN114943976B (en) 2022-10-11

Similar Documents

Publication Publication Date Title
CN107808143B (en) Dynamic gesture recognition method based on computer vision
US11657602B2 (en) Font identification from imagery
CN107239731B (en) Gesture detection and recognition method based on Faster R-CNN
CN107833213B (en) Weak supervision object detection method based on false-true value self-adaptive method
WO2020238054A1 (en) Method and apparatus for positioning chart in pdf document, and computer device
US11704357B2 (en) Shape-based graphics search
WO2020244075A1 (en) Sign language recognition method and apparatus, and computer device and storage medium
CN112784810B (en) Gesture recognition method, gesture recognition device, computer equipment and storage medium
CN109284729A (en) Method, apparatus and medium based on video acquisition human face recognition model training data
CN110136198B (en) Image processing method, apparatus, device and storage medium thereof
CN111368636B (en) Object classification method, device, computer equipment and storage medium
WO2021238548A1 (en) Region recognition method, apparatus and device, and readable storage medium
Jalab et al. Human computer interface using hand gesture recognition based on neural network
CN111860362A (en) Method and device for generating human face image correction model and correcting human face image
CN109284779A (en) Object detection method based on deep full convolution network
US11681409B2 (en) Systems and methods for augmented or mixed reality writing
WO2022193753A1 (en) Continuous learning method and apparatus, and terminal and storage medium
JP6787831B2 (en) Target detection device, detection model generation device, program and method that can be learned by search results
US11481577B2 (en) Machine learning (ML) quality assurance for data curation
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
Patel American sign language detection
WO2024021321A1 (en) Model generation method and apparatus, electronic device, and storage medium
CN115335872A (en) Training method of target detection network, target detection method and device
CN115147380A (en) Small transparent plastic product defect detection method based on YOLOv5
WO2023273572A1 (en) Feature extraction model construction method and target detection method, and device therefor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952760

Country of ref document: EP

Kind code of ref document: A1