WO2024021321A1 - Procédé et appareil de génération de modèle, dispositif électronique et support de stockage - Google Patents

Procédé et appareil de génération de modèle, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2024021321A1
WO2024021321A1 PCT/CN2022/126426 CN2022126426W WO2024021321A1 WO 2024021321 A1 WO2024021321 A1 WO 2024021321A1 CN 2022126426 W CN2022126426 W CN 2022126426W WO 2024021321 A1 WO2024021321 A1 WO 2024021321A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
model
node
data set
function control
Prior art date
Application number
PCT/CN2022/126426
Other languages
English (en)
Chinese (zh)
Inventor
李睿宇
石康
原卉
Original Assignee
深圳思谋信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳思谋信息科技有限公司 filed Critical 深圳思谋信息科技有限公司
Publication of WO2024021321A1 publication Critical patent/WO2024021321A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Definitions

  • the embodiments of the present application relate to the field of artificial intelligence technology, and in particular, to a method, device, electronic device, and storage medium for model generation.
  • AI artificial intelligence
  • face recognition image classification
  • object detection object detection
  • speech recognition etc.
  • AI development platforms are commonly used to provide users with services such as selection, construction, verification, and optimization of AI models for certain task goals.
  • AI development platforms are less flexible. How to obtain a model with diverse functions and the requirements for data processing in complex scenarios are issues that need to be solved urgently.
  • a method of model generation is provided to obtain a sample image data set; a tree structure model is determined based on the sample image data set.
  • the types of nodes (function controls) in the tree-structured model can have at most the following four items: image segmentation function controls, image classification function controls, image target detection function controls, or optical character recognition function controls.
  • the tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more. This can provide richer node types. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
  • a method for generating a tree-structured model is provided.
  • the input of each node in the model is the output of the node in the previous layer connected to the node.
  • Users can select the functions corresponding to different nodes (function controls) according to their needs to process the sample image data set.
  • the four node types that may be included in the model are more suitable for application in industrial scenarios, which to a certain extent simplifies the process of model generation, improves the flexibility of model generation, and can cope with more complex data processing scenarios while reducing It reduces the difficulty for users to learn and use.
  • control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types).
  • the control options include the above four functional controls (node types).
  • the user's second operation one or more of the four function controls are selected.
  • a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
  • the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, where the m-th layer
  • the node of is the image segmentation function control
  • the node of the m+1th layer is the image target detection function control, where m is a positive integer less than N.
  • a tree-structured model may include multiple sub-models.
  • the first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model.
  • the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series.
  • this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the tree-structured model generation method can be intuitively displayed on the user interface, which can facilitate the user to perform the model generation process and reduce the difficulty of user learning.
  • the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer and the m+2-th layer, Among them, the node on the mth layer is the image segmentation function control, the node on the m+1th layer is the image target detection function control, and the node on the m+2th layer is the optical character recognition function control, where m is A positive integer less than N.
  • the image target detection function control in the first sub-model is subsequently added with an optical character recognition function control
  • the second sub-model includes the first sub-model and the newly added optical character recognition function control.
  • the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls.
  • this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. The output data is used as the final output data.
  • the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the j-th layer
  • the node of is the optical character recognition control
  • the node of the j+1th layer is the image classification control, where j is a positive integer less than N.
  • the third sub-model is connected in parallel with the first sub-model (or second sub-model).
  • the third sub-model includes an optical character recognition function control and an image classification function control.
  • this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
  • the sample processing is one of the following processing: marking processing, sample selection processing, scaling processing, and preprocessing.
  • the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  • an electronic device including: one or more processors; one or more memories; the one or more memories store one or more computer programs, and the one or more computer programs include instructions , when the instruction is executed by the one or more processors, the electronic device is caused to execute the first aspect and the method in any possible implementation manner of the first aspect.
  • a third aspect provides a device for model generation, a processor coupled to a memory, the memory is used to store a computer program, and the processor is used to run the computer program, so that the device for model generation performs the above-mentioned first aspect and method in any of its possible implementations.
  • the device for model generation further includes one or more of the memory and a transceiver, the transceiver being used to receive signals and/or transmit signals.
  • a computer-readable storage medium includes a computer program or instructions.
  • the computer program or instructions When the computer program or instructions are run on a computer, it makes it possible as in the first aspect and any of the instructions thereof. The methods in the implementation are executed.
  • a computer program product includes a computer program or instructions.
  • the computer program product includes a computer program or instructions.
  • the computer program or instructions When the computer program or instructions are run on a computer, the computer program or instructions perform as in the first aspect and any possible implementation manner thereof. The method is executed.
  • a sixth aspect provides a computer program that, when run on a computer, causes the method in the first aspect and any possible implementation thereof to be executed.
  • Figure 1 is a schematic diagram of the system architecture provided by an embodiment of the present application.
  • Figure 2 is a framework diagram of the model generation method provided by the embodiment of the present application.
  • Figure 3 is a schematic diagram of a model generation method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a functional module provided by an embodiment of the present application.
  • Figure 5 is a processing flow chart of a data set provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of image segmentation provided by an embodiment of the present application.
  • Figure 7 is a processing flow chart of another data set provided by the embodiment of the present application.
  • Figure 8 is a processing flow chart of another data set provided by the embodiment of the present application.
  • Figure 9 is a schematic diagram of the data set processing flow provided by the embodiment of this application.
  • Figure 10 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 12 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 13 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 14 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 15 is a schematic diagram of another data set processing flow provided by an embodiment of the present application.
  • Figure 16 is a schematic flow chart of a model generation method provided by an embodiment of the present application.
  • the model generation method provided by this application can be applied to the system architecture diagram shown in Figure 1.
  • the terminal device 11 communicates with the server 12 through the network.
  • the terminal device 11 sends the sample pattern data set and the user's intention to the server 12 .
  • user intention is used to represent the user's required processing of the sample pattern data set.
  • the server 12 uses the sample pattern data set to train the corresponding model according to the user's intention, and finally generates the model actually needed by the user. That is to say, the user first uploads the data set that needs to be processed to the AI development platform provided by the embodiment of this application, and then selects the corresponding processing operation according to the actual needs to generate the model.
  • the terminal device 12 in the above system architecture diagram includes but is not limited to mobile phones, tablet computers, wearable electronic devices with wireless communication functions (such as smart watches), etc.
  • Exemplary embodiments of portable electronic devices include, but are not limited to, carrying Or portable electronic devices with other operating systems.
  • the above-mentioned electronic device may not be a portable electronic device, but a desktop computer.
  • the server 12 can be implemented as an independent server or a server cluster composed of multiple servers.
  • the model generation framework diagram 200 includes an image segmentation module 210, an image classification module 220, an image target detection module 230, and an optical character recognition (optical character recognition, OCR) 240.
  • image segmentation module 210 an image classification module 220
  • image target detection module 230 an image target detection module 230
  • optical character recognition optical character recognition
  • Image segmentation module 210 may be used to perform image segmentation.
  • Image segmentation is a technique and process that divides an image into several specific regions with unique properties and proposes objects of interest. It is a key step from image processing to image analysis.
  • Existing image segmentation methods are mainly divided into the following categories: threshold-based segmentation methods, region-based segmentation methods, edge-based segmentation methods, and segmentation methods based on specific theories.
  • the segmentation algorithm based on deep learning is mainly used.
  • image segmentation is the process of dividing a digital image into disjoint regions.
  • the process of image segmentation is also a labeling process, which can assign the same number to pixels belonging to the same area.
  • Image segmentation can be applied to the detection and edge recognition of detected objects down to the pixel level. For example, it can identify defects in fine parts such as cracked areas on silicon wafers and damaged areas in bearings.
  • Image classification module 220 may be used to perform image classification.
  • Image classification is an image processing method that distinguishes different categories of targets based on the different characteristics reflected in the image information.
  • Image classification can be applied to classify and judge detected materials. For example, two classification judgments are made based on whether materials are qualified, the color of the test object, the type of food being tested, defect subdivision, or classifying the test objects according to different materials.
  • the image object detection module 230 may be used to perform image object detection.
  • Image target detection is a theory and method that utilizes the fields of image processing and pattern recognition. Objects of interest can be located from the image, the specific category of each object is determined, and the bounding box of each object is given. Image target detection can be used to locate and classify targets in detection materials, and is suitable for multi-target detection, small target detection or counting, etc. For example, determine the number of pharmaceutical pills, determine the location of defects in components, etc.
  • the optical character recognition module 240 may be used to perform optical character recognition.
  • Optical character recognition refers to the process of analyzing and recognizing image files of text materials to obtain text and layout information.
  • Optical character recognition can be applied to single-character labeling and recognition, and multi-character labeling and recognition. It can break the limitations of traditional methods and solve complex character recognition problems such as curve character recognition, low contrast character recognition, and large character recognition. For example, text on fine parts can be recognized.
  • FIG 3 shows a schematic diagram of a model generation method provided by an embodiment of the present application.
  • the data set 300 includes but is not limited to a picture data set, a video data set, a text data set, etc.
  • the structure generated by this model is a tree structure.
  • the user can select modules with different functions for processing the data set 300 according to actual needs, such as image segmentation module, image classification module, image target detection module or optical character recognition module, etc. In Figure 3, only the above four modules are taken as examples for explanation. The user can select the required modules to build tree branches to implement the next step of processing the data set 300.
  • Figure 3 a model generating three series or parallel schemes is shown in Figure 3 .
  • the data set 300 is subjected to image segmentation, image target detection, optical character recognition and image classification processing in sequence.
  • the data set 300 is subjected to image segmentation, image target detection and optical character recognition processing in sequence.
  • image classification processing is performed on the data set 300.
  • the obtained segmentation data will be transmitted to the image object detection module.
  • the input data of each module is the output data of the previous module.
  • users can freely combine and connect the above four modules according to actual needs.
  • other functional modules will not be connected after image classification.
  • the user can selectively generate a model corresponding to the tree structure. For example, output the software development kit (SDK) corresponding to the model, etc.
  • SDK software development kit
  • each sub-node in the data structure diagram corresponds to a complete solution. However, users can choose to output all or part of the solution according to actual needs.
  • the user can select or delete functional modules through an input device (such as a mouse).
  • an input device such as a mouse
  • four functional modules including image segmentation, image classification, image target detection, and optical character recognition, will appear for the user to choose.
  • image segmentation process can be performed on the data set 300.
  • image classification process can be performed on the data set 300.
  • image object detection can be performed on the data set 300.
  • optical character recognition can be performed on the data set 300.
  • the above four functional modules will also appear for the user to choose the specific task to be performed next. That is to say, after each processing, the user can choose a specific processing method for the next step.
  • the specific processing method can be one or more, and this application does not limit this.
  • the user when the user right-clicks the image segmentation module, it means deleting the image segmentation module.
  • the image classification module 320 when the user right-clicks the image classification module 320, it means that the image target detection module 320, the optical character recognition modules 330 and 360, and the image classification module 340 are all deleted.
  • the image segmentation module 310 and image classification module 350 in Figure 3 will be retained.
  • FIG. 6 is an example of image data provided by this application.
  • the segmented image 600 may include four regions 610, 620, 630, and 640.
  • the four areas in Figure 6 are only examples, and other areas may also be included.
  • Image segmentation can assign the same number to pixels belonging to the same area.
  • the target of interest is located from the segmented areas of the 100 images in step S501, the specific category of each target is determined, and the bounding box of each target is given.
  • the target may be a trademark logo, or a description label, etc.
  • the 100 images in step S503 are classified.
  • the area numbered A in Figure 6 is divided into one category
  • the area numbered B is divided into another category
  • the area numbered C is divided into another category
  • the area numbered D is divided into another category.
  • a model can be generated, which can be used to process certain images according to the process of image segmentation, image target detection, optical character recognition and image classification.
  • Image segmentation can assign the same number to pixels belonging to the same area.
  • the target of interest is located from the segmented areas of the 100 images in step S701, the specific category of each target is determined, and the bounding box of each target is given.
  • the target may be a trademark logo, or a description label, etc.
  • a model can be generated, which can be used to process images according to the process of image segmentation, image target detection and optical character recognition.
  • the 100 images in the data set 300 are separated into different categories of areas in the image, and marked with different numbers.
  • a model is generated, which can directly perform image classification processing on images.
  • Figures 9, 10, 11 and 12 show schematic diagrams of data sets being processed in a tree structure model.
  • the processing flow of the tree structure model includes: on the one hand, the data set undergoes image target detection (for example, detection 1 in Figure 9), and then undergoes optical character recognition (for example, OCR1 in Figure 9) processing.
  • the data set is processed by image object detection (eg, detection 1 in Figure 9) and then image classification (eg, classification 1 in Figure 9).
  • image object detection eg, detection 1 in Figure 9
  • image classification eg, classification 1 in Figure 9
  • detection 1, OCR1 and classification 1 in Figure 9 can be considered as different functional controls.
  • FIG. 10 a schematic diagram of detection 1 in the above-mentioned aspect of the processing flow is shown.
  • the image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, the area above the characters "9BC3" is currently set.
  • OCR1 perform character recognition on the area located after executing detection 1.
  • OCR1 can also perform post-processing on the results output by detection 1.
  • the post-processing can be to offset the determined area by a fixed amount so that the character "9BC3" area in the figure is located.
  • the above post-processing can be the result of 4 types of processing on the sample image data set, and further adjustments can be made.
  • the area corresponding to the processed output result is offset or scaled by a certain amount to make the area corresponding to the output result more accurate.
  • the image is segmented into specific areas, and the user interface may display the segmented specific areas. Users can visually check whether the sample image data set is segmented accurately. When the specific segmented area deviates from the area that the user needs to be segmented, the specific area corresponding to the image segmentation process can be further adjusted to make the process more accurate.
  • the images are classified into different areas.
  • the classification of the sample image data set may be inaccurate, and areas that should not be in the same category are classified into the same category. Users can adjust the image classification by further moving the area corresponding to the output result, making the classification of the sample image data set more accurate.
  • objects of interest to the user are marked. For example, count the number of objects of interest to the user in the sample image data.
  • the user can further adjust and manually mark the targets that are not counted to improve the accuracy of quantity statistics.
  • Further processing of the sample image data set includes but is not limited to labeling processing, sample selection processing, scaling processing, and preprocessing. Further processing can be understood as the process of optimizing the processed data, which allows the final trained model to process data more accurately.
  • FIG. 12 a schematic diagram of Classification 1 in the processing flow of the above-mentioned aspect is shown.
  • Figures 13, 14 and 15 show schematic diagrams of data sets being processed in another tree structure model.
  • the processing flow of the tree structure model includes: the data set undergoes image target detection (for example, detection 1 in Figure 13), and then image segmentation (for example, segmentation 1 in Figure 13). Among them, detection 1 and segmentation 1 in Figure 13 can be considered as different functional controls.
  • FIG. 14 a schematic diagram of detection 1 in the above process flow is shown.
  • the image target detection corresponding to detection 1 is used to locate a specific area in the image. For example, what is currently set is the position of the diode in the image.
  • segmentation 1 As shown in FIG. 15 , a schematic diagram of segmentation 1 in the above processing flow is shown. Segmentation 1: For the positioned diode, the defective area is further identified and the defective area will be segmented.
  • Figure 16 shows a model generation method 1600 provided by the embodiment of the present application. This method can be applied in the framework shown in Figure 2. The method 1600 is described in detail below.
  • S1602. Determine a tree structure model based on the sample image data set.
  • the tree structure includes N layers, and each layer in the N layers includes at least one node, wherein the input of the node of the first layer is the sample image data set, and the input of the node of the i-th layer is the i-th layer.
  • the output of one of the nodes in layer 1, where N is a positive integer greater than 1, i 2,...,N; each node is one of the following controls: image segmentation function control, image classification function control, image target detection function Controls, optical character recognition function controls.
  • the embodiments of this application do not limit the specific number of sample image data sets.
  • the nodes (function controls) in the model of the tree structure can include at most four types of function controls: image segmentation function control, image classification function control, image target detection function control or optical character recognition function control.
  • the tree structure model can contain one or more of the above four functional controls, and the same functional control can also have one or more, which can provide a richer node type. Users can select all or part of the functions in the model for output based on the actual situation. Any node in each layer of the tree structure model is only connected to one node in the upper layer of the node, and will not be connected to all nodes in the upper layer. Moreover, there is no connection between the nodes in each layer, and the input of each node is the output of the node in the previous layer connected to the node.
  • control options of a certain layer of the tree structure are displayed, and the control options include the above four functional controls (node types).
  • the control options include the above four functional controls (node types).
  • the user's second operation one or more of the four function controls are selected.
  • a tool for performing sample processing on the sample image data set is displayed, and the user can use the sample processing tool to perform some auxiliary operations on the sample image data set.
  • the tree-structured model includes a first sub-model, and the first sub-model includes an m-th layer and an m+1-th layer, wherein the nodes of the m-th layer are image segmentation function controls, and the m+1-th layer
  • the node is the image target detection function control, where m is a positive integer less than N.
  • a tree-structured model may include multiple sub-models.
  • the first submodel is one of multiple submodels. Any functional control in the tree-structured model can be regarded as a sub-model.
  • the first sub-model includes an image segmentation function control and an image target detection function control. That is, the first submodel includes two functional controls in series.
  • this application does not limit the specific functional controls included in the first sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the tree-structured model includes a second sub-model, and the second sub-model includes the m-th layer, the m+1-th layer, and the m+2-th layer, wherein the nodes of the m-th layer are image segmentation function controls.
  • the node on the m+1th layer is the image target detection function control
  • the node on the m+2th layer is the optical character recognition function control, where m is a positive integer less than N.
  • the image target detection function control in the first sub-model was later added with an optical character recognition function control
  • the second sub-model includes the first sub-model and the newly added optical character recognition function control.
  • the second sub-model includes image segmentation function controls, image target detection function controls, and optical character recognition function controls.
  • this application does not limit the specific functional controls included in the second sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • the second sub-model is connected in series with the first sub-model. According to the actual situation, the output data of the first sub-model or the output data of the second sub-model can be selected as the final output data, or any function control can be selected. Output data as final output data.
  • the tree-structured model includes a third sub-model, and the third sub-model includes the j-th layer and the j+1-th layer, where the node on the j-th layer is an optical character recognition control, and the j+1-th layer
  • the node is the image classification control, where j is a positive integer less than N.
  • the third sub-model is connected in parallel with the first sub-model (or second sub-model).
  • the third sub-model includes an optical character recognition function control and an image classification function control.
  • this application does not limit the specific functional controls included in the third sub-model, and the specific functional controls can be determined according to the actual needs of the user.
  • users can modify the tree-structured model in real time according to actual needs. For example, when the user performs a deletion operation on the image target detection function control in the second sub-model, the image target detection function control and subsequent function controls can be deleted. In other words, the user can delete the corresponding function control and any function control after the function control through one deletion operation. It can improve the flexibility of model generation.
  • the sample processing is one of: labeling processing, sample selection processing, scaling processing, preprocessing.
  • the function corresponding to the image segmentation function control is a segmentation algorithm based on deep learning.
  • a model generation method based on a tree structure is provided. Users can choose different functional modules to process data according to specific needs, and can also choose multiple branch functions for the next step of processing the processed data. This method improves the flexibility of model generation and can cope with more complex data processing scenarios. At the same time, users can intuitively and conveniently select different functional modules on the user interface according to specific needs, which reduces the difficulty for users to learn and use the model.
  • Embodiments of the present application provide a computer program product.
  • the computer program product When the computer program product is run on an electronic device, it causes the electronic device to execute the technical solutions in the above embodiments.
  • the implementation principles and technical effects are similar to the above-mentioned method-related embodiments, and will not be described again here.
  • Embodiments of the present application provide a readable storage medium.
  • the readable storage medium contains instructions.
  • the instructions When the instructions are run on an electronic device, the electronic device executes the technical solutions of the above embodiments.
  • the implementation principles and technical effects are similar and will not be described again here.
  • Embodiments of the present application provide a chip, which is used to execute instructions. When the chip is running, it executes the technical solutions in the above embodiments. The implementation principles and technical effects are similar and will not be described again here.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and the component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • this function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method in various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

La présente demande concerne, selon certains modes de réalisation, un procédé et un appareil de génération de modèle, un dispositif électronique et un support de stockage. Le procédé consiste : à acquérir un ensemble de données d'image d'échantillon ; et à déterminer un modèle d'une structure arborescente selon l'ensemble de données d'image d'échantillon, la structure arborescente comprenant N couches, chacune des N couches comprenant au moins un nœud, l'entrée du nœud de la première couche étant l'ensemble de données d'image d'échantillon et l'entrée du nœud de la ième couche étant la sortie de l'un des nœuds dans la (i-1)ième couche, N étant un nombre entier positif supérieur à 1, i = 2, ..., N. Chaque nœud est l'une des commandes suivantes : une commande de fonction de segmentation d'image, une commande de fonction de classification d'image, une commande de fonction de détection de cible d'image et une commande de fonction de reconnaissance de caractère optique. Selon le procédé, sur une interface utilisateur, un utilisateur peut sélectionner visuellement et commodément différents modules fonctionnels pour traiter des données, la flexibilité de génération de modèle est améliorée, et une scène de traitement de données relativement complexe peut être traitée de manière pratique.
PCT/CN2022/126426 2022-07-26 2022-10-20 Procédé et appareil de génération de modèle, dispositif électronique et support de stockage WO2024021321A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210881785.2A CN114943976B (zh) 2022-07-26 2022-07-26 模型生成的方法、装置、电子设备和存储介质
CN202210881785.2 2022-07-26

Publications (1)

Publication Number Publication Date
WO2024021321A1 true WO2024021321A1 (fr) 2024-02-01

Family

ID=82911496

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/126426 WO2024021321A1 (fr) 2022-07-26 2022-10-20 Procédé et appareil de génération de modèle, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN114943976B (fr)
WO (1) WO2024021321A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943976B (zh) * 2022-07-26 2022-10-11 深圳思谋信息科技有限公司 模型生成的方法、装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416363A (zh) * 2018-01-30 2018-08-17 平安科技(深圳)有限公司 机器学习模型的生成方法、装置、计算机设备及存储介质
CN109949031A (zh) * 2019-04-02 2019-06-28 山东浪潮云信息技术有限公司 一种机器学习模型训练方法及装置
CN112990423A (zh) * 2019-12-16 2021-06-18 华为技术有限公司 人工智能ai模型生成方法、系统及设备
US20220058531A1 (en) * 2020-08-19 2022-02-24 Royal Bank Of Canada System and method for cascading decision trees for explainable reinforcement learning
CN114943976A (zh) * 2022-07-26 2022-08-26 深圳思谋信息科技有限公司 模型生成的方法、装置、电子设备和存储介质

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229286A (zh) * 2017-05-27 2018-06-29 北京市商汤科技开发有限公司 语言模型生成及应用方法、装置、电子设备和存储介质
CN107899244A (zh) * 2017-11-29 2018-04-13 武汉秀宝软件有限公司 一种ai模型的构建方法及系统
EP3570164B1 (fr) * 2018-05-14 2023-04-26 Schneider Electric Industries SAS Procédé et système de génération d'une application mobile à partir d'une application bureau
CN109948668A (zh) * 2019-03-01 2019-06-28 成都新希望金融信息有限公司 一种多模型融合方法
CN110309888A (zh) * 2019-07-11 2019-10-08 南京邮电大学 一种基于分层多任务学习的图像分类方法与系统
CN111046886B (zh) * 2019-12-12 2023-05-12 吉林大学 号码牌自动识别方法、装置、设备及计算机可读存储介质
CN111881315A (zh) * 2020-06-24 2020-11-03 华为技术有限公司 图像信息输入方法、电子设备及计算机可读存储介质
US20230267379A1 (en) * 2020-06-30 2023-08-24 Australia And New Zealand Banking Group Limited Method and system for generating an ai model using constrained decision tree ensembles
CN111782879B (zh) * 2020-07-06 2023-04-18 Oppo(重庆)智能科技有限公司 模型训练方法及装置
CN111931841A (zh) * 2020-08-05 2020-11-13 Oppo广东移动通信有限公司 基于深度学习的树状处理方法、终端、芯片及存储介质
CN113836128A (zh) * 2021-09-24 2021-12-24 北京拾味岛信息科技有限公司 一种异常数据识别方法、系统、设备及存储介质
CN114418035A (zh) * 2022-03-25 2022-04-29 腾讯科技(深圳)有限公司 决策树模型生成方法、基于决策树模型的数据推荐方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416363A (zh) * 2018-01-30 2018-08-17 平安科技(深圳)有限公司 机器学习模型的生成方法、装置、计算机设备及存储介质
CN109949031A (zh) * 2019-04-02 2019-06-28 山东浪潮云信息技术有限公司 一种机器学习模型训练方法及装置
CN112990423A (zh) * 2019-12-16 2021-06-18 华为技术有限公司 人工智能ai模型生成方法、系统及设备
US20220058531A1 (en) * 2020-08-19 2022-02-24 Royal Bank Of Canada System and method for cascading decision trees for explainable reinforcement learning
CN114943976A (zh) * 2022-07-26 2022-08-26 深圳思谋信息科技有限公司 模型生成的方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN114943976A (zh) 2022-08-26
CN114943976B (zh) 2022-10-11

Similar Documents

Publication Publication Date Title
CN107808143B (zh) 基于计算机视觉的动态手势识别方法
CN109325454B (zh) 一种基于YOLOv3的静态手势实时识别方法
US11657602B2 (en) Font identification from imagery
CN107239731B (zh) 一种基于Faster R-CNN的手势检测和识别方法
WO2020238054A1 (fr) Procédé et appareil pour positionner un graphique dans un document pdf et dispositif informatique
CN107833213B (zh) 一种基于伪真值自适应法的弱监督物体检测方法
US11704357B2 (en) Shape-based graphics search
CN110136198B (zh) 图像处理方法及其装置、设备和存储介质
CN109284729A (zh) 基于视频获取人脸识别模型训练数据的方法、装置和介质
CN111368636B (zh) 目标分类方法、装置、计算机设备和存储介质
WO2021238548A1 (fr) Procédé, appareil et dispositif de reconnaissance de région, et support de stockage lisible
CN111860362A (zh) 生成人脸图像校正模型及校正人脸图像的方法和装置
Jalab et al. Human computer interface using hand gesture recognition based on neural network
CN109284779A (zh) 基于深度全卷积网络的物体检测方法
WO2022193753A1 (fr) Procédé et appareil d'apprentissage continu, terminal et support de stockage
CN112991269A (zh) 一种肺部ct图像的识别分类方法
JP6787831B2 (ja) 検索結果による学習が可能な対象検出装置、検出モデル生成装置、プログラム及び方法
CN113051914A (zh) 一种基于多特征动态画像的企业隐藏标签抽取方法及装置
Mahmood et al. A Comparative study of a new hand recognition model based on line of features and other techniques
Patel American sign language detection
WO2024021321A1 (fr) Procédé et appareil de génération de modèle, dispositif électronique et support de stockage
CN115147380A (zh) 一种基于YOLOv5的小型透明塑料制品缺陷检测方法
WO2022237117A1 (fr) Procédé et système de commande tactile pour tableau blanc électronique interactif, et support lisible
KR102440198B1 (ko) 시각 검색 방법, 장치, 컴퓨터 기기 및 저장 매체 (video search method and apparatus, computer device, and storage medium)
Lin et al. Integrated circuit board object detection and image augmentation fusion model based on YOLO

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952760

Country of ref document: EP

Kind code of ref document: A1