TW202123167A

TW202123167A - Method for processing pictures

Info

Publication number: TW202123167A
Application number: TW108145601A
Authority: TW
Inventors: 王彥翔; 朱俊翰; 吳思為; 劉家瑀; 陳聖文
Original assignee: 萬里雲互聯網路有限公司
Priority date: 2019-12-12
Filing date: 2019-12-12
Publication date: 2021-06-16
Also published as: TWI753332B

Abstract

A method for processing pictures includes: recognizing a category of a picture based on a picture classification neural model; if the picture belongs to a type with a simple background, detecting an object and a non-object in the picture based on an object detection neural model; removing the non-object from the picture, so as to form a missing part as a result of the area overlapped by the object and the non-object; and filling the missing part to repair the object of the picture based on a picture repair neural model.

Description

Image processing method

本發明是有關於一種圖片編修技術，尤指一種利用人工智慧對圖片自動編修的方法。The invention relates to a picture editing technology, in particular to a method of automatically editing pictures using artificial intelligence.

網路販售通路，如拍賣網、購物網等，會呈現販售商品的圖片，讓使用者可以看到實品照片。然而，通路商往往會要求供貨商提供清楚的圖片，避免在圖片上顯示不必要的資訊（如文字、圖案）。因此，供貨商需要額外付出時間、人力來編修出合乎需求的圖片，十分不便。Online sales channels, such as auction networks, shopping networks, etc., will present pictures of the goods being sold, so that users can see photos of actual products. However, distributors often require suppliers to provide clear pictures to avoid displaying unnecessary information (such as text, patterns) on the pictures. Therefore, the supplier needs to spend extra time and manpower to edit and edit the pictures that meet the needs, which is very inconvenient.

有鑑於此，本發明實施例提出一種圖片處理方法，包括：依據圖片分類神經模型，識別圖片的類型；若圖片屬於具有簡單背景的類型，依據物件偵測神經模型，偵測圖片中的主體及非主體；將圖片中的非主體移除，使得主體與非主體相互重疊的區域形成一缺失部；以及依據圖片修補神經模型，對缺失部修補以復原主體。In view of this, an embodiment of the present invention proposes a picture processing method, including: classifying a neural model according to the picture to identify the type of the picture; if the picture is of a type with a simple background, detecting the neural model according to the object, detecting the subject in the picture, and Non-subject; remove the non-subject in the picture so that the area where the subject and the non-subject overlap each other forms a missing part; and repair the missing part according to the picture to restore the main body by repairing the neural model.

綜上所述，根據本發明的實施例，能自動偵測並移除圖片中的非主體，並且能夠對主體進行修補，可匹量的取得合乎需求的圖片。To sum up, according to the embodiments of the present invention, non-subjects in the picture can be automatically detected and removed, and the main body can be repaired, so that a picture that meets the requirements can be obtained in a matching quantity.

參照圖1，係為本發明一實施例之圖片處理方法流程圖。所述圖片處理方法係可由一電子裝置執行。參照圖2係為本發明一實施例之電子裝置之架構示意圖。電子裝置包括處理器221、記憶體222、非暫態電腦可讀取記錄媒體223、周邊介面224、及供上述元件彼此通訊的匯流排225。處理器221包括但不限於中央處理單元（CPU）2213和神經網路處理器（NPU）2215。記憶體222包括但不限於揮發性記憶體（如隨機存取記憶體（RAM））2224和非揮發性記憶體（如唯讀記憶體（ROM））2226。非暫態電腦可讀取記錄媒體223可例如為硬碟、固態硬碟等。周邊介面224可例如包括輸入輸出介面、繪圖介面、通訊介面（如網路介面）等。匯流排225包括但不限於系統匯流排、記憶體匯流排、周邊匯流排等一種或多種之組合。Referring to FIG. 1, it is a flowchart of a picture processing method according to an embodiment of the present invention. The image processing method can be executed by an electronic device. 2 is a schematic diagram of the structure of an electronic device according to an embodiment of the present invention. The electronic device includes a processor 221, a memory 222, a non-transitory computer-readable recording medium 223, a peripheral interface 224, and a bus 225 for communicating the above components with each other. The processor 221 includes, but is not limited to, a central processing unit (CPU) 2213 and a neural network processor (NPU) 2215. The memory 222 includes, but is not limited to, volatile memory (such as random access memory (RAM)) 2224 and non-volatile memory (such as read-only memory (ROM)) 2226. The non-transitory computer-readable recording medium 223 may be, for example, a hard disk, a solid state disk, or the like. The peripheral interface 224 may include, for example, an input/output interface, a graphics interface, a communication interface (such as a network interface), and so on. The bus 225 includes, but is not limited to, one or more combinations of system bus, memory bus, and peripheral bus.

電子裝置可以是由一個或多個計算裝置所構成。在一些實施例中，電子裝置可支援雲端計算服務，供其他連網裝置連接存取。雲端計算服務包括但不限於例如基礎結構即服務（infrastructure as a service）、平臺即服務（platform as a service）、軟體即服務（software as a service）、儲存即服務（storage as a service）、桌面即服務（desktop as a service）、資料即服務（data as a service）、安全即服務（security as a service）、以及API（應用程式介面）即服務（API as a service）。The electronic device may be composed of one or more computing devices. In some embodiments, the electronic device can support cloud computing services for other connected devices to connect and access. Cloud computing services include, but are not limited to, for example, infrastructure as a service, platform as a service, software as a service, storage as a service, desktop As a service (desktop as a service), data as a service (data as a service), security as a service (security as a service), and API (application programming interface) as a service (API as a service).

合併參照圖1及圖3。圖3為本發明一實施例之圖片處理架構示意圖。在步驟S110中，係將待處理的圖片400輸入至圖片分類神經模型310中，以依據圖片分類神經模型310，識別圖片400的類型。並於步驟S120中，判斷圖片是否屬於能夠處理的類型。若識別出的圖片類型屬於能夠處理的類型，則將該圖片400輸入至物件偵測神經模型320中，以偵測該圖片400中的主體及非主體（步驟S130）。在此，能夠處理的類型為圖片400中具有簡單背景的類型。反之，若為不能夠處理的類型，在此為具有重複物件的類型或具有複雜背景的類型，則結束流程。所述簡單背景是指背景為漸層色彩、均一色彩、黑白、透明等。所述重複物件是指圖片400具有重複的主體。所述複雜背景是指不具規則的照片或圖畫，例如風景、情境、人文等。所述主體為一商品，但本發明非以此為限，例如主體可以是人物、動物等生物，或者為商品、建物等非生物。所述非主體可以是文字或/及圖案，文字可例如是說明性文字（如廣告文案、商品描述、商標、浮水印），圖案可例如是促銷圖案、邊框、浮水印、商標等。Refer to Figure 1 and Figure 3 together. FIG. 3 is a schematic diagram of a picture processing architecture according to an embodiment of the present invention. In step S110, the picture 400 to be processed is input into the picture classification neural model 310 to identify the type of the picture 400 according to the picture classification neural model 310. And in step S120, it is determined whether the picture belongs to a type that can be processed. If the identified picture type is a type that can be processed, the picture 400 is input into the object detection neural model 320 to detect the subject and non-subject in the picture 400 (step S130). Here, the type that can be processed is the type with a simple background in the picture 400. On the contrary, if it is a type that cannot be processed, here it is a type with repeated objects or a type with a complex background, the process ends. The simple background means that the background is a gradient color, uniform color, black and white, transparent, etc. The repeated object refers to that the picture 400 has a repeated main body. The complex background refers to photos or drawings with irregularities, such as landscapes, contexts, and humanities. The subject is a commodity, but the present invention is not limited to this. For example, the subject may be creatures such as characters and animals, or non-living creatures such as commodities and buildings. The non-subject may be text or/and a pattern. The text may be, for example, descriptive text (such as advertising copy, product description, trademark, watermark), and the pattern may be, for example, a promotional pattern, frame, watermark, trademark, etc.

請參照圖4與圖5，圖4為本發明一實施例之原始圖片400之示意圖，圖5為本發明一實施例之移除非主體420的圖片400之示意圖。如圖4所示，圖片400包括主體410及非主體420。在此，主體410以一液晶螢幕商品為例，非主體420為位於主體410左上方的廣告文字（包含藍色圓形底圖）及位於主體410右下方的商標。在步驟S140中，將圖片400中的非主體420移除，使得主體410與非主體420相互重疊的區域形成一缺失部430（如圖5所示）。在一些實施例中，還可將主體410外部的區域去除（去背）。Please refer to FIGS. 4 and 5. FIG. 4 is a schematic diagram of an original picture 400 according to an embodiment of the present invention, and FIG. 5 is a schematic diagram of a picture 400 with non-subject 420 removed according to an embodiment of the present invention. As shown in FIG. 4, the picture 400 includes a main body 410 and a non-main body 420. Here, the main body 410 takes a liquid crystal screen product as an example, and the non-main body 420 is the advertisement text (including the blue circular base map) located at the upper left of the main body 410 and the trademark located at the lower right of the main body 410. In step S140, the non-main body 420 in the picture 400 is removed, so that the area where the main body 410 and the non-main body 420 overlap each other forms a missing portion 430 (as shown in FIG. 5). In some embodiments, the area outside the main body 410 may also be removed (backing).

復參照圖1及圖3，在步驟S150中，將具有缺失部430的圖片400輸入至圖片修補神經模型330，以依據圖片修補神經模型330，對缺失部430進行修補以復原主體410。參照圖6，係顯示修補後的圖片400的示意圖。1 and 3, in step S150, the picture 400 with the missing part 430 is input to the image repair neural model 330 to repair the neural model 330 according to the picture, and the missing part 430 is repaired to restore the main body 410. Referring to FIG. 6, a schematic diagram showing a repaired picture 400 is shown.

參照圖7，係為本發明一實施例之圖片分類神經模型310之示意圖。圖片分類神經模型310是利用多任務學習的方式，包括一特徵抽取神經模型312及複數子神經網路模型314。圖片400輸入至特徵抽取神經模型312之後，特徵抽取神經模型312能取得圖片400的複數特徵向量。特徵抽取神經模型312可以例如使用谷歌公司的EfficientNet開源模型，但本發明不以此為限。複數子神經網路模型314在此以三個為例，分別用於識別出不同類型的圖片，因此各子神經網路模型314的參數不相互共享。舉例而言，第一個子神經網路模型314用於識別圖片400中是否有圖案，第二個子神經網路模型314用於識別圖片400是否具有重複物件，第三個子神經網路模型314用於識別圖片400是否具有複雜背景。所述子神經網路模型314可以利用MobileNetV2中的反向殘差塊（inverted residual block）來實現。Referring to FIG. 7, it is a schematic diagram of a neural model 310 for image classification according to an embodiment of the present invention. The image classification neural model 310 uses a multi-task learning method, and includes a feature extraction neural model 312 and a complex sub-neural network model 314. After the picture 400 is input to the feature extraction neural model 312, the feature extraction neural model 312 can obtain the complex feature vector of the picture 400. The feature extraction neural model 312 may, for example, use the EfficientNet open source model of Google, but the present invention is not limited to this. Here, three complex sub-neural network models 314 are used as examples to identify different types of pictures. Therefore, the parameters of the sub-neural network models 314 are not shared with each other. For example, the first sub-neural network model 314 is used to recognize whether there are patterns in the picture 400, the second sub-neural network model 314 is used to recognize whether the picture 400 has duplicate objects, and the third sub-neural network model 314 is used To identify whether the picture 400 has a complicated background. The sub-neural network model 314 may be implemented by using an inverted residual block (inverted residual block) in MobileNetV2.

參照圖8，係為本發明一實施例之物件偵測神經模型320之示意圖，係使用RetinaNet架構。物件偵測神經模型320先利用殘差網路（Residual Network，ResNet）321來對圖片400抽取特徵圖譜（Feature Map），所抽取的特徵圖譜利用特徵金字塔網路（Feature Pyramid Networks，FPN）323來對每一層的特徵進行預測（predict）。每一預測分別輸入至一子神經網路模型325。每一個子神經網路模型325包括類別子網路（Class Subnet）3251及框子網路（Box Subnet）3252。類別子網路3251用以取得物件類別，框子網路3252用以取得物件位置。Referring to FIG. 8, it is a schematic diagram of an object detection neural model 320 according to an embodiment of the present invention, which uses the RetinaNet architecture. The object detection neural model 320 first uses the Residual Network (ResNet) 321 to extract the feature map (Feature Map) of the image 400, and the extracted feature map uses the Feature Pyramid Networks (FPN) 323 to extract the feature map. Predict the features of each layer. Each prediction is input to a sub-neural network model 325 respectively. Each sub-neural network model 325 includes a class subnet (Class Subnet) 3251 and a box subnet (Box Subnet) 3252. The category subnet 3251 is used to obtain the object category, and the frame subnet 3252 is used to obtain the object location.

在一實施例中，對圖片分類神經模型310及物件偵測神經模型320的訓練方式說明如下。參照圖9，係本發明一實施例之訓練圖片之產生示意圖，用以說明如何產生訓練圖片。第一，提供多個資料集。在此資料集包括內含複雜背景圖片的第一資料集510、內含簡單背景圖片的第二資料集520、內含主體圖片的第三資料集530、及內含非主體圖片的第四資料集540。在一些實施例中，主體圖片包括主體及單純的背景（如白色背景）。第二，隨機自第一資料集510或第二資料集520中挑選一第一圖片601、自第三資料集530中挑選一第二圖片602、自第四資料集540中挑選一第三圖片603。第三，合成第一圖片601、第二圖片602及第三圖片603為一訓練圖片700。重複上述步驟，可隨機生成多張訓練圖片700。依據此些訓練圖片700可對於圖片分類神經模型310及物件偵測神經模型320進行訓練。In one embodiment, the training method of the image classification neural model 310 and the object detection neural model 320 is described as follows. Referring to FIG. 9, it is a schematic diagram of generating training pictures according to an embodiment of the present invention to illustrate how to generate training pictures. First, provide multiple data sets. This data set includes the first data set 510 containing complex background pictures, the second data set 520 containing simple background pictures, the third data set 530 containing main pictures, and the fourth data containing non-main pictures Set 540. In some embodiments, the subject picture includes the subject and a simple background (such as a white background). Second, randomly select a first picture 601 from the first data set 510 or the second data set 520, a second picture 602 from the third data set 530, and a third picture from the fourth data set 540 603. Third, the first picture 601, the second picture 602, and the third picture 603 are synthesized into a training picture 700. By repeating the above steps, multiple training pictures 700 can be randomly generated. Based on these training pictures 700, the picture classification neural model 310 and the object detection neural model 320 can be trained.

在一些實施例中，由於訓練圖片700是自行生成的。在生成的同時，可以得知第一圖片601是從第一資料集510或第二資料集520中何者選出的，可據以產生一第一標記，即標記為複雜或簡單。並且，也能夠知道第二圖片602及第三圖片603在訓練圖片700中的位置、大小，從而可以產生標註主體410的第二標記及標註非主體420的第三標記。標註方式可例如使用方框等幾何圖形或按照物件輪廓圈選主體410與非主體420。於是，在訓練時，可依據第一標記、第二標記及第三標記，驗證圖片分類神經模型310及物件偵測神經模型320的輸出，以更新圖片分類神經模型310及物件偵測神經模型320的參數。也就是說，在對圖片分類神經模型310及物件偵測神經模型320進行訓練時，是依據訓練圖片700及第一標記、第二標記及第三標記進行。In some embodiments, the training picture 700 is generated by itself. While being generated, it can be known which of the first data set 510 or the second data set 520 is the first picture 601 selected, and a first mark can be generated accordingly, that is, the mark is complicated or simple. In addition, the positions and sizes of the second picture 602 and the third picture 603 in the training picture 700 can also be known, so that the second mark for marking the subject 410 and the third mark for marking the non-subject 420 can be generated. The marking method may, for example, use geometric figures such as boxes or circle the main body 410 and the non-main body 420 according to the outline of the object. Therefore, during training, the output of the image classification neural model 310 and the object detection neural model 320 can be verified according to the first label, the second label, and the third label, so as to update the image classification neural model 310 and the object detection neural model 320 Parameters. That is to say, when the image classification neural model 310 and the object detection neural model 320 are trained, it is performed based on the training image 700 and the first, second, and third labels.

參照圖10，係為本發明一實施例之圖片修補神經模型330之示意圖，係使用EdgeConnect架構。圖片修補神經模型330包括一邊緣產生器332及一修補產生器334。首先，依據具有缺失部430的圖片400產生灰階圖（Grayscale）、邊緣（Edge）和遮罩（Mask），並輸入至邊緣產生器332，以產生預測的邊緣圖。接著，將預測的邊緣圖和所述具有缺失部430的圖片400輸入至修補產生器334，以進行圖片修補，而於輸出端獲得修補後的圖片400。在此，邊緣產生器332由一產生器（Generator）3321及一判別器（Discriminator）3322構成。修補產生器334也由一產生器3341及一判別器3342構成。每一產生器3321、3341包括有編碼器（Encoder）、擴張捲積（Dilated Convolution）、殘差塊（Residual Block）及解碼器（Decoder）。每一判別器3322、3342包括有複數捲積層。Referring to FIG. 10, it is a schematic diagram of the image repair neural model 330 according to an embodiment of the present invention, which uses the EdgeConnect architecture. The image repair neural model 330 includes an edge generator 332 and a repair generator 334. First, a grayscale, an edge, and a mask are generated according to the picture 400 with the missing part 430, and input to the edge generator 332 to generate a predicted edge map. Then, the predicted edge map and the picture 400 with the missing part 430 are input to the inpainting generator 334 for picture inpainting, and the inpainted picture 400 is obtained at the output end. Here, the edge generator 332 is composed of a generator 3321 and a discriminator 3322. The repair generator 334 is also composed of a generator 3341 and a discriminator 3342. Each generator 3321, 3341 includes an encoder (Encoder), a dilated convolution (Dilated Convolution), a residual block (Residual Block), and a decoder (Decoder). Each discriminator 3322, 3342 includes a complex convolutional layer.

在一實施例中，對圖片修補神經模型330的訓練方式說明如下。第一，對邊緣產生器332進行訓練。先將一張或多張前述訓練圖片700隨機遮蔽一區域以形成的複數訓練資料。再將此些訓練資料輸入至邊緣產生器332，以獲得邊緣產生器332輸出的偵測結果。另一方面，利用邊緣偵測演算法（例如Canny 邊緣檢測）計算出該些訓練資料的邊緣，以驗證邊緣產生器332的偵測結果。在一些實施例，訓練圖片700為灰階圖，或預先將訓練圖片700轉為灰階圖。第二，對修補產生器334進行訓練。具體的，是將前述邊緣偵測演算法對於訓練圖片700計算出的邊緣以及訓練資料輸入至修補產生器334，以獲得修補產生器334的偵測結果，並驗證修補的準確度。第三，凍結邊緣產生器332的參數更新，而利用該些訓練資料對於邊緣產生器332及修補產生器334進行訓練。In an embodiment, the training method of the image repair neural model 330 is described as follows. First, the edge generator 332 is trained. First, one or more of the aforementioned training pictures 700 are randomly masked to form a plurality of training data. These training data are then input to the edge generator 332 to obtain the detection result output by the edge generator 332. On the other hand, an edge detection algorithm (such as Canny edge detection) is used to calculate the edges of the training data to verify the detection result of the edge generator 332. In some embodiments, the training image 700 is a grayscale image, or the training image 700 is converted into a grayscale image in advance. Second, the repair generator 334 is trained. Specifically, the edges and training data calculated by the aforementioned edge detection algorithm for the training picture 700 are input to the repair generator 334 to obtain the detection result of the repair generator 334 and verify the accuracy of the repair. Third, freeze the parameter update of the edge generator 332, and use the training data to train the edge generator 332 and the repair generator 334.

綜上所述，根據本發明的實施例，能自動偵測並移除圖片400中的非主體420，並且能夠對主體410進行修補，可匹量的取得合乎需求的圖片400。To sum up, according to the embodiment of the present invention, the non-subject 420 in the picture 400 can be automatically detected and removed, and the main body 410 can be repaired, and the picture 400 that meets the requirements can be obtained in a matching quantity.

221:處理器 222:記憶體 223:非暫態電腦可讀取記錄媒體 224:周邊介面 225:匯流排 2213:中央處理單元 2215:神經網路處理器 2224:揮發性記憶體 2226:非揮發性記憶體 310:圖片分類神經模型 312:特徵抽取神經模型 314:子神經網路模型 320:物件偵測神經模型 321:殘差網路 323:特徵金字塔網路 325:子神經網路模型 3251:類別子網路 3252:框子網路 330:圖片修補神經模型 332:邊緣產生器 3321:產生器 3322:判別器 334:修補產生器 3341:產生器 3342:判別器 400:圖片 410:主體 420:非主體 430:缺失部 510:第一資料集 520:第二資料集 530:第三資料集 540:第四資料集 601:第一圖片 602:第二圖片 603:第三圖片 700:訓練圖片 S110、S120、S130、S140、S150:步驟 221: Processor 222: Memory 223: Non-transient computer readable recording media 224: Surrounding interface 225: Bus Bar 2213: Central Processing Unit 2215: Neural Network Processor 2224: Volatile memory 2226: Non-volatile memory 310: Image classification neural model 312: Feature extraction neural model 314: Sub-neural network model 320: Object detection neural model 321: Residual Network 323: Feature Pyramid Network 325: Sub-neural network model 3251: Category subnet 3252: Frame network 330: Picture repair neural model 332: Edge generator 3321: Generator 3322: Discriminator 334: Repair generator 3341: Generator 3342: Discriminator 400: Picture 410: Subject 420: Non-subject 430: Missing Department 510: The first data set 520: The second data set 530: The third data collection 540: Fourth Information Collection 601: The first picture 602: The second picture 603: The third picture 700: Training pictures S110, S120, S130, S140, S150: Steps

[圖1]為本發明一實施例之圖片處理方法流程圖。 [圖2]為本發明一實施例之電子裝置之架構示意圖。 [圖3]為本發明一實施例之圖片處理架構示意圖。 [圖4]為本發明一實施例之原始圖片之示意圖。 [圖5]為本發明一實施例之移除非主體的圖片之示意圖。 [圖6]為本發明一實施例之修補後的圖片之示意圖。 [圖7]為本發明一實施例之圖片分類神經模型之示意圖。 [圖8]為本發明一實施例之物件偵測神經模型之示意圖。 [圖9]為本發明一實施例之訓練圖片之產生示意圖。 [圖10]為本發明一實施例之圖片修補神經模型之示意圖。[Fig. 1] is a flowchart of a picture processing method according to an embodiment of the present invention. [Fig. 2] is a schematic diagram of the structure of an electronic device according to an embodiment of the present invention. [Figure 3] is a schematic diagram of a picture processing architecture according to an embodiment of the present invention. [Figure 4] is a schematic diagram of an original picture according to an embodiment of the present invention. [Fig. 5] is a schematic diagram of removing non-subject pictures according to an embodiment of the present invention. [Figure 6] is a schematic diagram of a repaired picture according to an embodiment of the present invention. [Fig. 7] is a schematic diagram of a neural model for image classification according to an embodiment of the present invention. [Fig. 8] is a schematic diagram of an object detection neural model according to an embodiment of the present invention. [Fig. 9] is a schematic diagram of generating training pictures according to an embodiment of the present invention. [Figure 10] is a schematic diagram of an image repair neural model according to an embodiment of the present invention.

S110、S120、S130、S140、S150:步驟S110, S120, S130, S140, S150: steps

Claims

An image processing method, including: Identify the type of a picture according to a picture classification neural model; If the picture is of a type with a simple background, detect a subject and a non-subject in the picture according to an object detection neural model; Removing the non-subject in the picture so that the area where the main body and the non-subject overlap each other forms a missing part; and Repair the neural model according to a picture, and repair the missing part to restore the main body.

The image processing method according to claim 1, wherein the step of identifying the type of the image according to the image classification neural model includes: According to a feature extraction neural model, the complex feature vector of the picture is obtained; and The feature vectors are input to the complex neural network model, each of the neural network models has parameters that are not shared with each other, so as to recognize different image types respectively.

The image processing method according to claim 2, wherein the type of the image also includes a type with repeated objects that does not belong to the type with a simple background and a type with a complex background.

The picture processing method according to claim 3, wherein if the result of the step of identifying the type of a picture is the type with the repeated object or the type with the complex background, the picture is not processed.

The image processing method according to claim 1, wherein the image repair neural model includes an edge generator and a repair generator.

The image processing method described in claim 5 further includes the step of training the image repair neural model, including: Input the complex training data formed by randomly occluding an area with a training picture to the edge generator, and verify the detection result of the edge generator according to an edge detection algorithm; Use the edge detection algorithm to train the repair generator on the edge calculated for the training picture and the training data; and The parameter update of the edge generator is frozen, and the training data is used to train the edge generator and the repair generator.

The image processing method according to claim 1, wherein the non-subject is a text or a pattern.

The image processing method according to claim 1, wherein the subject is a commodity.

The image processing method described in claim 1, further including: Provide multiple data sets, including a first data set containing complex background pictures, a second data set containing simple background pictures, a third data set containing main pictures, and a A fourth data set of the main picture; Randomly select a first picture from the first data set or the second data set, a second picture from the third data set, and a third picture from the fourth data set; Synthesize the first picture, the second picture, and the third picture into a training picture; and Training the image classification neural model and the object detection neural model according to the training image.

The image processing method described in claim 9 further includes: According to which of the first picture is selected from the first data set or the second data set, a first mark is generated; and Generate a second mark and a third mark respectively according to the position and size of the second picture and the third picture in the training picture; Wherein, the step of training the picture classification neural model and the object detection neural model according to the training picture is also trained according to the first mark, the second mark, and the third mark.