CN115471509A

CN115471509A - Image processing method, device, equipment and computer readable storage medium

Info

Publication number: CN115471509A
Application number: CN202110646726.2A
Authority: CN
Inventors: 裴森; 李海涵
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2022-12-13

Abstract

The application discloses an image processing method, an image processing device, image processing equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring an original image and an area where a target in the original image is located; carrying out image segmentation on the original image to obtain at least two image blocks; rearranging the at least two image blocks to obtain a background image; carrying out image fusion on the region where the target is located and the background image to obtain a target image; the target image is used for training a preset detection network. By means of the technical scheme, directional data enhancement of the original image can be achieved at least, and therefore training efficiency and training accuracy of the preset detection network are improved.

Description

Image processing method, device, equipment and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method, an image processing apparatus, an image processing device, and a computer-readable storage medium.

Background

In order to enhance data of an image, the currently adopted technical solution includes: the method comprises the following steps of carrying out turning transformation, random clipping, color dithering, translation transformation, scale transformation, contrast transformation, noise disturbance, rotation transformation, reflection transformation and the like on an image. According to the data enhancement schemes, directional data enhancement cannot be performed on the images, and therefore under the condition that the detection network is preset by training the images after the data enhancement, the training efficiency and the training accuracy of the preset detection model cannot be effectively improved, and the target detection capability of the target detection network obtained through training cannot be effectively improved.

Disclosure of Invention

The application provides an image processing method, an image processing device, an image processing apparatus and a computer-readable storage medium, which can at least perform directional data enhancement on an image, and further improve training efficiency and training accuracy of a preset detection network.

The application provides an image processing method, which comprises the following steps:

acquiring an original image and an area where a target in the original image is located;

carrying out image segmentation on the original image to obtain at least two image blocks;

rearranging the at least two image blocks to obtain a background image;

carrying out image fusion on the region where the target is located and the background image to obtain a target image; the target image is used for training a preset detection network.

In some optional embodiments, the rearranging the at least two image blocks to obtain the background image includes:

determining target arrangement information of the at least two image blocks; the target arrangement information represents target relative position information between the at least two image blocks, and the target relative position information is different from original relative position information of the at least two image blocks in the original image;

and rearranging the at least two image blocks according to the target arrangement information to obtain the background image.

In some optional embodiments, the determining the target arrangement information of the at least two image blocks includes:

determining original position information;

based on the original position information, randomly adjusting the corresponding positions of the at least two image blocks in the original image to obtain the random position information;

determining the target arrangement information based on the random position information.

acquiring preset position information;

and determining the target arrangement information based on the preset position information.

In some optional embodiments, the image fusing the region where the target is located and the background image to obtain the target image includes:

determining target position information of the area where the target is located in the original image;

and remapping the area of the target to the background image according to the target position information to obtain the target image.

In some optional embodiments, the performing image segmentation on the original image to obtain at least two image blocks includes:

acquiring the length-width ratio of the original image;

determining an image segmentation size according to the aspect ratio;

and according to the image segmentation size, performing image segmentation on the original image to obtain the at least two image blocks.

In some optional embodiments, the method further comprises:

inputting the target image into the preset detection network to obtain a prediction category label and a confidence coefficient of the target image;

acquiring a target category label of the target image;

determining a target loss according to the target class label, the prediction class label and the confidence coefficient;

under the condition that the target loss does not meet a preset condition, adjusting network parameters of the preset detection network according to the target loss, and updating the target loss based on the preset detection network after the network parameters are adjusted;

and under the condition that the target loss meets the preset condition, taking a preset detection network corresponding to the condition that the target loss meets the preset condition as the target detection network.

In some optional embodiments, the method further comprises:

acquiring an image to be detected, wherein the image to be detected comprises an object to be detected;

and inputting the image to be detected into a target detection network for target detection to obtain the target object category information of the object to be detected.

The present application also provides an image processing apparatus, the apparatus including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an original image and an area where a target in the original image is located;

the image segmentation module is used for carrying out image segmentation on the original image to obtain at least two image blocks;

the rearrangement module is used for rearranging the at least two image blocks to obtain a background image;

the image fusion module is used for carrying out image fusion on the region where the target is located and the background image to obtain a target image; the target image is used for training a preset detection network.

In some optional embodiments, the rearrangement module comprises:

a determining unit, configured to determine target arrangement information of the at least two image blocks; the target arrangement information represents target relative position information between the at least two image blocks, and the target relative position information is different from original relative position information of the at least two image blocks in the original image;

and the rearranging unit is used for rearranging the at least two image blocks according to the target arrangement information to obtain the background image.

In some optional embodiments, the determining unit comprises:

a first determining subunit, configured to determine original location information;

a random adjusting subunit, configured to randomly adjust, based on the original position information, corresponding positions of the at least two image blocks in the original image to obtain the random position information;

a second determining subunit, configured to determine the target arrangement information based on the random position information.

In some optional embodiments, the determining unit comprises:

the acquisition subunit is used for acquiring preset position information;

a determining subunit, configured to determine the target arrangement information based on the preset position information.

In some optional embodiments, the image fusion module comprises:

the determining unit is used for determining the target position information of the area where the target is located in the original image;

and the remapping unit is used for remapping the area where the target is located on the background image according to the target position information to obtain the target image.

In some optional embodiments, the image segmentation module comprises:

an acquisition unit configured to acquire an aspect ratio of the original image;

a determining unit configured to determine an image segmentation size according to the aspect ratio;

and the image segmentation unit is used for carrying out image segmentation on the original image according to the image segmentation size to obtain the at least two image blocks.

In some optional embodiments, the method further comprises:

the input module is used for inputting the target image into the preset detection network to obtain a prediction class label and a confidence coefficient of the target image;

the second acquisition module is used for acquiring a target category label of the target image;

a first determining module, configured to determine a target loss according to the target class label, the prediction class label, and the confidence;

the adjusting module is used for adjusting the network parameters of the preset detection network according to the target loss under the condition that the target loss does not meet the preset condition, and updating the target loss based on the preset detection network after the network parameters are adjusted;

and the second determining module is used for taking a preset detection network corresponding to the condition that the target loss meets the preset condition as the target detection network under the condition that the target loss meets the preset condition.

In some optional embodiments, the apparatus further comprises:

the third acquisition module is used for acquiring an image to be detected, wherein the image to be detected comprises an object to be detected;

and the target detection module is used for inputting the image to be detected into a target detection network for target detection to obtain the target object category information of the object to be detected.

The present application further provides an image processing apparatus comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the image processing method as described above.

The present application further provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the image processing method as described above.

The image processing method, the image processing device, the image processing equipment and the computer readable storage medium have the following technical effects:

according to the method and the device, the original image and the area where the target is located in the original image are obtained, and the area where the directivity needs to be enhanced in the original image can be determined to be the area where the target is located; the image characteristics of the original image can be damaged by carrying out image segmentation on the original image to obtain at least two image blocks; by rearranging at least two image blocks, a background image without complete image characteristics can be obtained; by carrying out image fusion on the area where the target is located and the background image, the target image with enhanced directivity in the area where the target is located can be obtained, so that the directivity (data) of the original image is enhanced; the target image is used for training the preset detection network, so that the training accuracy of the preset detection network can be effectively improved, the training iteration times can be effectively reduced, and the training efficiency of the preset detection network is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate the technical solutions and advantages of the embodiments or the prior art of the present application, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is an alternative structural diagram of a distributed system applied to a blockchain system according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of an image processing method according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating an image segmentation process provided by an embodiment of the present application;

fig. 4 is a schematic flowchart of a training process of a predetermined detection network according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of an application scenario of image processing provided in an embodiment of the present application;

fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 7 is a block diagram of a hardware structure of a server of an image processing method according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the accompanying drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiment of the application relates to a target detection system, which comprises a server and a terminal, wherein directional data enhancement of an original image can be realized through the server, a target image is obtained, a target detection network is trained by utilizing the target image, and the application of the target detection network is realized through the terminal.

The target detection system related to the embodiment of the application can be a distributed system formed by connecting a client, a plurality of nodes (any form of computing equipment in an access network, such as a server and a user terminal) through a network communication mode. The client can be deployed in the terminal.

Taking a distributed system as an example of a blockchain system, referring To fig. 1, fig. 1 is an optional structural schematic diagram of a blockchain system To which a distributed system 100 provided in this embodiment of the present application is applied, where the blockchain system is formed by a plurality of nodes (computing devices in any form in an access network, such as servers and user terminals) and clients, a Peer-To-Peer (P2P) network is formed between the nodes, and a P2P Protocol is an application layer Protocol operating on a Transmission Control Protocol (TCP). In a distributed system, any machine, such as a server or a terminal, can join to become a node, and the node comprises a hardware layer, a middle layer, an operating system layer and an application layer.

Referring to the functions of each node in the blockchain system shown in fig. 1, the functions involved include:

1) Routing, a basic function that a node has, is used to support communication between nodes.

Besides the routing function, the node may also have the following functions:

2) The application is used for being deployed in a block chain, realizing specific services according to actual service requirements, recording data related to the realization functions to form recording data, carrying a digital signature in the recording data to represent a source of task data, and sending the recording data to other nodes in the block chain system, so that the other nodes add the recording data to a temporary block when the source and integrity of the recording data are verified successfully.

For example, the services implemented by the application include:

2.1 Wallet) for providing functions of conducting transactions of electronic money, including initiating transactions (i.e. sending transaction records of current transactions to other nodes in the blockchain system, and storing the record data of the transactions in temporary blocks of the blockchain as a response for acknowledging that the transactions are valid after the other nodes are successfully verified; of course, the wallet also supports querying for remaining electronic money in the electronic money address;

2.2 Shared account book) is used for providing functions of operations such as storage, query and modification of account data, record data of the operations on the account data are sent to other nodes in the block chain system, and after the other nodes verify that the record data are valid, the record data are stored in a temporary block as a response for acknowledging that the account data are valid, and confirmation can be sent to the node initiating the operations.

2.3 Smart contracts, computerized agreements) that can enforce the terms of a contract, implemented by code deployed on a shared ledger for execution when certain conditions are met, for completing automated transactions according to actual business requirement code, e.g. querying the logistics status of goods purchased by a buyer, transferring the buyer's electronic money to the merchant's address after the buyer signs for goods; of course, intelligent contracts are not limited to executing contracts for trading, but may also execute contracts that process received information.

3) And the Block chain comprises a series of blocks (blocks) which are mutually connected according to the generated chronological order, new blocks cannot be removed once being added into the Block chain, and recorded data submitted by nodes in the Block chain system are recorded in the blocks.

In the embodiment of the application, the training and application of the target detection network relate to an artificial intelligence technology, in particular to a machine learning direction of the artificial intelligence technology.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject, and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The method specially studies how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach to make computers have intelligence, and is applied in various fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Specifically, in order to improve the data enhancement effect of an image, the application provides an image processing method to perform directional data enhancement on an original image to obtain a target image, and train a preset detection network by using the target image, so as to improve the training efficiency and the training accuracy of the preset detection network.

The image processing method of the present application is described below, and the present specification provides the method operation steps as described in the embodiments or flowcharts, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures.

In a specific embodiment, as shown in fig. 2, the present application provides an image processing method, including:

s201: and acquiring an original image and a region where a target in the original image is located.

In the embodiment of the present application, the original image may be an image that needs data enhancement.

In the embodiment of the application, the original image may include a target object, and the type of the target object may include a person, an object, and a scene. Accordingly, the region where the target is located may be an image region where the target object is located in the original image. It can be understood that the region where the target is located may be a region in the original image where directivity enhancement is required.

In a specific embodiment, the region where the target is located may be a minimum bounding rectangle region of the target object in the original image.

In an optional embodiment, the step of obtaining the area where the target is located includes: the minimum circumscribed rectangle of the target object in the original image can be manually marked, the computer identifies the marked original image, and the region where the target is located is extracted from the original image.

In an optional embodiment, before acquiring the region where the target is located, the method further includes: and determining a vertex of the original image as an origin, determining a horizontal edge where the origin is located as an x-axis and a vertical edge as a y-axis, and establishing a rectangular coordinate system. Correspondingly, the step of acquiring the area where the target is located comprises the following steps: inputting an original image into a neural network based on deep learning to obtain the vertex coordinates of the minimum circumscribed rectangle of the target object; and obtaining the area where the target is located according to the vertex coordinates.

In practical application, a target detection network is adopted to detect a target object in an original image, so that an overfitting phenomenon occurs, specifically, the target object and a background area around the target object are identified as the target object. In order to improve the identification accuracy of the target detection network, directivity enhancement needs to be performed on an area where a target in an original image is located, the target image obtained after the directivity enhancement is utilized to train a preset detection network, and the target detection network with the identification accuracy effectively improved is obtained.

Considering that the phenomenon of overfitting is to identify a target object and a background area around the target object as the target object, the purpose of directivity enhancement can be achieved by destroying the background area except the target object in the original image. The target image can be obtained by extracting the region where the target is located, destroying the original image, determining the background image based on the destroyed original image, and carrying out image fusion on the region where the target is located and the background image, wherein the region where the target is located in the target image can be subjected to directivity enhancement.

In the two schemes, the second scheme can more comprehensively destroy the image characteristics of the original image, and obtain a background image which is less prone to overfitting. The embodiment of the application is based on the second scheme and performs data enhancement on the original image.

In the embodiment of the application, the area where the target is located in the original image can be obtained, so that the area in the original image, which needs directivity enhancement, can be determined.

S203: and carrying out image segmentation on the original image to obtain at least two image blocks.

In the embodiment of the present application, the original image may be divided into at least two image blocks with the same size. In particular, the shape of the at least two image blocks may comprise a rectangle. The rearrangement of at least two image blocks is facilitated, and a background image with the same image size as the original image is obtained.

In the embodiment of the application, the original image is subjected to image segmentation so as to destroy the image characteristics of the original image. Wherein, from the overall and local view, the image features may include overall features and local features; from the type point of view, the image features include shape features, color features, texture features, and spatial relationship features.

In the embodiment of the application, the original image is subjected to image segmentation, and it is required to ensure that the region where the target of the original image is located is damaged. If the region where the target of the original image is located is not damaged, it is indicated that the image segmentation of the original image is not enough to damage the local features of the original image, and further the background image obtained by rearranging the at least two image blocks has the region where the target is located, and after the extracted region where the target is located and the background image are subjected to image fusion, the obtained target image includes the extracted region where the target is located and the region where the target is located that is not damaged, which is not beneficial to the identification of the preset detection network.

In a specific embodiment, the image segmentation size of the original image can be determined according to the actual application requirements, so that the original image is ensured to be subjected to image segmentation, and the region where the target of the original image is located can be damaged.

S205: and rearranging the at least two image blocks to obtain a background image.

In the embodiment of the present application, the image size of the background image is the same as the image size of the original image.

It is understood that the arrangement of the at least two image blocks in the background image is different from the arrangement of the at least two image blocks in the original image, and that no complete image features exist in the background image.

In the embodiment of the application, at least two image blocks are rearranged, so that a background image without complete image features can be obtained, and the background image is used as the background of the area where the target is located, so that the directivity enhancement can be performed on the area where the target is located.

S207: carrying out image fusion on the region where the target is located and the background image to obtain a target image; the target image is used for training a preset detection network.

In the embodiment of the application, the target image can be an image with enhanced directivity for the area where the target is located.

In the embodiment of the application, the target image may include a region where the target is located, and a background region except the region where the target is located in the target image does not have complete image features, which is beneficial to a preset detection network to identify the region where the target is located.

In the embodiment of the application, the preset detection network is trained by using the target image, so that the training efficiency and the training accuracy of the preset detection network can be improved, and the target detection network with effectively improved recognition accuracy is obtained.

In S203, in order to divide the original image into at least two image blocks with the same size, as shown in fig. 3, a flowchart of an image dividing process is shown. Referring to fig. 3, the performing image segmentation on the original image to obtain at least two image blocks includes:

s301: and acquiring the aspect ratio of the original image.

It will be appreciated that the original image includes horizontally opposite edges and vertically opposite edges. Wherein, the length of the horizontal opposite sides is the same, and the length of the vertical opposite sides is the same.

In the embodiment of the present application, the aspect ratio of the original image may be a ratio of the length of the horizontal opposite side to the length of the vertical opposite side.

S303: and determining the image segmentation size according to the aspect ratio.

In the embodiment of the present application, the image segmentation size may include a preset length and a preset width. Specifically, the ratio of the preset length to the preset width is equal to the aspect ratio.

S305: and according to the image segmentation size, performing image segmentation on the original image to obtain the at least two image blocks.

In the embodiment of the application, the original image is subjected to image segmentation according to the image segmentation size, the length of each image block in at least two obtained image blocks is a preset length, and the width of each image block is a preset width.

In a specific embodiment, specific values of the preset length and the preset width are further determined according to practical application requirements, so as to ensure that the original image is segmented according to the image segmentation size, and the image features of the original image can be damaged.

In the embodiment of the application, by acquiring the aspect ratio of the original image, determining the image segmentation size according to the aspect ratio, and performing image segmentation on the original image according to the image segmentation size, at least two image blocks obtained by segmentation can be ensured to be image blocks with the same size and the same shape, so that rearrangement of the at least two image blocks is facilitated.

In S205, in order to obtain a background image without complete image features, the rearranging the at least two image blocks to obtain the background image includes:

In this embodiment of the present application, the target relative position information includes target relative positions of any two image blocks of the at least two image blocks, and the original relative position includes original relative positions of any two image blocks of the at least two image blocks.

In the embodiment of the application, the corresponding positions of the at least two image blocks in the original image are arranged according to the target arrangement information, so that the rearrangement of the at least two image blocks is realized, and the background image is obtained.

In an alternative embodiment, the target arrangement information of at least two image blocks may be preset.

In an alternative embodiment, the target arrangement information of at least two image blocks may be randomly determined.

Wherein, in a case that target arrangement information of at least two image blocks is preset, the determining the target arrangement information of the at least two image blocks includes:

acquiring preset position information;

In this embodiment of the present application, the preset position information may be positions corresponding to at least two preset image blocks.

In the embodiment of the application, the target relative positions of any two image blocks in the at least two image blocks are determined based on the preset positions corresponding to the at least two image blocks, so that the determination efficiency of the target arrangement information can be improved.

Wherein, in case of randomly determining the target arrangement information of at least two image blocks, the determining the target arrangement information of at least two image blocks includes:

determining original position information;

In this embodiment of the present application, the original position information includes original positions of at least two image blocks in an original image.

In this embodiment of the present application, the random position information includes positions corresponding to the at least two image blocks after the at least two image blocks are randomly adjusted.

In a specific embodiment, the random adjustment of the at least two images can be implemented by randomly exchanging the corresponding positions of any two image blocks in the original image based on the corresponding original positions of the at least two image blocks in the original image.

In the embodiment of the application, the target arrangement information of at least two image blocks is randomly determined, so that the disorder of the target arrangement information can be improved, and an original image with high image feature imperfection can be obtained.

It can be understood that, since the region where the target of the original image is located is damaged in the process of obtaining the background image, image fusion needs to be performed on the region where the target is located and the background image to obtain the target image. In order to make the image characteristics of the region where the target is located in the target image closer to the image characteristics of the region where the target is located in the original image.

In S207, the image fusion of the region where the target is located and the background image to obtain the target image includes:

In this embodiment, the target position information may include coordinate position information of an area where the target is located in the original image. The target position information may further include pixel position information of a region where the target is located in the original image.

Specifically, in the case that the region where the target is located is the minimum circumscribed rectangle of the target object in the original image, the coordinate position information may include vertex coordinates of the minimum circumscribed rectangle, and may include pixel coordinates of the minimum circumscribed rectangle.

And remapping the minimum circumscribed rectangle to the background image according to the vertex coordinates or the pixel coordinates of the minimum circumscribed rectangle to obtain the target image.

In the embodiment of the application, the area where the target is located is remapped to the background image, so that the image characteristics of the area where the target is located in the target image are closer to the image characteristics of the area where the target is located in the original image, and the integrity of the image characteristics of the area where the target is located in the target image is realized.

In a specific embodiment, as shown in fig. 4, a flowchart of a training process of a predetermined detection network is shown. Specifically, the method further comprises:

s401: and inputting the target image into the preset detection network to obtain a prediction class label and a confidence coefficient of the target image.

In the embodiment of the present application, the preset detection network may be a deep learning-based neural network that needs to be trained.

In the embodiment of the application, the target image comprises a target area, and the target area comprises a target object. Accordingly, the preset class label may represent a class of the target object predicted by the preset detection network. Correspondingly, the confidence coefficient may be the accuracy of the preset category label obtained by predicting the network.

S403: and acquiring a target category label of the target image.

In the embodiment of the application, the target class label can represent the real class of the target object.

S405: and determining target loss according to the target class label, the prediction class label and the confidence coefficient.

In the embodiment of the application, the target loss can be calculated through a cross entropy loss function. Specifically, the formula of the cross entropy loss function is:

wherein N is the number of pixels in the target image, i represents the current traversal to the ith pixel, and L _i For the target sub-loss, p, corresponding to the currently traversed pixel _i Is the confidence corresponding to the ith pixel, y _i The prediction type label corresponding to the ith pixel represents that the prediction type label is the same as the target type label when the numerical value is 1, and represents that the prediction type label is different from the target type label when the numerical value is 0.

S407: and under the condition that the target loss does not meet a preset condition, adjusting the network parameters of the preset detection network according to the target loss, and updating the target loss based on the preset detection network after the network parameters are adjusted.

In the embodiment of the application, the preset detection network comprises an input layer, an intermediate layer and an output layer. Accordingly, the network parameters of the predictive detection network include input layer parameters, intermediate layer parameters, and output layer parameters.

In an alternative embodiment, satisfying the preset condition includes: the target loss is lower than a preset first threshold value, and the first threshold value can be determined according to the actual application requirement.

In an alternative embodiment, satisfying the preset condition includes: the iteration number of the training phase reaches a preset second threshold, and the second threshold can be determined according to the actual application requirement.

S409: and under the condition that the target loss meets the preset condition, taking a preset detection network corresponding to the condition that the target loss meets the preset condition as the target detection network.

In the embodiment of the application, the target image is used for training the prediction detection network, and the target image comprises the target area with enhanced directivity and is easy to preset the identification of the detection network, so that the training iteration times of the prediction detection network can be reduced, and the training efficiency and the training accuracy are improved. The trained target detection network can also obtain better recognition capability.

In an optional embodiment, to implement the application of the target detection network, the method further includes:

In this embodiment, the target object category information may include a target object label and a target confidence of the object to be detected. The target object label can represent the object class of the object to be detected predicted by the target detection network, and the type of the object class can include people, objects and scenes. The target confidence coefficient can represent the accuracy of the object category of the object to be detected.

In a specific embodiment, the target detection network may predict target detection for objects of different object classes, where the confidence threshold corresponding to each object class is different. And under the condition that the target confidence coefficient is greater than or equal to the confidence coefficient threshold value corresponding to the object class of the object to be detected, determining that the target object label is a correct label.

In the embodiment of the application, confidence threshold values corresponding to different object types can be set according to actual application requirements. Compared with the conventional target detection network, the target detection network obtained in the embodiment of the application has the advantages that the recognition capability is effectively improved, and higher confidence threshold values can be set for different object types. Whether the target confidence coefficient output by the target detection network reaches the standard or not is measured by using a higher confidence coefficient threshold value, and the recall rate and the identification accuracy rate of the target detection network can be greatly improved. Experimental data show that the recall rate and the recognition accuracy rate of the target detection network obtained by training the preset detection network by using the target image obtained by the image processing method provided by the embodiment of the application can be improved by 150%.

Fig. 5 is a schematic diagram of an application scenario of image processing according to the present application. Referring to fig. 5, fig. 5 sequentially shows an original image, at least two image blocks obtained by image-dividing the original image, a background image obtained by rearranging the at least two image blocks, and a target image obtained by remapping an area where a target is located on the background image.

According to the method and the device, the original image and the area where the target is located in the original image are obtained, and the area where the directivity needs to be enhanced in the original image can be determined to be the area where the target is located; the image characteristics of the original image can be damaged by carrying out image segmentation on the original image to obtain at least two image blocks; by rearranging at least two image blocks, a background image without complete image characteristics can be obtained; by carrying out image fusion on the area where the target is located and the background image, the area where the target is located can be obtained to obtain a target image with enhanced directivity, so that the directivity enhancement of the original image is realized; the target image is used for training the preset detection network, so that the training accuracy of the preset detection network can be effectively improved, the training iteration times can be effectively reduced, and the training efficiency of the preset detection network is improved. According to the method and the device, the target image is used for training the preset detection network, and the identification capability of the obtained target detection network can be effectively improved. In addition, the scheme is simple and feasible, does not need to occupy more CPU resources, is also suitable for a scene for performing data enhancement on the image in real time on line, can perform data enhancement on the original image in real time, inputs the obtained target image into the preset detection network, and performs training on the preset detection network.

Furthermore, the image processing method provided by the embodiment of the application is applicable to processing of images in various formats without considering the format of the image.

As shown in fig. 6, an embodiment of the present application further provides a schematic structural diagram of an image processing apparatus 600. Referring to fig. 6, the apparatus includes:

a first obtaining module 601, configured to obtain an original image and an area where a target in the original image is located;

an image segmentation module 603, configured to perform image segmentation on the original image to obtain at least two image blocks;

a rearranging module 605, configured to rearrange the at least two image blocks to obtain a background image;

an image fusion module 607, configured to perform image fusion on the region where the target is located and the background image to obtain a target image; the target image is used for training a preset detection network.

In some embodiments, the rearrangement module 605 includes:

In some embodiments, the determining unit includes:

an acquisition subunit, configured to acquire preset position information;

In some embodiments, the image fusion module 607 includes:

In some embodiments, the image segmentation module 603 comprises:

In some embodiments, the above apparatus further comprises:

a first determining module, configured to determine a target loss according to the target class label, the prediction class label, and the confidence level;

and the second determining module is used for taking a preset detection network corresponding to the target loss meeting the preset condition as the target detection network under the condition that the target loss meets the preset condition.

In some embodiments, the above apparatus further comprises:

The device and method embodiments in the device embodiment described are based on the same inventive concept.

The embodiment of the present application further provides an image processing apparatus, which includes a processor and a memory, where the memory stores at least one instruction or at least one program, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the image processing method provided by the above method embodiment.

The present embodiments also provide a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the image processing method provided by the above method embodiments.

The storage medium in the described computer-readable storage medium embodiments and the method embodiments are based on the same inventive concept.

The present application also provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above.

An embodiment of the present application provides an image processing server, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the image processing method provided by the foregoing method embodiment.

The memory may be used to store software programs and modules, and the processor may execute various functional applications and image processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.

The method provided by the embodiment of the application can be executed in a mobile terminal, a computer terminal, a server or a similar operation device. Taking the server as an example, fig. 7 is a block diagram of a hardware structure of the server of the image processing method according to the embodiment of the present application. As shown in fig. 7, the server 700 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 710 (the processor 710 may include but is not limited to a Processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 730 for storing data, and one or more storage media 720 (e.g., one or more mass storage devices) for storing applications 723 or data 722. Memory 730 and storage medium 720 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 720 may include one or more modules, each of which may include a series of instruction operations for the server. Still further, central processor 710 may be configured to communicate with storage medium 720 and execute a series of instruction operations in storage medium 720 on server 700. The server 700 may also include one or more power supplies 760, one or more wired or wireless network interfaces 750, one or more input-output interfaces 740, one or more operating systems 721, such as Windows ServerTM, mac OS xTM, unixTM, linuxTM, freeBSDTM, and so forth.

The input/output interface 740 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the server 700. In one example, the input/output Interface 740 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the input/output interface 740 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

It will be understood by those skilled in the art that the structure shown in fig. 7 is only an illustration and is not intended to limit the structure of the electronic device. For example, server 700 may also include more or fewer components than shown in FIG. 7, or have a different configuration than shown in FIG. 7.

Embodiments of the present application further provide a storage medium, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing an image processing method in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by the processor to implement the image processing method provided by the above method embodiments.

Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

As can be seen from the embodiments of the image processing method, the image processing device, the image processing apparatus, and the computer-readable storage medium provided by the present application, by acquiring the original image and the region where the target in the original image is located, the region where the directivity needs to be enhanced in the original image can be determined as the region where the target is located; the image characteristics of the original image can be damaged by carrying out image segmentation on the original image to obtain at least two image blocks; by rearranging at least two image blocks, a background image without complete image characteristics can be obtained; by image fusion of the area where the target is located and the background image, the target image with enhanced directivity of the area where the target is located can be obtained, and therefore directivity enhancement of the original image is achieved; the target image is used for training the preset detection network, so that the training accuracy of the preset detection network can be effectively improved, the training iteration times can be effectively reduced, and the training efficiency of the preset detection network is improved.

It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An image processing method, characterized in that the method comprises:

rearranging the at least two image blocks to obtain a background image;

2. The method of claim 1, wherein rearranging the at least two image blocks to obtain a background image comprises:

3. The method according to claim 2, wherein the determining the target arrangement information of the at least two image blocks comprises:

determining original position information;

based on the original position information, randomly adjusting the corresponding positions of the at least two image blocks in the original image to obtain random position information;

and determining the target arrangement information based on the random position information.

4. The method according to claim 2, wherein the determining the target arrangement information of the at least two image blocks comprises:

acquiring preset position information;

5. The method according to any one of claims 1 to 4, wherein the image fusion of the region where the target is located and the background image to obtain the target image comprises:

6. The method according to any one of claims 1 to 4, wherein the image segmenting the original image to obtain at least two image blocks comprises:

acquiring the length-width ratio of the original image;

determining an image segmentation size according to the aspect ratio;

7. The method of any of claims 1 to 4, further comprising:

acquiring a target category label of the target image;

and under the condition that the target loss meets the preset condition, taking a preset detection network corresponding to the condition that the target loss meets the preset condition as a target detection network.

8. The method of claim 7, further comprising:

9. An image processing apparatus, characterized in that the apparatus comprises:

10. An image processing apparatus, characterized in that the apparatus comprises a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the image processing method according to any one of claims 1 to 8.