WO2023231159A1

WO2023231159A1 - Image retrieval method and apparatus for realizing achieving ia by combining rpa and ai, and electronic device

Info

Publication number: WO2023231159A1
Application number: PCT/CN2022/106339
Authority: WO
Inventors: 谭繁华
Original assignee: 来也科技(北京)有限公司
Priority date: 2022-05-30
Filing date: 2022-07-18
Publication date: 2023-12-07
Also published as: CN114840700B; CN114840700A

Abstract

The present disclosure provides an image retrieval method and apparatus for achieving intelligent automation (IA) by combining robotic process automation (RPA) and artificial intelligence (AI), and an electronic device. The method comprises: obtaining an initial image on the basis of RPA technology, wherein the initial image has image description information; intercepting an initial area image from the initial image on the basis of AI technology; processing the initial area image according to the image description information to obtain a target area image; and retrieving target content according to the target area image. According to the present disclosure, IA of image retrieval can be realized by combining RPA and AI, the image can be preprocessed in time before image retrieval, interference information in the image is removed, the pertinence of the obtained target area image in the retrieval process is effectively improved, and the influence of interference information on the retrieval process is effectively reduced, thereby effectively improving the image retrieval efficiency and the accuracy of the image retrieval result.

Description

Image retrieval methods, devices and electronic equipment that combine RPA and AI to realize IA

Cross-references to related applications

This application is filed based on a Chinese patent application with application number 202210600925.4 and a filing date of May 30, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this application.

Technical field

The present disclosure relates to the field of computer technology, and in particular to an image retrieval method, device and electronic equipment that combines RPA (Robotic Process Automation, Robotic Process Automation) and AI (Artificial Intelligence, Artificial Intelligence) to realize IA (Intelligent Automation, Intelligent Automation).

Background technique

Robotic Process Automation, referred to as RPA, uses specific "robot software" to simulate human operations on a computer and automatically execute process tasks according to rules.

Artificial Intelligence (AI) is a technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence.

Intelligent Automation (IA) is a general term for a series of technologies from robotic process automation to artificial intelligence. It combines RPA with Optical Character Recognition (OCR), Intelligent Character Recognition (ICR), process mining ( Process Mining), Deep Learning (Deep Learning, DL), Machine Learning (ML), Natural Language Processing (NLP), Speech Recognition (Automatic Speech Recognition, ASR), Speech Synthesis (Text To Speech, TTS), Computer Vision (CV) and other AI technologies are combined to create end-to-end business processes that can think, learn and adapt, covering process discovery, process automation, and automatic and continuous data collection Collect and understand the meaning of data, and use data to manage and optimize the entire process of business processes.

In related technologies, when the main part of the original image is relatively small, manual annotation is used to train the image vector algorithm, or a hybrid retrieval mode of image meta-information is used to perform image retrieval, resulting in a more complex image retrieval processing method that requires It consumes high labor costs and cannot effectively guarantee the image retrieval effect.

Contents of the invention

The present disclosure aims to solve one of the technical problems in the related art, at least to a certain extent.

To this end, the purpose of this disclosure is to propose an image retrieval method, device, electronic device and storage medium that combines RPA and AI to realize IA, which can use RPA combined with artificial intelligence AI to realize intelligent automated IA for image retrieval, and can achieve image retrieval before image retrieval. Preprocess the image in a timely manner to remove interference information in the image, effectively improve the pertinence of the obtained target area image in the retrieval process, effectively reduce the impact of interference information on the retrieval process, thereby effectively improving image retrieval efficiency and image retrieval results. accuracy.

The image retrieval method that combines RPA and AI to implement IA proposed by the embodiment of the first aspect of the present disclosure includes: obtaining an initial image based on robotic process automation RPA technology, where the initial image has image description information; and obtaining an initial image from the initial image based on artificial intelligence AI technology. Intercept the initial area image; process the initial area image according to the image description information to obtain the target area image; retrieve the target content based on the target area image.

In one implementation, intercepting the initial area image from the initial image based on artificial intelligence AI technology includes: calling a natural language processing NLP service to identify the subject information in the initial image; and determining, based on the subject information, that the subject corresponds to the subject in the initial image. The location description information; intercept the area image corresponding to the location description information from the initial image as the initial area image.

In one embodiment, after acquiring the initial image based on Robotic Process Automation (RPA) technology, the method further includes: determining the image scale information of the initial image; and/or determining the pixel feature information of the initial image; and/or determining the initial image for the initial image. Processing parameter information specified by the image; use image scale information, and/or pixel feature information, and/or processing parameter information as image description information.

In one embodiment, the image description information includes: image scale information; wherein, processing the initial area image according to the image description information to obtain the target area image includes: enlarging the initial area image according to the image scale information; The image is used as the target area image.

In one embodiment, the image description information includes: pixel feature information; wherein, processing the initial area image according to the image description information to obtain the target area image includes: obtaining the area pixels in the initial image area; parsing the area from the pixel feature information Regional pixel features of pixels; enhance the regional pixel features of each regional pixel in the initial image area to obtain the target area image.

In one embodiment, the image description information includes: image scale information and pixel feature information; wherein, processing the initial area image according to the image description information to obtain the target area image includes: enlarging the initial area image according to the image scale information, Obtain an image of the area to be filled, where the image of the area to be filled includes: pixels to be filled; parsing the first pixel features of the pixels to be filled from the pixel feature information; parsing the second pixel features of the regional pixels in other area images from the pixel feature information , wherein the initial area image and other area images together constitute the initial image; a filling pixel feature is generated according to the first pixel feature and the second pixel feature; the to-be-filled area image is filled according to the filling pixel feature to obtain the target area image.

In one embodiment, the image description information includes: processing parameter information; wherein processing the initial area image according to the image description information to obtain the target area image includes: processing the initial area image according to the processing parameter information to obtain the target area image.

In one implementation, retrieving target content based on the target area image includes: determining semantic representation information of the target area image; retrieving the target content based on the semantic representation information.

In one implementation, determining the semantic representation information of the target area image includes: identifying the target object outline from the target area image; determining the object outline information according to the target object outline; processing the object outline information to obtain the outline vector representation; converting the outline Vector representation as semantic representation of information.

In one implementation, retrieving the target content according to the semantic representation information includes: determining a candidate similarity level corresponding to the semantic representation information, wherein the candidate similarity level belongs to a pre-constructed graph data structure, the candidate similarity level, It is the level to which the similarity between the corresponding represented content and the initial image belongs; the content represented by the candidate similarity level in the graph data structure is used as the target content.

The image retrieval device that combines RPA and AI to implement IA proposed by the embodiment of the second aspect of the present disclosure includes: an acquisition module for acquiring an initial image based on robotic process automation RPA technology, where the initial image has image description information; a first processing module , used to intercept the initial area image from the initial image based on artificial intelligence AI technology; the second processing module is used to process the initial area image according to the image description information to obtain the target area image; the retrieval module is used to retrieve the target according to the target area image content.

In one implementation, the first processing module is specifically configured to: call a natural language processing NLP service to identify subject information in the initial image; determine based on the subject information that the subject corresponds to the position description information in the initial image; from the initial The area image corresponding to the position description information is intercepted from the image as the initial area image.

In one embodiment, the device further includes: a determining module, configured to determine image scale information of the initial image; and/or determine pixel feature information of the initial image; and/or determine processing parameter information specified for the initial image; and Image scale information, and/or pixel feature information, and/or processing parameter information are used as image description information.

In one embodiment, the image description information includes: image scale information; wherein, the second processing module is specifically configured to: expand the initial region image according to the image scale information; and use the expanded image as the target region image.

In one implementation, the image description information includes: pixel feature information; wherein, the second processing module is further configured to: obtain the regional pixels in the initial image region; parse the regional pixel features of the regional pixels from the pixel feature information; The regional pixel features of each area pixel in the initial image area are enhanced to obtain the target area image.

In one embodiment, the image description information includes: image scale information and pixel feature information; wherein, the second processing module is further configured to: expand the initial region image according to the image scale information to obtain the region image to be filled, where , the region image to be filled includes: pixels to be filled; parsing the first pixel features of the pixels to be filled from the pixel feature information; parsing the second pixel features of the region pixels in other region images from the pixel feature information, where the initial region image and Other area images together constitute the initial image; a filling pixel feature is generated based on the first pixel feature and the second pixel feature; the to-be-filled area image is filled according to the filling pixel feature to obtain the target area image.

In one implementation, the image description information includes: processing parameter information; wherein the second processing module is further configured to: process the initial area image according to the processing parameter information to obtain the target area image.

In one implementation, the retrieval module includes: a determination sub-module, used to determine the semantic representation information of the target area image; a retrieval sub-module, used to retrieve the target content according to the semantic representation information.

In one implementation, the determination sub-module is specifically used to: identify the target object contour from the target area image; determine the object contour information according to the target object contour; process the object contour information to obtain the contour vector representation; use the contour vector representation as semantic meaning representation information.

In one implementation, the retrieval sub-module is specifically used to: determine candidate similarity levels corresponding to semantic representation information, where the candidate similarity levels belong to a pre-constructed graph data structure, and the candidate similarity levels are their corresponding The level that represents the similarity between the content and the initial image; use the content represented by the candidate similarity level in the graph data structure as the target content.

The electronic device provided by the embodiment of the third aspect of the present disclosure includes: at least one processor and a memory; the memory stores computer execution instructions; and at least one processor executes the computer execution instructions stored in the memory, so that the at least one processor executes the first aspect of the disclosure. The embodiment proposes an image retrieval method that combines RPA and AI to implement IA.

The computer-readable storage medium proposed in the embodiment of the fourth aspect of the disclosure has computer-executable instructions stored in the computer-readable storage medium. When the processor executes the computer-executed instructions, the combination of RPA and AI proposed in the embodiment of the first aspect of the disclosure is realized. Implement the image retrieval method of IA.

The advantages or beneficial effects of the above technical solutions at least include: obtaining the initial image based on robotic process automation RPA technology, intercepting the initial area image from the initial image based on artificial intelligence AI technology, processing the initial area image according to the image description information, and obtaining the target area image. , retrieve the target content based on the target area image, and can use RPA combined with artificial intelligence AI to realize intelligent automation IA of image retrieval. It can preprocess the image in time before image retrieval to remove interference information in the image and effectively improve the obtained target. The pertinence of regional images in the retrieval process can effectively reduce the impact of interference information on the retrieval process, thereby effectively improving the image retrieval efficiency and the accuracy of image retrieval results.

The above summary is for illustration purposes only and is not intended to be limiting in any way. In addition to the illustrative aspects, embodiments and features described above, further aspects, embodiments and features of the present disclosure will be readily apparent by reference to the drawings and the following detailed description.

Description of the drawings

In the drawings, unless otherwise specified, the same reference numbers refer to the same or similar parts or elements throughout the several figures. The drawings are not necessarily to scale. It should be understood that these drawings depict only some embodiments in accordance with the disclosure and are not to be considered limiting of the scope of the disclosure.

Figure 1 is a schematic flowchart of an image retrieval method that combines RPA and AI to implement IA proposed by an embodiment of the present disclosure;

Figure 2 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure;

Figure 3 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure;

Figure 4 is a schematic flow chart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure;

Figure 5 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure;

Figure 6 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure;

Figure 7 is a schematic structural diagram of an image retrieval device that combines RPA and AI to implement IA proposed by an embodiment of the present disclosure;

Figure 8 is a schematic structural diagram of an image retrieval device that combines RPA and AI to implement IA proposed by another embodiment of the present disclosure;

FIG. 9 shows a structural block diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the drawings are exemplary and are only used to explain the present disclosure and are not to be construed as limitations of the present disclosure.

In the description of the present disclosure, the term "plurality" means two or more.

In the description of this disclosure, the term "Robotic Process Automation (RPA)" refers to the automatic execution of process tasks according to rules on a computer through robot application software.

In the description of this disclosure, the term "Artificial Intelligence (AI)" refers to the study of using computers to simulate certain human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both There are hardware-level technologies and software-level technologies. Artificial intelligence hardware technology generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, and big data processing; artificial intelligence software technology mainly includes computer vision technology, speech recognition technology, natural language processing technology, and machine learning, Deep learning, big data processing technology, knowledge graph technology and other major directions.

In the description of this disclosure, the term "Intelligent Automation (IA)" refers to a series of technologies from robotic process automation to artificial intelligence, combining RPA with Optical Character Recognition (OCR), intelligent character Recognition (Intelligent Character Recognition, ICR), Process Mining (Process Mining), Deep Learning (DL), Machine Learning (ML), Natural Language Processing (Natural Language Processing, NLP), Speech Recognition (Automatic Speech) Recognition, ASR), speech synthesis (Text To Speech, TTS), computer vision (CV) and other AI technologies are combined to create an end-to-end business process that can think, learn and adapt, covering everything from processes to The entire process from discovery, process automation, to using data to manage and optimize business processes through automatic and continuous data collection, understanding the meaning of data, and using data.

In the description of this disclosure, the term "initial image" refers to an image to be retrieved. The initial image can be, for example, a car image captured by a traffic monitoring device, or it can be any kind of image containing a retrieval object. There are no restrictions on this.

In the description of this disclosure, the term "image description information" refers to data information that describes one or more relevant features of the initial image. The image description information can be used to describe the multi-dimensional features of the image, and Features include, for example, scale features, pixel features, etc. The image description information may specifically include, for example, image scale information of the initial image, and/or pixel feature information, and/or processing parameter information, etc., without limitation.

In the description of this disclosure, the term "initial area image" refers to a partial image intercepted from the initial image using artificial intelligence-based AI technology. The initial area image may include a target area image.

In the description of this disclosure, the term "target area image" refers to an image obtained by processing the initial area image using the image description information as a reference basis.

In the description of the present disclosure, the term "target content" refers to the content obtained by retrieving the target area image as the retrieval reference basis during the image retrieval process. The target content can be, for example, a retrieved picture, There are no restrictions on the text, audio and video, etc. that describe the retrieved images.

In the description of the present disclosure, the term "subject" refers to the main description object contained in the initial image, such as a person in a portrait picture, a car in a car display picture, etc., without limitation.

In the description of this disclosure, the term "subject information" refers to information related to the retrieval object contained in the initial image, such as the position information and area information of the retrieval object in the initial image, and is not limited to this.

In the description of this disclosure, the term "position description information" can be used to describe information related to the location of the subject in the initial image, such as the distribution and proportion of the subject in the initial image. This is not done limit.

In the description of the present disclosure, the term "image scale information" may refer to relevant information used to describe the scale of the initial image, such as the size, area, etc. of the initial image, without limitation.

In the description of the present disclosure, the term "pixel feature information" can be used to describe the feature information of pixels contained in the initial image, such as the number of pixels, color, etc., without limitation.

In the description of the present disclosure, the term "processing parameter information" may refer to the expansion factor, fill color, brightness, hue, saturation, sharpening degree, etc. specified in advance for the initial image, without limitation.

In the description of the present disclosure, the term "area pixel" refers to a pixel corresponding to one or more image areas in the initial image area.

In the description of this disclosure, the term "regional pixel features" refers to the relevant features of regional pixels obtained based on pixel feature information, such as the number, color, etc. of regional pixels.

In the description of the present disclosure, the term "region image to be filled" refers to an image obtained by enlarging the initial region image based on image scale information.

In the description of the present disclosure, the term "pixels to be filled" refers to pixels in the image of the area to be filled that need to be filled.

In the description of the present disclosure, the term "first pixel feature" refers to the relevant feature of the pixel to be filled that is obtained based on the pixel feature information.

In the description of the present disclosure, the term "second pixel feature" refers to the relevant features of regional pixels in other regional images obtained based on pixel feature information.

In the description of this disclosure, the term "filling pixel feature" refers to the pixel feature obtained based on the first pixel feature and the second pixel feature. The filling pixel feature can be used to perform filling processing as an image of the area to be filled. Reference.

In the description of the present disclosure, the term "semantic representation information" may be used for information that characterizes image-related features of a target area. For example, the target area image can have image features such as color features, contour features, linear features, center features, diagonal features, texture features, local features, and shape features, and there are no restrictions on this. The semantic representation information may refer to information that characterizes one or more of the above image features.

In the description of this disclosure, the term "target object outline" refers to the outline of the retrieval target object contained in the target area image, such as the outline of the human body in the person image, the outline of the car light in the car image, and is not limited to this .

In the description of the present disclosure, the term "object contour information" refers to relevant information obtained based on the contour of the target object.

In the description of the present disclosure, the term "contour vector representation" refers to a vector representation that can represent the contour information of an object and is mapped in a vector space. The vector representation can be a feature obtained by mapping features to a vector space. , such as contour features.

In the description of the present disclosure, the term "candidate similarity level" refers to the level in the graph data structure to which the degree of similarity between the content represented by the semantic representation information and the initial image belongs.

In the description of the present disclosure, the term "graph data structure" refers to a data structure established in advance in the vector retrieval library using vector distance as the basis for division. The graph data structure can be used to perform image retrieval in the image retrieval process. Vector distance is used as a reference to find candidate similarity levels to narrow the search scope.

The intelligent automation platform can realize the seamless integration of RPA, Intelligent Document Processing (IDP), Conversational AI (CoAI), Process Mining and other capabilities, and has the capabilities of "business understanding", " The five major categories of functions, "Process Creation", "Run Anywhere", "Centralized Management and Control", and "Human-Machine Collaboration", enable enterprises to realize end-to-end intelligent automation of business processes, replace manual operations, further improve business efficiency, and accelerate digital transformation.

Intelligent document processing (IDP) is one of the core capabilities of the intelligent automation platform. Intelligent Document Processing (IDP) is based on Optical Character Recognition (OCR), Computer Vision (CV), Natural Language Processing (NLP), Knowledge Graph (KG) ) and other AI technologies, it can identify, classify, extract elements, verify, compare, and correct errors of various types of documents, and is a new generation of automation technology that helps enterprises realize the intelligence and automation of document processing.

These and other aspects of embodiments of the present disclosure will become apparent with reference to the following description and accompanying drawings. In these descriptions and drawings, some specific implementations of the embodiments of the disclosure are specifically disclosed to represent some of the ways of implementing the principles of the embodiments of the disclosure, but it should be understood that the scope of the embodiments of the disclosure is not limited by this restriction. On the contrary, the disclosed embodiments include all changes, modifications and equivalents falling within the spirit and scope of the appended claims.

An image retrieval method that combines RPA and AI to implement IA according to an embodiment of the present disclosure is described below with reference to the accompanying drawings.

Figure 1 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by an embodiment of the present disclosure.

In this embodiment, an image retrieval method that combines RPA and AI to implement IA is configured as an image retrieval device that combines RPA and AI to implement IA. In this embodiment, the image retrieval method that combines RPA and AI to implement IA can be configured in Among the image retrieval devices that combine RPA and AI to implement IA, the image retrieval device that combines RPA and AI to implement IA can be installed in a server or in an electronic device. This is not limited in the embodiments of the present disclosure.

In this embodiment, an image retrieval method that combines RPA and AI to implement IA is configured in an electronic device as an example. Among them, electronic devices such as smartphones, tablets, personal digital assistants, e-books and other hardware devices with various operating systems.

It should be noted that the execution subject of the embodiments of the present disclosure may be, in terms of hardware, a central processing unit (CPU) in a server or an electronic device, and in terms of software, it may be, for example, a server or a related processor in an electronic device. Background service, there is no restriction on this.

In addition, the "retrieval" in the embodiment of the present disclosure refers to the image retrieval process of realizing intelligent automation IA by combining robotic process automation RPA and artificial intelligence AI. In other words, the image retrieval process is a fully automated image retrieval process. The retrieval process, and the image retrieval process is also combined with artificial intelligence AI to realize automated image retrieval in the field of Natural Language Processing (NLP).

The present disclosure can be specifically applied to the field of natural language processing (NLP) of artificial intelligence AI. Natural language processing (NLP), that is, computer science, artificial intelligence, and linguistics focus on computer and human (natural) language. areas of interaction.

For example, in the embodiment of the present disclosure, based on the full-process automated image retrieval process, the full-process automation can be implemented to obtain the initial image based on the robotic process automation RPA technology, and the initial area image can be intercepted from the initial image based on the artificial intelligence AI technology. The initial area image is processed according to the image description information to obtain the target area image, and the target content is retrieved based on the target area image.

As shown in Figure 1, the image retrieval method that combines RPA and AI to implement IA includes:

S101: Obtain an initial image based on Robotic Process Automation (RPA) technology, where the initial image has image description information.

Among them, Robotic Process Automation (RPA) refers to the automatic execution of process tasks according to rules on the computer through robot application software.

The initial image refers to an image to be retrieved. The initial image may be, for example, a car image captured by a traffic monitoring device, or may be any type of image containing a retrieval object, and is not limited to this.

That is to say, an application scenario in the embodiment of the present disclosure may be, for example, using Robotic Process Automation (RPA) to obtain the car image captured by the traffic monitoring device, using the car image as the initial image, and then the acquired initial image can be Perform image retrieval that combines RPA and AI to implement IA to determine the car information in the initial image. Alternatively, the image retrieval method that combines RPA and AI to implement IA described in the embodiments of the present disclosure can also be applied to any other possible image retrieval. In the scene, there is no restriction on this.

Among them, the image description information refers to the data information that describes one or more relevant features of the initial image. The image description information can be used to describe the multi-dimensional features of the image, and the features are, for example, scale features, pixel features etc., the image description information may specifically be, for example, image scale information of the initial image, and/or pixel feature information, and/or processing parameter information, etc., without limitation.

In the embodiment of the present disclosure, when acquiring the initial image based on Robotic Process Automation (RPA) technology, the application data interface can be pre-configured, and the RPA robot will receive the user-robot interaction via the application data interface according to the preset software operation process. Interaction image information, and use the obtained image information of the interaction between the user and the robot as the initial image.

In other embodiments, a third-party image collection device can also be used, and a communication link between the execution subject of the embodiment of the present disclosure and the third-party image collection device can be established in advance, and the data collected by the third-party image collection device can be obtained based on Robotic Process Automation (RPA). Image, use it as the initial image, or you can use any other possible method based on Robotic Process Automation (RPA) to obtain the initial image, and there is no limit to this.

S102: Intercept the initial area image from the initial image based on artificial intelligence AI technology.

Among them, Artificial Intelligence (AI) refers to the study of using computers to simulate certain human thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.). It has both hardware-level technology and software-level technology. Technology. Artificial intelligence hardware technology generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, and big data processing; artificial intelligence software technology mainly includes computer vision technology, speech recognition technology, natural language processing technology, and machine learning, Deep learning, big data processing technology, knowledge graph technology and other major directions.

The initial area image refers to a partial image intercepted from the initial image based on artificial intelligence AI technology, and the initial area image may include a target area image.

In the embodiment of the present disclosure, after acquiring the initial image based on robotic process automation (RPA) technology, the initial area image can be intercepted from the initial image based on artificial intelligence (AI) technology.

In the embodiment of the present disclosure, when intercepting the initial area image from the initial image based on artificial intelligence AI technology, artificial intelligence AI may be used to identify the boundary information of the subject image in the initial image, and then intercept the initial area from the initial image based on the boundary information. image.

In other embodiments, a pre-trained matting model can also be used to intercept the initial image to obtain the initial region image, or any other possible method based on artificial intelligence (AI) can be used to intercept the initial region from the initial image. images, no restrictions are placed on this.

S103: Process the initial area image according to the image description information to obtain the target area image.

Among them, the target area image refers to the image obtained by processing the initial area image using the image description information as a reference basis.

In the embodiment of the present disclosure, when the initial area image is processed according to the image description information to obtain the target area image, the image description information and the initial area image can be input into the pre-trained image processing model to obtain the target area image and transmit it to The execution subject of the embodiment of the present disclosure may also use any other possible methods, such as mathematical and engineering methods, to process the initial area image according to the image description information to obtain the target area image, which is not limited.

S104: Retrieve target content based on the target area image.

Therefore, embodiments of the present disclosure can effectively combine RPA and AI to realize intelligent automation (IA) of the image retrieval process, thereby effectively improving the automation of image retrieval and reducing labor costs.

Among them, the target content refers to the content obtained by retrieving the target area image as the retrieval reference basis during the image retrieval process. The target content can be, for example, the retrieved picture or the text describing the retrieved picture. , audio and video, etc., there are no restrictions on this.

In the embodiment of the present disclosure, an image retrieval database can be obtained in advance, and the image retrieval database can contain the above target content, so as to achieve retrieval of similar images in the image retrieval database based on the target area image.

In the embodiment of the present disclosure, after the initial area image is processed according to the image description information and the target area image is obtained, the target content can be retrieved based on the target area image.

In the embodiments of the present disclosure, when retrieving target content based on the target area image, the classification feature information of the target area image can be determined, and then the target content can be retrieved based on the classification feature information.

In other embodiments, a retrieval learning model can be pre-trained. The retrieval learning model can perform feature analysis of the target area image, and perform retrieval operations on the image retrieval library based on the obtained feature analysis results, or any other possibility can be used. The method retrieves the target content based on the target area image, without any restrictions.

In this embodiment, the initial image is obtained based on Robotic Process Automation RPA technology, the initial area image is intercepted from the initial image based on artificial intelligence AI technology, the initial area image is processed according to the image description information, and the target area image is obtained, and the target area image is retrieved based on the target area image. Target content can use RPA combined with artificial intelligence AI to realize intelligent automation IA of image retrieval. It can preprocess images in time before image retrieval to remove interference information in the image and effectively improve the performance of the obtained target area image in the retrieval process. Targeted, effectively reduce the impact of interference information on the retrieval process, thereby effectively improving image retrieval efficiency and the accuracy of image retrieval results.

Figure 2 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure.

As shown in Figure 2, the image retrieval method that combines RPA and AI to implement IA includes:

S201: Obtain initial images based on Robotic Process Automation (RPA) technology.

For the description of S201, reference may be made to the above-mentioned embodiments and will not be described again here.

S202: Determine the image scale information of the initial image.

The image scale information may refer to relevant information used to describe the scale of the initial image, such as the size, area, etc. of the initial image, which is not limited.

In embodiments of the present disclosure, after acquiring the initial image based on robotic process automation (RPA) technology, the image scale information of the initial image can be determined, and the pre-trained image scale algorithm model can be used to perform algorithm analysis on the initial image to obtain the pre-trained image scale algorithm model. The output scale information is used as the image scale information of the initial image. The obtained image scale information can effectively characterize the feature information of the initial image from the scale dimension.

S203: Determine the pixel feature information of the initial image.

Among them, the pixel feature information can be used to describe the feature information of the pixels contained in the initial image, such as the number of pixels, hue, etc., and there is no limit to this.

In the embodiments of the present disclosure, after acquiring the initial image based on Robotic Process Automation (RPA) technology, the pixel feature information of the initial image can also be determined, and a convolutional neural network can be used to perform feature analysis and processing on the initial image to obtain the pixel features output by the convolutional neural network. The information is used as the pixel feature information of the initial image, and the obtained pixel feature information can effectively characterize the feature information of the initial image from the dimension of pixel features.

S204: Determine the processing parameter information specified for the initial image.

The processing parameter information may refer to information such as expansion factor, fill color, brightness, hue, saturation, and sharpening degree specified in advance for the initial image, and there is no limit to this.

In the embodiment of the present disclosure, after the initial image is obtained based on the Robotic Process Automation (RPA) technology, the processing parameter information specified for the initial image can also be determined. The processing parameter information can be configured based on the user configuration instructions, or it can also be determined in advance. The template information of the target area image is then analyzed and compared with the initial image based on the template information, and the processing parameter information applicable to the initial image is determined based on the obtained comparison results. There is no limit to this.

S205: Use image scale information, and/or pixel feature information, and/or processing parameter information as image description information.

In the embodiment of the present disclosure, after acquiring the initial image based on the Robotic Process Automation RPA technology, the image scale information, and/or pixel feature information, and/or processing parameter information of the initial image can be acquired, and one or more of them can be used as the image Description information, and the obtained image description information can be used as a reference for subsequent initial region image processing.

In this embodiment, by determining the image scale information of the initial image, and/or determining the pixel feature information of the initial image, and/or determining the processing parameter information specified for the initial image, and using the image scale information, and/or the pixel feature information , and/or process parameter information as image description information, thereby enabling the obtained image description information to characterize the relevant information of the initial image from multiple feature dimensions, making the image description information applicable to different image preprocessing scenarios. When based on When the image description information processes the initial area image, it can achieve multi-dimensional effective processing, effectively improve the reference value of the obtained target area image in the image retrieval process, thereby effectively improving the flexibility of the image retrieval process.

S206: Call the natural language processing NLP service to identify the subject information in the initial image.

Among them, Natural Language Processing (NLP) services take language as the object and use computer technology to analyze, understand and process natural language. That is, the computer is used as a powerful tool for language research, and language information is processed with the support of the computer. Conduct quantitative research and provide language description services that can be used between humans and computers.

Among them, the subject refers to the main description object contained in the initial image, such as the person in the portrait picture, the car in the car display picture, etc., and there is no limit to this.

Among them, the main information refers to the relevant information of the retrieval object contained in the initial image, such as the position information and area information of the retrieval object in the initial image, and there is no limit to this.

In the embodiment of the present disclosure, after using the image scale information, and/or pixel feature information, and/or processing parameter information as image description information, the natural language processing NLP service can be called to identify the subject information in the initial image. The image can be used The subject detection algorithm detects the subject information in the initial image, or it can also annotate the subject of the image through human-computer collaboration to obtain the subject information in the initial image, thereby reducing the cost of annotation, and there is no restriction on this.

S207: According to the subject information, determine that the subject corresponds to the position description information in the initial image.

The position description information refers to relevant information that can be used to describe the location of the subject in the initial image, such as the distribution and proportion of the subject in the initial image, etc., and there is no limit to this.

In the embodiment of the present disclosure, after calling the natural language processing NLP service to identify the subject information in the initial image, it can be determined based on the subject information that the subject corresponds to the position description information in the initial image. The obtained position description information can be used for subsequent processing from the initial image. The regional image corresponding to the location description information is intercepted from the image to provide a reliable reference basis.

S208: Intercept the area image corresponding to the position description information from the initial image as the initial area image.

The initial area image refers to a partial image intercepted from the initial image using artificial intelligence-based AI technology. The initial area image may include a target area image.

In the embodiment of the present disclosure, after determining that the subject corresponds to the position description information in the initial image based on the subject information, the region image corresponding to the position description information can be intercepted from the initial image as the initial region image, and the region image corresponding to the position description information can be intercepted in the initial image according to the position description information. Annotate and then crop based on the annotation information to obtain the initial region image.

In this embodiment, the natural language processing NLP service is called to identify the subject information in the initial image. Based on the subject information, it is determined that the subject corresponds to the position description information in the initial image, and the area corresponding to the position description information is intercepted from the initial image. The image is used as the initial area image. Since the initial image may contain interference information other than the subject information, and this interference information may affect the efficiency and accuracy of the retrieval process, when calling the natural language processing NLP service to identify the subject in the initial image information, and determine that the subject corresponds to the position description information in the initial image based on the subject information, so that the obtained position description information can effectively represent the position information of the subject in the initial image, and then intercept the area image corresponding to the position description information from the initial image As an initial area image, the interference information in the obtained initial area image can be effectively reduced, thereby improving the accuracy of the obtained initial area image in representing subject information.

S209: Process the initial area image according to the image description information to obtain the target area image.

For the description of S209, reference may be made to the above-mentioned embodiment, and details will not be described again here.

S210: Determine the semantic representation information of the target area image.

Among them, semantic representation information refers to information that can be used to characterize the image-related features of the target area. For example, the target area image can have image features such as color features, contour features, linear features, center features, diagonal features, texture features, local features, and shape features, and there are no restrictions on this. The semantic representation information may refer to information that characterizes one or more of the above image features.

In the embodiment of the present disclosure, the semantic representation information of the target area image is determined by pre-training a feature extractor for the target area image, and then inputting the target area image into the feature extractor to obtain a feature vector of one or more dimensions. Representation information, and the obtained feature vector representation information of one or more dimensions is used as the semantic representation information of the target area image, or any other possible method can be used to determine the semantic representation information of the target area image, without limitation.

Optionally, in some embodiments, determining the semantic representation information of the target area image may include identifying the target object outline from the target area image, determining the object outline information according to the target object outline, processing the object outline information, and obtaining the outline vector representation, Contour vector representation is used as semantic representation information. Since the object contour can effectively represent the characteristic information of the object, when the object contour information is determined based on the target object contour and the object contour information is processed to obtain the contour vector representation, the contour vector representation can be effectively improved. Representation effect, and then using the contour vector representation as semantic representation information can effectively improve the applicability of the obtained semantic representation information in the image retrieval process.

Among them, the target object outline refers to the outline of the retrieval target object contained in the target area image, such as the human body outline in the person image and the car outline lights in the car image. There is no limit to this.

Among them, the object contour information refers to the relevant information obtained based on the contour of the target object.

Among them, the contour vector representation refers to a vector representation that can represent the contour information of an object and is mapped in a vector space. The vector representation can be a feature obtained by mapping features to a vector space, such as a contour feature.

In the embodiments of the present disclosure, obtaining the contour vector representation may include performing operations such as dimensionality reduction, whitening, and pooling on the target area image, extracting the contour features of the subject in the target area image, and mapping them into the vector space to obtain the contour. Vector representation.

In the embodiment of the present disclosure, when determining the semantic representation information of the target area image, the visual neural network can be used to identify the target object outline from the target area image, determine the object outline information according to the target object outline, and process the object outline information to obtain Contour vector representation, using contour vector representation as semantic representation information.

S211: Retrieve target content based on semantic representation information.

In the embodiment of the present disclosure, after the semantic representation information of the target area image is determined above, the target content can be retrieved according to the semantic representation information, and the semantic representation information can be used as a reference basis to retrieve pictures that meet the above semantic representation information in the image retrieval database. to get the target content.

Optionally, in some embodiments, retrieving the target content according to the semantic representation information may be to determine the candidate similarity level corresponding to the semantic representation information, where the candidate similarity level belongs to a pre-built graph data structure, and the candidate similarity level The level is the level to which the similarity between the corresponding represented content and the initial image belongs. Then the content represented by the candidate similarity level in the graph data structure is used as the target content. Since there may be a large amount of data in the retrieval database, when determining the The candidate similarity level corresponding to the semantic representation information, and using the content represented by the candidate similarity level in the graph data structure as the target content, can greatly reduce the computational cost of the retrieval process and effectively improve the retrieval efficiency.

Among them, the candidate similarity level refers to the level of similarity between the content represented by the semantic representation information and the initial image in the graph data structure.

Among them, the graph data structure refers to the data structure established in advance in the vector retrieval library using vector distance as the basis for division. The graph data structure can be used to use vector distance as the reference basis to find candidate similarities during the image retrieval process. hierarchies to narrow the search scope.

In the embodiments of the present disclosure, when retrieving target content according to the semantic representation information, candidate similarity levels corresponding to the semantic representation information can be determined, where the candidate similarity levels belong to a pre-constructed graph data structure, and the candidate similarity levels, is the level to which the similarity between the corresponding represented content and the initial image belongs, and then the content represented by the candidate similarity level in the graph data structure is used as the target content.

In this embodiment, the semantic representation information of the target area image is determined, and the target content is retrieved based on the semantic representation information. Since the semantic representation information can effectively characterize the relevant features of the target area image, when the target content is retrieved based on the semantic representation information, the target content can be effectively retrieved. Improving the pertinence and purpose of the search process can effectively improve the reliability of search results.

In this embodiment, by determining the image scale information of the initial image, and/or determining the pixel feature information of the initial image, and/or determining the processing parameter information specified for the initial image, and using the image scale information, and/or the pixel feature information , and/or process parameter information as image description information, thereby enabling the obtained image description information to characterize the relevant information of the initial image from multiple feature dimensions, making the image description information applicable to different image preprocessing scenarios. When based on When the image description information processes the initial area image, it can achieve multi-dimensional effective processing, effectively improve the reference value of the obtained target area image in the image retrieval process, thereby effectively improving the flexibility of the image retrieval process. By calling the natural language processing NLP service to identify the subject information in the initial image, based on the subject information, determine that the subject corresponds to the location description information in the initial image, and intercept the area image corresponding to the location description information from the initial image as the initial area image , since the initial image may contain interference information other than the subject information, and this interference information may affect the efficiency and accuracy of the retrieval process, when calling the natural language processing NLP service to identify the subject information in the initial image, and based on the subject The information determines that the subject corresponds to the position description information in the initial image, so that the obtained position description information can effectively represent the position information of the subject in the initial image, and then intercepts the area image corresponding to the position description information from the initial image as the initial area image, It can effectively reduce the interference information in the obtained initial area image, thereby improving the accuracy of the obtained initial area image to represent the subject information. By determining the semantic representation information of the target area image and retrieving the target content based on the semantic representation information, since the semantic representation information can effectively represent the relevant features of the target area image, when the target content is retrieved based on the semantic representation information, the retrieval process can be effectively improved. sex and purpose, which can effectively improve the reliability of search results. Since the object contour can effectively represent the characteristic information of the object, when the object contour information is determined based on the target object contour and the object contour information is processed to obtain the contour vector representation, the representation effect of the contour vector representation can be effectively improved, and then the contour vector representation is used as Semantic representation information can effectively improve the applicability of the obtained semantic representation information in the image retrieval process. Since there may be a large amount of data in the retrieval database, when the candidate similarity levels corresponding to the semantic representation information are determined and the content represented by the candidate similarity levels in the graph data structure is used as the target content, the calculation of the retrieval process can be greatly reduced. cost, and effectively improve retrieval efficiency.

FIG. 3 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure.

As shown in Figure 3, the image retrieval method that combines RPA and AI to implement IA includes:

S301: Obtain the initial image based on Robotic Process Automation (RPA) technology.

S302: Determine the image scale information of the initial image.

S303: Use image scale information as image description information.

S304: Intercept the initial area image from the initial image based on artificial intelligence AI technology.

For descriptions of S301-S304, reference may be made to the above-mentioned embodiments and will not be described again here.

S305: Expand the initial area image according to the image scale information.

In the embodiment of the present disclosure, after determining the image scale information of the initial image and intercepting the initial area image from the initial image based on artificial intelligence AI technology, the initial area image can be enlarged according to the image scale information, so that the image obtained after processing The scale of is equal to the scale of the initial image or any other scale suitable for the image retrieval process, and there is no restriction on this.

S306: Use the enlarged image as the target area image.

In the embodiment of the present disclosure, after the initial area image is enlarged according to the image scale information, the enlarged image can be used as the target area image.

In this embodiment, the image scale information is used as image description information, the initial area image is expanded according to the image scale information, and the expanded image is used as the target area image. Since the image is intercepted from the initial image based on artificial intelligence AI technology The scale of the obtained initial area image may be low. When the initial area image is expanded according to the image scale information, and the enlarged image is used as the target area image, it can effectively avoid the initial area image being too low and affecting the retrieval effect. It can effectively improve the reliability of the obtained target area image as a retrieval basis.

S307: Retrieve target content based on the target area image.

For the description of S307, reference may be made to the above-mentioned embodiment, and details will not be described again here.

In this embodiment, the initial area image is enlarged according to the image scale information, and the enlarged image is used as the target area image. Since the scale of the initial area image intercepted from the initial image based on artificial intelligence AI technology may be relatively large, Low, when the initial area image is enlarged according to the image scale information, and the enlarged image is used as the target area image, it can effectively avoid the initial area image scale being too low and affect the retrieval effect, and can effectively improve the resulting target area image as a retrieval The reliability of the basis.

FIG. 4 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure.

As shown in Figure 4, the image retrieval method that combines RPA and AI to implement IA includes:

S401: Obtain the initial image based on Robotic Process Automation (RPA) technology.

S402: Determine the pixel feature information of the initial image.

S403: Use pixel feature information as image description information.

S404: Intercept the initial area image from the initial image based on artificial intelligence AI technology.

For descriptions of S401-S404, reference may be made to the above-mentioned embodiments and will not be described again here.

S405: Obtain the area pixels in the initial image area.

Among them, regional pixels refer to pixels corresponding to one or more image regions in the initial image.

In the embodiment of the present disclosure, after determining the pixel feature information of the initial image and intercepting the initial area image from the initial image based on artificial intelligence AI technology, the area pixels in the initial image area can be obtained.

S406: Parse the regional pixel features of the regional pixels from the pixel feature information.

Among them, regional pixel features refer to the relevant features of regional pixels obtained based on pixel feature information, such as the number and color of regional pixels.

In the embodiment of the present disclosure, after obtaining the regional pixels in the initial image area, matching processing can be performed based on the above-mentioned pixel feature information and the regional pixels to analyze and obtain the regional pixel features of the regional pixels.

S407: Enhance the regional pixel features of each area pixel in the initial image area to obtain the target area image.

In the embodiment of the present disclosure, after parsing the regional pixel features of the regional pixels from the pixel feature information, the regional pixel features of each regional pixel in the initial image area can be enhanced to improve the recognition of the regional pixel features of each regional pixel. , get the target area image.

In this embodiment, the pixel feature information is used as the image description information. By obtaining the regional pixels in the initial image region, the regional pixel features of the regional pixels are analyzed from the pixel feature information, and the regional pixel features of each regional pixel in the initial image region are analyzed. Enhancement processing is performed to obtain the target area image. Since the intensity of regional pixel features may affect the image retrieval effect, when the regional pixel features of each regional pixel in the initial image area are enhanced, the representation of the subject image of the obtained target area image can be effectively improved. capabilities, thereby improving the pertinence and accuracy of the image retrieval process.

S408: Retrieve the target content according to the target area image.

For the description of S408, reference may be made to the above-mentioned embodiments and will not be described again here.

In this embodiment, by obtaining the regional pixels in the initial image area, parsing the regional pixel characteristics of the regional pixels from the pixel characteristic information, and performing enhancement processing on the regional pixel characteristics of each regional pixel in the initial image area, the target area image is obtained. Since The strength of regional pixel features may affect the image retrieval effect. When the regional pixel features of each regional pixel in the initial image area are enhanced, the representation ability of the obtained target area image to the subject image can be effectively improved, thereby improving the targeting of the image retrieval process. sex and accuracy.

FIG. 5 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure.

As shown in Figure 5, the image retrieval method that combines RPA and AI to implement IA includes:

S501: Obtain the initial image based on Robotic Process Automation (RPA) technology.

S502: Determine the image scale information of the initial image.

S503: Determine the pixel feature information of the initial image.

S504: Use image scale information and pixel feature information as image description information.

S505: Intercept the initial area image from the initial image based on artificial intelligence AI technology.

For descriptions of S501-S505, reference may be made to the above-mentioned embodiments and will not be described again here.

S506: Expand the initial area image according to the image scale information to obtain an image of the area to be filled, where the image of the area to be filled includes: pixels to be filled.

Among them, the image of the area to be filled refers to the image obtained by enlarging the initial area image based on the image scale information.

Among them, the pixels to be filled refer to the pixels in the image of the area to be filled that need to be filled.

In the embodiment of the present disclosure, after using the image scale information and pixel feature information as image description information, and intercepting the initial area image from the initial image based on artificial intelligence AI technology, the initial area image can be enlarged according to the image scale information to obtain For the area image to be filled, the initial area image can be enlarged according to the image scale information, and the scale of the initial area image is adjusted to the scale of the initial image or any other scale value suitable for the image retrieval process, and the initial area image after the enlargement process is The area image is used as the area image to be filled.

S507: Parse the first pixel feature of the pixel to be filled from the pixel feature information.

The first pixel feature refers to the relevant feature of the pixel to be filled that is obtained based on the pixel feature information.

In the embodiment of the present disclosure, after the initial area image is expanded according to the image scale information to obtain the area image to be filled, the first pixel feature of the pixel to be filled can be analyzed from the pixel feature information, and the first pixel feature of the pixel to be filled can be combined based on the pixel feature information. The pixels are matched, and the feature information in the pixel feature information that matches the pixel to be filled is used as the first pixel feature.

S508: Parse the second pixel features of the regional pixels in other regional images from the pixel feature information, where the initial regional image and other regional images together constitute the initial image.

The second pixel feature refers to the relevant features of regional pixels in other regional images obtained based on pixel feature information.

In the embodiment of the present disclosure, after parsing the first pixel feature of the pixel to be filled from the pixel feature information, the second pixel feature of the regional pixels in the image of other regions can be parsed from the pixel feature information, and other regions can be combined based on the pixel feature information. The regional pixels in the image are matched, and the characteristic information in the pixel feature information that matches the regional pixels in other regional images is used as the second pixel feature.

S509: Generate filling pixel features based on the first pixel feature and the second pixel feature.

The filling pixel feature refers to the pixel feature obtained based on the first pixel feature and the second pixel feature. The filling pixel feature can be used as a reference for filling the image of the area to be filled.

In the embodiment of the present disclosure, after parsing the first pixel feature of the pixel to be filled from the pixel feature information and parsing the second pixel feature of the regional pixels in other regional images from the pixel feature information, the method can be based on the first pixel feature and the second Pixel features are used to generate filled pixel features. A pre-trained machine learning model can be used to analyze the first pixel feature and the second pixel feature to generate filled pixel features.

S510: Perform filling processing on the image of the area to be filled according to the characteristics of the filling pixels to obtain the target area image.

In the embodiment of the present disclosure, after generating the filling pixel characteristics according to the first pixel characteristics and the second pixel characteristics, the to-be-filled area image can be filled according to the filling pixel characteristics to obtain the target area image, and the pixels to be filled can be determined based on the filling pixel characteristics. Then, the area image to be filled is filled based on the pixels to be filled, so as to obtain the target area image.

In this embodiment, the image scale information and pixel feature information are used as image description information. The initial area image is expanded according to the image scale information to obtain the area image to be filled. The first pixel of the pixel to be filled is parsed from the pixel feature information. Features, analyze the second pixel features of area pixels in other area images from the pixel feature information, generate filling pixel features based on the first pixel features and second pixel features, fill the area image to be filled according to the filling pixel features, and obtain the target Therefore, while ensuring that the size of the obtained area image to be filled conforms to the normal image size, it can be filled by combining the first pixel feature and the second pixel feature to avoid affecting the representation of the image due to the enlargement process. , which can greatly improve the representation effect of the obtained target area image.

S511: Retrieve target content based on the target area image.

For the description of S511, reference may be made to the above-mentioned embodiment, and details will not be described again here.

In this embodiment, the initial region image is expanded according to the image scale information to obtain the region image to be filled, the first pixel feature of the pixel to be filled is parsed from the pixel feature information, and the regions in other region images are parsed from the pixel feature information. The second pixel feature of the pixel is used to generate a filling pixel feature based on the first pixel feature and the second pixel feature. The to-be-filled area image is filled according to the filling pixel feature to obtain the target area image. Thus, the obtained area to be filled can be guaranteed. While the size of the image conforms to the normal image size, the first pixel feature and the second pixel feature are combined to fill it to avoid affecting the representation of the image due to the enlargement process, thereby greatly improving the quality of the obtained target area image. representation effect.

FIG. 6 is a schematic flowchart of an image retrieval method for implementing IA by combining RPA and AI proposed by another embodiment of the present disclosure.

As shown in Figure 6, the image retrieval method that combines RPA and AI to implement IA includes:

S601: Obtain the initial image based on Robotic Process Automation (RPA) technology.

S602: Determine the processing parameter information specified for the initial image.

S603: Use the processing parameter information as image description information.

S604: Intercept the initial area image from the initial image based on artificial intelligence AI technology.

For descriptions of S601-S604, reference may be made to the above-mentioned embodiments and will not be described again here.

S605: Process the initial area image according to the processing parameter information to obtain the target area image.

Among them, the processing parameter information refers to the expansion factor, fill color, brightness, hue, saturation, sharpening degree, etc. specified in advance for the initial image, and there is no limit to this.

In the embodiment of the present disclosure, after using the processing parameter information as image description information and intercepting the initial area image from the initial image based on artificial intelligence AI technology, the initial area image can be processed according to the processing parameter information to obtain the target area image, which can be predetermined Based on the processing parameter information of the initial image (such as the specified expansion factor, fill color, brightness, hue, saturation, sharpening degree, etc.), and then adjust the corresponding parameters of the initial area image based on the processing parameter information to obtain Target area image.

In this embodiment, the processing parameter information is used as the image description information, and the target area image is obtained by processing the initial area image according to the processing parameter information. Since the processing parameter information can be correspondingly configured according to the user configuration instructions, when the initial area image is processed based on the processing parameter information, The regional image can flexibly process the initial regional image according to the application scenario to obtain the target region image suitable for the image retrieval process, thereby effectively improving the flexibility of the initial regional image processing process.

S606: Retrieve target content based on the target area image.

For the description of S606, reference may be made to the above-mentioned embodiments, and details will not be described again here.

In this embodiment, the target area image is obtained by processing the initial area image according to the processing parameter information. Since the processing parameter information can be correspondingly configured according to the user configuration instructions, when the initial area image is processed based on the processing parameter information, the initial area image can be processed according to the application scenario. The regional image is flexibly processed to obtain a target region image suitable for the image retrieval process, thereby effectively improving the flexibility of the initial region image processing process.

Figure 7 is a schematic structural diagram of an image retrieval device that combines RPA and AI to implement IA proposed by an embodiment of the present disclosure.

As shown in Figure 7, the image retrieval device 70 that combines RPA and AI to implement IA is applied in the field of natural language processing NLP, including:

The acquisition module 701 is used to acquire an initial image based on robotic process automation RPA technology, where the initial image has image description information;

The first processing module 702 is used to intercept the initial area image from the initial image based on artificial intelligence AI technology;

The second processing module 703 is used to process the initial area image according to the image description information to obtain the target area image;

The retrieval module 704 is used to retrieve target content according to the target area image.

In some embodiments of the present disclosure, the first processing module 702 is specifically used to:

Call the natural language processing NLP service to identify the subject information in the initial image;

According to the subject information, determine that the subject corresponds to the position description information in the initial image;

A region image corresponding to the position description information is intercepted from the initial image as the initial region image.

In some embodiments of the present disclosure, as shown in Figure 8, Figure 8 is a schematic structural diagram of an image retrieval device that combines RPA and AI to implement IA proposed by another embodiment of the present disclosure. The image retrieval device also includes:

Determining module 705, used to determine the image scale information of the initial image; and/or determine the pixel feature information of the initial image; and/or determine the processing parameter information specified for the initial image; and combine the image scale information, and/or the pixel feature information , and/or process parameter information as image description information.

In some embodiments of the present disclosure, the image description information includes: image scale information;

Among them, the second processing module 703 is specifically used for:

Expand the initial area image according to the image scale information;

The enlarged image is used as the target area image.

In some embodiments of the present disclosure, the image description information includes: pixel feature information;

Among them, the second processing module 703 is also used for:

Get the area pixels in the initial image area;

Parse regional pixel features of regional pixels from pixel feature information;

The regional pixel features of each area pixel in the initial image area are enhanced to obtain the target area image.

In some embodiments of the present disclosure, the image description information includes: image scale information and pixel feature information;

Among them, the second processing module 703 is also used for:

The initial area image is expanded according to the image scale information to obtain the area image to be filled, where the area image to be filled includes: pixels to be filled;

Parse the first pixel feature of the pixel to be filled from the pixel feature information;

Parse second pixel features of regional pixels in other regional images from the pixel feature information, where the initial regional image and other regional images together constitute the initial image;

Generate filling pixel features according to the first pixel feature and the second pixel feature;

The image of the area to be filled is filled according to the characteristics of the filled pixels to obtain the target area image.

In some embodiments of the present disclosure, the image description information includes: processing parameter information;

Among them, the second processing module 703 is also used for:

The initial area image is processed according to the processing parameter information to obtain the target area image.

In some embodiments of the present disclosure, the retrieval module 704 includes:

Determination sub-module 7041, used to determine the semantic representation information of the target area image;

The retrieval sub-module 7042 is used to retrieve target content based on semantic representation information.

In some embodiments of the present disclosure, the determination sub-module 7041 is specifically used for:

Identify the target object outline from the target area image;

According to the contour of the target object, determine the object contour information;

Process the object contour information and obtain the contour vector representation;

Contour vector representation as semantic representation information.

In some embodiments of the present disclosure, the search sub-module 7042 is specifically used for:

Determine the candidate similarity level corresponding to the semantic representation information, where the candidate similarity level belongs to a pre-constructed graph data structure, and the candidate similarity level is the level to which the similarity between the corresponding represented content and the initial image belongs;

The content represented by the candidate similarity level in the graph data structure is used as the target content.

Corresponding to the image retrieval method that combines RPA and AI to implement IA provided by the above embodiments of FIG. 1 to FIG. 6 , the present disclosure also provides an image retrieval device that combines RPA and AI to implement IA. Since the embodiment of the disclosure provides an image retrieval method that combines RPA with AI The image retrieval device that implements IA with AI corresponds to the image retrieval method that combines RPA and AI to implement IA provided in the above embodiments of Figures 1 to 6. Therefore, the implementation of the image retrieval method that combines RPA and AI to implement IA is also applicable to The image retrieval device that combines RPA and AI to implement IA provided by the embodiment of the present disclosure will not be described in detail in the embodiment of the present disclosure.

For the functions of each module in each device of the embodiment of the present disclosure, please refer to the corresponding description in the above method, and will not be described again here.

In order to implement the above embodiments, the present disclosure also proposes an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, the aforementioned embodiments of the present disclosure are implemented. The proposed image retrieval method combines RPA and AI to realize IA.

FIG. 9 shows a structural block diagram of an electronic device according to an embodiment of the present disclosure. As shown in FIG. 9 , the electronic device 90 includes: a memory 910 and a processor 920 . The memory 910 stores a computer program that can run on the processor 920 . When the processor 920 executes the computer program, it implements the image retrieval method for implementing IA by combining RPA and AI in the above embodiment. The number of memory 910 and processor 920 may be one or more.

The electronic device 90 also includes:

The communication interface 930 is used to communicate with external devices and perform data interactive transmission.

If the memory 910, the processor 920 and the communication interface 930 are implemented independently, the memory 910, the processor 920 and the communication interface 930 can be connected to each other through a bus and complete communication with each other. The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 9, but it does not mean that there is only one bus or one type of bus.

Optionally, in specific implementation, if the memory 910, the processor 920 and the communication interface 930 are integrated on one chip, the memory 910, the processor 920 and the communication interface 930 can communicate with each other through the internal interface.

Embodiments of the present disclosure provide a computer-readable storage medium, which stores a computer program. When the program is executed by a processor, the method provided in the embodiment of the present disclosure is implemented.

An embodiment of the present disclosure also provides a chip, which includes a processor for calling and running instructions stored in the memory, so that the communication device installed with the chip executes the method provided by the embodiment of the present disclosure.

Embodiments of the present disclosure also provide a chip, including: an input interface, an output interface, a processor, and a memory. The input interface, the output interface, the processor, and the memory are connected through an internal connection path. The processor is used to execute the code in the memory. , when the code is executed, the processor is used to execute the method provided by the application embodiment.

It should be understood that the above-mentioned processor can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processing, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor, etc. It is worth noting that the processor may be a processor that supports Advanced RISC Machines (ARM) architecture.

Further, optionally, the above-mentioned memory may include read-only memory and random access memory, and may also include non-volatile random access memory. The memory may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. Among them, non-volatile memory can include read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may include Random Access Memory (RAM), which acts as an external cache. By way of illustration, but not limitation, many forms of RAM are available. For example, static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access Memory (Double Data Date SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synchlink DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DR RAM).

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. A computer program product includes one or more computer instructions. When computer program instructions are loaded and executed on a computer, processes or functions in accordance with the present disclosure are produced, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium.

In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials, or features are included in at least one embodiment or example of the present disclosure. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.

In addition, the terms “first” and “second” are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In the description of the present disclosure, "plurality" means two or more than two, unless otherwise expressly and specifically limited.

Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments, or portions of code that include one or more executable instructions for implementing the specified logical functions or steps of the process. . And the scope of the preferred embodiments of the present disclosure includes additional implementations in which functions may be performed out of the order shown or discussed, including in a substantially concurrent manner or in the reverse order, depending on the functionality involved.

The logic and/or steps represented in the flowcharts or otherwise described herein, for example, may be considered a sequenced list of executable instructions for implementing the logical functions, and may be embodied in any computer-readable medium, For use by instruction execution systems, devices or equipment (such as computer-based systems, systems including processors or other systems that can fetch instructions from and execute instructions from the instruction execution system, device or equipment), or in combination with these instruction execution systems, devices or equipment.

It should be understood that various parts of the present disclosure may be implemented in hardware, software, firmware, or combinations thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. All or part of the steps of the method in the above embodiment can be completed by instructing relevant hardware through a program. The program can be stored in a computer-readable storage medium. When executed, the program includes one of the steps of the method embodiment or other steps. combination.

In addition, each functional unit in various embodiments of the present disclosure may be integrated into one processing module, each unit may exist physically alone, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or software function modules. If the above integrated modules are implemented in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium. The storage medium can be a read-only memory, a magnetic disk or an optical disk, etc.

The above are only specific embodiments of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person familiar with the technical field can easily think of various changes or modifications within the technical scope of the present disclosure. alternatives, these should all be covered by the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims

An image retrieval method that combines RPA and AI to implement IA, including:

Obtain an initial image based on Robotic Process Automation (RPA) technology, where the initial image has image description information;

Intercept the initial area image from the initial image based on artificial intelligence AI technology;

Process the initial area image according to the image description information to obtain a target area image;

Target content is retrieved based on the target area image.
The method of claim 1, wherein the intercepting an initial area image from the initial image based on artificial intelligence AI technology includes:

Call the natural language processing NLP service to identify the subject information in the initial image;

According to the subject information, it is determined that the subject corresponds to the position description information in the initial image;

A region image corresponding to the position description information is intercepted from the initial image as the initial region image.
The method according to claim 1 or 2, wherein after the initial image is obtained based on the robotic process automation RPA technology, the method further includes:

Determine the image scale information of the initial image; and/or

Determine the pixel feature information of the initial image; and/or

determining processing parameter information specified for the initial image;

The image scale information, and/or the pixel feature information, and/or the processing parameter information are used as the image description information.
The method of claim 3, wherein the image description information includes: the image scale information;

Wherein, processing the initial area image according to the image description information to obtain a target area image includes:

Expand the initial area image according to the image scale information;

The enlarged image is used as the target area image.
The method of claim 3, wherein the image description information includes: the pixel feature information;

Wherein, processing the initial area image according to the image description information to obtain a target area image includes:

Obtain regional pixels in the initial image region;

Analyze the regional pixel characteristics of the regional pixels from the pixel characteristic information;

The regional pixel features of each regional pixel in the initial image region are enhanced to obtain the target region image.
The method of claim 3, wherein the image description information includes: the image scale information and the pixel feature information;

Wherein, processing the initial area image according to the image description information to obtain a target area image includes:

The initial area image is expanded according to the image scale information to obtain an area image to be filled, wherein the area image to be filled includes: pixels to be filled;

Parse the first pixel feature of the pixel to be filled from the pixel feature information;

Parse second pixel features of regional pixels in other regional images from the pixel feature information, wherein the initial regional image and the other regional images together constitute the initial image;

Generate filling pixel features according to the first pixel feature and the second pixel feature;

Perform filling processing on the image of the area to be filled according to the characteristics of the filling pixels to obtain the image of the target area.
The method of claim 3, wherein the image description information includes: the processing parameter information;

Wherein, processing the initial area image according to the image description information to obtain a target area image includes:

The initial area image is processed according to the processing parameter information to obtain the target area image.
The method according to any one of claims 1 to 7, wherein retrieving target content according to the target area image includes:

Determine the semantic representation information of the target area image;

The target content is retrieved based on the semantic representation information.
The method of claim 8, wherein determining the semantic representation information of the target area image includes:

Identify the target object outline from the target area image;

Determine object contour information according to the target object contour;

Process the object contour information to obtain a contour vector representation;

The contour vector representation is used as the semantic representation information.
The method of claim 8 or 9, wherein retrieving target content according to the semantic representation information includes:

Determine a candidate similarity level corresponding to the semantic representation information, wherein the candidate similarity level belongs to a pre-constructed graph data structure, and the candidate similarity level is the difference between its corresponding represented content and the initial image. The level of similarity between them;

The content represented by the candidate similarity level in the graph data structure is used as the target content.
An image retrieval device that combines RPA and AI to implement IA, including:

An acquisition module, configured to acquire an initial image based on Robotic Process Automation (RPA) technology, where the initial image has image description information;

A first processing module, configured to intercept an initial area image from the initial image based on artificial intelligence AI technology;

a second processing module, configured to process the initial area image according to the image description information to obtain a target area image;

A retrieval module, configured to retrieve target content based on the target area image.
The device according to claim 11, wherein the first processing module is specifically used for:

Call the natural language processing NLP service to identify the subject information in the initial image;

According to the subject information, it is determined that the subject corresponds to the position description information in the initial image;

A region image corresponding to the position description information is intercepted from the initial image as the initial region image.
The device of claim 11 or 12, further comprising:

a determining module, configured to determine the image scale information of the initial image; and/or determine the pixel feature information of the initial image; and/or determine the processing parameter information specified for the initial image; and convert the image scale information , and/or the pixel feature information, and/or the processing parameter information as the image description information.
The device of claim 13, wherein the image description information includes: the image scale information;

Wherein, the second processing module is specifically used for:

Expand the initial area image according to the image scale information;

The enlarged image is used as the target area image.
The device of claim 13, wherein the image description information includes: the pixel feature information;

Wherein, the second processing module is also used for:

Obtain regional pixels in the initial image region;

Analyze the regional pixel characteristics of the regional pixels from the pixel characteristic information;

The regional pixel features of each regional pixel in the initial image region are enhanced to obtain the target region image.
The device of claim 13, wherein the image description information includes: the image scale information and the pixel feature information;

Wherein, the second processing module is also used for:

The initial area image is expanded according to the image scale information to obtain an area image to be filled, wherein the area image to be filled includes: pixels to be filled;

Parse the first pixel feature of the pixel to be filled from the pixel feature information;

Parse second pixel features of regional pixels in other regional images from the pixel feature information, wherein the initial regional image and the other regional images together constitute the initial image;

Generate filling pixel features according to the first pixel feature and the second pixel feature;

Perform filling processing on the image of the area to be filled according to the characteristics of the filling pixels to obtain the image of the target area.
The device of claim 13, wherein the image description information includes: the processing parameter information;

Wherein, the second processing module is also used for:

The initial area image is processed according to the processing parameter information to obtain the target area image.
The device according to any one of claims 11 to 17, wherein the retrieval module includes:

Determining submodule, used to determine the semantic representation information of the target area image;

The retrieval sub-module is used to retrieve the target content according to the semantic representation information.
An electronic device including:

at least one processor and memory;

The memory stores computer execution instructions;

The at least one processor executes the computer execution instructions stored in the memory, so that the at least one processor executes the image retrieval method for implementing IA by combining RPA and AI according to any one of claims 1-10.
A computer-readable storage medium, wherein computer-executable instructions are stored in the computer-readable storage medium. When the processor executes the computer-executable instructions, the combined RPA method as described in any one of claims 1-10 is implemented. and AI implement IA image retrieval method.