CN109992172B

CN109992172B - Information processing method and equipment

Info

Publication number: CN109992172B
Application number: CN201910252347.8A
Authority: CN
Inventors: 李凡智; 刘旭国; 周大凯; 杨良印; 邵昕
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2024-01-26
Anticipated expiration: 2039-03-29
Also published as: CN109992172A

Abstract

The embodiment of the application discloses an information processing method and equipment, wherein the method comprises the following steps: identifying the acquired first image, wherein the first image is a display image of the first device; determining a first operation for the first image at least based on first identification information if the first identification result indicates that the first image contains the first identification information, wherein the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field; a control operation mechanism performs the first operation with respect to the first device.

Description

Information processing method and equipment

Technical Field

The present application relates to the field of information processing, and relates to, but is not limited to, an information processing method and apparatus.

Background

With the rapid development of science and technology, automation technology is widely applied to the aspects of industry, agriculture, military, scientific research, medical treatment, service, families and the like. The automatic technology can be adopted to automatically detect, process information, analyze and judge, operate and control according to the requirements of people under the direct participation of no people or fewer people, so that the expected target can be realized. Thus, not only can the person be liberated from heavy physical labor, but also the labor productivity is greatly improved.

When automatic control or automatic operation is performed, the current state of the equipment needs to be identified firstly, then the next operation control is determined based on the current state, and the current state of the equipment is identified by mainly adopting an image contrast technology at present, but the requirement on photographing is higher and multiple resolutions cannot be adapted, so that the problem of low efficiency is caused.

Disclosure of Invention

In view of this, the technical solution of the embodiments of the present application is implemented as follows:

in one aspect, an embodiment of the present application provides an information processing method, including:

identifying the acquired first image, wherein the first image is a display image of the first device;

determining a first operation for the first image at least based on first identification information if the first identification result indicates that the first image contains the first identification information, wherein the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field;

a control operation mechanism performs the first operation with respect to the first device.

In another aspect, an embodiment of the present application provides an information processing apparatus, including at least:

The first interface is used for obtaining a first image, wherein the first image is acquired by the image acquisition device aiming at a display image of the first equipment;

processing means for recognizing the first image, determining a first operation for the first image based on at least the first recognition information if a first recognition result indicates that the first image contains the first recognition information, and controlling an operation mechanism to execute the first operation for the first device; the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field.

The embodiment of the application provides an information processing method and equipment, wherein an acquired first image is firstly identified, wherein the first image is a display image of first equipment; determining a first operation for the first image at least based on first identification information if the first identification result indicates that the first image contains the first identification information, wherein the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field; controlling an operating mechanism to execute the first operation on the first device; therefore, the current state of the equipment can be identified through the first identification information obtained by carrying out text identification on the first image, so that the first operation to be executed in the next step is determined, the identification accuracy can be improved, and the automatic working efficiency is improved.

Drawings

Fig. 1 is a schematic implementation flow chart of an information processing method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an implementation flow of another information processing method according to an embodiment of the present application;

fig. 3 is a schematic diagram of recognition effects of text recognition combined with icon recognition according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a collaborative test performed according to an embodiment of the present application;

fig. 5 is a schematic diagram of the composition structure of an information processing apparatus according to an embodiment of the present application;

fig. 6 is a schematic diagram of the composition structure of an information processing apparatus according to an embodiment of the present application.

Detailed Description

The technical scheme of the present application is further elaborated below with reference to the drawings and specific embodiments.

The present embodiment provides an information processing method applied to an information processing apparatus, the functions achieved by the information processing method can be achieved by calling program codes by processing means in the information processing apparatus, and of course, the program codes can be stored in a computer storage medium, and it is seen that the information processing apparatus includes at least the processing means and the storage medium.

Fig. 1 is a schematic implementation flow chart of an information processing method according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:

in step S101, the information processing apparatus recognizes the acquired first image.

Here, the first image is a display image of the first device. The information processing apparatus may include an image pickup device, which may be a camera. The first device may be an intelligent terminal, for example, a mobile terminal with wireless communication capability, such as a smart phone, a tablet computer, a notebook computer, and the like. It should be noted that, in the embodiment of the present application, the first device includes at least a display device having a touch function.

When step S101 is implemented, the information processing device may capture, through a camera, a first image on a display interface of the first device, and then perform text recognition on the first image to obtain a first recognition result.

Step S102, if the first identification result shows that the first image contains first identification information, the information processing device determines a first operation for the first image at least based on the first identification information.

Here, the first identification information includes at least one identification text field and location information corresponding to the at least one identification text field.

In other embodiments, before the step S102, the method further includes: the information processing apparatus determines whether the first recognition result indicates that the first image contains first recognition information, which may be determined by a pre-designed operation script. If the first identification result indicates that the first image contains the first identification information, it is indicated that the first image displayed by the first device accords with the operation rule of the operation script, and then step S102 is performed.

When step S102 is implemented, if the first recognition result indicates that the first image includes the first recognition information, the information processing apparatus may determine, according to a preset operation script and the first recognition information, a target text field or target location information corresponding to the first image, and further determine a first operation for the first image based on the target text field or the target location information.

Step S103, the information processing apparatus controls an operation mechanism to execute the first operation with respect to the first apparatus.

Here, the operating mechanism may be an integral part of the information processing apparatus, the processor of the information processing apparatus and the operating mechanism may perform transmission of information and data through a communication bus, and the operating mechanism may also perform transmission of information and data with the information processing apparatus through a wireless communication connection.

In other embodiments, the operating mechanism may also be an entity independent of the information processing apparatus, with which the operating mechanism establishes a wireless communication connection or a wired communication connection for transmission of information and data.

Step S103 may be implemented by the information processing apparatus controlling the operation mechanism to contact a display area of a display device corresponding to the destination location information in the first apparatus, so that an induction device overlapping with the display device of the first apparatus may obtain a contact operation; the information processing device may control the operating mechanism to output the audio information corresponding to the target text field, so that the audio acquisition device of the first device may obtain the audio information.

In the information processing method provided by the embodiment of the application, the acquired first image is identified, wherein the first image is a display image of the first device; determining a first operation for the first image based on at least the first identification information if the first identification result indicates that the first image contains the first identification information; and controlling an operating mechanism to execute the first operation for the first device; in this way, the first operation to be executed in the next step can be determined through the first identification information obtained by carrying out text identification on the first image in the automatic operation, so that the identification accuracy rate is improved, and the automatic operation efficiency is further improved.

Based on the foregoing embodiments, the embodiments of the present application further provide an information processing method, which is applied to an information processing apparatus, and fig. 2 is a schematic flow chart of implementation of another information processing method according to the embodiments of the present application, as shown in fig. 2, where the method includes the following steps:

in step S201, the information processing apparatus acquires a first image displayed by the first apparatus based on an operation instruction to start an automated operation.

Here, the information processing apparatus may be an automation apparatus. An automated device refers to a device that operates automatically, processes data, and displays or outputs results in an appropriate manner with little or no human involvement. The information processing apparatus includes at least a host and an operation box, the operation box and the host establishing a wired or wireless communication connection. The operation box at least comprises an image acquisition device, a fixing device of the operated equipment and an operation mechanism, wherein the image acquisition device can be a camera, can acquire images displayed in a display screen of the operated equipment, the operation mechanism can be a manipulator, can simulate a human hand to operate the tested equipment, and in other embodiments, the operation mechanism can also be a voice output device, such as a sound box, and can output voice data.

In this embodiment, after the information processing apparatus receives an operation instruction for starting an automation operation, a first image displayed by a first apparatus is acquired through an image acquisition device in an operation box, where the first apparatus is an operated apparatus.

In step S202, the information processing apparatus performs text recognition on the first image, and obtains a first recognition result.

Here, in the embodiment of the present application, after the first image is acquired, text recognition is performed on the image to acquire text information contained in the first image, so as to determine display content in the first image.

When step S202 is implemented, the characters in the first image may be first identified by using an optical character recognition (Optical Character Recognition, OCR) technology, and then the identified characters are matched with a preset word stock to determine a first recognition result. It should be noted that, the word stock used here may be a limited word stock composed of words included in each level of interfaces of each application according to names of each application included in the market at present, so that recognition efficiency can be improved.

In step S203, the information processing apparatus determines whether the first identification result includes first identification information.

When the automation operation is performed, an operation script is pre-generated or pre-designed, wherein the expected result of each operation is recorded in the operation script, and if the result obtained by the step is verified to be consistent with the expected result, what step is executed next. The information processing apparatus judging whether the first recognition result includes the first recognition information may be regarded as a process of verifying the first recognition result or a process of verifying the first image to determine whether the first recognition result coincides with the expected result. In this embodiment, the first identification information may be regarded as an expected result, and if the first identification information is included in the first result, it indicates that the first identification result matches the expected result, and the process proceeds to step S204; if the first identification result does not contain the first identification information, the first identification result is not consistent with the expected result, and the process is ended.

In step S204, the information processing apparatus determines, based on the first identification information, target position information corresponding to the first image.

Here, step S204 may be implemented by determining the target position information corresponding to the first image by operating the script and the first identification information. The target position information refers to a position corresponding to the next execution operation.

For example, the first identification information includes "11:30", "3 months 25 days", "setting", "application market", "memo", "calculator", "mobile phone manager", "gallery", eight text fields, and location information of the eight text fields, where the first image may be considered as a desktop image of the first device, and based on the operation script, it is determined that the application of "mobile phone manager" is to be entered next, and then the target location information is the location information corresponding to the text field of "mobile phone manager".

In step S205, the information processing apparatus controls the control operation mechanism to contact the display area of the display device corresponding to the destination location information in the first apparatus, so that the sensing device disposed overlapping with the display device of the first apparatus can obtain the contact operation.

After determining the target position information, the information processing apparatus may send a control instruction to an operating mechanism in the operation box, where the control instruction at least includes the target position information, and after receiving the control instruction, the operating mechanism moves to a display area corresponding to the target position information carried in the control instruction, and contacts the display area, so that a sensing device that is overlapped with the display device of the first apparatus may obtain a contact operation.

In this embodiment, after receiving the contact operation, the first device executes a corresponding instruction based on the contact operation. For example, when the contact operation is to contact "cell phone steward", the first device opens the application "cell phone steward" based on the contact operation, and the first device displays the second image.

In step S206, the information processing apparatus acquires a second image displayed by the first apparatus.

In step S207, the information processing apparatus performs text recognition on the acquired second image, and obtains a second recognition result.

Here, each text field in the second image, and the position information in each text field are included in the second recognition result.

Step S208, if the second recognition result indicates that the second image includes the second recognition result, determining a second operation for the second image based on at least the second recognition result.

Here, the second recognition result includes at least one recognition text field and location information corresponding to the at least one recognition text field. If the second recognition result indicates that the second image contains the second recognition result, then the second recognition result is considered to be consistent with the expected result, at which point it may be further determined what step to perform next.

For example, the second recognition result includes four text fields, i.e., one-key optimization, cleaning acceleration, harassment interception, virus killing, and position information corresponding to the four text fields, and based on the operation script, it is determined that the second operation is cleaning acceleration, and then the position corresponding to the second operation may be determined based on the position information of the text field, i.e., cleaning acceleration, in the second recognition result.

In step S209, the information processing apparatus controls the operation mechanism to execute the second operation with respect to the first apparatus.

Here, when step S209 is implemented, the information processing apparatus may control the control operation mechanism to contact the display area in the display device corresponding to the destination location information of the second operation in the first apparatus, so that the sensing device disposed overlapping the display device of the first apparatus may obtain the contact operation, so that the first apparatus may execute the operation instruction corresponding to the contact operation.

In other embodiments, after the first device performs the second operation, the current image displayed by the first device may be further acquired, text recognition may be performed on the image, and the image may be verified based on a recognition result obtained by the recognition, and an operation to be performed next may be determined.

In the information processing method provided by the embodiment of the application, after the information processing device receives an operation instruction for starting automatic operation, a first image displayed in a first device to be operated is obtained based on the operation instruction, text recognition is carried out on the first image to obtain a first recognition result, if the first recognition result indicates that the first image contains first recognition information, the first image is considered to be in accordance with an expected result, further, a first operation to be executed in the next step is determined based on the first recognition information, an operation mechanism is controlled to execute the first operation on the first device, so that the operation of the next round can be carried out again, verification is carried out on a second image displayed by the first device, and a second operation to be executed in the next step is determined; in the information processing method, the text recognition is used for verifying the image displayed by the first device, and a word stock formed by the names of the applications appearing on the market and the vocabulary in the application display interface is adopted in the recognition process, so that the recognition accuracy can be greatly improved, the automatic operation can be ensured to be continuously carried out, and the working efficiency is further improved.

It should be noted that, in other embodiments, the steps S204 and S205 may be replaced by the following steps:

Step S204', the information processing equipment determines a target text field corresponding to the first image based on the first identification information;

in step S205', the information processing apparatus controls the operating mechanism to output the audio information corresponding to the target text field, so that the audio acquisition device of the first apparatus can obtain the audio information.

Here, the operation mechanism may be an audio output device, for example, a microphone, in an actual implementation process, if a voice instruction is sent to the first device, the voice acquisition device of the first device needs to be started by a preset operation instruction, and after the voice acquisition device is started, audio information corresponding to the target text field is output, so that the first device acquires the audio information and then executes the operation instruction corresponding to the audio information.

In the actual implementation process, step S202 may be implemented by the following steps:

step S2021, performing text recognition on the first image, to obtain N identified original text fields and location information of the N original text fields.

Here, when step S2021 is implemented, text recognition may be performed on the first image by using OCR technology, or text recognition may be performed on the first image by using other text recognition methods to obtain N recognized original text fields and location information of the N original text fields, which is not limited herein.

Step S2022, matching the ith original text field with a preset word stock to obtain a first matching value and a second matching value of the ith original text field.

Here, the first matching value is larger than the second matching value, i=1, 2, …, N.

The preset word stock can be a limited word stock composed of words according to names of all applications and all levels of interfaces of all applications in the market, so that matching efficiency can be improved when the identified original text fields are matched.

When step S2022 is implemented, the ith original text field may be first matched with each lexicon text field in the lexicon to obtain each matching value between the ith original text field and each lexicon text field; in this embodiment of the present application, the largest matching value among the matching values is determined as a first matching value, and the largest matching value among the matching values except for the first matching value is determined as a second matching value.

In step S2023, it is determined whether the first matching value and the second matching value of the ith original text field satisfy the preset condition.

Here, if the first matching value and the second matching value satisfy a preset condition, the thesaurus field corresponding to the first matching value may be regarded as the recognition text field; step S2024 is entered at this time; if the first matching value and the second matching value do not meet the preset condition, it is indicated that the word stock field corresponding to the first matching value cannot be considered as the recognition text field, and step S2025 is performed to recognize the image corresponding to the i-th original text field again.

If there are a plurality of words in the word stock that are similar to the words in the original text field, the first matching value and the second matching value are relatively similar, and in this case, in order to correctly identify the ith field, the first matching value and the second matching value must be required to satisfy a preset condition. In this embodiment of the present application, the preset condition may be that the first matching value is 1.5 times the second matching value, for example, the i-th original text field is "calculator", the first matching value is 67%, the second matching value is 10% and corresponds to the term "payment treasured", and since the first matching value and the second matching value satisfy the preset condition, step S2024 is proceeded.

In step S2024, if the first matching value and the second matching value of the ith original text field meet the preset condition, the word stock field corresponding to the first matching value is determined as the ith identified text field.

Here, taking the example in step S2023, the "calculator" is determined as the i-th recognition text field.

In step S2025, if the first matching value and the second matching value of the ith original text field do not meet the preset condition, a fifth image corresponding to the location information on the first device is acquired based on the location information of the ith original text field.

Here, if the i-th original text field is "ticker", the first matching value is 80%, the corresponding word stock field is "ticker", the second matching value is 75%, and the corresponding word stock field is "ticker", even if the first matching value is already quite high, but because the second matching value is quite close to the first matching value, which means that at least two words in the word stock are similar to the original text field, then the first matching value and the second matching value are considered to not meet the preset condition, and need to be identified again. At this time, the fifth image corresponding to the position information may be acquired again based on the position information of the i-th original text field.

It should be noted that, when the fifth image is acquired, the position of the image acquisition device and the acquisition parameters are adjusted according to the position information, so as to acquire the complete and clear fifth image.

Step S2026, identifying the fifth image, and obtaining a text field contained in the fifth image.

Here, when step S2026 is implemented, image matching may be performed on the obtained fifth image, to obtain a matching result, and then a text field included in the fifth image may be determined based on the matching result. When the fifth image is matched, the adopted matching gallery can be an icon of an application appearing on the market at present and images of interfaces of all levels in the application.

With the example in step S2025, since the correct recognition result of "ticker" is not determined, the icon corresponding to "ticker" can be obtained, then image matching is performed, so as to obtain the matching image corresponding to the icon, and then the correct recognition result of "ticker" is determined according to the application name corresponding to the matching image.

In this step, text recognition may be performed on the fifth image to obtain an original text field in the fifth image, and then matching the original text field with a word stock to determine an i-th recognition text field.

Step S2027 determines the text field contained in the fifth image as the i-th recognition text field.

Step S2028 determines a first recognition result based on the N recognition text fields and the position information corresponding to the N recognition fields.

In the embodiment of steps S2021 to S2028, when the text recognition is performed on the first image, the obtained original text field may be matched with a preset limited word stock to determine text information included in the first image, and when the recognition result cannot be determined by using the text recognition, the text recognition may be combined with the icon recognition, so that not only the recognition accuracy rate may be improved, but also a higher recognition efficiency may be ensured.

Fig. 3 is a schematic diagram showing Recognition effects of combining text Recognition with Icon Recognition according to the embodiment of the present application, as shown in fig. 3, the accuracy is 81% by performing Recognition through Icon Recognition (Icon Recognition, IR) alone, the accuracy is 95% by performing Recognition through OCR alone, and when Recognition is performed through an artificial intelligence (Artificial Intelligence, AI) search engine customized by combining OCR and Icon Recognition (Icon Recognition, IR), the Recognition accuracy can reach over 96%.

Based on the foregoing embodiments, an embodiment of the present application further provides an information processing method, which is applied to an information processing system that is composed of an information processing apparatus, a first apparatus, and a second apparatus, the method including:

in step S301, the information processing apparatus acquires a first image displayed by a first apparatus.

Here, in the embodiment of the present application, the information processing apparatus includes one host and at least two operation boxes, and one of the at least two operation boxes is to be operated, including the first apparatus and the second apparatus, and both operation boxes establish wireless or wired connection with the host.

In step S302, the information processing apparatus performs text recognition on the first image, and obtains a first recognition result.

In step S303, the information processing apparatus determines whether the first identification result includes first identification information.

Here, if the first identification information is included in the first identification result, the first image may be considered to coincide with the expected result at this time, and the flow proceeds to step S304, and if the first identification information is not included in the first identification result, the flow is ended.

In step S304, the information processing apparatus determines, based on the first identification information, target position information corresponding to the first image.

In step S305, the information processing apparatus controls the control operation mechanism to contact the display area of the display device corresponding to the destination location information in the first apparatus, so that the sensing device disposed overlapping with the display device of the first apparatus can obtain the contact operation.

Here, in the present embodiment, the first operation may be an operation of instructing the first device to transmit a message to the second device. For example, if the first operation is to input "hello" in the text input box and click on the display area corresponding to the button control "send" in the first device, the operation mechanism simulates the operation of a human hand, inputs "hello" and sends the message.

In step S306, the information processing device obtains a third image currently displayed by the first device, and identifies the third image to obtain a third identification result.

Here, the third image is an image displayed after the first device transmits the message to the second device. In this embodiment, the recognition of the third image may be text recognition of the third image to obtain text information contained in the third image.

In step S307, the information processing device acquires a fourth image currently displayed by the second device, and identifies the fourth image to acquire a fourth identification result.

Here, the fourth image is an image displayed after the second device receives the message sent by the first device.

In step S308, the information processing apparatus verifies the third identification result and the fourth identification result, and obtains a verification result to verify whether the execution result of the first operation is correct.

Here, the third recognition result and the fourth recognition result are verified to verify whether the execution result of the first operation is correct, and it may be verified whether the transmitted message content is included in the third recognition result and the fourth recognition result, and whether the position where the message content is displayed is correct.

It should be noted that the same steps or concepts as those in the other embodiments in this embodiment may be explained with reference to the descriptions in the other embodiments.

In the information processing method provided in the embodiment of the present application, cooperative testing based on the information processing device, the first device and the second device is completed, and fig. 4 is a schematic diagram of performing cooperative testing in the embodiment of the present application, as shown in fig. 4, a communication connection is established between a host 401 in the information processing device and a box a 402, a box b 403, a first device to be tested is provided in the box a, and a second device to be tested is provided in the box b. The TSi can be regarded as an ith test case in the test script, when the host computer performs the test of the TSi, the host computer needs to cooperate with the TSj to complete the test, at the moment, an available box b needs to be found and is distributed to the TSj to perform the test, and the host computer monitors and coordinates the box a and the box b to complete the TSi, so that the interaction test between the two devices is completed through the cooperation of the information processing device, the first device and the second device.

In other embodiments, before the information processing apparatus acquires the first image, the method further includes:

in step 41, the information processing apparatus acquires size information and resolution information of the display device of the first apparatus.

Here, when step 41 is implemented, detailed parameters of the first device may be acquired through information such as a brand, a model, etc. of the first device, so that size information and resolution information of the display apparatus of the first device are further acquired.

In step 42, the information processing apparatus determines a target acquisition parameter for image acquisition and/or a target position of the image acquisition device according to the size information and the resolution information.

When the step 42 is implemented, the image capturing device of the information processing apparatus may first capture a preset standard resolution table, and analyze an image of the standard resolution table obtained by capturing, so as to determine an actual performance parameter of the image capturing device, and further determine a target capturing parameter for performing image capturing and/or a target position of the image capturing device according to the size information, the resolution, the position information of the first device, and the actual performance parameter of the image capturing device.

Step 43, the information processing device adjusts the image acquisition device based on the target acquisition parameter and/or the target position, so as to obtain a first image displayed by the first device.

Here, the adjusting the image acquisition device includes adjusting an acquisition parameter and/or a position of the image acquisition device to reach a target acquisition parameter and/or a target position, and further acquiring a first image displayed by the first device, so that accurate matching of the acquired image can be ensured, and the recognition rate is effectively improved from 80% to more than 95%.

An embodiment of the present application provides an information processing apparatus, fig. 5 is a schematic diagram of a composition structure of the information processing apparatus according to the embodiment of the present application, and as shown in fig. 5, the information processing apparatus 500 at least includes: a first identification module 501, a first determination module 502, and a first control module 503, wherein:

the first identifying module 501 is configured to identify an acquired first image, where the first image is a display image of a first device;

the first determining module 502 is configured to determine, if a first recognition result indicates that first recognition information is included in the first image, a first operation for the first image based at least on the first recognition information, where the first recognition information includes at least one recognition text field and location information corresponding to the at least one recognition text field;

the first control module 503 is configured to control an operating mechanism to perform the first operation on the first device.

In other embodiments, the first determining module includes:

the first determining unit is used for determining target position information or a target text field corresponding to the first image based on the first identification information;

And a second determining unit configured to determine a first operation for the first image based on the target position information or the target text field.

In other embodiments, the first control module includes:

a first control unit for controlling an operation mechanism to contact a display area of a display device corresponding to the destination position information in the first device, so that a sensing device overlapped with the display device of the first device can obtain a contact operation; or,

the second control unit is used for controlling the operating mechanism to output the audio information corresponding to the target text field, so that the audio acquisition device of the first equipment can acquire the audio information.

In other embodiments, the apparatus further comprises:

the second identification module is used for identifying an acquired second image, wherein the second image is a display image of the first device;

a second determining module, configured to determine, if the first recognition result indicates that the second image includes a second recognition result, a second operation for the second image based at least on the second recognition result, where the second recognition result includes at least one recognition text field and location information corresponding to the at least one recognition text field;

And the second control module is used for controlling an operating mechanism to execute the second operation on the first equipment.

In other embodiments, when the first operation is an operation that instructs the first device to send a message to a second device, the apparatus further comprises:

the third identification module is used for acquiring a third image currently displayed by the first equipment, identifying the third image and acquiring a third identification result;

the fourth identification module is used for acquiring a fourth image currently displayed by the second equipment, identifying the fourth image and acquiring a fourth identification result;

and the verification module is used for verifying the third identification result and the fourth identification result to obtain a verification result so as to verify whether the execution result of the first operation is correct.

In other embodiments, the apparatus further comprises:

a first obtaining module, configured to obtain size information and resolution information of a display device of the first apparatus;

the third determining module is used for determining target acquisition parameters for image acquisition and/or target positions of the image acquisition device according to the size information and the resolution information;

and the adjusting module is used for adjusting the image acquisition device based on the target acquisition parameters and/or the target position so as to obtain a first image displayed by the first equipment.

In other embodiments, the first identification module includes:

the first acquisition unit is used for carrying out text recognition on the first image and acquiring N recognized original text fields and position information of the N original text fields;

the first matching unit is used for matching the ith original text field with a preset word stock to obtain a first matching value and a second matching value of the ith original text field, wherein the first matching value is larger than the second matching value, and i=1, 2, … and N;

a third determining unit, configured to determine a word stock field corresponding to the first matching value as an i-th identified text field if the first matching value and the second matching value of the i-th original text field meet a preset condition;

and the fourth determining unit is used for determining the first recognition result based on the N recognition text fields and the position information corresponding to the N recognition fields.

In other embodiments, the first matching unit includes:

the matching subunit is used for matching the ith original text field with each word stock text field in the word stock to obtain each matching value between the ith original text field and each word stock text field;

A first determining subunit configured to determine a largest matching value among the respective matching values as a first matching value;

and a second determining subunit configured to determine, as a second matching value, a largest matching value among the respective matching values, in addition to the first matching value.

In other embodiments, the apparatus further comprises:

the second obtaining module is used for obtaining a fifth image corresponding to the position information on the first device based on the position information of the ith original text field if the first matching value and the second matching value of the ith original text field do not meet a preset condition;

a fifth identifying module, configured to identify the fifth image, and obtain a text field included in the fifth image;

and a fourth determining module, configured to determine a text field included in the fifth image as an i-th identification text field.

In other embodiments, the fifth identification module includes:

the second matching unit is used for carrying out image matching on the fifth image to obtain a matching result;

and a fifth determining unit configured to determine a text field included in the fifth image based on the matching result.

It should be noted here that: the description of the embodiment items of the information processing apparatus above, which is similar to the description of the method above, has the same advantageous effects as those of the embodiment of the method. For technical details not disclosed in the embodiments of the information processing apparatus of the present application, those skilled in the art will understand with reference to the description of the embodiments of the method of the present application.

Based on the foregoing embodiments, the present embodiment provides an information processing apparatus, fig. 6 is a schematic diagram of a composition structure of the information processing apparatus according to the embodiment of the present application, and as shown in fig. 6, the information processing apparatus 600 includes at least:

a first interface 601, configured to obtain a first image, where the first image is acquired by an image acquisition device according to a display image of a first device;

processing means 602, configured to identify the first image, determine a first operation for the first image based on at least the first identification information if the first identification result indicates that the first image includes the first identification information, and control an operation mechanism 603 to perform the first operation for the first device; the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field.

In other embodiments, the operation mechanism 603 includes a pointing device 6031, and the pointing device 6031 is configured to contact a display area of a display device corresponding to the destination position information in the first apparatus based on control of the processing device 602, so that a sensing device disposed overlapping with the display device of the first apparatus can obtain a contact operation.

As shown in fig. 6, the pointing device 6031 is connected to a stepping motor 6032, and the stepping motor 6031 is an actuator that converts an electric pulse into an angular displacement. When the step driver in the step motor receives a pulse signal, the step motor is driven to rotate by a fixed angle according to a set direction. The rotation of the stepping motor is operated step by step at a fixed angle, and the angular displacement can be controlled by controlling the number of pulses, so that the purpose of accurate positioning is achieved, and therefore, the stepping motor can accurately adjust the position of the pointing device based on the control of the processing device, thereby ensuring that the pointing device contacts the correct display area.

In other embodiments, the pointing device further comprises a robotic arm that is a three-axis flexible robotic arm.

In other examples, the operation mechanism 603 may further include an audio output device 6032, where the audio output device 6032 is configured to output audio information corresponding to the target text field based on control of the processing device, so that the audio collecting device of the first apparatus can obtain the audio information.

In other embodiments, the information processing apparatus 600 further includes: the image acquisition device 604 and the acquisition adjustment device 605, wherein the image acquisition device 604 is installed on the acquisition adjustment device 605, and the acquisition adjustment device 605 can adjust the image acquisition device 605 to a target position based on the control of the processing device 602, so as to obtain a first image displayed by the first device. In other embodiments, the acquisition adjustment device may include another stepper motor that may drive the acquisition adjustment device to adjust the position of the image acquisition device based on control of the processing device.

In other embodiments, the processing device performs the following steps when performing the step of determining a first operation for the first image based at least on the first identification information if the first identification result indicates that the first image contains the first identification information:

determining target position information or a target text field corresponding to the first image based on the first identification information;

a first operation for the first image is determined based on the target location information or target text field.

In other embodiments, the processing means, after controlling the operating mechanism to perform said first operation for the first device, further performs the steps of:

identifying an acquired second image, wherein the second image is a display image of the first device;

if the second recognition result indicates that the second image contains a second recognition result, determining a second operation for the second image at least based on the second recognition result, wherein the second recognition result comprises at least one recognition text field and position information corresponding to the at least one recognition text field, and M is a positive integer;

and controlling an operation mechanism to execute the second operation on the first device.

In other embodiments, the processing means, after controlling the operating mechanism to perform the first operation for the first device, further performs the steps of:

acquiring a third image currently displayed by the first equipment, and identifying the third image to acquire a third identification result;

acquiring a fourth image currently displayed by the second equipment, and identifying the fourth image to acquire a fourth identification result;

and verifying the third identification result and the fourth identification result to obtain a verification result so as to verify whether the execution result of the first operation is correct.

In other embodiments, the processing device further performs the steps of:

acquiring size information and resolution information of a display device of the first equipment;

determining target acquisition parameters for image acquisition and/or target positions of an image acquisition device according to the size information and the resolution information;

and adjusting the image acquisition device based on the target acquisition parameters and/or the target position, so as to obtain a first image displayed by the first equipment.

In other embodiments, the processing device performs the step of identifying the acquired first image, and when acquiring the first identification result, performs the following steps:

Performing text recognition on the first image to obtain N recognized original text fields and position information of the N original text fields;

matching an ith original text field with a preset word stock to obtain a first matching value and a second matching value of the ith original text field, wherein the first matching value is larger than the second matching value, i=1, 2, … and N;

if the first matching value and the second matching value of the ith original text field meet a preset condition, determining a word stock field corresponding to the first matching value as an ith identification text field;

and determining a first recognition result based on the N recognition text fields and the position information corresponding to the N recognition fields.

In other embodiments, when the processing device performs the step of matching the ith original text field with a preset word stock to obtain a first matching value and a second matching value of the ith original text field, the processing device performs the following steps:

matching an ith original text field with each word stock text field in the word stock to obtain each matching value between the ith original text field and each word stock text field;

Determining the largest matching value in the matching values as a first matching value;

and determining the largest matching value except the first matching value in the matching values as a second matching value.

In other embodiments, the processing device further performs the steps of:

if the first matching value and the second matching value of the ith original text field do not meet a preset condition, acquiring a fifth image corresponding to the position information on the first device based on the position information of the ith original text field;

identifying the fifth image and acquiring a text field contained in the fifth image;

and determining the text field contained in the fifth image as an ith identification text field.

In other embodiments, when performing the step of identifying the fifth image and acquiring the text field included in the fifth image, the processing device performs the following steps:

performing image matching on the fifth image to obtain a matching result;

and determining a text field contained in the fifth image based on the matching result.

It should be noted here that: the description of the above embodiment items of the information processing apparatus, similar to the description of the above method, has the same advantageous effects as those of the embodiment of the method. For technical details not disclosed in the embodiments of the information processing apparatus of the present application, those skilled in the art will understand with reference to the description of the embodiments of the method of the present application.

Accordingly, the present embodiment further provides a computer storage medium, in which computer executable instructions are stored, which when executed by a processing device, implement the steps of the information processing method provided in the above embodiment.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application. The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read Only Memory (ROM), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the integrated units described above may be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partly contributing to the prior art, and the computer software product may be stored in a storage medium, and include several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An information processing method, comprising:

controlling an operating mechanism to execute the first operation on the first device; wherein,

the control operation mechanism performs the first operation for the first device, including at least one of:

controlling an operation mechanism to contact a display area of a display device corresponding to at least one piece of position information in the first equipment so that a sensing device overlapped with the display device of the first equipment can obtain contact operation; or,

And controlling an operation mechanism to output at least one piece of audio information corresponding to the identification text field, so that the audio acquisition device of the first equipment can acquire the audio information.

2. The method of claim 1, wherein if the first recognition result indicates that the first image includes first recognition information, determining a first operation for the first image based at least on the first recognition information comprises:

3. The method of claim 2, after controlling an operating mechanism to perform the first operation for the first device, the method further comprising:

4. The method as recited in claim 1, the method further comprising:

5. The method of claim 1, identifying the acquired first image, and acquiring an identification result, including:

And determining a recognition result based on the N recognition text fields and the position information corresponding to the N recognition text fields.

6. The method according to claim 5, wherein the step of matching the ith original text field with a preset word stock to obtain a first matching value and a second matching value of the ith original text field includes:

7. The method as in claim 6, further comprising:

determining a text field contained in the fifth image as an i-th recognition text field;

The fifth image is identified, and the text field contained in the fifth image is obtained at least includes: performing image matching on the fifth image to obtain a matching result; and determining a text field contained in the fifth image based on the matching result.

8. An information processing apparatus includes at least:

the first interface is used for obtaining a first image, wherein the first image is acquired by the image acquisition device aiming at a display image of first equipment;

processing means for recognizing the first image, determining a first operation for the first image based on at least the first recognition information if a first recognition result indicates that the first image contains the first recognition information, and controlling an operation mechanism to execute the first operation for the first device; the first identification information comprises at least one identification text field and position information corresponding to the at least one identification text field; wherein,

9. The information processing apparatus according to claim 8, further comprising: the image acquisition device is arranged on the acquisition adjustment device, and the acquisition adjustment device can adjust the image acquisition device to a target position based on the control of the processing device, so that a first image displayed by the first equipment is obtained.