CN112908324B

CN112908324B - Data processing method, device, equipment and system

Info

Publication number: CN112908324B
Application number: CN202110069463.3A
Authority: CN
Inventors: 雍军
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2024-07-30
Anticipated expiration: 2041-01-19
Also published as: CN112908324A

Abstract

The embodiment of the application provides a data processing method, a device, equipment and a system, wherein the method comprises the following steps: acquiring a preset flow, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, and the operation information comprises at least one of the following steps: an operation object, an operation mode and an operation voice; acquiring detection information corresponding to the operation steps, wherein the detection information is acquired by a sensor on a first object and comprises voice information or at least one image; and determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect. And a large amount of manpower resources are not required to be consumed, so that the judging efficiency of the operation steps is improved.

Description

Data processing method, device, equipment and system

Technical Field

The embodiment of the application relates to the technical field of engineering, in particular to a data processing method, a device, equipment and a system.

Background

The flow specification mainly specifies the scope, content, program and processing method of each management service. In some technical fields, strict requirements are imposed on the process specification, so that corresponding process supervision is required for the operation of workers.

The current scheme for monitoring the flow specification is mainly completed by manual field monitoring. When a worker goes out, a plurality of supervisors of different types need to be dispatched to supervise the working process of the worker, and whether the worker accords with relevant specifications when executing various working processes is judged, so that the situations of omission, errors and the like of the worker are avoided.

The proposal of manual field supervision needs to consume a great deal of manpower resources and has lower efficiency.

Disclosure of Invention

The embodiment of the application provides a data processing method, a device, equipment and a system, which are used for solving the problem that a great deal of human resources are consumed by manual on-site supervision workflow specification.

In a first aspect, an embodiment of the present application provides a data processing method, including:

acquiring a preset flow, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, and the operation information comprises at least one of the following steps: an operation object, an operation mode and an operation voice;

Acquiring detection information corresponding to the operation steps, wherein the detection information is acquired by a sensor on a first object and comprises voice information or at least one image;

And determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect.

In one possible implementation manner, determining the detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step includes:

determining contents included in operation information corresponding to any one operation step of the at least one operation step;

And carrying out matching processing on the operation information and detection information corresponding to the operation step according to the content included in the operation information to obtain the detection result.

In a possible implementation manner, the operation information includes the operation voice; according to the content included in the operation information, performing matching processing on the operation information and detection information corresponding to the operation step to obtain the detection result, including:

identifying the voice information in the detection information to obtain first information;

acquiring second information corresponding to the operation voice;

If the first information comprises the second information, determining that the detection result is a first detection result, wherein the first detection result is used for indicating that the operation step of the first object is correct;

and if the first information does not include the second information, determining that the detection result is a second detection result, wherein the second detection result is used for indicating that the operation step of the first object is wrong.

In a possible implementation manner, the operation information includes the operation object and an operation mode; according to the content included in the operation information, performing matching processing on the operation information and detection information corresponding to the operation step to obtain the detection result, including:

performing image recognition on an image set in the detection information to obtain at least one object, wherein the image set comprises at least one image;

if the at least one object comprises the operation object, determining a first mode of operating the operation object by the first object according to the image set, and determining the detection result according to the first mode and the operation mode;

And if the at least one object does not comprise the operation object, determining that the detection result is a second detection result, wherein the second detection result is used for indicating the operation step error of the first object.

In one possible implementation, determining a first manner in which the first object operates the operation object according to the image set includes:

respectively carrying out image recognition on each image in the image set to obtain object information corresponding to each image, wherein the object information comprises at least one of the following components: the position relation between the operation object and a preset object or the position of the operation object in the image;

And determining the first mode according to the object information corresponding to each image, wherein the first mode comprises a first position relation between the operation object and the preset object and/or a first motion track of the operation object.

In a possible implementation manner, the operation mode includes a second position relation between the operation object and the preset object and/or a second motion track of the operation object; according to the first mode and the operation mode, determining the detection result includes:

matching the first position relation with the second position relation to obtain a first matching result, and obtaining the detection result according to the first matching result; or alternatively

Matching the first motion track with the second motion track to obtain a second matching result, and obtaining the detection result according to the second matching result; or alternatively

Matching the first position relation with the second position relation to obtain a first matching result; matching the first motion trail with the second motion trail to obtain a second matching result; and obtaining the detection result according to the first matching result and the second matching result.

In one possible implementation, the obtaining the preset flow includes:

Acquiring a control instruction, wherein the control instruction is a key control instruction or a voice control instruction;

and acquiring the preset flow according to the control instruction.

In one possible embodiment, the sensor is a headset disposed on the head of the first subject, the headset including a microphone and a camera; obtaining detection information corresponding to the operation steps, including:

Acquiring an audio stream from the pickup, and acquiring voice information corresponding to an operation step according to the audio stream and a period corresponding to the operation step; or alternatively

And acquiring a video stream from the camera, and acquiring at least one image corresponding to the operation step according to the video stream and a period corresponding to the operation step.

In a possible implementation manner, for any one operation step, before obtaining the detection information corresponding to the operation step, the method further includes:

and sending a first voice instruction to the loudspeaker, wherein the first voice instruction is used for controlling the loudspeaker to broadcast the operation information corresponding to the operation step.

In one possible implementation manner, for any one operation step, after determining a detection result corresponding to the operation step, the method further includes:

and sending a second voice instruction to the loudspeaker, wherein the second voice instruction is used for controlling the loudspeaker to broadcast the detection result corresponding to the operation step.

In one possible embodiment, the method further comprises:

acquiring a first video, wherein the first video is obtained by shooting an operation executed by the first object in each operation step;

and marking the first video according to the detection result corresponding to each operation step to generate a second video.

In a second aspect, an embodiment of the present application provides a data processing apparatus, including:

The first acquisition module is used for acquiring a preset flow, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, and the operation information comprises at least one of the following steps: an operation object, an operation mode and an operation voice;

The second acquisition module is used for acquiring detection information corresponding to the operation steps, wherein the detection information is acquired by a sensor on the first object and comprises voice information or at least one image;

The processing module is used for determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect.

In a possible implementation manner, the processing module is specifically configured to:

In a possible implementation manner, the operation information includes the operation voice; the processing module is specifically configured to:

acquiring second information corresponding to the operation voice;

In a possible implementation manner, the operation information includes the operation object and an operation mode; the processing module is specifically configured to:

Performing image recognition on the image set in the detection information to obtain at least one object, wherein the image set comprises at least one image;

In a possible implementation manner, the operation mode includes a second position relation between the operation object and the preset object and/or a second motion track of the operation object; the processing module is specifically configured to:

In one possible implementation manner, the first obtaining module is specifically configured to:

and acquiring the preset flow according to the control instruction.

In one possible embodiment, the sensor is a headset disposed on the head of the first subject, the headset including a microphone and a camera; the second obtaining module is specifically configured to:

In a possible implementation manner, for any operation step, the processing module is further configured to, before acquiring detection information corresponding to the operation step:

In a possible implementation manner, for any one operation step, the processing module is further configured to, after determining a detection result corresponding to the operation step:

In one possible embodiment, the processing module is further configured to:

In a third aspect, an embodiment of the present application provides a host device, including:

a memory for storing a program;

a processor for executing the program stored in the memory, the processor being for executing the data processing method according to any one of the first aspects when the program is executed.

In a fourth aspect, an embodiment of the present application provides a data processing system, including a head-mounted device and a host device, wherein:

The head-mounted device is arranged on the head of the first object and is used for collecting the first object to obtain detection information corresponding to each operation step in a preset flow, and the detection information is sent to the host device;

the host device is configured to acquire the detection information and perform the data processing method according to any one of the first aspects.

The data processing method, device, equipment and system provided by the embodiment of the application firstly acquire a preset flow, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, the operation information comprises at least one of an operation object, an operation mode and operation voice, and the operation information can be used for knowing how each operation step needs to be completed by the first object. Then, detection information corresponding to each operation step is acquired, wherein the detection information is acquired by the sensor when the first object executes the operation step, and according to the operation information corresponding to each operation step and the corresponding detection information, a detection result corresponding to each operation step can be determined, and whether each operation step of the first object is correct or incorrect is acquired. According to the scheme provided by the embodiment of the application, the working process of the first object is not required to be supervised manually, the operation steps of the first object are acquired through the sensor and compared with the corresponding operation information, so that the detection result of the operation steps of the first object is obtained, a large amount of human resources are not required to be consumed, and the efficiency is high.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a workflow test;

FIG. 2 is a schematic flow chart of a data processing method according to an embodiment of the present application;

fig. 3 is a schematic view of an application scenario provided in an embodiment of the present application;

FIG. 4 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of image recognition according to an embodiment of the present application;

FIG. 6 is a schematic diagram of determining a detection result according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

Fig. 8 is a schematic hardware structure of a host device according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a data processing system according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

First, concepts related to the embodiments of the present application will be explained.

Flow specification: the specification of the flow mainly refers to specifying the range, content, program and processing method of each management service, namely, setting service standards. For example, a railway department has a corresponding operation instruction for each job as a flow specification file.

Object detection: the task of object detection is to find all objects (objects) of interest in an image, and to determine their position and size, which is one of the core problems in the field of machine vision.

Neural network model: is an algorithm mathematical model which simulates the behavior characteristics of an animal neural network and processes distributed parallel information. The network relies on the complexity of the system and achieves the purpose of processing information by adjusting the relationship of the interconnection among a large number of nodes. Hardware units for acceleration processing in the neural network model include, but are not limited to, a neural network reasoning engine (Neural Network INFERENCE ENGINE, abbreviated as NNIE), a graphics processor (Graphic Processing Unit, abbreviated as GPU), a central processing unit (Central Processing Unit, abbreviated as CPU), and the like.

Fig. 1 is a schematic diagram of a workflow detection, as shown in fig. 1, including a worker 11, where the worker 11 needs to go out to complete a corresponding workflow. In some industries, there are stringent requirements on the workflow of workers, requiring that the workers perform each operation step strictly as required, such as maintenance of the railway by the railroad industry, overhaul of the aircraft by the aircraft industry, and so forth.

In the example of fig. 1, the workflow that the worker 11 needs to execute is a workflow with strict requirements, so when the worker 11 executes the corresponding workflow, supervision is also needed to be performed on the worker 11 to determine whether each operation step executed by the worker 11 is correct or not, and whether the operation step meets the specification or not.

There are two current approaches to supervising the worker 11. The first is by manual field supervision. Specifically, as shown in fig. 1, when a worker 11 goes out, a plurality of different types of supervisors, such as supervisors 12 and 13 in fig. 1, may be simultaneously dispatched to supervise the work process thereof. The supervisors 12 and 13 judge whether the operation of the worker 11 meets the flow specification according to the operation steps executed by the worker 11 observed by themselves, thereby avoiding the situations of missing, error and the like of the worker 11.

According to the scheme, when a worker goes out each time, a large amount of human resources are consumed by sending out the operation steps of the supervisor to supervise the worker, and the judgment standards of different supervisors are inconsistent, so that different supervisors can obtain different conclusions on the operation of the worker, and the efficiency is low.

In the second scheme, as shown in fig. 1, the working process of the worker 11 is photographed by the recorder 14, after the worker 11 works, the photographed video is uploaded, and the organizer views the video to find out the place where the worker 11 does not operate in compliance with the specification, and feedback is performed.

The scheme firstly checks the video afterwards to judge the normalization of the operation of workers, and only can process the video afterwards, so that certain potential safety hazards exist for part of operation steps needing on-site reminding. Meanwhile, workers can also shield the recorder during work, video can not be recorded in the whole process effectively, and partial operation steps which are not in line with the specifications can not be detected.

Based on the above, the embodiment of the application provides a data processing method to realize effective supervision of workflow normalization of workers.

Fig. 2 is a flow chart of a data processing method according to an embodiment of the present application, as shown in fig. 2, the method may include:

s21, acquiring a preset flow, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, and the operation information comprises at least one of the following steps: an operation object, an operation mode and an operation voice.

The execution main body of the data processing method provided by the embodiment of the application is host equipment, and the host equipment is equipment with certain data processing capability, for example, the host equipment can be a processor, a server or equipment comprising the processor or the server.

The preset flow may be a specification file indicating a workflow, and the host device may acquire the preset flow, where the preset flow includes at least one operation step and operation information corresponding to each operation step, where the operation information indicates what operation needs to be performed. The operation information includes at least one of an operation object, an operation mode, and an operation voice. The operation object is an object that a worker needs to operate, and the operation object may be a specific device. The operation mode is an operation that a worker needs to perform on an operation object. The operation voice is a voice instruction which is required to be sent by a worker.

S22, detection information corresponding to the operation steps is acquired, wherein the detection information is acquired by a sensor on the first object and comprises voice information or at least one image.

The preset flow comprises at least one operation step, and corresponding detection information can be obtained for each operation step to detect. For any one of the operation steps, the sensor may collect detection information when the first object performs the operation step, and then transmit the detection information to the host device.

The detection information may include voice information, and the sensor may be a device having a recording function. The detection information may include at least one image, and at this time, the sensor may be a device having an image capturing function, and the sensor may directly capture an image, or may capture a video, and obtain at least one image according to the video.

S23, determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect.

For any one of at least one operation step in a preset flow, firstly, operation information and corresponding detection information corresponding to the operation step are obtained. The operation information is defined in a preset flow and indicates what operation needs to be performed by the first object. The detection information is acquired by collecting the operation steps actually executed by the first object.

After the detection information is acquired, the detection result of the first object in the operation step can be judged according to the detection information and the operation information, and whether the operation step of the first object is correct or incorrect can be obtained.

According to the data processing method provided by the embodiment of the application, a preset flow is firstly obtained, wherein the preset flow comprises at least one operation step and operation information corresponding to each operation step, the operation information comprises at least one of an operation object, an operation mode and operation voice, and how each operation step needs to be completed by the operation information can be known. Then, detection information corresponding to the operation steps is acquired, wherein the detection information is acquired by the sensor when the first object executes the operation steps, and according to the operation information corresponding to the operation steps and the corresponding detection information, a detection result corresponding to the operation steps can be determined, so that whether the operation steps of the first object are correct or incorrect is acquired. According to the scheme provided by the embodiment of the application, the working process of the first object is not required to be supervised manually, the operation steps of the first object are acquired through the sensor and compared with the corresponding operation information, so that the detection result of the operation steps of the first object is obtained, a large amount of human resources are not required to be consumed, and the efficiency is high.

The following will describe the embodiments of the present application in detail with reference to the accompanying drawings.

Fig. 3 is a schematic view of an application scenario provided in an embodiment of the present application, as shown in fig. 3, including a first object 31, a head-mounted device 32 and a host device 33 disposed on the head of the first object 31.

When the first object 31 is operating, the head-mounted device 32 may acquire detection information corresponding to when the first object performs each operation step, and transmit the detection information to the host device 33. The host device 33 determines whether the operation steps performed by the first object 31 are correct based on the detection information.

The headset 32 may include a sound pickup 321 and a camera 322, where the sound pickup 321 is used to acquire voice data, and the camera 322 is used to acquire video data or image data. The head-mounted device 32 is arranged on the head of the first object 31, so that the visual angle of the camera 322 moves along with the movement of the head of the first object 31, the visual range of the first object 31 is covered, the image acquired by the camera 322 is ensured to be an operation target of the first object 31, the problem that the wearing recorder is placed at a fixed position or worn on the body to cause that part of the operation target cannot be used for shooting pictures of the camera 322 is avoided, and therefore, video recording is ensured in the whole working process of the first object 31. Meanwhile, the head-mounted device 32 is arranged at the position where the head is close to the mouth of the first object 31, so that the voice emitted by the first object 31 can be better acquired, and noise interference caused by the position on the body or other positions of the first object 31 is avoided.

The host device 33 may be provided separately or may be integrated with the head-mounted device 32 in one device. When the host device 33 is provided alone, the host device 33 may be provided on the body of the first subject 31 or may be provided at a fixed position. When the host device and the head-mounted device 32 are integrated in one device, then both are provided to the head of the first object 31.

Optionally, a speaker 34 and a display screen 35 may be further included, where the speaker 34 may be controlled by the host device 33 to perform voice broadcasting, and the voice broadcasting may be broadcasting the content of the operation step, or may be a voice alert when the first object 31 performs the operation step error. The display 35 may display images or video sent by the host device for viewing by a worker for a workflow.

The solution of the application is described below on the basis of the device illustrated in fig. 3.

Fig. 4 is a flow chart of a data processing method according to an embodiment of the present application, as shown in fig. 4, including:

S41, loading the flow specification and obtaining a preset flow.

After the host device is started, the flow specification may be loaded first to obtain the preset flow. The process of loading the flow specification can be that the range, the content, the program and the processing method of the business to be managed are specified, and then the corresponding file is generated and loaded into the host device.

After the process specification is loaded, a preset process can be acquired. Specifically, the host device may obtain a control instruction, and obtain a preset flow according to the control instruction, where the control instruction may be a key control instruction or a voice control instruction.

After the host device is started, a preset flow can be selected by controlling a key on the host device, the preset flow can be selected by voice control, and the like.

S42, starting a judging process.

After the host device obtains the preset flow, the preset flow includes at least one operation step and operation information corresponding to each operation step, and the operation information may include an operation object, an operation mode, operation voice, and the like, where the operation object is a device that a worker needs to operate, the operation mode is an operation performed by the worker on the operated device, and the operation voice is a password that the worker needs to send.

When the preset flow includes a plurality of operation steps, one operation step is judged each time according to the sequence of the operation steps, and the judgment flow is switched to the judgment flow of the next operation step after the judgment is finished.

Optionally, before starting the decision flow of each operation step, the name of the current operation step and the corresponding working content may be voice-broadcast, where the working content is determined by the operation information corresponding to the operation step. Through voice broadcasting, workers can clearly know the operation to be executed in the operation step. For example, in fig. 3, for any operation step, before obtaining the detection information corresponding to the operation step, the host device 33 may send a first voice command to the loudspeaker 34, and control the loudspeaker 34 to broadcast the operation information corresponding to the operation step through the first voice command, for example, may include one or more of an operation object, an operation manner, and an operation voice, and may also broadcast important content that needs to be noticed by the operation step.

S43, collecting the process of the operation steps of the first object to obtain detection information.

In S43, the determination of any one operation step in the preset flow is described as an example.

After the judging flow of the operation step starts, the process of the operation step of the first object is collected, and detection information is obtained.

Alternatively, the device that collects the procedure of the operation step of the first object is a head-mounted device 32 as illustrated in fig. 3, and the head-mounted device 32 is disposed on the head of the first object. Since the detection information may include voice information, a sound pickup for acquiring voice information of the first object is included in the head-mounted device 32. Since the detection information may include an image, the headset 32 further includes a camera for acquiring image information of the first object.

In the whole judging process of the preset flow, the pickup continuously acquires the audio stream of the first object, and the camera continuously acquires the video stream. For any operation step, the judgment of the operation step only needs to acquire the voice information corresponding to the operation step from the audio stream or acquire at least one image corresponding to the operation step from the video stream according to the period corresponding to the operation step.

In the case where the operation information corresponding to the operation step includes an operation voice, the detection information corresponding to the operation step includes voice information acquired by the pickup. Under the condition that the operation information corresponding to the operation step comprises an operation object and/or an operation mode, the detection information corresponding to the operation step comprises at least one image acquired by the camera.

S44, determining a detection result corresponding to the operation step.

After the detection information is determined, the detection result corresponding to the operation step may be further determined. Specifically, for any one operation step, the content included in the operation information corresponding to the operation step may be determined first, and then, according to the content included in the operation information, matching processing is performed on the operation information and the detection information corresponding to the operation step, so as to obtain a corresponding detection result.

For example, in the case that the operation information includes an operation object and/or an operation mode, it is necessary to detect that the first object is operating the device and the corresponding operation mode, and the detection information should be at least one image; in the case where the operation voice is included in the operation information, it is necessary to detect a voice password that the first object should issue, and the detection information at this time should be voice information. The acquisition of the detection results in these two cases will be described below, respectively.

In the case where the operation information includes an operation voice, the corresponding detection information should be voice information acquired by the sound pickup. Because it is necessary to determine whether the first object speaks the corresponding voice password according to the requirement at this time, the voice information in the operation step is first obtained, and the voice information is identified to obtain the first information. And simultaneously, acquiring corresponding second information in the operation voice, and judging whether the first information comprises the second information. If the first information comprises the second information, the first object is indicated to speak a corresponding voice password according to the requirement, the detection result can be determined to be a first detection result, and the first detection result indicates that the operation steps of the first object are correct; if the first information does not include the second information, the first object is indicated to not speak the corresponding voice password according to the requirement, the detection result can be determined to be a second detection result, and the second detection result indicates that the operation steps of the first object are wrong.

The recognition of the voice information may be directly performing voice recognition, or may be performing text recognition according to the voice information, that is, performing voice-to-text processing on the voice information to obtain first text information corresponding to the voice information. In the case where the recognition of the voice information is voice recognition, the first information is one voice information, and the second information corresponding to the operation voice is also one voice information. In the case where the recognition of the voice information is text recognition, the first information is one text information, and the second information corresponding to the operation voice is also one text information.

For example, if the operation voice is "open device a", the operation step requests the first object to issue a voice command of "open device a". The method comprises the steps of acquiring voice information in a period corresponding to an operation step, and identifying whether the voice information comprises a voice instruction of opening equipment A, wherein other voices can be included in the voice information, and the first object can be considered to finish the operation step according to the specification requirement as long as the voice instruction of opening equipment A is sent out in the period corresponding to the operation step, and the operation step is correct.

Taking recognition of the voice information as text recognition as an example, the second information is the text "opening device a", and when the first information is the text "opening device a", "the opening device a" and the like, the first information is considered to include the second information, the detection result is a first detection result, and when the first information is the text "opening", "the opening device B" and the like, the first information is considered to not include the second information, and the detection result is a second detection result.

In the case where the operation information includes an operation object and an operation mode, it is necessary to determine whether or not the first object operates the operation object in accordance with a predetermined operation mode. Therefore, the determination of the operation step includes two steps, the first step is to determine whether the first object is operating the operation object, and the second step is to determine whether the first object is operating in the corresponding operation manner.

For the first step, it is determined whether the first object is operating the operation object. Because the first object wears the head-mounted device to acquire the image, the acquisition range of the head-mounted device is basically consistent with the working range of the first object, and therefore, only the image in the period corresponding to the operation step is acquired and the image is judged.

Specifically, image recognition may be performed on an image set in a period corresponding to the operation step, to obtain at least one object, where the image set includes at least one image. Then, it is determined whether the operation object is included in the at least one object. If the at least one object includes the operation object, a first mode of operating the operation object by the first object can be determined according to the image set, and a detection result corresponding to the operation step can be determined according to the first mode and the operation mode. If the at least one object does not include the operation object, the first object is indicated to not operate the operation object at all, and no mode exists for operating the operation object, at this time, the detection result corresponding to the operation step is determined to be a second detection result, and the first pair of detection results are indicated to be incorrect in operation step.

Fig. 5 is a schematic diagram of image recognition according to an embodiment of the present application, in which an operation object is a tire, an operation mode is a tire unloading operation, and an image 51 is an image in an image set in the operation step. One or more images may be included in the image set, and one image 51 is illustrated in fig. 5 as being included in the image set. As shown in fig. 5, image 51 is first subjected to image recognition to obtain at least one object.

Image recognition of the image 51 may be accomplished by a neural network model. Before that, the neural network model needs to be trained according to multiple sets of training samples, wherein each set of training samples comprises a sample image and corresponding marking data, the sample image may or may not comprise the operation object, and the marking data is obtained by marking the operation object in the sample image. If the sample image comprises the operation object, the corresponding marking data marks the information such as the size, the shape, the position, the name and the like of the operation object; if the sample image does not include the operation object, the corresponding mark data does not include the information.

For any group of training samples, the sample images can be input into the neural network model to obtain an output identification image, and then parameters in the neural network model are adjusted according to the difference between the identification image and the marker image. And carrying out the processing on each group of training samples, and continuously adjusting the neural network model until the neural network model converges to obtain the trained neural network model. The trained neural network model has the capability of identifying the operation object, and at this time, the image 51 is input into the trained neural network model, so that the operation object can be identified therein.

After image recognition is performed on the image 51, an object 52 may be obtained, where the object 52 is a tire, and at this time, it is indicated that the image 51 includes an operation object. If the operation object is not recognized in the image 51, it means that the first object does not perform the corresponding operation, and the operation step of the first object is wrong.

If the image set includes a plurality of images, image recognition is performed on each image, so that one or more objects can be obtained. When the image recognition result of any image in the image set does not comprise the operation object, the first object is indicated to not execute the corresponding operation, and the operation steps of the first object are wrong.

For example, when the first object performs an operation of unloading the tire, the tire may be identified in the corresponding image, and the vehicle may be identified, and when the first object is unloading the tire is completed, the tire may not be identified in the corresponding image due to the completion of the tire unloading, i.e., the objects identified in the different images may be different. But as long as the first object performs the step of unloading the tyre, the tyre is included in at least one image of the set of images. Thus, if no image in the image set includes a tire, it indicates that the first object is not performing the operation of unloading the tire.

Alternatively, when the image 51 is identified, the first object may be considered to be operating the operation object as long as the operation object is identified on the image 51, or the position of the operation object on the image 51 may be further acquired, and whether the position of the operation object on the image 51 is within the preset range may be determined. If the predetermined range is exceeded, for example, at the edge position of the image 51, it can be considered that the first object is not operating the operation object, and the operation step of the first object is wrong.

In the case where an operation object is included in the image, it is necessary to further judge the corresponding operation mode. Specifically, image recognition is performed on each image in the image set to obtain object information corresponding to each image, and then a first mode of first object operation is determined according to the object information corresponding to each image.

The object information may include a positional relationship between the operation object and the preset object, or a position of the operation object in the image, and contents included in the object information may be different according to actual operations. The first mode may be a first positional relationship between the operation object and the preset object, or may be a first motion track of the operation object, and is determined according to actual object information.

The first positional relationship may be a positional range between the operation object and the preset object, for example, a distance between positions of the operation object and the preset object, and the first positional relationship may also be a relative position between the operation object and the preset object, for example, if the operation step is that a worker wipes a certain device with a wipe, the preset object is a wipe, the object information is a positional relationship between the device and the wipe, and the corresponding first manner is a first positional relationship between the device and the wipe, where the first positional relationship is a relative position between the operation object and the preset object, that is, the wipe should be on the device.

The first motion trajectory is a trajectory determined according to the position of the operation object in each image identified in the image set and the photographing time of each image. For example, if the operating step is a worker unloading a tire, then the object information is the position of the tire in the image and the first way is the first motion profile of the tire.

After the first mode is determined, the detection result may be determined according to the first mode and the operation mode. The operation mode comprises a second position relation between the operation object and the preset object and/or a second motion track of the operation object.

Specifically, if the operation mode includes the second position relationship, the first position relationship and the second position relationship may be subjected to matching processing to obtain a first matching result, and the detection result is obtained according to the first matching result.

Or if the operation mode comprises the second motion trail, the first motion trail and the second motion trail can be subjected to matching processing to obtain a second matching result, and the detection result is obtained according to the second matching result.

Or if the operation mode comprises the second position relation and the second motion track, the first position relation and the second position relation can be subjected to matching processing to obtain a first matching result, the first motion track and the second motion track are subjected to matching processing to obtain a second matching result, and the detection result is obtained according to the first matching result and the second matching result.

The following describes a first movement locus of an operation object in a first mode, and a second movement locus of an operation object in an operation mode as an example. Fig. 6 is a schematic diagram of a determination result provided by an embodiment of the present application, as shown in fig. 6, in which an operation object is a tire, and an operation step performed is tire unloading.

The multiple images may be image-identified, the position of the tire in each image identified, and the first motion trajectory of the tire, such as the trajectory 61 illustrated in fig. 6, may be obtained based on the chronological order of each image.

The tire is unloaded from one position to another, the second motion track is a track 62, wherein the track 62 can be set to a track range, if the track 61 is in the track range, the detection result is considered to be that the first object operation step is correct, otherwise, the first object operation step is considered to be wrong. In fig. 6, the locus 61 is within the locus range set by the locus 62, and the first object operation step is considered to be correct at this time.

In the above-described embodiments, it is described how to determine the detection result of the operation step when the contents included in the operation information are different, respectively. After the detection result of the operation step is obtained, recording may be performed. Further, after the detection result corresponding to the operation step is determined, the host device may further send a second voice command to the loudspeaker, so as to control the loudspeaker to broadcast the detection result corresponding to the operation step, so that the first object knows whether the operation of the first object is compliant or not compliant. Optionally, for some operation steps requiring on-site reminding, on-site reminding can be performed. For example, if the detection result indicates that the operation step of the first object is wrong, the host device may send a voice prompt instruction to the loudspeaker, and control the loudspeaker to voice broadcast the operation step of the first object, so as to prompt the first object to correct in time.

S45, judging whether all operation steps in the preset flow are judged to be finished, if yes, executing S46, and if not, executing S47.

S43-S44 are judging processes for any one operation step, and after the judging process of one operation step is finished, whether all operation steps in the preset process are judged to be finished or not needs to be judged.

S46, storing the detection result corresponding to each operation step.

After all the operation steps in the preset flow are judged, the detection result corresponding to each operation step needs to be stored. Specifically, the host device may acquire a first video from the head-mounted device, where the first video is obtained by shooting an operation performed by the first object in each operation step, and a process of executing a complete workflow by the first object is recorded.

And then, marking the first video according to the detection result corresponding to each operation step to generate a second video. In the second video, for different operation steps, the detection result is marked according to the time period of the operation step. After the second video is stored, each operation step of the first object can be conveniently and later reviewed, and the basis for judging the detection result of the second object is found.

S47, switching to the judging flow of the next operation step, and executing S43.

If the operation steps in the preset flow are not judged to be finished, the judgment flow of the next operation step can be switched. The triggering of the switching may be active triggering of the first object, for example, the first object may be triggered by a key on the host device or may be triggered by voice. The triggering of the switching may also be automatic triggering of the host device, for example, the host device sets a maximum execution time period for a certain operation step, and after exceeding the maximum execution time period, the host device automatically switches to the determination flow of the next operation step regardless of the detection result of the first object executing the operation step.

Fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, as shown in fig. 7, including:

The first obtaining module 71 is configured to obtain a preset flow, where the preset flow includes at least one operation step and operation information corresponding to each operation step, and the operation information includes at least one of the following: an operation object, an operation mode and an operation voice;

A second obtaining module 72, configured to obtain detection information corresponding to the operation step, where the detection information is obtained by collecting the first object by using a sensor, and the detection information includes voice information or at least one image;

The processing module 73 is configured to determine a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, where the detection result is used to indicate that the operation step of the first object is correct or incorrect.

In one possible implementation, the processing module 73 is specifically configured to:

In a possible implementation manner, the operation information includes the operation voice; the processing module 73 is specifically configured to:

acquiring second information corresponding to the operation voice;

In a possible implementation manner, the operation information includes the operation object and an operation mode; the processing module 73 is specifically configured to:

In a possible implementation manner, the operation mode includes a second position relation between the operation object and the preset object and/or a second motion track of the operation object; the processing module 73 is specifically configured to:

In one possible implementation manner, the first obtaining module 71 is specifically configured to:

and acquiring the preset flow according to the control instruction.

In one possible embodiment, the sensor is a headset disposed on the head of the first subject, the headset including a microphone and a camera; the second obtaining module 72 is specifically configured to:

In a possible implementation manner, for any operation step, the processing module 73 is further configured to, before acquiring detection information corresponding to the operation step:

In a possible implementation manner, for any operation step, the processing module 73 is further configured to, after determining a detection result corresponding to the operation step:

In one possible implementation, the processing module 73 is further configured to:

The device provided by the embodiment of the application can be used for executing the technical scheme of the embodiment of the method, and the implementation principle and the technical effect are similar, and are not repeated here.

Fig. 8 is a schematic hardware structure of a host device according to an embodiment of the present application, as shown in fig. 8, where the host device includes: at least one processor 81 and a memory 82. Wherein the processor 81 and the memory 82 are connected by a bus 83.

Optionally, the model determination further comprises a communication component. For example, the communication component may include a receiver and/or a transmitter.

In a specific implementation, at least one processor 81 executes computer-executable instructions stored in the memory 82, so that the at least one processor 81 performs the data processing method as described above.

The specific implementation process of the processor 81 can be referred to the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

In the embodiment shown in fig. 8, it should be understood that the Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: DIGITAL SIGNAL Processor, abbreviated as DSP), application specific integrated circuits (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.

The memory may comprise high speed RAM memory or may further comprise non-volatile storage NVM, such as at least one disk memory.

The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.

Fig. 9 is a schematic structural diagram of a data processing system according to an embodiment of the present application, as shown in fig. 9, including a headset 91 and a host device 92, where:

The headset 91 is disposed at a head of a first object, and is configured to collect the first object, obtain detection information corresponding to each operation step in a preset flow, and send the detection information to the host device;

The host device 92 is configured to acquire the detection information and perform the data processing method as described above.

The present application also provides a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, implement the data processing method as described above.

The computer readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). The processor and the readable storage medium may reside as discrete components in a device.

The division of the units is merely a logic function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. A method of data processing, comprising:

Determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect;

The sensor is a head-mounted device arranged on the head of the first object, the head-mounted device comprises a sound pick-up and a camera, and the head-mounted device moves along with the head of the first object and covers the visual range of the first object; in the judging process of the preset flow, the pickup continuously acquires the audio stream of the first object, and the camera continuously acquires the video stream of the first object; the obtaining the detection information corresponding to the operation step includes:

2. The method of claim 1, wherein determining the detection result corresponding to the operation step based on the operation information and the detection information corresponding to the operation step comprises:

3. The method according to claim 2, wherein the operation information includes the operation voice; according to the content included in the operation information, performing matching processing on the operation information and detection information corresponding to the operation step to obtain the detection result, including:

acquiring second information corresponding to the operation voice;

4. The method according to claim 2, wherein the operation information includes the operation object and operation mode; according to the content included in the operation information, performing matching processing on the operation information and detection information corresponding to the operation step to obtain the detection result, including:

5. The method of claim 4, wherein determining a first manner in which the first object operates the operation object from the set of images comprises:

6. The method according to claim 5, wherein the operation mode includes a second positional relationship between the operation object and the preset object and/or a second movement trace of the operation object; according to the first mode and the operation mode, determining the detection result includes:

7. The method according to any one of claims 1-6, wherein obtaining the preset flow comprises:

and acquiring the preset flow according to the control instruction.

8. The method according to any one of claims 1-6, wherein for any one of the operation steps, before obtaining the detection information corresponding to the operation step, the method further comprises:

9. The method according to any one of claims 1-6, wherein for any one of the operation steps, after determining the detection result corresponding to the operation step, the method further comprises:

10. The method according to any one of claims 1-6, further comprising:

11. A data processing apparatus, comprising:

The processing module is used for determining a detection result corresponding to the operation step according to the operation information and the detection information corresponding to the operation step, wherein the detection result is used for indicating that the operation step of the first object is correct or incorrect;

the sensor is a head-mounted device arranged on the head of the first object, the head-mounted device comprises a sound pick-up and a camera, and the head-mounted device moves along with the head of the first object and covers the visual range of the first object; in the judging process of the preset flow, the pickup continuously acquires the audio stream of the first object, and the camera continuously acquires the video stream of the first object; the second obtaining module is specifically configured to:

12. A host device, comprising:

a memory for storing a program;

A processor for executing the program stored in the memory, the processor being for executing the data processing method according to any one of claims 1 to 10 when the program is executed.

13. A data processing system comprising a head-mounted device and a host device, wherein:

the host device is configured to acquire the detection information and perform the data processing method according to any one of claims 1 to 10.