WO2022146166A1

WO2022146166A1 - Platform for step-by-step augmented reality technical instructions

Info

Publication number: WO2022146166A1
Application number: PCT/RU2020/000785
Authority: WO
Inventors: Дмитрий Анатольевич КУЗЬМЕНКО; Наталья Сергеевна ЛЕВЧЕНКО; Юрий Муратович НАБОКОВ; Егор Алексеевич НАРЫШКИН; Алексей Александрович ОСТРОВЕРХОВ; Тимофей Юрьевич САВИН; Иван Владимирович СОКОЛОВСКИЙ; Сергей Леонидович СОЛЯНИК; Юлия Петровна СОЛЯНИК
Original assignee: Общество С Ограниченной Ответственностью "Спайдер Груп"
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2022-07-07

Abstract

The invention relates to the field of using artificial intelligence to identify types and models of technical appliances. The technical result of the claimed invention is that of providing faster and better quality visualization of instructions for different technical models in augmented reality (AR), reducing the time taken to search for and output instructions, and obviating the need for assistance from a technical specialist. This technical result is achieved in that a method for displaying instructions for a technical appliance with the aid of AR includes: capturing an image of an appliance; identifying the type and model of said technical appliance; extracting and correlating special points; displaying instructions for the identified model of appliance.

Description

AUGMENTED REALITY STEP-BY-STEP INSTRUCTION PLATFORM FOR ENGINEERING

FIELD OF TECHNOLOGY

The invention relates to the field of computer analysis of images, in particular to the use of artificial intelligence (AI) to determine the types and models of objects of technology through the analysis of the image coming from the camera of the user's device, and the subsequent demonstration of instructions for the object of technology using augmented reality (AR) technology through visual linking hints to a real object of technology.

The presented solution can be used, at least in domestic conditions, when interacting with home appliances, as well as in industrial production when interacting with devices, including those requiring detailed study of regulatory documents.

BACKGROUND OF THE INVENTION

US Pat. No. 9,324,229 B2, April 26, 2016, describes a head-mounted display by which a user looking at a suitably marked instrument panel can be provided with an overlay image with instructions for the operation and maintenance of the equipment directly superimposed on the instrument panel. The head-mounted display provides the user with a magnified view of the object being viewed; a tracking mechanism, such as a camera, repeatedly determines the position and orientation of the head-mounted display relative to the object being viewed; and the computer system provides information for the expanded view and repeatedly updates the expanded view of the object being viewed based on the determined position and orientation of the display. The tracking mechanism determines its position using one or more markers or beacons on the viewed object. Markers can be active or passive, including light-emitting diodes (LEDs) that emit invisible light.

Patent KR 102171691 B1, 10/29/2020 describes a method and system for servicing a 3D printer based on augmented reality, which includes the steps of recognizing an auxiliary object of the 3D printer based on previously stored data when viewing the 3D printer through the camera of the user terminal; the formation of a virtual frame, including an auxiliary object; return of information about the camera coordinates for a certain point of the virtual frame; and outputting computer graphics with information about the auxiliary object recognized based on the coordinate information, and displaying it on the camera screen of the user terminal and captured on the camera screen of the user terminal based on augmented reality.

The closest analogue of the claimed invention is the technical solution disclosed in the application WO 2015125066 A1, 08/27/2015. A system is described to facilitate the maintenance of equipment during field work by mobile technicians. The system includes a server configured to store hardware configurations and service protocols; an equipment maintenance logger that communicates with a server, and a variety of smart glasses worn by field technicians, the system provides the field technician with interactive instructions that are displayed on the smart glass to help with field work.

However, this solution lacks the ability to recognize the type and model of equipment using computer technology, and the field technician determines the type and model of equipment manually on site, or performs maintenance on equipment whose type and model is already known and recognition is not required.

Also, when creating augmented reality, existing analogues rely mainly on the 3D model. The use of the 3D model implies the creation of a digital twin of the device. The preparation of such a model takes a lot of time and money. It is also difficult to keep it up to date - with any, even cosmetic changes to the original device, the 3D model will need to be finalized, which will entail time and money costs. Whereas the use of a flat marker in the claimed invention allows you to work with three-dimensional models without the need to create a 3D copy object, which leads to a reduction in time and cost. In addition, a flat marker allows you to work both with interfaces and with each side of the object (each side uses its own marker).

Also, existing analogues provide solutions, each of which is designed to service only a certain type of equipment. The claimed invention is intended for maintenance of various types of equipment.

The technical problem to be solved by the claimed invention is to create a comprehensive scalable solution for determining the types and models of various objects of technology through image analysis, and then demonstrating instructions for the recognized model of technology using augmented reality (AR) technology, which, in addition to tools for issuing instructions for step-by-step manipulation also contains tools for creating content and determining the type and model of an object, which allows you to exclude the assistance of a user by a technical specialist from the service process, and also contains an AI-based tracking algorithm that does not require the creation of a 3D model and, at the same time, allows work with three-dimensional objects, track and bind marks from different sides of the object. Also, the problem to be solved is the qualitative identification of the model and the implementation of high-quality tracking (binding labels).

SUMMARY OF THE INVENTION

The technical result of the claimed invention is the implementation of operations for visualizing instructions for various models of equipment in the AR mode more quickly and efficiently, reducing the time it takes to search for and displaying instructions to the end user for various models of equipment, excluding from the maintenance and operation of the technique of assisting the end user by a technician, which in ultimately leads to the simplification of maintenance and operation, the acceleration of maintenance work, the reduction of errors to zero in the performance of maintenance operations and the operation of equipment by the end user, and also contributes to an increase in the life of the equipment.

The specified technical result is achieved due to the fact that in the method of demonstrating instructions for a technical object using augmented reality, the following steps are carried out: capturing an image of a technical object using the end user device; recognition of the type and model of the equipment object based on the current image frame; transferring to the end user device a data set for the recognized model of equipment, containing at least a reference image, coordinates of interface elements, a list of instructions, detailed information for instructions; extracting at least key points and descriptors for the reference image and the current frame, matching key points by descriptors; building a tomography matrix to obtain the coordinates of the interface elements on the current frame, projecting the coordinates of the interface elements onto the current frame; displaying instructions for the recognized model of the vehicle object on the current frame on the end user device.

The system for demonstrating instructions for a technical object using augmented reality comprises: an end user device containing an image capture means and a means for displaying information to the user; classification module; dataset of training images; tracking module; updated catalog of instructions; which carry out the capture of the image of the object of technology using the image capture means; based on the current image frame, the type and model of the technical object are recognized by means of the classification module; projecting the coordinates of the interface elements on the reference image of the recognized model of the technical object onto the current frame by means of the tracking module; display instructions for the recognized model of the vehicle object using augmented reality, where instruction step labels are linked to the real vehicle object displayed on the end user's device.

In the system, the end user device can be a smartphone, tablet, or AR glasses.

In the system, the image capture medium can be the built-in camera.

In the system, an object of equipment can be an object of household, professional, industrial equipment.

In the system, the classification module may include a technique type determination module and a technique model determination module.

The system can additionally determine the brand of equipment.

In the system, the module for determining the type of equipment and the tracking module can be implemented on the end user device, the module for determining the model of equipment and the catalog of instructions can be implemented on the server. The system can store images of equipment objects for training the module for determining the model of equipment in a dataset of training images.

The system can recognize the model of the object of technology using neural networks.

The system can carry out initial training of neural networks based on 10 or more different photographs of a technical object.

In the system, data about the object of technology can come from various sources, at least from the internal product team and from end users.

In the system, each time when receiving data from the user about the successful or unsuccessful definition of the model, additional training of networks can be carried out.

The system can carry out, by means of the tracking module, at least tracking through the detection and short-term tracking of the technical object.

The system may further comprise an instruction creation interface, wherein the instruction catalog is replenished using the instruction creation interface.

The claimed invention provides a system in which it is possible to create and visualize instructions for a scalable catalog of machinery and equipment.

The end user simply points the phone at the equipment, and the system itself determines the type, brand and model of the equipment; without forcing the user to think about how to determine the model or, if it is impossible to determine the model of equipment by the user himself, seek the help of a technical specialist.

If the model is defined, the system immediately gives the user a list of instructions for it, otherwise it offers a manual search in the catalog. After the user selects an instruction, the system displays instructions using AR technology, where instruction step labels are tied to a real object.

In order to simplify the maintenance and operation of equipment that the end user uses in their daily life, for example, at home or at work, the user is shown instructions on a mobile phone using AR technology.

The claimed invention has its own tracking algorithm based on AI technology. This algorithm does not require the creation of a 3D model, unlike many other solutions, and, at the same time, allows you to work with volumetric objects, track and bind marks from different sides. object. Initially, the side of the object is identified and an arrow (or text hint) is shown to the user, which tells the user exactly where to point the camera (to which side of the object).

The system includes the following components:

- neural networks for recognition of objects of technology;

- image dataset for training neural networks;

- interface for creating instructions for equipment models;

- tracking module (AR).

Instructions use photographs of the object (one or several) as a reference for object identification and subsequent demonstration, which greatly simplifies the data preparation process. The use of AI algorithms as object detection tools, the rejection of a narrow-profile specialist for determining the type and model of equipment in favor of AI recognition technologies leads to a reduction in time and effort to prepare and maintain the necessary input data for recognizing the type and model of equipment.

To identify models, the solution uses pre-trained neural networks, while each time receiving data from the user about the successful or unsuccessful definition of the model, additional training of networks can be carried out. This process allows you to get better results for the identification of objects.

The very process of initial training of neural networks takes place on the basis of at least 10 different photographs of the object. The main goal in terms of model identification is to construct a neural network capable of learning from just a few photos (Single Shot Learning). That is, in order to determine a model of equipment previously unknown to the network, only a few photographs of equipment are needed, instead of hundreds or thousands of photos.

The collection of data about the models required for training, which are part of the neural network solution, is distributed. The data comes from various sources: from the internal product team and from the end users of the solution. This allows the solution to develop faster in two directions: extensive (parallel collection of information about several models at once) and intensive (photos of the same model are received from different participants, but in different quality, lighting and camera angle) without the involvement of a highly specialized specialist.

The claimed invention also includes an algorithm for creating instructions, which involves several steps: marking all the elements of the object's interface on one or more photos, creating instructions using a simple visual editor. The instruction creator simply specifies the sequence of object controls and adds text comments as needed.

When developing an algorithm for creating instructions, it was taken into account that when preparing a set of instructions, it was found that the description of instructions is largely typical. This solution allows you to unify the process of creating instructions, since the vehicle model is marked up only once when it is first added, and there is no need to mark it up again each time creating instructions for it. It is also proposed to speed up the process itself using an auto-generated list of suitable matches for each action and its value, that is, the creation of an instruction is presented as a constructor. Additionally, this makes it easy to edit instructions and keep them up to date.

The process of searching for a model of equipment and displaying instructions is carried out as follows.

The camera of the end user's device is pointed at the object of technology, a frame is captured on the user's mobile device, classified by type of equipment, the mobile device transmits the video stream in the form of frames to the server, the video server receives the video stream and parses it into separate frames, the frames from the video stream are stored on the Redis server .

The neural network algorithm recognizes the brand and specific model of equipment. The neural network is implemented as a separate service (Daemon), which can be located both on the server itself and on separate recognition servers specially designed for this purpose. The number of neural network servers is not limited in number - they can connect to the system as needed, ensuring the horizontal growth of the solution. Each neural network server receives new frames from the Redis server for processing, removing the frame from the queue, with each frame marked with a label (stamp). After the object recognition is completed, the neural network server returns the result of its work to the Redis server with the same label (stamp) that was assigned to a particular frame. The list of available instructions for the recognized model is displayed on the end user's device, the user initiates a transition to a specific instruction, the data is received for initialization, the instruction step is determined.

The tracking module looks for objects associated with the instruction step on frames and keeps track of them.

Top-level description of the tracking module:

The tracking module first searches for the descriptors of the frame received from the client, then compares it with the descriptor corresponding to the selected step, searches for the homography matrix, and maps the marker image objects to the received frame. Marker image objects are selected rectangular fragments on the image. Descriptors are a deep description of the frame points found using superpoints (a pre-trained neural network that extracts the key points of the given frame, and they are the same for the marker and the incoming frame). That is, a frame is input, the output is a set of "special" (according to the neural network) points with descriptions extracted by the neural network. The result of the work of the neural network is the coordinates of the desired object on the frame.

In this case, the model is defined on the server, and the object is tracked on the mobile device.

DESCRIPTION OF THE DRAWINGS

The implementation of the invention will be described hereinafter in accordance with the accompanying drawings, which are presented to explain the essence of the invention and in no way limit the scope of the invention.

The claimed invention is illustrated by figures 1-5, which depict:

Fig. 1 - illustrates the decision context diagram:

(1) - end user device, (2) - vehicle type identification module, (3) - tracking module, (4) - server part, (5) - vehicle model identification module, (6) - instructions catalog, (7) - interface for creating instructions, (8) - dataset of training images;

Fig. 2 - the algorithm of the tracking module;

Fig. 3 - algorithm of the classification module;

Fig. 4 - option for displaying an instruction step to the end user; Fig. 5 is a general diagram of a computing device.

DETAILED DESCRIPTION OF THE INVENTION

In the following detailed description of the implementation of the invention, numerous implementation details are provided to provide a clear understanding of the present invention. However, one skilled in the art will appreciate how the present invention can be used, both with and without these implementation details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to unduly obscure the features of the present invention.

Furthermore, it will be clear from the foregoing that the invention is not limited to the present implementation. Numerous possible modifications, changes, variations and substitutions that retain the spirit and form of the present invention will be apparent to those skilled in the subject area.

Figure 1 shows the overall design of the solution, consisting of 4 main components:

(1) - end user device;

(4) - server part;

(7) - interface for creating instructions;

(8) - dataset of training images.

The end user device (1) can be a smartphone, tablet or AR glasses. The main requirement for the device is the ability to capture a frame using the built-in camera, as well as the subsequent output of information to the user. The component is designed to capture an image and display instructions for the model.

The server part (4) includes a module for determining the model of equipment (5) and a catalog of instructions (6). The module for determining the model of technology (5) is implemented on the basis of the algorithm for classifying and clarifying the coordinates of objects and the algorithm for generating applicants. The instruction catalog (6) is stored in the cloud. The server part component (4) is designed to implement the model definition logic separately from the end user device.

The instruction creation interface (7) may be a web platform for facilitating the staff creation process. An alternative to the web platform is the import of pre-prepared instructions by a script in csv / xml format.

The training image dataset (8) can be a cloud space or any other medium that stores vehicle images in JPG and/or PNG formats for training the vehicle model definition module (5).

The interaction of the main components in Fig.1 can be carried out using wired and/or wireless communication using REST and/or HTTP data transfer protocols.

On FIG. 2 shows the algorithm of the tracking module.

To track an object, you must first specify it, and in a specific case, find it. The problem is solved by matching the singular points of the frames and the template. But since we have learned how to detect an object, we can solve the problem in this way: detect an object on each frame. This approach is called tracking through detection.

The current algorithm is used only to set the object of interest at the beginning of work, change of the interface element or side, or when tracking is lost. The tracking itself will be carried out using short-term tracking methods.

As shown in Figure 2, the algorithm consists of the following steps:

1. The input of the tracking module is a reference image in a single-channel format with a resolution of 640 by 480 pixels, for a certain model of equipment, the coordinates of the interface elements on this image and the current frame.

2. For the reference image and frame, special points and descriptors are extracted using a pre-trained SuperPoint neural network.

3. Feature points are matched by descriptors using a pre-trained SuperGlue neural network.

4. A homography matrix is built to obtain the coordinates of interface features of interest in the input image.

In computer vision, any two images of the same flat object in space are connected by homography. Having a set of points on the reference image and a set of points in the scene associated with it, it is possible to find a correspondence between them in the form of a homography matrix H using the RANSAC algorithm. The algorithm evaluates homography for randomly selected points and does so until a sufficient match between coordinates is achieved.

After calculating the homography matrix, a perspective matrix transformation of vectors is performed - multiplication of the homography matrix by the coordinates of points on the reference image. This operation allows you to find the desired coordinates on the frame.

Obtaining the homography matrix and perspective transformation of coordinates is performed using the functions of the OpenCV library.

5. Carry out short-term tracking of the object.

6. The coordinates of the interface elements on the reference photo are projected onto the current frame.

7. Get the coordinates of interface elements on the current frame.

Detailed description of the tracking module:

The following are passed to the tracking module: an instruction, an instruction step, and a frame received from the user. Descriptors are extracted from the received frame and pre-prepared descriptors corresponding to the instruction step are loaded.

Next, the marker descriptors and descriptors of the received frame are passed to the matcher, which matches the points of 2 descriptors and, at the output, looks for a homography matrix between images. Descriptors are built according to the same principle both for marker images and for incoming frames from the user. Using the resulting matrix, marked objects are displayed from the marker image to the submitted frame, drawn and served to the client. By building the mapping of the marker image into the given frame, all objects of interest are also built and displayed. Having received the mapping of the object from the marker to the submitted frame, the coordinates of the object are extracted already from the space of the user frame. These coordinates are transferred to the client, and already on the client side, rectangular objects are drawn according to the transferred coordinates.

The instruction output interface is rendered according to the step.

Description of the classification module.

By classification we will understand the comparison of objects (observations, events) with one of the previously known classes. The task of classification is reduced to finding an algorithm (decisive function) that determines the compliance of the model of equipment on the input image with a certain type of equipment, brand, number models. To solve the classification problem, in a particular solution, it is proposed to use the technology of artificial neural networks.

For the task of classifying objects, any convolutional artificial neural network is structurally divided into two parts. The first one consists of convolutional and unifying layers and forms a feature matrix based on the original image. The second part of the network is the classifier, which, having taken a set of features, produces a vector with probabilities for each class.

A description of the operation of the classification module is shown in Fig.3.

The definition of the brand of technology is carried out on the available limited set of images. The best solution for classifying equipment by brand, taking into account the fact that almost all presented logos contain text, is to implement the module in two stages: text recognition on the frame and text comparison with logo templates. Training is continuous and is carried out for a large number of classes with a small sample size (1-10 photographs).

The process of creating instructions.

Using the interface for creating instructions (position (7) in Fig. 1), you can replenish the catalog as follows.

When adding a vehicle model:

With the help of a working device, one or more photos of a vehicle model are uploaded, indicating its properties (type of vehicle, brand, model number). The actor defines the area of the device marker and marks in it all the necessary controls and/or controls. For each of them, it specifies the type (for example, button, lever, switch, etc.) and how to interact with it (for example, press the button).

The actor saves the vehicle model with a set of controls in the catalog. After saving, the model of equipment can have an identifier for linking to it the data received from the SCADA systems of industrial enterprises. This helps to visualize the data received from SCADA in relation to the control and / or control body.

When adding instructions to a specific model of equipment:

Specifies the name of the instruction.

The step of the instruction and the required control and/or control element are indicated. The actor specifies a textual description of the instruction step when using the chain of blocks “action - control - value” from existing directories or by entering manually.

The previous step is repeated within a set of instruction steps. The actor stores the instruction in the directory.

Detailed description of the process of searching for instructions on the model.

Detailing the process of searching for a model of equipment and displaying instructions:

1. The camera of the end user device is aimed at the object.

2. Based on the image in the camera's video stream, the current frame is determined. For an object on the frame, the probability vector of belonging to each of the given types of equipment is determined. Information about the frame and the probability vector is transmitted to the server side.

3. When a frame arrives at the server, the neural network algorithm sequentially determines the brand and model of the equipment. If the vehicle model is not recognized, only the vehicle brand and category are reported to the end user (based on the probability vector). After determining the vehicle model, the server sends a set of data to the end user device: a reference image of the vehicle model, coordinates of its buttons, a list of instructions for this model.

4. The list of available instructions for the recognized model is displayed on the end user's device.

5. The user initiates a jump to a particular instruction.

6. After selecting an instruction, the detailed information on the instruction and the image of the marker (marked up image) are loaded. In this case, markup means finding the coordinates of the rectangles that bound any part of the image that is significant for us at the current step, such a significant part of the image can be a button on a washing machine, kettle, and any other equipment. Next, the marker descriptors are passed to the marker tracking component for image detection.

7. The user determines the step of the instruction, then the process of matching the marker occurs when the camera is pointed at the object of interest to the user.

8. Information about the displayed step of the instruction is retrieved - it is the coordinates of the rectangle on the marker image (marked coordinates in the space of the marker image). 9. The marker image and the image submitted by the user are actually tensors (matrices) with the following dimensions - (3,640,480), which are further converted from RGB to Grayscale and get simply (1,640,480). Then a homomorphism is constructed that maps the marker image to the space of the submitted frame, and through this homomorphism the frame coordinates are converted from the space of the marker to the space of the frame submitted by the user.

10. The transformed coordinates of the rectangles are superimposed on the user's video sequence coming from the camera in real time.

eleven . The user goes through the instruction within the set of its steps.

On FIG. 4 shows an option for displaying an instruction step to the end user, for example, step 2/11 "Display" and step 5/11 "Socket".

The examples given are special cases and do not exhaust all possible implementations of the claimed invention.

On FIG. 5 shows a general diagram of a computing device (N00) that provides the data processing necessary to implement the claimed solution.

In general, a device (N00) contains components such as: one or more processors (N01), at least one memory (N02), storage media (N03), I/O interfaces (N04), I/O ( N05), networking tools (N06).

The processor (N01) of the device performs the basic computing operations necessary for the operation of the device (N00) or the functionality of one or more of its components. The processor (N01) executes the necessary machine-readable instructions contained in the main memory (N02).

Memory (N02), as a rule, is made in the form of RAM and contains the necessary software logic that provides the required functionality.

The data storage facility (N03) can be implemented in the form of HDD, SSD disks, raid array, network storage, flash memory, optical storage media (CD, DVD, MD, Blue-Ray disks), etc. The tool (N03) allows you to perform long-term storage of various types of information.

Interfaces (N04) are standard means for connecting and working with the server part, for example, USB, RS232, RJ45, LPT, COM, HDMI, PS/2, Lightning, FireWire, etc. The choice of interfaces (N04) depends on the specific version of the device (N00), which can be a personal computer, mainframe, server cluster, thin client, smartphone, laptop, etc.

The keyboard should be used as data I/O (N05) in any implementation of the system. The keyboard hardware can be any known: it can be either a built-in keyboard used on a laptop or netbook, or a separate device connected to a desktop computer, server, or other computer device. In this case, the connection can be either wired, in which the keyboard connection cable is connected to the PS / 2 or USB port located on the system unit of the desktop computer, or wireless, in which the keyboard exchanges data via a wireless communication channel, for example, a radio channel, with base station, which, in turn, is directly connected to the system unit, for example, to one of the USB ports. In addition to the keyboard, I/O devices can also use: joystick, display (touchscreen), projector, touchpad, mouse, trackball, light pen, speakers, microphone, etc.

Means of network interaction (N06) are selected from devices that provide network reception and transmission of data, for example, an Ethernet card, WLAN / Wi-Fi module, Bluetooth module, BLE module, NFC module, IrDa, RFID module, GSM modem, etc. With the help of tools (N05) the organization of data exchange over a wired or wireless data transmission channel, for example, WAN, PAN, LAN (LAN), Intranet, Internet, WLAN, WMAN or GSM, 3G, 4G, 5G, is provided.

The device components (N00) are connected via a common data bus (N10).

The present application materials provide a preferred disclosure of the implementation of the claimed technical solution, which should not be used as limiting other, private embodiments of its implementation, which do not go beyond the scope of the requested legal protection and are obvious to specialists in the relevant field of technology.

It should be clear to a person skilled in the art that various variations of the proposed method and system do not change the essence of the invention, but only determine its specific embodiments and applications.

Claims

INVENTION FORMULA A method for demonstrating instructions for a technical object using augmented reality, containing the following steps:

- capturing an image of a technical object using an end user device;

- recognition of the type and model of the equipment object based on the current image frame;

- transfer to the end user device of a data set for the recognized model of equipment, containing at least a reference image, coordinates of interface elements, a list of instructions, detailed information for instructions;

- extracting at least key points and descriptors for the reference image and the current frame, matching key points by descriptors;

- building a homography matrix to obtain the coordinates of the interface elements on the current frame, projecting the coordinates of the interface elements onto the current frame;

- displaying instructions for the recognized model of the vehicle object on the current frame on the end user device. A system for demonstrating instructions for a technical object using augmented reality for implementing the method according to claim 1, comprising:

- an end user device containing an image capture means and a means for displaying information to the user;

- classification module;

- training images dataset;

- tracking module;

- updated catalog of instructions; in which carry out the capture of the image of the object of technology by means of capturing the image; based on the current image frame, a model of a technical object is recognized by means of a classification module; projecting the coordinates of the interface elements on the reference image of the recognized model of the technical object onto the current frame by means of the tracking module;

16

SUBSTITUTE SHEET (RULE 26) display instructions for the recognized model of the vehicle object using augmented reality, where instruction step labels are linked to the real vehicle object displayed on the end user's device. The system according to claim 2, characterized in that the end user device is a smartphone, tablet or AR glasses. The system according to claim 2, characterized in that the means of capturing the image is a built-in camera. The system according to claim 2, characterized by the fact that the object of technology is an object of household, professional, industrial equipment. The system according to claim 2, characterized in that the classification module includes at least a technique type determination module and a technique model determination module. The system according to claim 6, characterized in that the brand of equipment is additionally determined. The system according to claim 6, characterized in that the module for determining the type of equipment and the tracking module are implemented on the end user device, the module for determining the model of equipment and the catalog of instructions are implemented on the server. The system according to claim 6, characterized in that they store images of equipment objects for training the module for determining the equipment model in a dataset of training images. The system according to claim 2, characterized in that the model of the object of technology is recognized using neural networks. The system according to claim 10, characterized in that the initial training of neural networks occurs on the basis of 10 or more different photographs of the technical object. The system according to claim 11, characterized in that the data about the technical object comes from various sources, at least from the internal product team and from end users. The system according to claim 11, characterized in that each time the user receives data about the successful or unsuccessful definition of the model, additional training of the networks is carried out.

17

SUBSTITUTE SHEET (RULE 26) The system according to claim 2, characterized in that at least tracking through the detection and short-term tracking of the technical object is carried out by means of the tracking module. The system according to claim. 2, characterized in that the system additionally contains an interface for creating instructions, and the catalog of instructions is replenished using the interface for creating instructions.

eighteen

SUBSTITUTE SHEET (RULE 26)