WO2022185156A1

WO2022185156A1 - Marker for artificial neural networks, related computer-implemented method for the recognition and interpretation and related system

Info

Publication number: WO2022185156A1
Application number: PCT/IB2022/051625
Authority: WO
Inventors: Fabio LANZI; Matteo Fabbri; Riccardo Gasparini; Rita Cucchiara; Simone CALDERARA; Massimo Garuti; Lorenzo Baraldi
Original assignee: Goatai S.R.L.
Priority date: 2021-03-03
Filing date: 2022-02-24
Publication date: 2022-09-09
Also published as: IT202100004982A1

Abstract

The marker (1) for artificial neural networks comprises a plurality of elements (2) defining a matrix structure and provided with one inner area (3) the elements (2) of which define a matrix code and with at least one outer edge (6) to the inner area (3) comprising four angular portions (7, 8, 9, 10) of different colors.

Description

MARKER FOR ARTIFICIAL NEURAL NETWORKS, RELATED COMPUTER-IMPLEMENTED METHOD FOR THE RECOGNITION AND INTERPRETATION AND RELATED SYSTEM

Technical Field

The present invention relates to a marker for artificial neural networks, a related computer-implemented method for the recognition and interpretation, and a related system.

Background Art

In the electronic industry, the use of markers, also called fiducial markers, is well known which are expressed as images and are recognized by electronic systems to provide information such as, e.g., in the case of augmented reality, a transition from 2D to 3D space.

The marker-based recognition is based on the detection, within an acquired image, of some characteristic patterns detectable according to the strong contrast of the edges (usually square or round black on a white background) compared to the rest of the image. On the basis of the representation of the edges and of the figure inside the edges it is possible to classify the markers in different subgroups. Usually the recognition of the edges within the image is achieved by distinguishing the areas of the image itself where the gradients of pixels change dramatically from one value to another. The discrimination between one marker and another is carried out on the basis of an encoding of the value enclosed between the identified edges.

Among the various types of fiducial markers are known, for example, the ArUco markers which are of the type of square binary markers.

The main problems ascertained in the use of markers are due to the fact that the identification and reading of the marker itself depends unequivocally on the quality of the acquired image.

In addition, the marker may be distorted by the perspective of the image and, therefore, difficult to recognize and/or decode.

It follows that the algorithms for their recognition must be programmed ad hoc in order to allow a rapid and effective decoding. This issue is even more acute in the field of artificial neural networks.

An artificial neural network is a computational model composed of artificial “neurons”, loosely inspired by the simplification of a biological neural network and is used to solve Artificial Intelligence engineering problems related to different technological fields such as computer science, electronics or other subjects.

Neural networks cause computers to be able to solve problems independently and improve their capabilities.

To achieve this, the neural network is “trained” so that it can provide results that are as reliable as possible. A neural network must, therefore, be able to autonomously process the information related to the positioning and the distortion of the marker.

The difficulties encountered in the detection of markers in images make the markers themselves susceptible to improvement.

Description of the Invention

The main aim of the present invention is to devise a marker for artificial neural networks which is uniquely recognizable within an image by an artificial neural network.

Another object of the present invention is to devise a marker for artificial neural networks, a related computer-implemented method for the recognition and interpretation and a related system which enable rapid and efficient decoding of the marker by an artificial neural network regardless of its placement within an image.

Another object of the present invention is to devise a marker for artificial neural networks, a related computer-implemented method for the recognition and interpretation, and a related system which allow the mentioned drawbacks of the prior art to be overcome within a simple, rational, easy and effective to use as well as affordable solution.

The aforementioned objects are achieved by the present marker for artificial neural networks having the characteristics of claim 1.

Brief Description of the Drawings Other characteristics and advantages of the present invention will become more apparent from the description of a preferred, but not exclusive, embodiment of a marker for artificial neural networks, a related computer-implemented method for the recognition and interpretation, and a related system, illustrated by way of an indicative, yet non-limiting example in the accompanying tables of drawings wherein:

Figures 1 and 2 are schematic representations of a marker according to the invention;

Figure 3 is a schematic representation illustrating the steps of the method according to the invention.

Embodiments of the Invention

With particular reference to these figures, reference numeral 1 globally indicates a marker for artificial neural networks.

In the context of the present disclosure, the expression “artificial neural network” refers to a computational model composed of artificial “neurons”, loosely inspired by the simplification of a biological neural network.

The marker 1 comprises a plurality of elements 2 defining a matrix structure and provided with at least one inner area 3 the elements 2 of which define a matrix code.

In particular, the inner area 3 comprises a plurality of elements 2 of white or black color.

Each of these elements 2 may be associated with a value of 0 or 1, depending on its color. In other words, the inner area 3 encodes a binary code 4 for the representation of a predefined piece of information.

The inner area 3 is decodable by a computer system capable of converting the binary code 4 into the aforementioned information.

According to the invention, the marker 1 comprises at least one outer edge 6 to the inner area 3 comprising four angular portions 7, 8, 9, 10 of different colors. The presence of an outer edge 6 with angular portions 7, 8, 9, 10 of different colors allows the marker 1 to result easily recognizable also by an artificial neural network. An artificial neural network, in fact, is able to autonomously process the information and provide a univocal result. To achieve this result, the neural network must be “trained” so that it can provide results that are as reliable as possible.

The angular portions 7, 8, 9, 10 of different colors surrounding the inner area 3 uniquely identify the marker 1 and make the identification thereof extremely fast and unequivocal.

The outer edge 6 is also defined by a plurality of elements 2 arranged externally to the inner area 3.

The outer edge 6 has a thickness equal to one of the elements 2. In other words, the elements 2 of which the outer edge 6 is composed are arranged side by side on a single level.

It cannot, however, be ruled out that the outer edge 6 has a different thickness.

In the embodiment shown in the figures, the angular portions 7, 8, 9, 10 are each defined by three elements 2, arranged in an “L” pattern and forming the outer corners of the marker 1.

It cannot, however, be ruled out that the angular portions 7, 8, 9, 10 are composed of a different number of elements 2.

Usefully, the angular portions 7, 8, 9, 10 comprise: a first angular portion 7, the elements 2 of which are green; a second angular portion 8, the elements 2 of which are blue; a third angular portion 9, the elements 2 of which are white; a fourth angular portion 10, the elements 2 of which are red.

More in detail, colors are defined according to the RGB (Red Green Blue) model in which the color green is represented by the code (0, 255, 0), the color blue is represented by the code (0, 0, 255), the color white is represented by the code (255, 255, 255) and the color red is represented by the code (255, 0, 0).

The outer edge 6 also comprises a plurality of side portions 11 the elements 2 of which are white in color (RGB=(255, 255, 255)).

Each side portion 11 is bounded between two angular portions 7, 8, 9, 10.

In a predefined readout position, the inner area 3 represents a precise succession of elements 2 which encode a particular code.

Should the marker 1 be positioned rotated with respect to the readout position and the inner area 3 be decoded according to this rotated position, it would inevitably provide an incorrect code and, consequently, incorrect information. The arrangement of the angular portions 7, 8, 9, 10 allows the orientation of marker 1 to be determined and the inner area 3 to be correctly decoded.

In the readout position: the first angular portion 7 is arranged in the upper left corner; the second angular portion 8 is arranged in the upper right corner; the third angular portion 9 is arranged in the lower right corner; the fourth angular portion 10 is arranged in the lower left corner.

The marker 1 also comprises at least one inner edge 12, positioned between the inner area 3 and the outer edge 6, the elements 2 of which are black (RGB=(0, 0, 0)).

The inner edge 12 allows the angular portions 7, 8, 9, 10 to be effectively recognized thanks to the chromatic contrast between the various colors of the latter and the black color of the inner edge itself.

The inner edge 12 in turn has a thickness equal to one of the elements 2. Similarly to the outer edge 6, the elements 2 of which the inner edge 12 is composed are arranged side by side on a single level.

It cannot, however, be ruled out that the inner edge 12 has a different thickness. Consequently, if the inner area 3 is of the type of an S x S matrix, the marker 1 has a matrix structure (S+4) x (S+4).

According to a further aspect, the present invention also relates to a computer- implemented method for the recognition and interpretation of a marker for artificial neural networks as described above.

The present method is, therefore, performed by means of a computerized recognition and interpretation system.

The method according to the invention comprises at least one step of acquisition of an image 13 containing at least one marker 1.

More in detail, the image 13 may contain a plurality of markers 1. The acquisition of the image 13 is performed by means of an image acquisition unit 14.

The image acquisition unit 14 may comprise at least one of either a camera or a video camera.

The image acquisition unit 14 may be of the fixed type, i.e., stably arranged in a predefined area, or of the movable type, such as e.g. a camera/video camera of a smart-phone, tablet, or the like.

Next, the method comprises one step of detection of the marker 1 within the image 13 by means of a first neural network 15.

The first neural network 15 is configured to recognize the marker 1 within the image 13.

In particular, the image 13 is analyzed in order to find sub-areas of the image plane in which the possible presence of markers is estimated.

More in detail, the first neural network 15 is configured to recognize at least one of: shape, color, pattern.

The method then comprises one step of determination of the orientation of the marker 1 by means of a second neural network 16.

In more detail, the step of determination is performed in order to determine the arrangement of the marker 1 within the image 13, i.e. to determine whether it is rotated with respect to the readout position.

Conveniently, prior to the determination step, the method comprises at least one step of cutting the marker 1 with respect to a background of the image 13 and at least one step of resizing of the cut-out marker 1.

The cutting step is performed in such a way as to isolate the marker 1 from elements in the image 13 which are unrelated to the marker itself.

The resizing step is performed so as to give all the cut-out markers 1 the same predefined size.

The cutting step and the resizing step are performed by means of a cutting and resizing unit 17.

Once the display of the marker 1 has been optimized, the determination step is performed. Advantageously, the determination step comprises at least one step of locating at least one key point 18, 19, 20, 21 in the cut-out marker 1.

More specifically, the determination step comprises a locating step of a plurality of key points 18, 19, 20, 21.

The key points 18, 19, 20, 21 comprise: a first key point 18 corresponding to the first angular portion 7; a second key point 19 corresponding to the second angular portion 8; a third key point 20 corresponding to the third angular portion 9; a fourth key point 21 corresponding to the fourth angular portion 10. Specifically, each key point 18, 19, 20, 21 corresponds to a point of the marker 1 wherein the relevant angular portion 7, 8, 9, 10 meets the inner edge 12.

The second neural network 16 is configured to determine the coordinates of the key points 18, 19, 20, 21 according to a Cartesian system. In other words, the second neural network 16 assigns the coordinates (x,y) to each key point 18, 19, 20, 21 so as to establish the relevant position thereof and to determine the possible rotation of the marker 1 with respect to the readout position.

Finally, the method comprises a step of decoding of the marker 1 by means of a third neural network 22.

The third neural network 22 is configured to convert the inner area 3 into a binary code 4.

Conveniently, prior to the decoding step, the method comprises a step of rotation of the cut-out marker 1 so that: the first key point 18 is positioned in the upper left corner; the second key point 19 is positioned in the upper right corner; the third key point 20 is positioned in the lower right corner; the fourth key point 21 is positioned in the lower left corner.

Substantially, the rotation step allows the marker 1 to be placed in the readout position.

The method also comprises a step of rectification of the cut-out marker 1, by means of homographic transformation.

The rectification step is performed so as to correct any distortion of the cut-out marker 1, caused by the perspective of the image 13, and to cut it again to remove those parts which are not relevant for the next decoding step.

The rotation step and the rectification step are performed by means of a rotation and rectifying unit 23.

The decoding step comprises at least one step of associating each white element 2 of the inner area 3 with a binary value of 1 and each black element 2 of the inner area 3 with a binary value of 0, in order to generate a binary code 4.

Prior to the detection step of the marker 1 , the method comprises a training step of at least one of the first neural network 15, the second neural network 16 and the third neural network 22.

The training step comprises: a first training step by means of an artificial dataset; and a second training step by means of the artificial dataset combined with a real dataset.

In more detail, the training step is performed for all three neural networks 15, 16, 22.

The artificial dataset is automatically created starting from random images 13 on which a random number of markers 1 are superimposed with random positions, angles, perspective, illumination, saturation, transparency, etc., by means of image creation software.

These images 13 are annotated fully automatically, the positions of the markers 1 on the image plane and the coordinates of the relevant key points 18, 19, 20, 21 being known to the image creation software.

The neural networks 15, 16, 22 are, therefore, trained on this artificial dataset to obtain a first working version of the recognition and interpretation system.

The real dataset is created from real images 13 where the real markers 1 have been printed, cut out and placed realistically in the scene.

Then, there is a semi-automatic annotation of the real dataset, performed by means of the first version of the system.

Semi-automatic annotation involves that given a real image 13, the position of such markers 1 and, in particular the position of the relevant key points 18, 19, 20, 21, is first estimated by the first version of the system and then corrected (if necessary) manually by a human annotator.

Finally, the second training step is performed by means of the first version of the system on a final dataset obtained by the union of the artificial dataset and the real dataset, to obtain a final version of the recognition and interpretation system.

According to a third aspect, the present invention also relates to a recognition and interpretation system of markers for artificial neural networks.

The system comprises: at least one image acquisition unit 14 configured to acquire at least one image 13 containing at least one marker 1 as described above; and at least one processing and control unit 15, 16, 17, 22, 23 configured to perform the method as described above.

Specifically, the image acquisition unit 14 may comprise at least one of either a camera or a video camera.

The image acquisition unit 14 may be of the fixed type, i.e., stably arranged in a predefined area, or of the movable type, such as a camera/video camera of a smart-phone, tablet, or the like.

The processing and control unit 15, 16, 17, 22, 23 comprises: at least a first neural network 15 configured to perform the detection step; at least one cutting and resizing unit 17 configured to perform the cutting step and the resizing step; at least a second neural network 16 configured to perform the determination step; at least one rotation and rectifying unit 23 configured to perform the rotation step and the rectification step; and at least a third neural network 22 configured to perform the decoding step.

It has in practice been ascertained that the described invention achieves the intended objects and in particular the fact is underlined that the present marker, thanks to the presence of an outer edge provided with angular portions of different colors and of an inner edge in black color, is univocally recognizable inside an image by an artificial neural network.

Furthermore, the marker, method and system according to the invention allow for fast and efficient decoding of the marker by an artificial neural network regardless of its placement within an image.

Claims

1) Marker (1) for artificial neural networks, comprising a plurality of elements (2) defining a matrix structure and provided with at least one inner area (3) the elements (2) of which define a matrix code, characterized by the fact that it comprises at least one outer edge (6) to said inner area (3) comprising four angular portions (7, 8, 9, 10) of different colors.

2) Marker (1) according to claim 1, characterized by the fact that said angular portions (7, 8, 9, 10) comprise: a first angular portion (7), the elements (2) of which are green; a second angular portion (8), the elements (2) of which are blue; a third angular portion (9), the elements (2) of which are white; a fourth angular portion (10), the elements (2) of which are red.

3) Marker (1) according to claim 2, characterized by the fact that: said first angular portion (7) is arranged in the upper left corner; said second angular portion (8) is arranged in the upper right corner; said third angular portion (9) is arranged in the lower right corner; said fourth angular portion (10) is arranged in the lower left corner.

4) Marker (1) according to one or more of the preceding claims, characterized by the fact that it comprises at least one inner edge (12), positioned between said inner area (3) and said outer edge (6), the elements (2) of which are black.

5) Marker (1) according to one or more of the preceding claims, characterized by the fact that said inner area (3) comprises a plurality of elements (2) of white or black color.

6) Computer-implemented method for the recognition and interpretation of a marker (1) according to one or more of the preceding claims, characterized by the fact that it comprises the following steps: acquisition of an image (13) containing at least one marker (1); detection of said marker (1) inside said image (13) by means of a first neural network (15); determination of the orientation of said marker (1) by means of a second neural network (16); decoding of said marker (1) by means of a third neural network (22).

7) Method according to claim 6, characterized by the fact that, prior to said determination step, it comprises at least one step of cutting said marker (1) with respect to a background of said image (13) and at least one step of resizing of said cut-out marker (1).

8) Method according to claim 6 or 7, characterized by the fact that said determination step comprises at least one step of locating at least one key point (18, 19, 20, 21) in said cut-out marker (1), wherein: a first key point (18) corresponds to said first angular portion (7); a second key point (19) corresponds to said second angular portion (8); a third key point (20) corresponds to said third angular portion (9); a fourth key point (21) corresponds to said fourth angular portion (10).

9) Method according to one or more of claims 6 to 8, characterized by the fact that, prior to said decoding step, it comprises a step of rotation of said cut-out marker (1) so that: said first key point (18) is positioned in the upper left corner; said second key point (19) is positioned in the upper right corner; said third key point (20) is positioned in the lower right corner; said fourth key point (21) is positioned in the lower left corner.

10) Method according to one or more of claims 6 to 9, characterized by the fact that said decoding step comprises at least one step of associating each white element (2) of said inner area (3) with a binary value of 1 and each black element (2) of said inner area (3) with a binary value of 0, in order to generate a binary code (4).