WO2020023467A1

WO2020023467A1 - Unique identification of freely swimming fish in an aquaculture environment

Info

Publication number: WO2020023467A1
Application number: PCT/US2019/042958
Authority: WO
Inventors: Bryton SHANG; Thomas HOSSLER
Original assignee: Aquabyte, Inc.
Priority date: 2018-07-24
Filing date: 2019-07-23
Publication date: 2020-01-30

Abstract

Approaches for unique identification of freely swimming fish in a fish farm enclosure include and end-to-end deep learning pipeline for unique fish identification. The pipeline for unique fish identification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of fish images.

Description

INTERNATIONAL PATENT APPLICATION

FOR

UNIQUE IDENTIFICATION OF FREELY SWIMMING FISH IN AN AQUACULTURE ENVIRONMENT TECHNICAL FIELD

[0001] The present disclosure is directed to unique identification of freely swimming fish in an aquaculture environment.

BACKGROUND

[0002] The growth rate of world human population is applying substantial pressure on the planet’s natural food resources. Aquaculture will play a significant part in feeding this growing human population.

[0003] Aquaculture is the farming of aquatic organisms (fish) in both coastal and inland areas involving interventions in the rearing process to enhance production. Aquaculture has experienced dramatic growth in recent years. The United Nations Food and Agriculture

Organization estimates that aquaculture now accounts for half of the world’s fish that is used for food.

[0004] Fish farm production technology is underdeveloped, when compared to the state of the art in other food production processes. Techniques that improve the production processes in fish farms using new perception and prediction techniques would be appreciated by fish farmers.

[0005] United States Patent Application No. 2005/0011470 broadly describes a system for uniquely identifying subjects from a target population that operates to acquire, process and analyze digital images to create data which is sufficient to uniquely identify an individual in a population of interest. However, the system requires manual handling of fish or requires fish to swim through an optically transparent tube.

[0006] An approach that does not require the manual handling of fish, which can cause stress and damage to the fish, and does not require fish to swim through an optically transparent tube, which may be commercially impractical in open water fish farming enclosure environments, would be appreciated by fish farmers.

[0007] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely because of their inclusion in this section.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] For a more complete understanding of the present disclosure and its advantages, reference is now made to the following description taken in conjunction with the accompanying drawings, in which like reference numerals represent like parts:

[0009] FIG. 1 is a schematic diagram of an example aquaculture environment in which the present system for uniquely identifying freely swimming fish may operate.

[0010] FIG. 2 is a flowchart of a clustering-approach of the present system for unique fish identification.

[0011] FIG. 3 is a flowchart of a component-approach for a feature extraction step of the clustering-approach of the present system for unique fish identification.

[0012] FIG. 4 depicts various landmark points on a two-dimensional lateral view of a fish.

[0013] FIG. 5 depicts various landmark areas of a two-dimensional lateral view of a fish.

[0014] FIG. 6 is a flow diagram of an end-to-end deep learning approach of the present system for unique fish identification.

[0015] FIG. 7 is a schematic diagram of an end-to-end deep learning approach of the present system for unique fish identification.

[0016] FIG. 8 depicts the four freckle dimensions involved in a first biometric approach to unique fish identification.

[0017] FIG. 9 is a flowchart of a first biometric approach to unique fish identification.

[0018] FIG. 10 depicts a constellation of a second biometric approach unique fish identification.

[0019] FIG. 11 is a block diagram of an example computer system with which the present system for uniquely identifying freely swimming fish may be implemented.

DETAIFED DESCRIPTION

[0020] FIG. 1 is a schematic diagram of aquaculture environment 100 for uniquely identifying freely swimming fish 102 in fish farming enclosure 104. Environment 100 includes a high-resolution, light sensitive digital camera 106 within a waterproof housing immersed underwater in the fish farming enclosure 104. [0021] In some implementations, camera 106 is an approximately 12-megapixel color or monochrome camera with a resolution of approximately 4096 pixels by 3000 pixels, and a frame rate of approximately 1 to 8 frames per second. Although different cameras with different capabilities and higher frame rates may be used according to the requirements of the particular implementation at hand. For example, a stereo camera may be used to capture stereo (e.g., left and right) images that may be processed for unique fish identification.

[0022] Selection of the camera lens(es) for camera 106 may be based on an appropriate baseline and focal length to capture images of a fish freely swimming in front of camera 106 where the fish is close enough to the lens(es) for proper pixel resolution and feature detection in the captured image, but far enough away from camera 106 such that a fish can fit entirely in the image frame. For example, 8-millimeter focal length lenses with high line pair count (lp/mm) can be used such that the image pixels can be resolved. The baseline of camera 106 may have greater variance such as, for example, within the range of 6 to 12-millimeter baseline.

[0023] Fish farming enclosure 104 may be a sea net pen framed by a plastic or steel cage that provides a substantially inverted conical, circular, or rectangular cage, or cage of other desired dimensions. Fish farming enclosure 104 may hold a number of fish of a particular type (e.g., salmon). The number of fish held may vary depending on a variety of factors such as the size of fish farming enclosure 104 and the maximum stocking density of the particular fish caged. For example, fish farming enclosure for salmon may be 50 meters in diameter, 20-50 meters deep, and hold up to approximately 200,000 salmon assuming a maximum stocking density of 10 to 25 kg/m3.

[0024] While in some implementations the techniques for unique fish identification disclosed herein are applied to a sea-pen environment, the techniques are applied to other fish farming enclosures in other embodiments. For example, the techniques may be applied to fish farm ponds, tanks, or other like fish farm enclosures.

[0025] Camera 106 may be attached to a winch system that allows camera 106 to be relocated underwater in the fish farming enclosure 104 to capture images of fish from different locations within fish farming enclosure 104. For example, the winch system may allow camera 106 to move around the perimeter and the interior of the fish farming enclosure 104 and at various depths within fish farming enclosure 104 to capture images of sea lice on both lateral sides of fish 102. The winch system may also allow control of pan and tilt of camera 106. [0026] The winch system may be operated manually by a human controller such as, for example, by directing user input to an above-water surface winch control system. Alternatively, the winch system may operate autonomously according to a winch control program configured to adjust the location of camera 106 within the fish farming enclosure 104, for example, in terms of location on the perimeter of the cage and depth within fish farming enclosure 104.

[0027] The autonomous winch control system may adjust the location of camera 106 according to a series of predefined or pre-programmed adjustments and /or according to detected signals in fish farming enclosure 104 that indicate better or more optimal locations for capturing images of fish 102 relative a current position and/or orientation of camera 106. A variety of signals may be used such as, for example, machine learning and computer visions techniques applied to images captured by camera 106 to detect schools or clusters of fish currently distant from camera 106 such that a location that is closer to the school or cluster can be determined and the location, tilt, and / or pan of camera 106 adjusted to capture more suitable images of the fish. The same techniques may be used to automatically determine that the camera 106 should remain or linger in a current location and /or orientation because camera 106 is currently in a good position to capture suitable images of fish 102 for unique fish identification or other purposes.

[0028] It is also possible to illuminate fish 102 in the fish farming enclosure 104 with ambient lighting in the blue-green spectrum (450nm to 570nm). This may be useful to increase the length of the daily sample period during which useful images of fish 102 in the fish farming enclosure 104 may be captured. For example, depending on the current season (e.g., winter), time of day (e.g., sunrise or sunset), and latitude of the fish farming enclosure 104, only a few hours during the middle of the day may be suitable for capturing useful images without using ambient lighting. This daily period may be extended with ambient lighting.

[0029] The fish farming enclosure 104 may be configured with wireless cage access point 108 A for transmitting images captured by the camera 106 and other information wirelessly to barge 110 or other water vessel that is also configured with wireless access point 108B. Barge 110 may be where on-site fish farming process control, production, and planning activities are conducted.

[0030] Barge 110 may house computer image processing system 112. At high-level, computer image processing system 112 is able to determine, with a high degree of accuracy, whether a particular fish in an image captured by camera 106 has“seen” before. Techniques for making this determination are described in greater detail below with respect to FIG. 2 and FIG.

3.

[0031] While camera 106 can be communicatively coupled to image processing system 112 wirelessly via wireless access points 108, camera 106 can be communicatively coupled to image processing system 112 by wire such as, for example, via a wired fiber connection between fish farming enclosure 104 and barge 110.

[0032] While image processing system 112 can be located remotely from camera 106 and connected by wire or coupled wirelessly, image processing system 112 can be a component of the camera 106. In this implementation, camera 106 may be configured within an on-board graphics processing unit (GPU) or other on-board processor or processors capable of executing image processing system 112. In both implementations where system 112 is integrated with the camera 106 and implementations where system 112 and camera 106 are remote from each other, output of image processing system 112 based on processing images captured by camera 106 may be uploaded to the cloud or otherwise over the internet via a cellular data network, satellite data network, or other suitable data network to an online service configured to provide the estimates or information derived by the online service therefrom in a web dashboard or the like (e.g., in a web browser, a mobile application, a client application, or other graphical user interface.) System 112 may also be locally coupled to a web dashboard or the like to support on-site fish farming operations and analytics.

[0033] While FIG. 1 shows image processing system 112 being contain on barge 110 and barge 110 is present in environment 100, there is no requirement of the present invention that image processing system 112 be contained on barge 110 or that barge 110 be present in aquaculture environment 100. Instead, camera 106 may contain image processing system 112 or be coupled by wire to a computer system that contains image processing system 112. The computer system may be affixed above the water surface to net pen 104 and may include wireless data communications capabilities for transmitting and receiving information over a data network (e.g., the Internet).

[0034] As another alternative, image processing system 112 may be located in the cloud (e.g., on the internet). In this configuration, camera footage captured by camera 106 is uploaded over a network (e.g., the internet) to system 112 in the cloud for processing there. Barge 110 or other location at the fish farm may have a personal computing device (e.g., a laptop computer) for accessing a web application over the network. The web application may drive a graphical user interface (e.g., web browser web pages) at the personal computing device where the graphical user interface presents results produced by system 112 such as analytics, reports, etc. generated by the web application based on the unique identification of fish 102 in fish farming enclosure 104.

[0035] Although not shown in FIG. 1, barge 110 may include a mechanical feed system that is connected by physical pipes to the fish farming enclosure 104. The feed system may deliver food pellets via the pipes in doses to the fish in fish farming enclosure 104. The feed system may include other components such as a feed blower connected to an air cooler which is connected to an air controller and a feed doser which is connected to a feed selector that is connected to the pipes to fish farming enclosure 104. The unique fish identifications performed by image processing system 112 may be used as input to the feed system for determining the correct amount of feed in terms of dosage amounts and dosage frequency, thereby improving the operation of the feed system.

[0036] As well as being useful for determining the correct amount of feed, the unique fish identifications generated by image processing system 112 are also useful for determining more optimal feed formulation. Feed formulation includes determining the ratio of fat, protein, and other nutrients in the food pellets fed to fish 102. Using unique fish identifications performed by image processing system 112 for fish in a particular fish farming enclosure, precise feed formulations for the fish in that fish farming enclosure may be determined. It is also possible to have different formulations for the fish in different fish farming enclosures based on individual biomass estimates and growth rates associated with uniquely identified fish. For example, individual biomass estimates of fish 102 in fish farming enclosure 104 may be generated based on unique fish identifications by image processing system 112 and input to an onsite (e.g., on barge 110) food pellet mixer that uses the individual biomass estimates to automatically select the ratio of nutrients to mix together in the food pellets that are delivered to fish 102 in the fish farming enclosure 104. Unique fish identifications by system 112 enables individual biomass estimates, and thus reduces double counting such as by including multiple biomass estimates for the same fish in a total biomass estimate calculation of fish 102 in the fish farming enclosure 104 that might occur if the same fish repeatedly swims in front of camera 106. As such, the total biomass estimate is more accurate and the ratio of nutrients delivered to fish 102 more targeted and precise.

[0037] As an alternative, the individual biomass estimates may be used to select feed to dispense in fish farming enclosure 104 from one or more different silos of pelletized feed. The different silos of feed may have different predetermined nutrient mixes and / or different pellet sizes. The individual biomass estimates may be used to automatically select which silo or silos to dispense feed from depending on various factors including for example the average estimated biomass of fish 102 in fish farming enclosure 104 calculated based on the individual biomass estimates.

[0038] Along the same lines, in addition to being useful for feed dosage optimization and feed formulation optimization, the individualized biomass estimates generated by image processing system 112 are also useful for determining optimal harvest times and maximizing sale profit for fish farmers. For example, fish farmers may use individual biomass estimates to determine how much of different fish sizes they can harvest and bring to market. For example, the different fish sizes may be distinguished in the market by 1 -kilogram increments. Thus, individual biomass estimates are important to fish farmers to accurately determine which market bucket (e.g., the 4kg to 5kg bucket, the 5kg to 6kg bucket, etc.) the fish in fish farming enclosure fall into to. Having individual biomass estimates would also improve fish farmers’ relationship downstream in the market such as with slaughterhouse operators and fish futures markets.

[0039] Additionally, along the same lines, individualized biomass estimates are useful for compliance with governmental regulations. For example, in Norway, a salmon farming license may impose a metric ton limit. Individual biomass estimates generated according to techniques disclosed herein may be useful for ensuring compliance with such licenses.

[0040] There are other benefits to fish farmers made possible with unique fish identifications including, as suggested above, reducing the variance of total biomass estimates, reducing double counting of fish, monitoring the progression / growth / health of individual fish over time and in response to various different environment and feed conditions, generating an accurate count of the fish in fish farming enclosure, tracking fish on an individual basis up to harvest time, and individualized feeding and medication of fish.

[0041] For example, individual biomass estimates derived based on unique fish identification allows the derivation of more granular and precise growth distributions and growth models. This is derivation is made possible because the unique fish identification provides for a better understanding of fish growth on an individual basis, as opposed to just an entire population basis.

CLUSTERING-APPROACH

[0042] FIG. 2 is a flowchart of process 200 of a clustering-approach for unique fish identification. Process 200 consists of two major steps: feature extraction 210 from images of fish, followed by application 220 of a clustering algorithm.

[0043] For feature extraction 210, a component-based approach may be used. FIG. 3 is a flowchart of process 300 of a component-based approach for feature extraction 210. Process 300 includes the steps of detecting 310 key / landmark points in images of fish, extracting 320 local regions in the image containing the fish’s landmark points detected 310, extracting 330 local binary pattern (EBP) features and histogram of oriented gradient (HOG) features from each extracted 320 local region, principal component analysis is applied 340 to the extracted 330 features for dimensionality reduction, the features from each local region are concatenated 350, and linear discriminant analysis is applied 360 to the resulting feature vector.

[0044] The landmark points detected 310 in an image of a fish may include one or more of those shown in FIG. 4: (1) the posterior most part of the eye, (2) the posterior point of the neurocranium (where scales begin), (3) the origin of the pectoral fin, (4) the origin of the dorsal fin, (5) the origin of the pelvic fin, (6) the posterior end of the dorsal fin, (7) the origin of the anal fin, (8) the origin of the adipose fin, (9) the anterior attachment of the caudal fin to the tail, (10) the posterior attachment of the caudal fin to the tail and (11) the base of the middle caudal rays.

[0045] The input to operation 310 may be an image of a substantially lateral view of a fish captured by cameras 106, or a cropped portion thereof. For example, the cropped portion may correspond to a rectangular (bounding box) portion of the image containing the fish as detected by a convolution neural network trained for fish detection and image segmentation thereof. The output of operation 310 may indicate a set of one or more X, Y coordinates where each X, Y coordinate identifies the location of a detected landmark point in the input image. Note the set may be empty if no landmark points are detected. The landmark points may be detected based on a statistical model of appearance of a fish from a substantially lateral perspective. For example, an active shape model may be trained based on database of annotated images of freely swimming fish from a substantially lateral perspective with the landmark points annotated. More information on active shape models is described in the paper by T.F. Cootes, C. J. Taylor, D.H. Cooper and J. Graham,“Active shape models - their training and application,” Computer Visions and Image Understanding (61): 38-59 (1995).

[0046] As used herein, substantially lateral, as in a substantially lateral view of a fish in an image captured by camera 106, refers to exactly lateral where the yaw of the fish along is zero degrees relative to the baseline of the cameras and the roll of the fish is zero degrees and includes approximately lateral where the yaw and roll of the fish are not zero degrees but are such that the same side of both the anterior end (the head) and the posterior end (the tail) are captured in the image. The yaw of the fish may be measured relative to the dorsoventral axis of the fish having its origin at the center of gravity directed toward the ventral side of the fish, perpendicular to the anteroposterior axis. The roll of the fish may be measure relative to the anteroposterior axis having its origin at the center of gravity and directed toward the anterior end of the fish. For example, a fish captured in an image may have a yaw of up to 30 degrees and a roll of up to 30 degrees and still be substantially lateral if the same side of both the anterior end (the head) and the posterior end (the tail) are captured in the image.

[0047] At operation 320, local regions in the image where the landmark points are detected 310 are extracted. For example, all of the local regions, a subset, or a superset thereof, that are depicted in FIG. 5 may extracted 320 based on the detected 310 landmark points depicted in FIG. 4. For example, the extracted 320 local regions may include the head area (SL)-(A), or a portion thereof, between the start of the (SL) the standard body length at the anterior end of the fish and (A) the body depth at the origin pectoral fin, the pectoral area (A)-(B), or a portion thereof, between the (A) the body depth at the origin pectoral fin and (B) the body depth at the origin of the dorsal fin is, the anterior dorsal area (B)-(C), or a portion thereof, between the (B) body depth at the origin of the dorsal fin and (C) the body depth at the end of the dorsal fin, the posterior dorsal area (C)-(D), or a portion thereof, between the (C) the body depth at end of the dorsal fin and (D) the body depth at origin of the anal fin, the anal area (D)-(E), or a portion thereof, between (D) the body depth at origin of the anal fin and (E) the least depth of the caudal peduncle, and the tail area (E)-(SL), or a portion thereof, between the (E) the least depth of the caudal peduncle and the end of the (SL) the standard body length at the posterior end of the fish.

[0048] At operation 330, various features are extracted 330 from each local region extracted 320. The features extracted 330 can include local binary pattern (LBP) features and / or histogram of oriented gradient (HOG) features. Other local features may be used in addition to, or instead of, LBP and / or HOG features, such as, for example, scale-invariant feature transform (SIFT) features, oriented fast and rotated brief (ORB) features, and / or HAAR-like features.

[0049] At operation 340, principal component analysis is applied to the extracted 330 features for dimensionality reduction.

[0050] At operation 350, the features extracted 330 from each local region extracted 320 that remain after applying 340 PCA are concatenated 350 to form a feature vector for the fish.

[0051] At operation 360, linear discriminant analysis is applied 360 to the resulting feature vector for the fish for further dimensionality reduction.

[0052] Returning to FIG. 2, at operation 220, a clustering algorithm is applied to a set of feature vectors, or a full or approximate k nearest neighbors’ graph or similarity matrix computed thereon. Each feature vector corresponds to one fish detected in an image for which feature extraction 210 is performed. Feature extraction 210 may be performed on fish detected in multiple images to produce a set of feature vectors. A variety of different clustering algorithms may be applied 220 to the set of feature vectors including k-means, spectral and rank-order.

[0053] With k-means cluster, the goal is to minimize the total square distance of the set of feature vectors to the nearest of C cluster centers. Here, C may be the approximate or estimated number of fish 102 in the fish farming enclosure 104. As a practical matter, an approximate k- means approach may be used such as Lloyd’s algorithm. More information on Lloyd’s algorithm is available in the paper by Lloyd, Stuart P.,“Least squares quantization in PCM,” IEEE

Transactions on Information Theory, 28 (2): 129-137 (1982).

[0054] With spectral clustering, an adjacency matrix is constructed from the set of feature vectors, describing the set of feature vectors as a graph. The graph can be fully connected where each value in the adjacency matrix is the similarity between corresponding samples. Otherwise, a spare adjacency matrix may be constructed, by either retaining all edges with a similarity above a threshold, or retaining a mixed number of edges with the greatest weights. After the full or sparse adjacency matrix is computed, the normalized Laplacian may be computed, followed by the top C eigenvectors of the normalized Laplacian, and then a new matrix is formed having columns of the computed eigenvalues. Finally, considering each row of this new matrix a new sample corresponding to the one of the original feature vectors in the set of feature vectors, k-means clustering is carried out on the new matrix. More information on spectral clustering can be found in the paper U. Von Luxburg,“A tutorial on spectral clustering,” Statistics and Computing, 17(4):395-416, 2007.

[0055] With rank-order clustering, a form of agglomerative hierarchical clustering leveraging a sophisticated distance metric is used. The overall procedure is as followed: (1) given a distance metric, (2) initialize all feature vectors to be separate clusters, and (3) iteratively merge the two closest clusters together. This requires a cluster- to-cluster distance metric. For example, the distance between two clusters may be considered to be the minimum distance (e.g., as measured by the cosine distance, for example) between any two feature vectors in the clusters.

[0056] Process 200 may be performed for a sample of images of fish 102 captured in the fish farming enclosure 104 over a period of time such as one or a few days to obtain the clusters of feature vectors which each cluster corresponds to a unique fish. Once the clusters are established, the identity of a particular fish in the fish farming enclosure 104 captured by the cameras 106 thereafter may be determined by obtaining a feature vector for the particular fish from the image of the fish according to the feature extraction step 210, and determining the cluster the feature vector is closest to according to a vector-to- cluster distance metric. For example, the vector- to- cluster distance metric may be measured as the cosine similarity between the feature vector and a centroid vector of the cluster.

END-TO-END DEEP LEARNING PIPELINE

[0057] FIG. 6 is a flowchart of an end-to-end deep learning pipeline for unique fish identification. The pipeline for unique fish identification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of fish images.

[0058] At operation 610, a deep convolutional neural network (DCNN) is trained as a classification task where the network learning to classify a given image of a fish to its correct identity label. The training may be based on a real or synthetically generated training dataset with substantially lateral fish images and corresponding identity labels. A softmax loss function may be used training the network.

[0059] At operation 620, sample images of fish are obtained based on images of freely swimming fish 102 captured by cameras 106 in the fish farming enclosure 104. The images produced by cameras 106 may themselves be processed through a convolutional neural network for the purpose of detecting and segmenting out (e.g., via a bounding box or segmentation mask) any fish in the images. The images may then be cropped to the area or areas of the image which a substantially lateral view of a fish is located.

[0060] At operation 630, pairs of sample images obtained 620 are input to the trained DCNN in a Siamese configuration as shown in FIG. 7. As a result, feature vectors are obtained for each image. The features vectors are normalized to unit length and a similarity score is computed 640 on the unit length normalized feature vectors that provides a measure of distance or how close the features lie in an embedded space. If the similarity score is greater than a predefined threshold, then the pair of images are judged to be of the same fish. The similarity score may be computed 640 as the L2 distance between the unit length normalized feature vectors or by using cosine similarity.

[0061] Operations 630 and 640 may be repeated for pairs of sample images obtained 620 to identify the unique fish among the fish captured in the sample images.

[0062] It is also possible to use exogenous data to aid a clustering-based or end-to-end machine learning-based algorithm for unique fish identification. For example, the length and width of the fish as determined based on a depth map or a disparity map generated from stereo images captured by stereo cameras 106 can be used to develop confidence that a previously identified fish has been re-identified. In addition, the time distance between two identifications may be used as a factor in the confidence that the two identifications are of the same fish with generally greater time distances resulting in less confidence and shorter time distances resulting in greater confidence. This time distance information may also be coupled with information about the positions of cameras 106 when identifications are made. For example, if cameras 106 are constantly moving around the fish farming enclosure 104 at a certain velocity, then it may be unlikely that the same fish would be captured by the cameras 106 on opposite sides of the fish farming enclosure 104 within a certain period of time depending on the certain velocity of the cameras 106 and the typical swimming speed and patterns of fish 102 in the fish farming enclosure 104.

BIOMETRIC APPROACH

[0063] Fish can be recognized individually by their looks. For example, the freckle pattern on the head and / or body of the fish can be used to identify each fish in a typical day’s production of fish for slaughter. In an implementation, the freckle pattern on fish 102 in fish farming enclosure 104 is read by computer vision and identified using a computer algorithm. [0064] Two different computer algorithms are disclosed for identifying or verifying individual fish based on biometrics. For fish 102 in fish farming enclosure 104, spots / freckles on the head and body of fish 102 are used as physiological characteristics to be identified to enable identifying or verifying individual fish based on biometrics.

BIOMETRIC APPROACH - POLAR COORDINATES ON FISH HEAD [0065] In a first biometric approach, two reference points are identified on a fish that remain constant in all images of the fish. The two reference points are then used to create a coordinate system. A first reference point is the center of the eye. A second reference point is the skull of the fish. Both of these points are assumed to be constant.

[0066] With these two reference points, a coordinate system can be created with the eye as an origin. In particular, a coordinate system the center of the eye as the origin can be created by drawings a line along the skull, and the drawing a new line from the center of the eye perpendicular to the skill line and with the shortest possible distance from the skull line. The perpendicular line can then be taken as a zero-degree line. Each freckle position in the coordinate system is recorded, and the position, together with information regarding the size and shape of each freckle, can be placed in a database. By doing so, a computer database can be constructed to look up fish by its detected freckles and find the fish with the most similar pattern. Thus, a database can be constructed that contains information regarding unique patterns that can be used to confirm the identity of a fish.

[0067] In a possible implementation, four“freckle” dimensions for each freckle detected in an image of a fish captured by camera 106 are used. The four freckle dimensions as depicted in FIG. 8 are:

• Rx = The radius from the center of the eye (e.g., 804) to the freckle (e.g., 806).

• a_x = The angle from the a- zero-line as origin in the positive clock-wise direction.

• A_Sx = The area of the freckle (e.g., 806).

• Fhx = The Heywood circularity factor of the freckle (e.g., 806), where:

F_hx ⁼

, where P represents the circumference of the freckle (e.g., 806).

[0068] The above four freckle dimensions, the center of the eye (e.g., 804), the location of a freckle (e.g., 806) can be determined with the aid of computer vision techniques applied to an image (e.g., 800) or images captured by camera 106. In a possible implementation, camera 106 is a stereo camera and disparity map information obtained by disparity map processing a pair of left and right images captured by camera 106 is used to aid in determining the four freckle dimensions for a given freckle detected.

[0069] In a possible implementation, computer vision techniques search for freckles in limited search area 802 of image 800 of fish, which can be cropped image of a full image of the fish captured by camera 106. For example, limited search area 802 can roughly correspond to the head area of the fish.

[0070] FIG. 9 is flowchart of first biometric approach 900 for identifying or verifying individual fish based on biometrics.

[0071] At operation 910, computer vision techniques are applied to an image or images captured by camera 106 to find the center of the eye of a fish in the image(s). The center of the eye will be the origin of a new coordination system.

[0072] At operation 920, the shortest distance between the center of the eye and the forehead of the fish by drawing a line perpendicular on the shortest distance vector (a- zero-line). The straight line along the forehead of the fish is the head line (e.g., as in FIG. 8).

[0073] At operation 930, the search area for freckles is set as an area where the angle a is between zero (0) degrees and approximately two-hundred and seventy degrees (270) and the bow line defined by the shadow cast by the gills.

[0074] At operation 940, the four dimensions Rx, a_x , A_sx , and Fhx for each freckle detected using computer vision techniques is obtained.

[0075] In a possible implementation, once four freckle dimensions for a target freckle are obtained as by approach 900, a fish number can be obtained from the database by looking up the fish number associated with four freckle dimensions that are most similar to the four freckle dimensions obtained for the target freckle

BIOMETRIC APPROACH - STAR CONSTELLATIONS

[0076] In a second biometric approach that uses pattern recognition, two images captured by camera 106 are compared to each other. The two images are placed over each other and then a search is conducted to find a positioning of the two images that matches up the pattern.

[0077] A constellation of freckles on a salmon can be the same regardless of a fixed point.

Th constellation pattern can be moved around the fish until the right position is found. This gives a larger degree of freedom in the search functions and the recognition of patterns can easily adapt to changes and inaccuracies, such as the fish placement in image, fish movement and twisting.

[0078] To recognize and“draw” the constellations, two freckles are first found. One of the two freckles is considered the origin and the other freckle“C” A line can be drawn between these two freckles. This line can make a forty-five (45) degree angle in an x/y- coordinate system, which can now be constructed from the chosen origin freckle. After the origin freckle and coordinate system is set, two other points can be found within the coordinate system boundaries. The limit of the coordinate system can be set to the distance x,y from the origin freckle to freckle “C”. These two new points are named“A” and“B”. Together, the two freckles and the two points make up a constellation.

[0079] A constellation can be randomly chosen based on these criteria. A constellation can be verified as unique in a database by creating it from an image of a fish and then checking the coordinates of the constellation in a database of constellations.

[0080] A constellation can be considered as four dots, and the distance between the dots and their relative position to each other can be represented as a set of vectors. The vectors allow the distance between dots to be scalable, so that the approach is more robust and can handle change in image resolution and fishes twisting and turning when imaged by camera 106.

[0081] FIG. 10 depicts image 1000 of a fish showing constellation detection within limited search area 1002. In this example, freckles C, and Origin and points / dots A and B are part of a constellation.

[0082] A simple way to separate freckles from the rest of the fish skin in an image captured by camera 106 is by using global thresholding on the pixel values. Each pixel is made up of an amount of red, blue, and green. Camera 106 can perceive this. By combining the different color values, a single value can be obtained that indicate how light or dark the pixel is. By defining limits as to how dark or bright a pixel has to be considered part of a freckle, a“freckle-only”- version of the original image can be created that contains only the freckles on the salmon.

[0083] Unfortunately, determining this global value for pixel values is difficult as a global limit seldom exists. This is because the intensity of fish skin varies. For example, the belly is often much lighter than the back of the fish with the freckles in between. Only looking at total darkness in each pixel risks inclusion of irrelevant pixels such as dark fins of the fish, dark back and shadow along the gills. These pixels are noise and a potential error source when counting and positioning freckle pattern constellations. To account for this, the second biometric approach uses deep learning to learn what a freckle is based on a learning set. By doing so, the image can be cleaned up before sending to a memorizing algorithm.

[0084] When training a deep learning algorithm what a freckle is, it is helpful to define what the algorithm supposed to be looking for. The learning set for the deep learning algorithm can be made by a human or machine, and can be done by marking freckles on a set of fish pictures.

Each pixel that is marked can then be a part of a freckle on the fish.

[0085] The automatic classification of freckles can be done by using a convolutional neural network. This type of neural network looks at the pixel to be classified and the nearest X number (-168) neighboring pixels to determine whether or not the pixel is part of a freckle or not.

[0086] The picture can be normalized before it is given as input to the network. Normalizing the picture is done by deducting the average value of the pixels and dividing by the variance in the picture. No further picture editing is needed (like as segmentation or background deduction).

[0087] The neural network can be trained to create a binary mask (“a freckle-only photo”) with the same dimension as the input photo, where values approaching 1 means that the corresponding pixel in the input photo is a freckle, and values closer to 0 indicate that the pixel is not a freckle.

[0088] From the binary photo that is created by the neural network, a list of freckle positions is made. The freckles are described as x, y-coordinates relative to the top left-hand corner of the photo. This list of dots can be used to identify the fish.

[0089] Once deep learning algorithm has converted a normal picture of a fish to a list of x, y- coordinates that describe the position of the fish’s freckles relatively to the picture frame, the list can be used to describe the individual fish in a way that identifies it from all other fish in the fish farm enclosure.

[0090] It is useful to identify fish in a way so that the rotation and placement of the fish in the picture is inessential to identifying the fish. To describe this possible variance in rotation position in the picture the freckles can be described in a coordinate system that follows the fish and its rotation in the picture. Ideally, the coordinate system should be automatically placed on the fish, and placed correctly every time, as a misplaced coordinate system will describe all freckles wrong due to a false set of reference. [0091] This is solved by first using a large amount of different coordinate systems, and later describing a small subset of all the freckles in each coordinate system. By doing so, most of the freckles on each fish are described in the correct frame of reference, even though some coordinate systems will be misplaced.

[0092] In a possible implementation, the same amount of coordinate systems as there are freckles detected on the fish is constructed. Each of the coordinate systems is used only to describe the position of two freckles. For each freckle on the fish, a search for three nearest freckles is conducted. This forms a constellation of freckles. For each constellation two of the freckles (namely freckles A and B) are used to construct a coordinate system. The remaining freckles (namely freckles C and D) are described in this coordinate system using their x, y- coordinates.

[0093] Each constellation consisting of four freckles results in an index vector which is [Cx, Cy, Dx, Dy] The vector is then made discrete, by setting each element in the vector to be an integer between 1 and N. This vector can be used to look up the data in a list with the dimension (N x N x N x N). The fish ID which the constellation belongs to is then added to this position of the list. Instead of a list of freckles per fish, a list of all constellations results, where each fish with a given constellation is at a given position in the list.

[0094] Given four freckles of a constellation detected in an image, an index vector for the constellation can be generated.

[0095] First, the four freckles are named A, B, C, and D. Freckles A and B are used to define a coordinate system. Freckles C and D are used to create the index vector. The distance between all of the freckles and the two freckles that are farthest away from each other are named freckles A and B. The other two freckles are named C and D.

[0096] For both freckles that are considered to be named A and B, the distance to the nearest freckle is measured. The freckle that is closest to another freckle is named A, and is considered the origin of the coordinate system.

[0097] With freckle A as the origin, the vector AB is made a unit vector and it is made to stand 45 degrees on the x-axis. The coordinates are normalized, and the points rotated to fit into a new coordinate system. The coordinate system now decided, and the freckle closest to the origin is named C and the other dot D.

[0098] The preliminary index vector can now be found, and will be [Cx, Cy, Dx, Dy] [0099] Because s 4-dimensional table only can contain integers, the coordinate system can be made discrete. The limits for the discrete coordinate system is defined as the unit circle that passes through A and B

[0100] The coordinate system can be made discrete using N (e.g., 10) number of steps.

[0101] The index vector has now been established and is used to index a table. The ID of the fish which the freckles belonged to are put at the given index.

[0102] To train a deep convolutional artificial neural network what a freckle is, images of fish with freckles annotated in the images can be used.

[0103] A fish may be lookup in an NxNxNxN matrix. Multiple fish IDs may be stored per constellation, but only one ID may receive a significant hit on the right number of constellations.

[0104] When all freckle patterns are logged into a database the individual fish can be recognized at a later time. This can be done by taking a new photo of the individual. The freckles on the new photo are recognized by the deep learning, and constellations of dots are used to produce index vectors, like above. Every index vector, which describes a unique constellation of dots, is used to look up the individual in the matrix. For every slot in the matrix we will find a list of fish IDs that hold just this constellation/index vector.

[0105] The same constellation may not produce the same index vector every time it is calculated/recorded, due to noise and inaccuracies in the discretization process. Despite this, the likelihood to identify fish correctly is high. Furthermore, the likelihood for another fish to have the exact same constellation is significantly lower. Thus, a given fish being identified will receive a significantly higher amount of hits than the rest of the fish population.

[0106] FIG. 11 is a block diagram that illustrates a computer system 1100 with which some embodiments of the present invention may be implemented. Computer system 1100 includes a bus 1102 or other communication mechanism for communicating information, and a hardware processor 1104 coupled with bus 1102 for processing information. Hardware processor 1104 may be, for example, a general-purpose microprocessor, a central processing unit (CPU) or a core thereof, a graphics processing unit (GPU), or a system on a chip (SoC).

[0107] Computer system 1100 also includes a main memory 1106, typically implemented by one or more volatile memory devices, coupled to bus 1102 for storing information and instructions to be executed by processor 1104. Main memory 1106 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 1104. Computer system 1100 may also include a read-only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104. A storage device 1110, typically implemented by one or more non-volatile memory devices, is provided and coupled to bus 1102 for storing information and instructions.

[0108] Computer system 1100 may be coupled via bus 1102 to a display 1112, such as a liquid crystal display (LCD), a light emitting diode (LED) display, or a cathode ray tube (CRT), for displaying information to a computer user. Display 1112 may be combined with a touch sensitive surface to form a touch screen display. The touch sensitive surface is an input device for communicating information including direction information and command selections to processor 1104 and for controlling cursor movement on display 1112 via touch input directed to the touch sensitive surface such by tactile or haptic contact with the touch sensitive surface by a user’s finger, fingers, or hand or by a hand-held stylus or pen. The touch sensitive surface may be implemented using a variety of different touch detection and location technologies including, for example, resistive, capacitive, surface acoustical wave (SAW) or infrared technology.

[0109] An input device 1114, including alphanumeric and other keys, may be coupled to bus 1102 for communicating information and command selections to processor 1104.

[0110] Another type of user input device may be cursor control 1116, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.

[0111] Instructions, when stored in non-transitory storage media accessible to processor 1104, such as, for example, main memory 1106 or storage device 1110, render computer system 1100 into a special-purpose machine that is customized to perform the operations specified in the instructions. Alternatively, customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or hardware logic which in combination with the computer system causes or programs computer system 1100 to be a special-purpose machine.

[0112] A computer-implemented process may be performed by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in main memory 1106. Such instructions may be read into main memory 1106 from another storage medium, such as storage device 1110. Execution of the sequences of instructions contained in main memory 1106 causes processor 1104 to perform the process. Alternatively, hard-wired circuitry may be used in place of or in combination with software instructions to perform the process.

[0113] The term“storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operate in a specific fashion. Such storage media may comprise non-volatile media (e.g., storage device 1110) and/or volatile media (e.g., main memory 1106). Non-volatile media includes, for example, read-only memory (e.g., EEPROM), flash memory (e.g., solid-state drives), magnetic storage devices (e.g., hard disk drives), and optical discs (e.g., CD-ROM). Volatile media includes, for example, random-access memory devices, dynamic random-access memory devices (e.g., DRAM) and static random-access memory devices (e.g., SRAM).

[0114] Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the circuitry that comprise bus 1102. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.

[0115] Computer system 1100 also includes a network interface 1118 coupled to bus 1102. Network interface 1118 provides a two-way data communication coupling to a wired or wireless network link 1120 that is connected to a local, cellular or mobile network 1122. For example, communication interface 1118 may be IEEE 802.3 wired“ethernet” card, an IEEE 802.11 wireless local area network (WLAN) card, a IEEE 802.15 wireless personal area network (e.g., Bluetooth) card or a cellular network (e.g., GSM, LTE, etc.) card to provide a data

communication connection to a compatible wired or wireless network. In any such

implementation, communication interface 1118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

[0116] Network link 1120 typically provides data communication through one or more networks to other data devices. For example, network link 1120 may provide a connection through network 1122 to a local computer system 1124 that is also connected to network 1122 or to data communication equipment operated by a network access provider 1126 such as, for example, an internet service provider or a cellular network provider. Network access provider 1126 in turn provides data communication connectivity to another data communications network 1128 (e.g., the internet). Networks 1122 and 1128 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 1120 and through communication interface 1118, which carry the digital data to and from computer system 1100, are example forms of transmission media.

[0117] Computer system 1100 can send messages and receive data, including program code, through the networks 1122 and 1128, network link 1120 and communication interface 1118. In the internet example, a remote computer system 1130 might transmit a requested code for an application program through network 1128, network 1122 and communication interface 1118. The received code may be executed by processor 1104 as it is received, and/or stored in storage device 1110, or other non-volatile storage for later execution.

[0118] In the foregoing specification, the embodiments have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A system for unique fish identification of freely swimming fish in a net pen in an aquaculture environment, the system comprising:

a camera for immersion underwater in the net pen and for capturing images of freely swimming fish in the net pen; and

an image processing system of, or operatively coupled to, the camera, the image processing system comprising one or more processors, storage media, and one or more programs stored in the storage media and configured for execution by the one or more processors, the one or more programs comprising instructions configured for:

obtaining and storing a set of sample digital images captured by, or derived from digital images captured by, the camera;

using a trained convolutional neural network to classify a first image region in a first digital image, of the set of sample digital images, as containing an image of a first fish;

using the trained convolutional neural network to classify a second image region in a second digital image, of the set of sample digital images, as containing an image of a second fish;

creating a first cropped digital image based on the first image region in the first digital image;

creating a second cropped digital image based on the second image region in the second digital image;

generating a first feature vector based on the first cropped digital image;

generating a second feature vector based on the second cropped digital image;

measuring a similarity between the first feature vector and the second feature vector; and determining whether the first fish and the second fish are a same fish based on the similarity measured.

2. The system of Claim 1 , the one or more programs comprising instructions further configured for:

measuring a cosine similarity between the first feature vector and the second feature vector; and determining whether the first fish and the second fish are a same fish based on the cosine similarity measured.

3. The system of Claim 1, the one or more programs comprising instructions further configured for:

measuring a Euclidean distance between the first feature vector and the second feature vector; and

determining whether the first fish and the second fish are a same fish based on the Euclidean distance measured.

4. The system of Claim 1 , the one or more programs comprising instructions further configured for:

determining whether the first fish and the second fish are a same fish based on the similarity measured being within a threshold similarity.

5. The system of Claim 1, the one or more programs comprising instructions further configured for:

determining whether the first fish and the second fish are a same fish based on exogenous data.

6. The system of Claim 5, wherein:

the camera is a stereo camera;

the exogenous data includes a first length and a first width of the first fish;

the exogenous data further includes a second length and a second width of the second fish;

the one or more programs comprises instructions further configured for determining the first length and the first width of the first fish based on a first disparity map generated based on the first digital image;

the one or more programs comprises instructions further configured for determining the second length and the second width of the second fish based on a first disparity map generated based on the second digital image; and

the one or more programs comprising instructions further configured for determining whether the first fish and the second fish are a same fish based on the first length, the first width, second length, and the second width.

7. The system of Claim 5, wherein: the exogenous data includes a time distance between a first time when the first digital image is captured by the camera and a second time when the second digital image is captured by the camera; and

the one or more programs comprising instructions further configured for determining whether the first fish and the second fish are a same fish based on the time distance.

8. The system of Claim 7, wherein:

the exogenous data indicates a first camera position of the camera in the net pen when the first digital image is captured and a second camera position of the camera in the net pen when the second digital image is captured; and

the one or more programs comprising instructions further configured for determining whether the first fish and the second fish are a same fish based on the time distance, the first camera position, and the second camera position.

9. The system of Claim 1, wherein:

the trained convolutional neural network is a first trained convolutional neural network; the one or more programs comprising instructions further configured for:

using a second trained convolutional neural network in a Siamese configuration to generate the first feature vector based on the first cropped digital image, and

using the second trained convolutional neural network in the Siamese configuration to generate the second feature vector based on the second cropped digital image.

10. A computer- implemented comprising for unique fish identification of freely swimming fish in a net pen in an aquaculture environment, the method comprising:

obtaining and storing a set of sample digital images captured by, or derived from digital images captured by, a camera;

creating a first cropped digital image based on the first image region in the first digital image; creating a second cropped digital image based on the second image region in the second digital image;

generating a first feature vector based on the first cropped digital image;

generating a second feature vector based on the second cropped digital image;

measuring a similarity between the first feature vector and the second feature vector; determining whether the first fish and the second fish are a same fish based on the similarity measured; and

wherein the stereo camera is immersed underwater in a net pen and captures images of freely swimming fish in the net pen; and

wherein the method is performed by an image processing system of, or operatively coupled to, the camera, the image processing system comprising one or more processors, storage media, and one or more programs stored in the storage media and configured for execution by the one or more processors, the one or more programs comprising instructions executed by the one or more processors to perform the method.

11. One or more non-transitory computer-readable media storing one or more programs comprising instructions configured to perform a method recited in Claim 10.