SE2150231A1

SE2150231A1 - Improved forest surveying

Info

Publication number: SE2150231A1
Application number: SE2150231A
Authority: SE
Inventors: Johannes Ulén; Krister Tham; Linus Mårtensson; Sebastian Haner
Original assignee: Katam Tech Ab
Priority date: 2017-02-27
Filing date: 2017-02-27
Publication date: 2021-03-02

Abstract

A surveying apparatus (100) comprising a controller (CPU), the controller (CPU) being configured to: receive an image stream representing a video sequence; determine a camera pose for a second image in the image stream relative a first image in the image stream; match the first image with the second image, based on the camera pose; and generate a three dimensional model based on the image match.

Description

IMPROVED FOREST SURVEYING TECHNICAL FIELDThe present invention generally relates to methods, devices and computer programs for forest inventory management, such as forest surveying.

BACKGROUND In today”s forestry industry, the trees in a forest are handled as groups of trees, assuming thatall trees Within a group have more or less the same characteristics. With today”s inventorymethods, a group of trees norrnally are quite big (> l ha). The assumption that all trees havesome characteristics is very rough and there is no control of each individual tree. This meansthat the ground potential is not fully utilized and at the end it leads to production loss, Why more precise forest inventory management methods and devices are needed.

HoWever, Forest inventory management is a highly time-consuming and costly task, Whetheryou do it With traditional measurement tools, or With modem solutions such as airbome solutions, laser scanners or cloud servers.

Such solutions are both expensive and complex to implement. For example, using laserscanners require a large investment and specially trained surveyors. Utilizing cloud servers tocompute the forest status also most often requires advanced measuring apparatus, and anintemet connection, something Which is not always possible in rural areas, especially not indeveloping countries. Aerial solutions also require a large investment and specially trained staff as an aeroplane needs to be purchased or rented.

The most commonly utilized tools are still hand-held analog devices that have not changed forthe last several decades. While these tools are fairly fast, the measurement is imprecise, lackstraceability and is highly subj ective in the selection of measurement sites. As such, the use requires experience and can still not be trusted.

Solutions have been proposed based on taking pictures from various angles of a forest andthen match the pictures to generate a three dimensional model from Which trees may be identified. HoWever, such solutions suffer from the vast computational resources required and are as such not suitable for ﬁeld Work. Execution times of hours are discussed and then Whenbeing executed on Work stations, making such solutions only feasible for providing ananalysis only after a field survey has been done and the camera used has been retumed to an office set up.

There is thus a longstanding need for cheaper and more reliable devices and methods thatdoes not require a large investment, nor specially trained staff and Which results may be trusted.

SUMMARY The inventors have realized that by, instead of limiting the data to be processed in order toreduce the computational resources needed, the data may be increased so that tools, such asSLAM, may be used to provide for a faster matching of one picture to another, therebyeffectively reducing the required computational resources. The inventors therefore propose to utilizing video sequences to generate the three dimensional models.

The inventors base this proposal on the insightful realization that techniques such as SLAMcan be used also for surveying, not only for controlling autonomous vehicles. Thus byincorporating this technology from the field of controlling autonomous vehicles, into the fieldof forest surveying, an improved manner that is capable of being executed even on acontemporary smartphone is achieved. Thus, solving problems such as the long standingproblem of how to survey forest areas more efficiently not requiring vast computationalresources and/or vast human resources by insightful reasoning and by incorporating technologies from remote technical fields.

The problems of the prior art of suffering from too long computational times, heavycomputing resources and/or special equipment such as lasers, that are usually heavy to carryaround, requiring vehicles for proper mounting and transport, have thus been overcome by theinventors. Instead of seeking solutions to filter down or reduce the data set to be processed, orfinding better matching algorithms, the inventors have realized that by instead doing theopposite, and increase the data to be processed, tools norrnally only used in other remote fields may be used.

Furthermore, as the teachings herein are proposed to be supplemented only by sensorscommonly found in smartphones, such as motion sensors (accelerometers) and/or positioningsensors, such as Global Positioning System sensors or GNSS sensors, and as the teachingsherein rely on video camera recording, video cameras being common in smartphones, theteachings herein enable forestry surveys to be performed using a simple smartphone (ortablet), which greatly reduce the investment and also maintenance needed for performing forestry surveys.

It is therefore provided an apparatus to overcome or at least mitigate or reduce the problemsdiscussed herein, the apparatus being a forestry surveying apparatus comprising a controller,the controller being configured to: receive an image stream representing a video sequence;determine a camera pose for a second image in the image stream relative a first image in theimage stream; match the first image with the second image, based on the camera pose; andgenerate a three dimensional model based on the image match; wherein the video sequence and the three dimensional model represent forestry related objects.

It is also provided a method for forestry surveying, the method comprising: receiving animage stream representing a video sequence; deterrnining a camera pose for a second image inthe image stream relative a first image in the image stream; matching the first image with thesecond image, based on the camera pose; and generating a three dimensional model based onthe image match; wherein the video sequence and the three dimensional model represent forestry related objects.

It is also provided a computer-readable medium comprising computer program instructions that when loaded into a controller, causes the method according to herein to be executed.

It should be noted that even though the techniques discussed herein have been disclosed asbeing performed in a handheld device possibly simultaneous with making the recordings, theymay also be performed after the recordings have been made by uploading them to a (remote) SCTVCT.

The manner taught herein also solves a problem of how to match two surveyed areas asdiscussed below, and it is therefore an object of the present invention to provide a method for matching a first area to a second area, wherein said first and second areas correspond to surveyed areas and each comprises at least one object, the method comprises receiving saidf1rst area; receiving said second area; finding a first set of objects in said first area; finding amatching second set of objects in said second area; and stitching together said first area With said second area by overlaying said first and second sets.

It is also an object of the teachings herein to provide a computer program comprisingcomputer executable instructions Which When downloaded and executed by a processor of a device causes the device to perform a method as above and also as below.

It is also an object of the teachings herein to provide a device for matching a first area to asecond area, Wherein said first and second areas correspond to surveyed areas and eachcomprises at least one object, the device comprising a processor arranged for receiving saidf1rst area; receiving said second area; finding a first set of objects in said first area; finding amatching second set of objects in said second area; and stitching together said first area With said second area by overlaying said first and second sets.

As the inventors have also realized, the teachings herein may also be used in other surveyingareas, and it is therefore provided a surveying apparatus comprising a controller, thecontroller being configured to: receive an image stream representing a video sequence;determine a camera pose for a second image in the image stream relative a first image in theimage stream; match the first image With the second image, based on the camera pose; and generate a three dimensional model based on the image match.

It is also provided a method for surveying, the method comprising: receiving an image streamrepresenting a video sequence; deterrnining a camera pose for a second image in the imagestream relative a first image in the image stream; matching the f1rst image With the secondimage, based on the camera pose; and generating a three dimensional model based on the image match.

BRIEF DESCRIPTION OF THE DRAWINGS The above, as well as additional objects, features and advantages of the present invention, willbe better understood through the following illustrative and non-limiting detailed descriptionof preferred embodiments of the present invention, with reference to the appended drawings,wherein: Figure lA is a schematic View of a user equipment conf1gured according to anembodiment of the teachings herein; Figure lB is a schematic View of the components of a user equipmentconfigured according to an embodiment of the teachings herein; Figure 2 is a general ﬂowchart of a method according to one embodiment of theteachings herein; Figure 3 is a general ﬂowchart of a method according to one embodiment of theteachings herein; Figure 4 is a schematic View of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 5 is a general ﬂowchart of a method according to one embodiment of theteachings herein; Figure 6 is a schematic View of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 7 is a general ﬂowchart of a method according to one embodiment of theteachings herein; Figure 8 shows a drone and UE system adapted according to one embodiment ofthe teachings herein; Figure 9 shows a drone and UE system adapted in use according to oneembodiment of the teachings herein; Figure 10 shows a schematic View of a computer-readable product according toone embodiment of the teachings herein; Figure ll is a general ﬂowchart of a method according to one embodiment ofthe teachings herein; Figure 12 is a schematic View of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 13 is a general ﬂowchart of a method according to one embodiment of the teachings herein; Figure 14 is a schematic view of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 15 is a general ﬂowchart of a method according to one embodiment ofthe teachings herein; Figure 16 is a schematic view of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 17 is a general ﬂowchart of a method according to one embodiment ofthe teachings herein; Figure 18 is a schematic view of one example use of a user equipment accordingto one embodiment of the teachings herein; Figure 19 shows a schematic view of a first area and a second area that are to bestitched together to form a composite area according to one embodiment; Figure 20 is a ﬂow chart illustrating a method for a device according to anembodiment; Figure 21 shows a schematic view of a first area and a second area that are to bestitched together to form a composite area according to one embodiment; and Figure 22 shows a schematic view of a combination of generating stands and planning a route according to one embodiment.

DESCRIPTION The inventors have realized that by relying on some specific assumptions about objects, suchas trees, it is possible to simply and elegantly extract the main features for objects, such astrees, such as the width and location (at least the relative position of one object in reference to other objects) from a simple video film, possibly filmed with a smartphone.

The inventors have further ingeniously combined trialed video and image processingtechniques which have been selected and combined in a manner that enables the analysis to beperformed using only limited computational power, so that the analysis may be made by a smartphone, and even at realtime while the video is being recorded.

Using the teachings herein, as invented by the inventors, it is thus possible to conduct forestinventory by simply filming (sections of) a forest with a smartphone, a task that does notrequire expensive equipment or specially trained staff, and that produces results that are within acceptable accuracy and can thus be trusted.

It should be noted that the manner taught herein may also be executed with any camerahaving or being connected to a processing unit. Examples of such arrangements aresmartphones, tablets, laptop computers, video cameras connected (or configured to beconnected) to a tablet, a laptop computer, a smartphone or other processing terminal,surveillance cameras to mention a few examples. Such arrangements will hereafter be referredto as a user equipment UE and an example of such a UE will be given with reference to figures 1A and lB.

Figure 1A shows an example of a User Equipment 100, in this embodiment a smartphone100. Another example of a UE is a tablet computer. Figure 1B shows a schematic view ofcomponents of a UE 100. The UE 100 comprises a user interface (UI) which in the exampleof figures 1A and 1B comprises a display 110 and one or more physical buttons 120. Thedisplay 110 may be a touch display and the user interface may thus also comprise virtual keys(not shown). The UI is connected to a controller which is configured for controlling theoverall operation of the UE 100. The controller may be a processor or other programmable lo gical unit. The controller may also be one or more such programmable logical units, but forthe reasons of this application the controller will be exemplified as being a Central ProcessingUnit (CPU). The controller CPU is connected to or arranged to carry a computer readablememory for storing instructions and also for storing data. The memory MEM may compriseseveral memory circuits that may be local to the UE or remote. Local memories are examplesof non-transitory mediums. Remote memories are non-transitory in themselves, but present themselves to the UE as transitory mediums.

The UE 100 further comprises or is arranged to be connected to a camera 130 for receiving animage stream from which image stream is to be processed by the controller CPU and at leasttemporarily stored in the memory MEM. As the camera 130 records a video sequence, the video sequence may simultaneously be displayed on the display 110.

The UE 100 may also comprise sensors, such as an accelerometer 140 configured to providethe controller with sensor data, either to be processed by the controller or (at least partially)pre-processed. In one embodiment, this enables the controller to determine or followmovements of the camera, both as regards lateral movements and changes in angles, that is the pose of the camera. Apose is thus a position and a direction or angle of a camera, resulting in six (6) degrees of freedom indicating how a camera is moved and or rotatedmaking it possible to determine how the camera is moved and or rotated from one pose toanother pose. Other examples of sensors include, but are not limited to GNSS device, time of ﬂight sensor, compass, and gyro to name a few.

In one embodiment, as realized by the inventors, this enables the controller to compare amovement in the SLAM data to the sensor detected movements of the camera so that a scale may be deterrnined (as in 592).

The UE 100 may also comprise positional sensors, such as a global navigational systemsensor configured to provide the controller with position data, either to be processed by thecontroller or (at least partially) pre-processed. This enables the controller to determine orfollow the actual position of the camera. This position is deterrnined in an extemal coordinatesystem (extemal to the SLAM data cloud), such as a Global Navigation System (GPS orGNS S).

As mentioned above, the inventors of the teachings herein have realized that it is possible toreduce the computational resources needed to perform the image matching of some prior artsolutions, by replacing the series of images, with a video or image stream. The differencebetween an image stream and a series of images is thus that the image stream comprisesimages that are taken at regular time intervals, and where the time intervals are very short, forexample representing 30 images per second or higher, whereas a series of images, are imagestaken at irregular intervals and where the intervals are generally longer in the order of an image per 10 seconds or more, even minutes.

The inventors have realized that by actually increasing the data to be processed, thecomputational resources needed may be significantly reduced as there will be a more strictcorrelation between each image, which will make the image matching faster and moreefficient. Thus, by realizing that stepping from photographic single shot series to videostreams, and thereby increasing the data to be processed by a huge factor, the image matchingmay be done more effectively and thereby also reducing the computational resources actuallyneeded. This is made possible by the realization that by using techniques such as SLAM (Simultaneous Localization And Mapping), the camera°s position may be deterrnined, whereby the matching between subsequent images becomes much simpler as their positional relationship is known.

A high level description of a manner for executing the present invention for providing forestryrelated parameters will be given with reference to Figure 2 which shows a ﬂowchart for ageneral method according to an embodiment of the present invention. The UE receives 210 avideo sequence (or an image stream), possibly along with sensor data, deterrnines 220 thecamera°s position and angle, i.e. the pose of the camera, and perforrns image matching 230between the respective images in the video sequence using the camera°s position, and therebygenerates 240 a three dimensional model of the f1lmed area and any objects therein. The three dimensional model may then be used to determine 250 various forestry related parameters.

The sensor data may relate to positional information, such as GNSS or GPS coordinates orother data. The sensor data may also or altematively relate to motion information, such as accelerometer or gyro data.

This manner may be used to survey a forest area thereby overcoming the problems discussedin the background section. This manner may also be used to determine other forestry relatedparameters such as the amount of timber in a pile of logs. This manner may also be used todetermine other forestry related parameters such as the amount of chippings in a pile of chippings.

In further embodiments, comparable to those of figures 15, 16, 17 and 18, the teachingsherein may also be used, as realized by the inventors, for analyzing and deterrnining thecontent in piles of gravel, piles of sand, piles of grain, piles of potatoes (or other agriculture-related piles) and even in deterrnining the content of blocks of stone or ore and goods and cargo packing.

A high level description of a manner for executing the present invention will be given withreference to Figure 3 which shows a ﬂowchart for a general method according to anembodiment of the present invention and to Figure 4 which shows a schematic view of a UE,such as a smartphone, configured to work according to the present invention in a forest environment. A more detailed description will follow below.

The UE 100 is configured to receive a video sequence (an image stream) of a currentlysurveyed area comprising a number of objects, such as a forest area comprising trees T. Thevideo sequence may comprise meta data, such as by capturing the video sequence (comprisingan image stream) With a (possibly built-in) camera 130. The video sequence may be receivedalong With time-stamped sensor data or other sensor data, the sensor data being (a portion of)meta data in a first step 310. The sensor data may relate to positional information, such asGNSS or GPS coordinates or other data. The sensor data may also or altematively relate to motion information, such as accelerometer or gyro data.

The image stream is generated or captured so that multiple angles of the same area arecaptured. This may be achieved by simply Walking the UE holding the camera along a path Pthrough an area possibly sWeeping the camera back and forth thereby also increasing the sizeof the surveyed area captured by the camera°s angle of view AV. At the same time themetadata may be generated. One example is movement data that is generated by the accelerometer 140.

It should be noted that the camera may be brought through a forest in many different Ways. -for example it can be carried by a user or operator that Walks through a forest. It can also bepositioned on a vehicle, such as a forestry vehicle or an all-terrain vehicle (ATV). The camera may also be brought by a drone, commercial or hobby.

HoWever, unlike prior art solutions, the teachings herein benefit from not requiring vastcomputational resources and are as such suitable for use in a handheld device that is simplyWalked through a forest unlike the prior art solutions Which are - in some manner - dependent on bigger and heavier equipment, be it sensors or computational resources.

In one embodiment the camera is conf1gured to use a Wide-angle lens. If a Wider angle thanthe one originally conf1gured for the camera, the camera may be equipped With a differentlens or an add-on lens. A Wider angle enables for more trees or objects to be recorded in eachcamera angle and frame. If a lens, such as a fish-eye lens is used, the manners herein may beconf1gured to take this into account by compensating for angular distortion at the edges of a frame. 11 The image stream is then analyzed by the controller CPU of the UE 100 by deterrnining 320the pose of the camera and through a series of image processing algorithms 330 to generate a three dimensional 3D model of the forest area currently being surveyed in a fourth step 340.

It should be noted that the image stream may be analyzed at the time it is received, such as at the time it is captured or at a later stage such as When it is being vieWed.

The algorithms used may include for example image segmentation, SLAM, depth mapestimation, point cloud generation and merging thereof, edge detection, cluster detection, deepleaming, and more to estimate relative and absolute positions and parameters of trees in thevideo, as Well as information on the surrounding terrain for example retained from receiving acurrent position from a Global Navigation Satellite System (GNSS) and checking that position in a map program.

As the 3D model has been generated, stems of the objects or trees in the image stream areidentified 350. Making assumptions that a tree stem generally has tWo straight and parallelsides and that the sides end and start at approximately the same level makes it fairly easy toidentify the stems and therefore also the trees, the number of trees, the Width of the trees andthe (relative) location of the trees as is done in one embodiment. In one embodiment, it mayalso be deterrnined Whether the color of the tree stem is different from the background to identify the stems.

By realizing that a simple 3D model may be generated even in a smartphone, and byidentifying the trees in the 3D model, a relatively accurate estimation of the tree density in anarea may be provided in a manner that is simple and cheap to execute and that does not require expensive specialized equipment.

In order to provide a visual feedback of the tree density and such, but also to provide acorrection mode, the detected trees may be presented 360 on the display 110 of the UE100 bysimply overlaying graphical representations, such as cylinders, of the detected trees on the video being recorded or vieWed.

As the image stream is recorded or captured, it is beneficial if the camera is moved so as to capture multiple angles. This may be achieved by simply instructing a user or operator to 12 record a Video When Walking through the forest, either in a recommended pattern through a single measurement zone, or as a free Walk throughout the forest.

In one embodiment, the video recording comprises a plurality of video recordings, some ofWhich may have been done at different times, and some of Which captures different angles ofa tree or several trees. In this manner, the information available for a given tree may besupplemented over time as the actual position of a tree may be deterrnined, Whereby the tree may be identified as being the same in different recordings.

By using the meta data, the position of the captured or surveyed area may be positioned 370 Within the forest, especially if the sensor data includes GNSS positions.

Statistics may then be generated 380 over several recordings by comparing them based ontheir corresponding positions, or find statistics, such as tree density distributions, diameter distributions, terrain ﬂatness index, and similar.

During the recording, the algorithms may be run in real-time to produce information on thecurrent measuring location. This information can be used to deterrnine, among other things,the size of the area that has been covered When measuring a zone. The real-time data canprovide an error estimate in the test. Once this error estimate is beloW a tolerance level, there can be an indication provided to the user that the current measurement is done.

When Walking through the forest, as proposed by the present invention, instead of onlytargeting selected test areas, several benefits are achieved. One beneﬁt is that the camera getscloser to several more trees than it Would have focusing on surveying fixed test areas. Thecloser view of more trees, giving a more detailed view of the trees, may be used to increasethe accuracy of the image processing in that it Will be given better grounds for deterrniningtrees (more high resolution trees With color and texture to compare With as is done in some embodiments).

Furthermore, the same error tolerance can be used to split a measurement into multipleindividual measurements, and provide relevant information on the area the user is currently inrather than averaged measures for the Whole path, though both may be relevant. This enables an operator or system designer to, depending on different criteria, split up or partially process 13 a measurement in distinct processes which can then be referred to individually or communally.

In one embodiment, the UE is configured to recommend new locations that will optimizecoverage of the forest based on finished recordings and their positions. This may be done indifferent manners. In one embodiment, the UE is configured to select the location with thelargest distance to all previous recordings, each recording being associated with a locationthrough the meta data for that recording. In such an embodiment, the measurements on theforest are deterrnined as complete when the largest distance to another measurement zone isbelow a threshold level. Altematively or additionally, the UE may be configured to determinethe coverage of previous measurements by mapping an area starting at the location of eachrecording and extending in the directions covered b the camera (based on the view angle ofthe camera in relation to the position, a compass direction and movements deterrnined throughan accelerometer or compass - all comprised in the meta data. The sensors may have anaccelerator and/or a compass for providing a camera angle which may also be part of the metadata. The UE may then be arranged to identify areas having a a congruent form and an areaexceeding a threshold value and propose such areas as a next area to be surveyed. As no more areas exceeding the threshold is found, the measurements are deemed complete.

In one embodiment, the UE is configured to recommend locations by providing or suggestinga path through the forest and request a number of measurements along this path, such thatboth the total distance traversed is minimized, and the largest distance to a measurement zonewithin the forest is below the distance threshold after the whole path is walked. Such arecommendation would also take into account terrain, paths through the forest and forest boundaries from a GIS-system, so that the recommended path is both traversable and relevant.

Figure 5 shows a ﬂowchart for a more detailed method according to one embodiment of thepresent invention. Figures 6A to 6F Shows schematic views of partial and final processingresults performed using the method of figure 5. As for the method of figures 2 and 3, the UE100 receives a video sequence 510 of a forestry related area, such as a portion or area of aforest, see figure 6A showing the display of a UE 100 showing a video of a forest areaincluding some trees T. The video sequence comprises a plurality of image frames or simplyframes that are analyzed individually (at least some of them) by, for each analyzed frame, running an algorithm to provide a camera position 520 and a point cloud 530. 14 The video sequence is possibly accompanied by sensor data. In one embodiment, a sparsepoint cloud is used. In one embodiment a dense point cloud is used. In one embodiment acombination of a sparse and a dense point cloud is used The sensor data may relate topositional information, such as GNSS or GPS coordinates or other data. The sensor data may also or altematively relate to motion inforrnation, such as accelerometer or gyro data.

In one embodiment a SLAM algorithm is used to provide the point cloud and the cameraposition. SLAM (Simultaneous Localization And Mapping) is the computational problem ofconstructing or updating a map of an unknown environment while simultaneously keeping track of a unit's location within it.

SLAM originates in robotics, and the inventors have thus realized that by combiningalgorithms traditionally used in the remote field of robotics, an improved forestrymanagement may be achieved. Furthermore, as some SLAM algorithms rely on multipleangles of an object having been captured, but as most tablets and smartphones and othersmall, handheld devices mostly only covers one camera angle (apart from having two camerasarranged in opposite directions, the inventors have overcome this by moving the cameraaround, thereby simulating more than one sensor, as multiple angles are captured by the samecamera/ sensor. Moving the camera around while filming/recording has several benefitscompared to taking the same video from two vantage points in that only one user action isrequired and that more than two angles are captured simultaneously. Also the relationshipbetween the two positions need not be given as that information will be included in thecontinuous image stream captured. Furthermore, as the manner herein is highly suitable forsmall device cameras, it is also highly suitable for manual recording (or possibly dronerecording). Such manual recording is very difficult to achieve without the camera beingmoved in different directions. A person holding a camera while walking through a forest ishighly unlikely to keep the camera stable. The manner proposed of how to record the video sequence, thus further increase the efficiency of the manner in a positive circle.

A base plane and an up direction in the point cloud are deterrnined 540. In one embodimentthis is done by filtering points and fitting a base plane to the data filtered points. In oneembodiment this is done by using a RANSAC algorithm. Random sample consensus(RANSAC) is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers, when outliers are to be accorded no influence on the values of the estimates, the base plane being the mathematical model, and the outliers and inliers being points in the point cloud. Figure 6B shows a base plane 610.

A height map is generated 550, in one embodiment by dividing the point cloud into 2D cellsalong a plane, and finding a median distance from each point cloud point in the correspondingcell to the plane. When detecting trees, points that are near the ground - as specified by theplane and height map - will be f1ltered out.

To identify 575 the trees, clusters of points are detected 570 in the remaining points. In oneembodiment this is done by for all points finding all other points that are within a distancethreshold to the current point. Each such cluster is assumed to be a tree. Seen from the baseplane, there will be a higher density of points at the locations/positions where there is a tree,as the points can be on top of one another along the stem of the tree. These points may be seenas a cluster of points which can be identified through a density mapping, similar to the heightmap, but instead of deterrnining a median point it is deterrnined how many points that are partof each grid cell. Those grids that have signif1cantly more points than the surroundings areinitially regarded to be trees. A f1ltering may then be used to finally determine whether it is atree or not. A cluster is thus initially regarded to be a tree. Figure 6D shows examples of such point clusters 630.

To provide an approximation of the tree a geometrical approximation, which may bedisplayed as a graphical indication, is ﬁtted to each cluster 580. In one embodiment this isperformed by using a RANSAC algorithm. The geometrical approximation may be a cylinder,two parallel lines, a rectangular body, a cone or two lines that in their extensions areconverging. Figure 6E shows examples of such graphical approximations (possibly indicated by graphical indications) 640 having been applied to the point clusters 630.

To provide 590 a refined 3D model the detected trees are filtered through an imagesegmentation algorithm, such as a deep leaming network, an artificial neural network or aconvolutional neural network, which deterrnines whether a pixel is part of a tree, based onvarious parameters. Examples of such parameters are image patches (image patches beingdefined asa rectangular subset of the image) .Figure 6F shows examples of such refined detected trees 650. 16 In one embodiment, the image segmentation algorithm may be arranged to concurrently or subsequently also determine other relevant information such as the type of tree.

In the below, there will be given examples of parameters that may be provided using the teachings herein, and how they may be used.

It should be noted that SLAM is not the only technique necessary or possible to use foranalyzing the video recording. As the video recording provides a more ﬂuent indication of thechange of camera position it also lends itself to altemative image analysis such as deeprecurrent convolutional neural networks (RCNN). Using such methods, the forest survey may produce a rough estimate without performing the SLAM analysis.

Also, the above (and below) described generation of a base plane and a height map may beoptional and are not essential, but do provide a clear benefit in that it is easier to find the clusters representing trees.

As an optional step the UE may be configured to provide a scale correction measurement,based on sensor information (GPS or otherwise achieved position in combination withSLAM). This may be done by estimating the scale factor between the accelerometer and thesecond derivative of the corresponding camera/SLAM position through a filter, such as aKalman filter or by a matching algorithm that aligns the accelerometer data from the sensorswith acceleration data deterrnined from the SLAM-deterrnined movements. As theacceleration from the sensors is absolute/real, the acceleration in the model may be aligned toprovide for a scale. The scale correction measurement is used to calibrate 590 the SLAM algorithm used.

Figure 7 shows a ﬂowchart of a more detailed method according to the present invention.

In a first step 510 an image stream is received of a first forest area. The image stream includesmeta data recorded by sensors in connection with recording the image stream. Examples ofsuch sensors are GNSS sensors, accelerometers, digital compasses, gyrometers (providingboth direction changes and linear movements possibly to compliment the GPS positions) ortime of ﬂight sensors. The image stream may be received from an intemal camera or anextemal camera as the image stream is being recorded or captured. Or the image stream may be received as a previously captured or recorded image stream. 17 The image stream has thus been generated in an earlier or simultaneous step 500, by a user oroperator moving around With the camera to capture the ground and trees from differentpositions and thus capturing multiple angles. It should be noted that although the descriptionherein is focused on a user or operator moving the camera, in one embodiment Where thecamera is carried by an automated vehicle, such as a preprogrammed drone (see figures 9Aand 9B) or robotic Work tool traversing the forest, the automated vehicle is programmed tomove the camera so that multiple angles are covered. It should also be noted that the user oroperator need not move the camera around manually, but the camera may be carried by aremote controlled vehicle controlled by the operator. The camera may also be mounted on a vehicle (directly) controlled by the user or operator.

It should be noted that although only one camera is mentioned to be utilized herein, the teachings may also be applied and With added benefit to constellations With multiple cameras.

The added benefit lies in that as the teachings herein provide a manner for deterrnining hoWone camera has moved, the same movement may be used to the Whole constellation thusenabling for more video recording angles but Without requiring a linear or Worse additional processing.

In one embodiment the image stream is recorded by Walking in a closed loop, such as anellipse, a square/rectangle, back and forth or any other closed looped pattem, to facilitate a more accurate motion tracking in the SLAM algorithm to be applied.

In one embodiment the image stream is recorded by Walking in a straight line, or in a curved line, With the camera angle being at a right angle to the direction of movement.

As the image stream has been received, the camera pose and/or position is estimated 520through Simultaneous Localization and Mapping (SLAM) algorithms. This is for exampledone by tracking the motion of a set of points between images or frames in the image stream,construct key frames and storing the point locations seen in the key frames and thendetermine the camera position based on the movement respective to each key frame. A keyframe is a concept that is used in searching. A key frame comprises the camera position, thevideo image and other meta data, such as the position of the key frame in relation to adjacent key frames, the preceding and succeeding key frames, the positions of each known point at 18 this time. This enables for a reconstruction of at least parts of the camera sequence, if it islater deterrnined that an error has been introduced. It can also be used to re-step to a measurement set if a reference is temporarily lost.

A key concept of SLAM techniques is that they analyze movements of objects, in subsequentimages, and recreate the camera°s movement and the position of the obj ect(s) in space. Itshould be noted that other SLAM techniques apart from those disclo sed herein in detail may also be used for implementing the teachings herein.

As the camera position has been deterrnined, a point cloud is deterrnined 530 by estimating adepth map based on the calculated camera poses/positions and corresponding frames, fusingeach point in the depth map into a larger point cloud, and generating new points in the cloud when there is no corresponding point to fuse with.

The depth map may be constructed in a manner similar to tracking an image point in asequence of video images, but with a high density, for example by following each imagepoint. Due to the SLAM process, the camera movement is known and the depth of each point is deterrnined based on the point displacement.

As the camera movement is known and the distance from the camera to a point, it is possibleto add a point to a common coordinate system, that of the camera(s) and it is then alsopossible to find points that are close to one another. Points that are very close to other points may be regarded as being the same point and can therefore be fused.

Also as the image stream has been received, possibly in conjunction with determine the pointcloud, a ground model is deterrnined 545 by fitting a single plane to a set of points with a high enough confidence.

The confidence may for example be deterrnined based on the distance of the movement untilthe point was found in the depth map or how well one point°s movement corresponds to the movement of the surrounding points.

Also as the image stream has been received, possibly in conjunction with determine the point cloud, trees are identified in the image stream and possibly associated with a location and/or 19 various tree attributes. The trees are identified by splitting the point cloud into a 2D grid fromthe norrnal direction of the ground plane, filter out all points that are within a thresholddistance to the plane, finding all clusters with many remaining points (i.e. points that areoutside a threshold distance to the plane) and assume these are trees. The trees may thenoptionally be approximated for example by fitting lines, rectangles or cylinders or othergeometrical shapes to these clusters. Other embodiments may use other algorithms foridentifying trees, such as in one embodiment utilizing edge detection and/or tree segmentationfor example being based on a deep leaming network, such as an Artificial Neural Network (ANN) or a Convolutional Neural Network (CNN).

A three dimensional (3D) model is thereby provided of the test area with the treesapproximated by for example lines. The inventors have realized by approximating a view ofthe forest by a 3D model, many measurements may be made in the 3D model at a high enough accuracy, thus providing reliable and objective results that may be trusted.

As the trees have been identified and approximated in the 3D model, a sample plot may bedeterrnined 591 by deterrnining all locations along the plane representing the ground whichhave been accurately recorded by the camera, that is that have been recorded with aconfidence higher than a threshold value, such that it is likely all trees were detected and marking this area as a sample plot and determine its size.

The sample plot may be used to determine a basal area and thereby a density of trees. As thetrees have been identified, and their approximate width has also been deterrnined, throughtheir detected stenis, the tree density of the area may also be deterrnined through a simpledivisional operation. The tree density of an area may be defined as the volume of tree (or timber) per area, such as cubic meters of tree per acre (m3/ha) The UE may also be configured to calculate an absolute scale 592. In one embodiment this isdone by deterrnining an average distance between the camera and the ground plane. Themeasured height and the deterrnined average distanced thereby giving the scale factor as they represent the same height.

In one embodiment, the UE may be configured to determine the scale by identifying an object having a known size in one or more frames. The known object may be a scale indicator that is placed in the area by the operator. The scale may then be deterrnined as the distance to the object in the 3D model is deterrnined in the 3D model.

In one embodiment this is done by comparing the accelerometer movement to the deterrnined movement using the SLAM technology.

The scale may be utilized for optimizing the size of the point cloud that is generated When deterrnining the point cloud 530.

The UE may also be configured to make further tree analysis 593, possibly throughsegmentation as discussed above, such as deterrnining tree curvatures, deterrnining treespecies and detection of tree branches. These factors may be used to identify trees (curvaturesand branches) and also to identify the tree type or tree species for providing a more detailed survey result, Without the user or operator having to know different tree types.

The UE may also be configured to display 594 recorded video With overlay of trees, groundmodel and sample plot boundary to provide both visual feedback, but also to offer anopportunity to give feedback such as by receiving corrective commands that amends the 3D model.

The UE may also be configured to use stored camera poses and fitted tree information torender trees or graphical objects or indications representing trees on top of the video sequencefor enabling a user or operator to identify Which trees have been detected by the UE. Other components or parameters may also be rendered in a similar fashion as necessary/desirable.

The UE may thus also be configured to receive user commands 595 for correcting the 3Dmodel. The user commands may for example relate to a selection of and subsequent and correction of trees, their shapes and/or positions.

As a correction or series of corrections have been received and executed, meaning that the 3Dmodel is changed, other features and parameters may be recalculated, and the 3D model maybe redisplayed. The user may then provide a final confirrnation for the model. Altematively, the 3D model may be changed at any subsequent time. 21 As a 3D model has been provided forestry relevant parameters may be deterrnined 596 basedon the 3D model that would be difficult to otherwise determine without the proper trainingand knowledge, and in a time consuming manner. The parameters and how they aredeterrnined will be disclosed and discussed further in the below. The parameters may be used in subsequent deterrninations and/or presented to the user or operator.

The 3D model, possibly along with the parameters and any measurements made, may besaved for later retrieval. The storing may be done locally or extemally such as in a server or in a cloud server.

The UE may also be configured to align the test plot or area geographically by comparing it toa second measurement that at least overlaps the test area, by comparing specific pattems ofthe test plot and finding matches. The Swedish patent application SE 163 0035-2 discloses one manner of doing so that may be used benef1cially in combination with the teachings herein.

The UE may also be conf1gured to align the test plot or area to other adj acent test plots, alsoby comparing the test plots to be aligned with at least one second set and by finding matchingpattems, identifying the test plots relative positions. The Swedish patent application SE1630035-2 discloses one manner of doing so that may be used beneficially in combination with the teachings herein.

The UE may also be configured to identify and provide a (GNS S) position for each individualtree in a test plot, by comparing the test plot with a second test plot and by finding pattems, itis possible to identify a single tree, and if one or both of the test plots is associated with anaccurate position, the position of each tree within that plot may also be deterrnined. TheSwedish patent application SE 163 0035-2 discloses one manner of doing so that may be used beneficially in combination with the teachings herein.

The geo graphical position may be displayed for the user with relevant map information and ifneeded, a more detailed 3D-modelling and plot stitching may be performed on a remoteserver (and/or locally in the background) which may ref1ne the model and the deterrnination of the forest parameters. 22 The individual trees, their positions as well as the boundaries for the test area and manyimportant parameters may thus be provided to a user by simply f1lming a plot area fromdifferent angles, such as when moving a camera through the area. The manner taught herein only uses algorithms that are capable of being performed and executed on a handheld device, such as a smartphone thanks to the clever assumptions and realizations made by the inventors.

As mentioned above, the teachings herein may also be used to provide parameters that areuseful in the forestry industry. Some such parameters, and how they are deterrnined will be discussed below.

It should be noted that while these descriptions of how to determine these parameters aregiven with reference to the teachings herein as regards generating a 3D model, they may also be used with other manners of generating a 3D model.

One parameter that may be deterrnined more easily using the teachings herein is the actualplot area. The plot area is used to estimate the total tree volume or volume per hectare in aforest stand (a forest stand being a limited forest area with relatively homogenous treepopulation). When doing so it is inefficient to measure every single tree and thecorresponding ground area. To simplify this procedure, it has been agreed through the Nordicforest management standard to limit the forest inventory to measure only a number of forestplot areas. Detailed measurements are performed on each plot, which are then used forcalculating an average for the entire stand. Common tools for performing the measurements are relascopes, calipers, measurement tapes and height meters (clinometers).

When recording a forest plot, for example through photographing, videographing, or laserscanning etc, the recorder captures a large number of trees, both trees that are located close tothe recorder and trees located far away from the camera. A tree at a long distance will mostoften not be recorded with as high an accuracy as a tree close to the recorder, which meansless information about the tree is available resulting in a lower confidence in the tree attributedescription. When generating a 3D model of a recorded forest plot, such as through a methoddescribed above, the UE or other system, such as a remote server, sorts out which trees are located too far away from the recorder to provide any meaningful data. 23 Figure 11 shows a ﬂowchart of a general method according to an embodiment of theteachings herein. A 3D model of a recorded plot area is generated 1110. The UE then defines1120 a boarder or “hu11” that defines the actual plot area. Inside the plot all included trees arewell-defined and carrying a high accuracy of forestry attributes or parameters. The boarder thus provide for a well-defined plot area.

With a well-defined area of the plot, it is possible to determine a number of forest parameters,like e. g. number of trees per hectare. The trivial solution is to define a circular or rectangularplot. But in order to maximize the area of the plot, a “convex hull” around the outer treescould be applied (see Figure 12). However, if the plot area is not homogenous, i.e. there arecertain unknown areas within the plot area, this need to be taken into consideration whencalculating e. g. number of trees per hectare. Another problem is to define the area without getting boundary Value problem (see figure 12).

As mentioned above, in one embodiment the UE is arranged to be carried by or be part of aUnmanned Aerial Vehicle or System (UAV or UAS), hereafter referred to as a drone. Figure8 shows a view of a drone 100B carrying a camera 130 as well as other sensors 135.Examples of such sensors are accelerometers, GNSS device, time of flight sensors, compass,gyro to name a few. The drone 100B comprises a controller for controlling the overalloperation of the drone 100B including ﬂight control and camera control. The controller isconnected to a memory for storing instructions and data related to the operation and filming of the drone.

The drone 100B may be remote controlled or it may follow a set ﬂight path or execute acombination of receiving remote control instructions while executing a ﬂight path.

The drone 100B comprises a wireless interface for receiving control instructions through, andpossibly also for transmitting a video sequence as it is recorded for possible storage at another location.

The drone 100B may also be arranged for autonomous operation, wherein the drone wouldreceive an indication of an area to be surveyed and then by itself determine a ﬂight pattem forcovering that area. The area may be specified by coordinates, or by being demarcated y forexample radio frequency, wireless communication, magnetic or light emitting posts or other markers. The drone may cover the area in a random pattem or in a regular pattem or in a 24 combined random and regular pattern. In one embodiment, the drone is arranged to ﬂy at or close to ground level, as in under the level where the tree crowns (on an average) start.

The drone could possibly be arranged to, by itself, deterrnine when an area had been coveredsufficiently, for example by deterrnining that the flight pattem, possibly in combination with the camera angle and camera movement, had covered enough of the area to be surveyed.

The drone 100B may be part of a drone system 105 as shown in figure 8, where a drone 100Bis connected to a UE 100A, such as the UE 100 of figures 1A and 1B, for receiving remotecontrol instructions and/or for providing video streams, thereby enabling the UE 100A toremotely control the drone 100B for recording a video sequence. In one embodiment, thedrone 100B is configured to perform some of the processing, while the UE (and/or a server) perfomis the remainder of the processing.

The drone system may thus be used to execute a walk through or rather fly through of asurvey area, whereby the drone is controlled to ﬂy through an area to be surveyed andrecording a video film or sequence of the area from different angles as shown in figure 9 where a drone 100B is controlled by a UE 100A to ﬂy trough a survey area T.

A survey area is in one embodiment a forest area. The survey area may altematively be an agricultural area for example a vineyard or a fruit tree farm area.

In one embodiment, which takes full advantage of the benefits of the teachings herein andutilizing the fact that drones are becoming cheaper and cheaper and that the teachings hereindo not require a camera of higher quality but a normal everyday use camera is sufficient, thedrone is used to carry the camera 130 to record the video which is then transmitted (possiblyin real time) to the UE 100A, whereby the video recording is analyzed as per above. Thismakes it possible for the operator, using only equipment that is already available(smartphone) or at least not requiring a huge investments (drones starting at 200 E) to survey aforest and see the results simultaneously which enables the operator to revisit areas or recircleareas without having to actually traverse the forest area himself, and in one session gettinginstant results.

Figure 10 shows a schematic view of a computer-readable product 10 according to one embodiment of the teachings herein. The computer-readable product is configured to carry or store a computer program or computer program instructions 11 along with application relateddata. The computer-readable product10 may be a data disc as in ﬁgure 5 or a Universal SerialBus, a memory card or other commonly known computer readable products, these beingexamples of transitory mediums. The computer-readable product 10 may be inserted orplugged in or otherwise connected to a computer-readable product reader 12 configured toread the information, such as the program instructions 11 stored on the computer-readableproduct 12 and possibly execute the instructions or to connect to a device configured toexecute the instructions such as a UE 100, as the one disclosed in figures 1A and 1B. The UE100 may thus connect wirelessly or through a Wired connection to a computer-readableproduct reader 12 (this being an example of a non-transitory medium) to receive the computerinstructions 11. The UE 100 may in one embodiment comprise the computer-readable product reader 12 to receive the computer instructions 11.

In this manner a smartphone of standardized model may be upgraded to incorporate theteachings herein, by loading the computer program instructions into the controller (and/ormemory) of the smartphone (or other UE) and causing the controller to execute the computer program instructions.

As mentioned above the base area (or ground area) of the surveyed area may be deterrnined.Figure 11 shows a ﬂowchart for a general method - or partial method - for deterrnining thedensity of trees (or other objects) in a test area and figure 12 shows a schematic view of a testarea. The test area comprises a plurality (or at least one) detected tree(s) 1230 and possibly anumber of undetected trees 1240, i.e. trees not having a high enough confidence to be detected.

In one embodiment the base area 1220 is deterrnined 1110 - after having received andanalyzed a video recording, possibly along with sensor data, as disclosed herein 1100 - bydeterrnining the convex hull 1210 of detected trees 1230. The convex hull is deterrnined isdeterrnined by including all detected trees while only “tuming“ in the same direction. Theconcept of deterrnining a convex hull or envelope is known to a skilled person and will not bediscussed in further detail herein, a convex hull being defined generally as the smallestconvex set in a Euclidean plane or in a Euclidean space is the smallest convex set that contains a set of detected trees. 26 In one embodiment the base area 1220 equals the convex hull. In one embodiment the basearea 1220 equals the convex hull with an addition of an additional area. The additional areamay be deterrnined as the convex hull plus a margin. The size of the margin may be absoluteor relative. The relative margin may be relative the size of the area or the size of a detectedtree or an average of detected trees 1230. The relative margin may also or altematively berelative the distance between two detected trees or an average of distances between detected trees 1230.

The height of a tree may also be deterrnined 1120. The height may be deterrnined as theheight of the detected tree stem. The height may also or altematively be deterrnined as theheight of the detected tree stem plus a ratio of the height of a detected tree crown. The ratio ofthe tree crown that is to be added to the detected tree stem height depends on the type of tree.

In one embodiment, the height of a tree may be deterrnined as follows.

Firstly, the three dimensional model is generated 1100 as per any method herein. As the stemsof most tree types are free from leaves and as such are clean from an image handlingperspective, it will be possible to detect many trees with a high confidence, i.e. most trees willbe detected. In order to accomplish this, all points above a certain height are f1ltered out, thusallowing the manner herein to focus on the “clean” stems providing for a faster and moreaccurate generation of the 3D model of the stems. The height above which points are f1lteredout, basically correspond to the level where the tree crowns. This height may be set as aparameter, or it may be deterrnined automatically by identifying the trees being surveyed(more on identifying tree types below). The height may thus vary from one part of the area toanother, as the trees in the area vary. The height may be set as a number or by selecting orinputting a tree type corresponding to a typical height. Examples of heights are 4, 5, 6, 7 or 8meters just to give some examples. It would be apparent to a skilled person that this heightmay indeed vary greatly. Even though this f1ltering of points over a certain heights ismentioned in relation to deterrnining heights of trees, it should be clear that it relates togenerating the three dimensional model and as such is relevant to all embodiments disclosed herein.

As the three dimensional model has been generated, the tree stems are extrapolated 1121 through and over the height of where the tree crowns are assumed to start, i.e. through the 27 ceiling made up by the tree crowns through which details may be visibly obscured. The extrapolation is based on the assumption that stems are substantially straight.

To facilitate the extrapolation, and for enabling capture of a highest top, the extrapolation maybe supplemented by further video recording 1122, possibly in combination with ﬁJrther sensorreadings, this time aimed at the top of the trees, or at least their crowns. To enable for a heightto be calculated correctly, and for matching an upper portion of a tree (such as the top of thetree or where the tree ends, the highest visible point of the tree or simply a point in the crownarea of the tree to give a few examples) to a lower (detected) portion, the further videorecording (and sensor reading) may be performed in a portrait mode providing a viewencompassing as much as possible of the tree in one frame, possibly including some swiveling to capture the whole tree. In such an embodiment, the initial video recording maybe made in a landscape mode, thereby capturing as many tree stems as possible in each frame.Altematively or additionally, the initial video recording may be done so that it includes somesegments where the full tree lengths are visible. These segments then constitute at least part of the further video recording.

The further video recording is analyzed in much the same manner using the same techniquesas the initial recording, but where the upper portion and/or tree crowns are paired 1123 withthe already detected (lower) portions, for facilitating the identification of where the stems are.The pairing may be achieved by comparing camera poses and positions within a frame of the initial recording and in a frame of the further recording.

The estimated heights are visually indicated l l24 are presented on the display by extendingthe graphical indications used to indicate the tree stems so that it marks the whole height of the tree(s), and the operator may be given an opportunity to correct the detected tree height.

As the height has been detected, the scaling deterrnined previously is used, possibly incombination with the distances in the 3D model, to determine the actual height l 125 of the detected tree(s), as represented by their respective cylinders.

The type of tree may be deterrnined through image recognition or it may be input by onoperator. It may also or altematively be deterrnined through downloading forestry data relevant for the geographical location of the test area or from previous forestry surveys. The 28 detected tree Crown may be shown to the operator using a graphical indicator being a round or oval graphical object.

In one embodiment, such as where detected trees are estimated by diverging/convergingparallel lines or other such approximation, the tapering and/or even curvature of a detectedtree may be deterrnined 1130 using the teachings herein. As the tree stem is detected, thewidth of the tree stem is also implicitly detected or deterrnined as being the distance between the lines (grouping of points) indicating the tree stem.

As the tapering is deterrnined and the height is known, the volume of the detected tree(s) may also be deterrnined 1140.

Following this the volume of trees or usable timber per acre (or other measurement) maysimply be deterrnined1150 as the sum of the volumes of the detected trees divided by the deterrnined base area.

If the tapering and also the curvature and general shape of the detected trees is deterrnined, thequality of the tree - as related to the forestry industry - may also or altematively bedeterrnined 1160. The exact manner of deterrnining the quality of a tree varies from operatorto operator and different operator may choose different manners, and as there are manyaltematives available the exact manner of doing so will not be discussed in detail herein, otherthan that the quality reﬂects how much timber that may be retrieved from the tree. Forexample, a circular tree stem have a higher quality than an irregular or oval; a bent or curved tree stem has a lower quality than a straight tree stem.

It should be noted that this is only one manner of deterrnining the density of trees, and manyother exists For example, there are known algorithms and even tables for performing suchdeterrninations. Such algorithms and tables may be based on information such as type of trees,and such information may be deterrnined by the UE or it may be received from a remotelocation. A UE according to herein may thus also be configured to determine the density based on such algorithm and/or tables.

As the teachings herein only require very little in the way of computing resources, more complicated factors and parameters such as curvature, may also be deterrnined within a 29 realistic time frame, thus enabling for important parameters such as the quality of the timberto be deterrnined Within a realistic time frame and using only cheap and readily available equipment.

The inventors have also realized that the exact position of a detected tree may be deterrnined.Figure 13 shows a ﬂoWchart for a general method - or partial method - for deterrnining theposition of trees (or other objects) in a test area and figure 14 shoWs a schematic view of a test area. The test area comprises a plurality (or at least one) detected tree(s) T.

In one embodiment the position of the UE 100 is deterrnined 1310 - after having received andanalyzed a video recording as disclosed herein 1300 - by receiving and processing locationsignals. The location signals may be received through a GNSS system Whereby the position isdeterrnined as the geographical coordinates provided by the GNSS system. Altematively oradditionally the location signals may be received through Wireless communication With base stations Whereby the position is deterrnined utilizing triangulation from base stations.

During the analysis, a relative position of the camera and hoW it is changed in time has beendeterrnined. Using the deterrnined position of the camera (UE 100) at one time and relatingthis to the relative position of the camera at the same time, the exact movement and positionof the camera (UE 100) may be deterrnined 1320. As a scale and detected trees” relativelocation(s) have been deterrnined previously, the distance and direction from the camera to atree may be deterrnined 1330, based upon Which the exact location of the tree is deterrnined1340. It could be noted that in one embodiment, the Whole movement of the camera alongWith several GNSS deterrnined positions is used to provide an accurate deterrnination of the absolute position of the camera based on a calculated average of positions The teachings herein thus enable an operator to detect trees and actually determine their exactindividual location rather accurately, using only a simple GPS receiver (or other GNSS device) Which are commonly found in most contemporary smartphones and tablets.

As has been mentioned in the above, the manner taught herein may also be used to determinethe amount of timber in a log pile. To determine the mount of timber in a log is especiallytroublesome When the logs in the pile are of different lengths. The inventors propose to record a video sequence capturing a view surrounding the pile in so far as that the depth, the height and the width of the pile gets recorded from different angles. This may be achieved by aperson simply walking the camera around the pile, possibly bringing or sweeping the cameraback and forth. This also ensures that logs of different lengths are recording and later detected accurately, provided that they are visible for optical detection.

Figure 16 shows how a UE 100 holding a camera is brought along a path P so that its angle ofview AV covers a pile of logs L. Figure 15 shows a ﬂowchart for a method of deterrnining thevolume of timber in a log pile. As discussed above with reference to figure 3, the UE receivesa video sequence 310, deterrnines camera poses 320 and perforrns image matching 330 togenerate a 3D model 340. The individual logs may then be identified by detecting 1550 thecross sections or cutting areas at one end of a log L and then detecting or extrapolating to thecorresponding 1555 cross section at the other end of the log. The individual log may thus bedeterrnined 1560. As the cross sections and the length of each log is thereby known, thevolume of timber in the log pile may also be deterrnined 1570. As can be seen the logs L inthe pile may be of different lengths and also of different or even varying thickness (width). Byutilizing the teachings herein and video recording also the backside of the pile, it is provided amanner for deterrnining the individual lengths of the logs and their individual (approximate)variance in width, which provides for a more accurate estimation of the amount of timber in the pile.

As has been mentioned in the above, the manner taught herein may also be used to determinethe amount of chippings in a pile of chippings. The inventors propose to record a videosequence capturing a view surrounding the pile in so far as that the depth, the height and thewidth of the pile gets recorded from different angles. This may be achieved by a personsimply walking the camera around the pile, possibly bringing or sweeping the camera back and forth.

Figure 18 shows how a UE 100 holding a camera is brought along a path P so that its angle ofview AV covers a pile of chippings. Figure 17 shows a ﬂowchart for a method of deterrniningthe volume of chippings in a pile. As discussed above with reference to figure 3, the UEreceives a video sequence 310, deterrnines camera poses 320 and perforrns image matching 330 to generate a 3D model 340. 31 The Volume of the pile may then be deterrnined by integrating along the height of the pile P.This integration may be approximated by deterrnining 540 a base plane and generating 550 aheight map. A grid is overlayed 1760 on the height map and by knowing the area of eachsector of the grid, the volume may be deterrnined 1770 by multiplying the sector area by theheight of each sector in the grid.

The inventors have furtherrnore identified one more problem that may be so lved by theteachings herein. The problem relates to surveying large forest areas. The proposed solutionmay also be applied to other forms of geo surveying and its use is not restricted to forestry related surveying.

Traditionally when surveying large areas the large area will be surveyed in parts or partialareas which partial areas are then stitched together to form the large area. This technique isalso used to stitch together old surveys with new surveys. To identify the different partialareas markers are used. The markers, being uniquely marked with a shape or an identifier,will identify the positions of the partial areas as well as how the areas should be aligned toone another if more than one marker is used, simply by aligning the markers in the different surveys or rather results of the surveys Using markers not only requires manufacturing, transporting, installing/mounting the markersand making sure that the markers are visible or otherwise identifiable at the time of a (partial)survey, but also requires planning in where to locate them, how many should be used andwhat kind should be used. As there are many different surveying techniques available, there are also many different marker standards available.

The lo gistics involved and the planning becomes a problem especially in remote areas andareas that have not previously been surveyed and this may require a lot of man power and alsotake a long time as some places where a marker should be put may be very difficult to reach.The inventors have realized that these problems can be so lved or at least mitigated by the factthat trees, and especially groups of trees, are unique in their appearance and also theirindividual placement. This is especially true for unplanned forest where threes and such grow in irregular pattems. 32 Thus by using the teachings herein, Which provide a manner for identifying and marking therelative position of individual trees or other objects in a 3D model of an area, a first (partial)area (as represented by its 3D model) may be matched to a second (partial) area (asrepresented by its 3D model) as relates to relative position and alignment, by finding a set oftrees, comprising at least one tree, in said first (partial) area and match this set to a set of objects in said second (partial) area are therefore proposing.

For a set comprising more than one tree, this manner only requires one set to be matched, asthe arrangement of trees (or other objects) Within the set Will also provide for hoW the first and second areas are aligned With relation to one another.

Figure 19 shows a schematic and exemplary vieW of a first (partial) area Al and a second(partial) area A2 that are to be stitched together to form a composite area A. It should be notedthat the manner may be used for stitching together areas from different surveys as Well aspartial areas from the same survey. In the following the areas Will simply be referred to asareas, Which include both partial areas of one survey as Well as surveys from differentsurveys. Figure 20 shoWs a ﬂoWchart for a method according to the teachings herein, Where afirst area Al and a second area A2 is to be matched and stitched together. The areas Al, andA2 have previously been analyzed to find the individual relative positions of the objects, inthis example being trees T. The individual relative positions thus being the positions of theobjects With regards to one another. In one embodiment, the size of at least one object is alsoprovided by the analysis. The method thus begins With receiving 2000 a first area Al (orrather the 3D model representing the area, or the video recording along With any sensor datato be analyzed to generate the 3D model) and second area A2 (or rather the 3D modelrepresenting the area, or the video recording along With any sensor data to be analyzed togenerate the 3D model), Where the individual position of the objects in the areas are given. Asthe areas have been received, a set of objects Sl is to be found 2010 in the first area Al. Theset may be chosen at random as simply being a set of at least one tree. HoWever, in oneembodiment the set is selected to be a set of objects at an edge of the area. This enables for amore likely matching to another area as the tWo areas most likely overlap along an edge.

In one embodiment the set is selected as a set of obj ects having an irregular pattem, Whereinirregular herein means that it is not similar to the surrounding pattems of objects.

As a first set Sl has been found, a second set S2 is found 2020 in the second area A2. 33 The second set S2 is found by finding a set of objects that correspond to the first set S1,whereby a n1atching of the two sets and therefore also the two areas is achieved.In one en1bodin1ent, the first set S1 and the second set S2 are found as a set of objects that exist in both areas.

The sets are found to be niatching by con1paring characteristics of the objects. Thecharacteristics may be the size of the obj ect(s), the individual position(s), the actual positionof an object, tree species (kind of trees), branch structure, shape, profile, vertical boletransaction, barch texture or pattern, tree height and/or other characteristics such as discussed above.

In one en1bodin1ent, a tin1e factor is also applied to accon1n1odate for continued growth. Forexample, a tree having a specific height or width at one year will have an expected height orwidth at a subsequent year which n1ay be deterrnined using norrnal growth rates. As such,profiles, heights and/or widths n1ay be adapted accordingly during the comparison to find niatches.

In one en1bodin1ent, the coniparison is done without relation to the scale, where only relativecharacteristics, such as relative sizes and/or distances between objects are taken into account,or alternatively or additionally, only the actual position of an object is taken into account.This provides for a n1anner of rescaling one or both of the two areas to provide a scaled stitching into a con1posite area.

The scaling n1ay be done by scaling the area(s) to be scaling so that the corresponding setcorresponds to the other set. Or, when a set is found by finding niatching sets(s) in the two areas, the scaling is done so that such a set n1ay be found.

The scaling n1ay also be done according to scaling inforrnation received along with the areainforrnation. Such scaling inforrnation n1ay also be deterrnined by con1paring the absolute positions of two objects in an area.

As the two sets S1, and S2 have been found in the two areas Al, A2, The relative positions ofthe two areas n1ay be deterrnined 2030 by overlaying the two sets S1 and S2 and the two areasn1ay be stitched 2040 together to forrn a con1posite area A. 34 In one embodiment, a rotation of one or both of the areas is also performed before stitching inorder to overlay the two sets, see ﬁgure 11. The rotation may be done by rotating the area(s)to be rotated so that the corresponding set corresponds to the other set. Or, when a set is foundby finding matching sets(s) in the two areas, the rotation is done so that such a set may be found.

The rotation may also be done according to compass or other directional inforrnation receivedalong with the area inforrnation. Such directional inforrnation may also be deterrnined by comparing the absolute positions of two objects in an area.

In this context an area may be taken to correspond to a plurality or set of objects. As theobjects have a position, they also cover an area, whereby the two terrns effectively may be regarded as interchangeable.

As previously stated this may be used to stitch partial areas for forest surveys, but may also beused for other surveys. It may be used to stitch areas belonging to the same overall survey, or to stitch newer surveys to older surveys.

By stitching together more than one area, the scaling may be deterrnined more accurately. Inpart because it will be based on more data, and in part based on that one of the areas may have a more easily deterrnined (such as explicitly indicated) scale.

In several of the described embodiments, Deep Leaming may be utilized to provide improvedresults. Deep Leaming provides a toolset for a multitude of image classif1cation relatedproblems, wherein a set of inputs is converted into a set of outputs using a trained imageprocessing network. The outputs can be numbers, image masks, or other data. An example of such a network is U-Net, , but many altematives exist.

Training a deep leaming network consists of creating a set of desired outputs for a set ofinputs, and then for each input and output, deterrnining the output error and adjust thenetwork accordingly. For example the Tensorﬂow software package providesimplementations and literature for such solutions. These norrnally run on a server, and the trained model is then installed for processing on target devices.

In one embodiment, we train a deep leaming network to classify each pixel into, for examplebackground, tree boundary or tree interior, and output this as an image mask. By combiningthis with our detected trees and camera positions, we can ref1ne our cylinders to betterdetermine tree diameters, 3D position, and filter out non-trees that have been erroneously detected as trees.

In another embodiment, we use the previously described image mask with a space carvingalgorithm, to determine all volumes in our video recording which are decisively not part of atree. The remaining volumes will thus be further ref1ned 3D shapes representing trees at ahigher detail level than cylinders. By analyzing the difference between a straight cylinderapproximating this tree, and the generated volume, we may determine the quality of said tree in the form of curvature, shape, and twists.

In one embodiment, the inventors propose to train a deep leaming network to identify andlabel individual trees, such that an image mask will produce a new number for each uniquetree in the output image mask. This would allow for further ref1nement of a cylinder tree diameter in a single image.

In one embodiment, we train a deep leaming network to identify tree types, outputting animage mask where each pixel is a unique number specifying the tree type. By combining thiswith our detected trees and camera positions, we note the type of tree contained in our tree cylinder to determine the cylinder tree type.

In one embodiment, 3D point clouds and various image masks representing trees aregenerated. As seen above, it is then possible to identify each point belonging to a unique treein an image without identifying clusters, and ensure only those pixels are taken into accountwhen a tree cylinder is generated. Simultaneously, included points would then also provide additional metadata pertaining to such a tree when the tree is deterrnined.

In one embodiment, a deep leaming network is trained to identify sky and non-sky pixels. Theheight of a tree can then be deterrnined as the point on an infinite cylinder where it touchesthe height map up until the point in any image with a known camera position where a slice in the cylinder as proj ected in said image only contains sky pixels. 36 As is shown, various metadata for a target object such as a tree can be extracted from animage mask in the same manner, given that an annotated output can be provided for an input,and therefore any such data can be decisively connected using the methodology we haveshown in this invention. It is therefore not meaningful for us to describe each such metadata type in this invention, instead we make note of its inclusion as such.

In one embodiment, a deep leaming network is trained to detect tree cross sections in animage, such that each cross section is provided a unique ID in the image. This directly enablescounting cut trees in a pile. By matching two such images together from each side of a pile,the tapering of each tree in such a pair can be deterrnined. This would then provide a high-quality volume detection for a tree pile together with our SLAM based scale correct position tracking as previously described.

In one embodiment, a deep leaming network may be trained to detect a single optimal circlein a greyscale image. This may be used as an altemative embodiment to determine trees from density images with clusters of points proj ected to a plane as described elsewhere.

The inventors also propose a manner for deterrnining when a plot has enough data to besufficiently and accurately modeled. In one embodiment, a maximum recording time is set. Inone embodiment, the UE is configured to determine when changes in output parameters forthe plot are below a threshold. In one embodiment, the UE is configured to determine when the scale in our SLAM system as detected e. g. through sensor data is stable enough over time.

In one embodiment, multiple such parameters may be combined in a single threshold detector, such that all of them need be fulfilled before a plot is deterrnined to be complete.

The inventors also propose a manner for performing Route Planning.

Route planning is the task of deterrnining where in our target environment further surveysneed to be made, given a set of constraints. Such constraints may be a set of GNSScoordinates describing the boundary of the forest, and a set of existing surveys and theirparameters, and optionally extemal data such as existing map data describing various standsor biomes in a single forest. Additionally, constraints may be described as a target plot density, an error magnitude estimate that must be reduced below a threshold, or a set number 37 of surveys that need to be conducted for each stand, or for the forest as a whole. Another optional input may be a time limit.

Based on given inputs, one embodiment would simply find each point in the forest at themaximum distance from all other survey points, then repeat this recursively and brute-forceoptimize a path between the generated points from the current user coordinates. This may be done for the whole forest, or for each stand, or based on another, similar criteria.

In another embodiment, a set of map layers may be generated over one or more parametersprovided for the stand or forest, and suitable target survey points may be generated through aset of functions applied to said map layers. For example, one such map layer may show thebasal area of a forest, smoothed out to fill the layer based on known data, and the derivative ofsaid map layer may have a set of local maxima, indicating where - between two knownsurveys - there is a large difference in basal area, and thus where more survey data would beuseful. Similarly, the distance measurement described previously could also be provided as such map layer, and the two multiplied together may provide an even better estimate.

The inventors also propose to join individual survey plots into stands, where a set of desirableparameters in the plots are similar, or otherwise correlated. In one embodiment, such a division may be deterrnined by connected survey areas with the same major tree type.

Using our survey plot data, we may optimize over desirable properties to produce stands in anumber of ways according to operator expectations. One lo gical plot division would be thepreviously mentioned major tree type stands. In our solution, we would simply find allsurveys fulfilling a property, and join them in sets such that they are connected. Onboundaries between stands, they are deterrnined as a midpoint between surveys part of eachcorresponding stand. Stands that correlate but cannot be connected in a 2D layer are separate stands.

In another example, a stand may be divided based on both major tree type and basal area ranges.

In another embodiment, we may generate stands over all or multiple parameters automatically, such that a set of survey plots are chosen that represent a set of stands, and then 38 all connected plots that fulfill a similarity criteria to that plot based on a threshold or operator criteria, are joined into a stand with that survey plot.

In another embodiment, an operator may provide a set of existing stands, and let the software update those stands based on survey plot parameters within each stand.

In one embodiment, an operator may provide two or more nearby plots, and the softwaredeterrnines similarities properties in these plots, then suggests other plots that may be related to these plots. If no similarities are found, the software provides a waming.

In one embodiment, random pairs of nearby plots are selected, and similarities between eachpair are deterrnined similar to above. If plots are dissimilar, the pair is discarded. This may bedone recursively if each pair is then treated as a single plot, or a small stand. The process can then be repeated until a target number of stands have been reached.

Figure 22 shows a schematic view of a combination of generating stands and planning a route.

A UE is travelling along a path moving through points Pl to P8. As the analysis is run, thearea covering points Pl-P4 is noted to be of a similar type, i.e. an homogenous area, and thearea is treated as one stand. However, as the area around point P5 is reached, the analysisdetects that some parameters change, and that the area by point P5 is different from thepreviously surveyed area. To ensure a more accurate measurement, the general area of andsurrounding point P5 is therefore surveyed more thoroughly by adding survey areas, as represented in figure 22 by a closer arrangement of points P5-P8 compared to Pl-P4.

The present invention is further described according to the following items: l. A forestry surveying apparatus (100) comprising a controller (CPU), the controller (CPU)being configured to: receive an image stream representing a video sequence; determine a camera pose for a second image in the image stream relative a first imagein the image stream; match the first image with the second image, based on the camera pose; and generate a three dimensional model based on the image match; 39 Wherein the Video sequence and the three dimensional model represent forestry relatedobjects. 2. The forestry surveying apparatus (100) according to item l, Wherein the controller isfurther configured to: receive sensor data and deterrnine the camera pose and/or generating the three dimensional model based on thereceived sensor data, Wherein the sensor data relates to one or more of positional information,such as GNSS data, motion data such as accelerometer or gyro data. 3. The forestry surveying apparatus (100) according to item l or 2, Wherein the controller isconfigured to deterrnine the camera pose by utilizing SLAM. 4. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to: deterrnine a movement between a first and a second camera pose;receive accelerometer data and to deterrnine a scale by comparing accelerometer data to the deterrnined movement between afirst and a second camera pose.

. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to: detect at least one tree stem in said three dimensional model. 6. The forestry surveying apparatus (100) according to item 5, Wherein the controller isfurther configured to: receive positional information; andposition said detected tree stems utilizing said positional information. receive said positional information along With said video sequence. 7. The forestry surveying apparatus (100) according to item 5 or 6, Wherein the controller isfurther conf1gured to detect said tree stem bygenerating said three dimensional model bygenerating a point cloud;deterrnining a base plane in said point cloud;generate a height map in said point cloud; andfilter points in height map in said point cloud; and thereby detect clusters of points in remaining points in said point cloud as being tree stems. 8. The forestry surveying apparatus (100) according to item 7, Wherein the controller isfurther configured to generate said point cloud by deterrnining said camera pose and then determine hoW much a pixel has movedrelative a movement of the camera pose. 9. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a sample plot by deterrnining a hull around thedetected trees. l0. The forestry surveying apparatus (100) according to any preceding item, Wherein thecontroller is further conf1gured to determine the height of a tree by receiving a further video recording, Wherein said further video recording may be included inthe video recording; detecting an upper portion of a tree; pairing the upper portion to a detected tree; extrapolating the height of the tree to include the upper portion;scale the detected tree; and determine the height of the tree including the upper portion. 41 11. The forestry surveying apparatus (100) according to item 9 and 10, Wherein the controlleris further configured to deterrnine a density of trees based on the deterrnined height of a tree. 12. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a volume of timber in a log pile by, in the threedimensional model, identifying a cross section and a corresponding cross section of at leastone log, and based on this determine the volume of the at least one log, Wherein the videorecording represents a pile of logs. 13. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a volume of chippings in a pile by, in the threedimensional model, deterrnining a base plane and an up direction;generating a height map; integrate the area over the height map to determine the volume; Wherein the video recordingrepresents a pile of chippings. 14. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a volume of timber in a surveyed area by: deterrnining a base area;deterrnining a height of detected trees; deterrnining a width of detected trees; and based on this determine the volume of timber bydividing the sum of the volume of individual detected trees by the base area.

. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a quality of detected tree by deterrnining a shapeof a detected tree. 16. The forestry surveying apparatus (100) according to any previous item, Wherein thecontroller is further configured to determine a location of an individual detected tree (T) by: deterrnining the location of the camera (130); 42 deterrnining the distance from the camera (130) to the tree; and based on this deterrnining the location of the tree (T). 17. The forestry surveying apparatus (100) according to any previous item, Wherein thedevice is further configured to find a first set (S1) in a first plurality of objects (A1),find a matching second set (S2) in a second plurality of objects (A2) and to stitch together the first plurality of objects (A1) With the second plurality of objects (A2) byoverlaying the first set (S1) and the second set (S2). 18. The forestry surveying apparatus (100) according to any previous item, Wherein theforestry surveying apparatus (100) is handheld. 19. The forestry surveying apparatus (100) according to item 18, Wherein the forestrysurveying apparatus (100) is a User Equipment such as a smartphone or a computer tablet.

. The forestry surveying apparatus (100) according to any previous item, Wherein theforestry surveying apparatus (100) is a forestry surveying system (105) comprising a UserEquipment (100A) and an unmanned aerial vehicle (100B). 21. A method for forestry surveying, the method comprising:receiving an image stream representing a video sequence; deterrnining a camera pose for a second image in the image stream relative a firstimage in the image stream; matching the first image With the second image, based on the camera pose; andgenerating a three dimensional model based on the image match; Wherein the video sequence and the three dimensional model represent forestry relatedobjects. 22. A computer-readable medium comprising computer program instructions that Whenloaded into a controller, causes the method according to item 21 to be executed. 43 23. A surveying apparatus (100) comprising a controller (CPU), the controller (CPU) beingconfigured to: receive an image stream representing a Video sequence; deterrnine a camera pose for a second image in the image stream relatiVe a first imagein the image stream; match the first image With the second image, based on the camera pose; and generate a three dimensional model based on the image match. 24. A method for surveying, the method comprising:receiving an image stream representing a Video sequence; deterrnining a camera pose for a second image in the image stream relatiVe a firstimage in the image stream; matching the first image With the second image, based on the camera pose; and generating a three dimensional model based on the image match.

. A computer-readable medium comprising computer program instructions that Whenloaded into a controller, causes the method according to item 24 to be executed.

Claims

1. A forestry surveying system (105) comprising an unmanned aerial vehicle (l00B) and a controller(CPU), the controller (CPU) being configured to: receive an image stream representing a video sequence; determine a camera pose for a second image in the image stream relative a first image in the image stream; match the first image with the second image, based on the camera pose; and generate a three dimensional model based on the image match; wherein the video sequence and the three dimensional model represent forestry related objects, wherein the controller is further configured to: receive sensor data and determine the camera pose and/or generating the three dimensional model based on the received sensor data, wherein the sensor data relates to one or more of positionalinformation, such as GNSS data, motion data such as accelerometer or gyro data, and wherein the controller is configured to determine the camera pose by utilizing SLAM.

2. The forestry surveying system (105) according to claim 1, wherein the unmanned aerial vehicle (l00B) comprises a camera arranged to record said image stream.

3. The forestry surveying system (105) according to claim 1 or 2, wherein the forestry surveying system (105) further comprises a User Equipment (l00A) comprising said controller (CPU).

4. The forestry surveying system (105) according to claim 1, 2 or 3, wherein the forestry surveying system (105) further comprises a server [support [0034]] comprising said controller (CPU).

5. The forestry surveying system (105) according to claim 2 and 3, wherein the server is configured toreceive said image stream from said User Equipment (l00A), which in turn is configured to receive said image stream from said unmanned aerial vehicle (l00B).

6. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to: determine a movement between a first and a second camera pose; receive accelerometer data and todetermine a scale by comparing accelerometer data to the determined movement between a first and a second camera pose.

7. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to: detect at least one tree stem in said three dimensional model.

8. The forestry surveying system (105) according to claim 7, wherein the controller isfurther configured to: receive positional information; and position said detected tree stems utilizing said positional information. receive said positional information along with said video sequence.

9. The forestry surveying system (105) according to claim 7 or 8, wherein the controller isfurther configured to detect said tree stem by generating said three dimensional model by generating a point cloud; determining a base plane in said point cloud; generate a height map in said point cloud; and filter points in height map in said point cloud; and thereby detect clusters of points in remaining points in said point cloud as being tree stems.

10. The forestry surveying system (105) according to claim 9, wherein the controller isfurther configured to generate said point cloud bydetermining said camera pose and then determine how much a pixel has moved relative a movement of the camera pose.

11. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to determine a sample plot by determining a hull around the detected trees.

12. The forestry surveying system (105) according to any preceding claim, wherein thecontroller is further configured to determine the height of a tree by receiving a further video recording, wherein said further video recording may beincluded in the video recording; detecting an upper portion of a tree; pairing the upper portion to a detected tree; extrapolating the height of the tree to include the upper portion; scale the detected tree; and determine the height of the tree including the upper portion.

13. The forestry surveying system (105) according to any previous claim, wherein the controller isfurther configured to determine a volume of timber in a log pile by, in the three dimensional model,identifying a cross section and a corresponding cross section of at least one log, and based on this determine the volume of the at least one log, wherein the video recording represents a pile of logs.

14. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to determine a volume of chippings in a pile by, in the threedimensional model, determining a base plane and an up direction; generating a height map; integrate the area over the height map to detelmine the volume; wherein the video recording represents a pile of chippings.

15. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to determine a volume of timber in a surveyed area by:determining a base area; determining a height of detected trees; determining a width of detected trees; and based on this determine the volume of timber by dividing the sum of the volume of individual detected trees by the base area.

16. The forestry surveying system (105) according to any previous claim, wherein the controller is further configured to determine a quality of detected tree by determining a shape of a detected tree.

17. The forestry surveying system (105) according to any previous claim, wherein thecontroller is further configured to determine a location of an individual detected tree (T) by:determining the location of the camera (130); determining the distance from the camera (130) to the tree; and based on this determining the location of the tree (T).

18. The forestry surveying system (105) according to any previous claim, wherein thedevice is further configured to find a first set (Sl) in a first plurality of objects (Al), find a matching second set (S2) in a second plurality of objects (A2) and to stitch together the first plurality of objects (Al) with the second plurality of objects (A2) by overlaying the first set (Sl) and the second set (S2).

19. The forestry surveying system (105) according to any of claims 3 to 18, wherein the user equipment (100A) is handheld.

20. The forestry surveying system (105) according to claim 19, wherein the User Equipment is a smartphone or a computer tablet.

21. A method for forestry surveying, the method comprising: receiving an image stream representing a video sequence; determining a camera pose for a second image in the image stream relative a first image in the image stream; matching the first image with the second image, based on the camera pose; andgenerating a three dimensional model based on the image match; wherein the video sequence and the three dimensional model represent forestry relatedobjects, wherein the method further includes: receive sensor data and determine the camera pose and/or generating the three dimensional model based on the received sensor data, wherein the sensor data relates to one or more of positional information, such as GNSS data, motion data such as accelerometer or gyro data, and wherein the camera pose is determined by utilizing SLAM.

22. A computer-readable medium comprising computer program instructions that when |oaded into a controller, causes the method according to c|aim 21 to be executed.