Andrew Jones

Home   Blog   Projects   Tutorials   Map Gallery

KYFromAbove: Downloading and Processing LIDAR data


LIDAR data is incredibly important because it allows us to create detailed and accurate maps of the Earth's surface and objects on it. By using laser beams to measure distances, LIDAR can create 3D models of forests, cities, and even the ocean floor with high precision. This data is crucial for urban planning, managing natural resources, studying climate change, and understanding geological processes. It helps scientists, engineers, and planners make informed decisions about infrastructure, conservation efforts, disaster response, and more. In essence, LIDAR data provides a valuable perspective on our world that helps us protect the environment, plan for the future, and improve our understanding of Earth's complex systems.


In this tutorial, LIDAR data from KYFromAbove will be downloaded, processed, and modeled. KYFromAbove is a government-sponsored geoportal "focused on building and maintaining a current basemap for the Commonwealth that can meet the needs of its users at the state, federal, local, and regional level" (KYFromAbove, 2024). The goal is to create hillshade data that demonstrates elevation change and lighting in one particular LIDAR tile. To provide additional clarity, the workflow for downloading and using LIDAR data will be the focus of this tutorial. If LIDAR data from outside Kentucky is to be used, the same general steps can be applied. Some possible sources of LIDAR data can be found here.


It should be noted that the "Spatial Analyst" extension for ArcGIS Pro is required to create a hillshade. If this extension is not enabled, the intermediary raster dataset can be exported to QGIS for hillshade creation.


Begin by navigating to KYFromAbove. Scroll down until "Download Point Cloud Data" is found, and "View" should be selected (Figure 1).


Point Cloud Data
Figure 1. Finding Point Cloud Data on KYFromAbove

An index grid map of Kentucky will be displayed, with each grid containing compressed LIDAR (LAZ) data representing the corresponding area. The grids are numbered according to their position, with tile grids in Kentucky following the format NxxxExxx, where “x” is a number. The grid map should be zoomed in on, and the desired tile should be clicked to display its metadata and download link. The latest version of the data can be downloaded using the FTP link. Any grid can be used for this project, as all grids are processed in the same way. Grid N169E188 has been selected for this tutorial (Figure 2). However, it is recommended that a grid with some buildings be chosen so that the results of the project are more impressive.


LIDAR Web Map
Figure 2. Selecting a LIDAR Index Grid to Download on the ArcGIS Web Map

Next, the LAZ data must be decompressed. A LAZ file is a compressed format used to store LIDAR data, while a LAS file is the uncompressed version of a LAZ file and is compatible with GIS software. LAS files are typically compressed into LAZ files due to the large amount of space LAS files occupy.


To decompress the LAZ dataset, the "Convert LAS" tool in ArcGIS Pro should be used. It may be helpful for a folder to be created for the uncompressed LAS data. The LAZ dataset should be input, the target folder for the LAS dataset specified, "Compression" should be set to "No Compression", and LAS Options should be left at their default settings (Figure 3). The tool should then be run.


Decompressing LAZ
Figure 3. Converting LAZ to LAS on ArcGIS Pro

The newly uncompressed LAS dataset from the target folder should be added to ArcGIS Pro. When zoomed out, it will appear as a red square overlaid on part of Bowling Green. Upon zooming in, however, it will be displayed as a dense set of multicolored points, with blue points representing lower elevation and red points representing higher elevations (Figure 4).


LAS from a Distance
LAS up close
Figure 4. LAS Data from a Distance and LAS Data up close

To use this LAS data with elevation and raster functions, it must be transformed into a proper LAS dataset. "Create LAS Dataset" should be looked up in the toolbox. The LAS data should be input, and the remaining options can be left at their default settings.


As a side note, if multiple LAS files are available, they can be input and combined into one large LAS dataset. Technically, a LAS dataset for all of downtown Bowling Green could be created in this way, although it would be time-consuming. The LAS Dataset should then be created (Figure 5). (Figure 5).


Creating LAS Dataset
Figure 5. Creating an LAS Dataset

Working with a LAS Dataset


Some time should be taken to review the LAS Dataset options (LAS Dataset Layer, Data, Classification), which appear at the top ribbon when the LAS Dataset is selected in the table of contents. The most noteworthy options are found under "LAS Dataset Layer," where the density of the LAS points can be adjusted, as well as the symbology and LAS Point parameters. The symbology settings can be modified to display different point, surface, and line options—observe how these settings reveal different types of information about the physical landscape.


"LAS Points" refers to the classification of the LIDAR data: classifications can include all elevations (such as building and treetop heights), ground elevations only, non-ground elevations, or first return points. Under "Data," numerous options for analysis are available, including the creation of information on concepts or objects such as power lines, buildings, statistics, area and volume, outliers, surface derivatives, and visibility.


To create a hillshade, the LIDAR data will need to be transformed into raster data. It should be ensured that the "LAS Points" setting is set to "All Points" so that both buildings and treetops are captured. The "LAS Dataset to Raster" tool should be searched for and selected in the geoprocessing toolbox. This tool has several important parameters that require thorough review, so time should be taken to examine them.


Creating a raster
Figure 6. Creating a Raster from the LAS Dataset

LIDAR to Raster: Some Information on Raster Datasets


As a quick break from the workflow, the next few steps will describe some parameters used in creating raster datasets, along with general information on the raster data format. When it comes time to fill in the parameters, a line-by-line approach to the geoprocessing tool will be taken, and a brief summary of each parameter's options will be provided.


Raster datasets exist in the form of a grid, composed of a series of rows and columns. While rasters can also exist in rectangular form, other irregular shapes would not be feasible since they cannot be easily stored in a row and column format. Tiny pixels organized in the grid serve as cell values, which quantify the raster.


The amount of data that can be stored in a raster depends on its bit number. An 8-bit raster can contain 255 values, a 16-bit raster can contain 65,535 values, and a 32-bit raster can contain 4,294,967,295 values. With the LAS Dataset to Raster tool, a 32-bit signed raster has been created.


Raster data also exists in signed and unsigned formats. Signed rasters can contain negative values, whereas unsigned rasters contain only positive values. For instance, an 8-bit unsigned raster's values range from 0 to 255, whereas an 8-bit signed raster's values range from -128 to 127.


At this point, each parameter in Figure 6 can be examined and discussed. Raster extensions include TIFF, BMP, GIF, IMG, GRID, JPG, PNG, or BIL. For this tutorial, the "Output raster" (Figure 6) should be set to a .img raster file — simply add the ".img" file extension to the end of the "Output Raster" name. IMG is a proprietary file format owned by ERDAS, a company specializing in remote sensing data capture. IMG rasters are typically of high quality, but they require more storage space. Table 1 below provides a basic overview of the uses for different file formats. An exhaustive list can be found here.


Table 1. Image Files as Raster Datasets
Image File Extensions Compression Type Ideal Use
IMG, TIFF, GRID, BIL Uncompressed Negative values are supported by these image file formats when the raster is over 8 bits in size. This allows for broader applications of the raster data, though it comes at the cost of increased storage space usage.
JPG, GIF, PNG, BMP Compressed Compressed image file formats are more space efficient. This comes at the cost of reduced resolution quality.

When looking at "Interpolation Type," it should be noted that there are several methods by which a raster can be interpolated. The primary choices are "Binning" or "Triangulation" (Table 2).


Table 2. Binning vs Triangulation Interpolation
Interpolation Type Description
Binning The value of a pixel is determined by binning, which involves observing the points within the pixel to calculate the final value.
Triangulation Triangulation is performed using a method called Delaunay triangulation, which creates a surface from a network of triangular facets composed of nodes and edges that cover the surface and are rasterized. Triangulation is best applied when the point density of the LAS dataset is low, typically when the point size of the pixel is less than three to four times larger than the average distance between pixels.

The parameter should be left as default with "Binning." It should be noted that when switching between "Binning" and "Triangulation," the parameters below them are changed. Table 3 below summarizes each of the parameters for "Binning."


Table 3. Binning Options
Cell Assignment Void Fill
  • Average: uses the mean of the points in the cell
  • IDW: uses inverse distance weight interpolation to calculate cell values -- a greater distance from a point results in a smaller value
  • Maximum: uses the largest point in the cell
  • Minimum: uses the smallest point in the cell
  • Nearest: uses the value of neighboring points so that the distance between each point is equal everywhere
  • None: leaves those cells empty
  • Simple: uses the mean of the nearest points to assign a value to the empty cell
  • Linear: uses triangular interpolation to fill the empty cell -- this is similar to the "Nearest" option for cell assignment
  • Natural Neighbor: uses a more sophisticated method of the linear void fill method for a smoother interpolation result

Likewise, Table 4 below summarizes each of the parameters for "Triangulation".


Table 4. Triangulation Options
Interpolation Methods Thinning Type Selection Method (Window Size Only)
  • Linear: uses the value of neighboring points so that the distance between each point is equal everywhere
  • Natural Neighbor: uses a more sophisticated method of the linear void fill method for a smoother interpolation result
  • No Thinning: leaves the output size raster as is
  • Window Size: this “thins” the raster which may make it faster to process
  • Maximum: uses the highest point in the window size
  • Minimum: uses the lowest point in the window size
  • Closest to Mean: uses the closest approximation to the mean of the points in the window size

For "Cell Assignment" use "Nearest" and for "Void Fill Method" use "Linear". These parameters will help preserve the edges of buildings, structures, and trees.


Under "Output Data Type," the options "Floating" or "Integer" are available. For the purposes of this tutorial, either option can be selected, though the choice ultimately depends on the intended use of the raster data. A summary of the differences is presented in Table 5 below.


Table 5. Floating and Integer Valued Rasters
Output Data Type Description
Floating "Floating" rasters include decimal points, making them ideal for displaying elevation data. The caveat is that, since infinitesimal values are included, an attribute table cannot be created for the raster. Generally, this makes floating point rasters ideal for data visualization, but less useful for analytical workflows.
Integer Integer rasters retain an attribute table, as their values are countable. If a TIN model for 3D modeling with building footprints is to be created, or if elevation values need to be extracted to a feature layer, selecting an "Integer" raster type may be more appropriate. The disadvantage of integer rasters is that they cannot display data with the same level of detail as floating-point rasters.

"Sampling Type" offers two choices: "Cell Size" and "Observations" (Table 6). Both options include the "Sampling Value" and "Z Factor" sub-options. Since a square grid of LIDAR data is used, "Cell Size" should be selected as the "Sampling Type."


For "Sampling Value," a lower sampling value produces a higher resolution raster. The default value of ten can be left as is, although other values (such as 1, 5, 25, or 100) may be tested by running the tool to observe how the resolution of the output raster is affected.


The "Z Factor" is used to convert x and y units to a z unit with a different unit of measurement. For example, if the x and y units are in feet and the z coordinate is in meters, the Z factor would be set to 3.28084 to convert the z units from meters to feet. It is also used to apply vertical exaggeration in 3D modeling, making vertical features more prominent. The Z factor can be left at the default value of one, as no conversion of measurements or 3D modeling is required.


Table 6. Sampling Types for Raster Datasets
Sampling Type Description
Observations The number of cells that divide the lengthiest side of the LAS dataset extent will be used. This method is ideal if the raster is rectangular rather than square shaped.
Cell Size The cell size of the output raster will be used. This is the default.

Under the "Environments" tab, there are a few more settings (Figure 7).


Environments
Figure 7. Looking at the Environment Settings of the Create Raster from LAS Dataset Tool

The parameters "Output Coordinates," "Raster Analysis," and "Geodatabase" are straightforward, as they refer to the output coordinate system of the raster, how the raster's inputs and snap size should be considered, and whether a specific geodatabase configuration keyword is required. Particular attention should be given to the "Raster Storage" parameter, as it will affect the appearance of the raster dataset. The "Pyramid" option is checked by default: pyramids are built to potentially expedite the drawing speed of the raster dataset. Pyramids are essentially down-sampled versions of the raster dataset, which can then be displayed more quickly in ArcGIS Pro. When zooming in, the details from the original raster will be displayed. This is useful for large rasters, as it prevents ArcGIS from needing to load every detail at once.


Specific "Pyramid levels" can also be specified. If the field is left empty, all pyramids will be built. If a specific number is entered, that number of pyramids will be constructed. Below "Pyramid levels" is the "Skip first" option, which is unchecked by default—this would skip the creation of the first pyramid level of the raster.


The "Resampling Techniques" are described in Table 7 below. This parameter affects how the values of neighboring cells are used to assign a value to the output raster cell.


Table 7. Resampling Methods for Raster Datasets
Method Description
Nearest Neighbor The Nearest Neighbor method assigns the value of the closest cell to the output cell. This method is ideal for discrete data, such as land use, that uses integer values, as it does not smooth out the data.
Bilinear Interpolation Bilinear Interpolation is typically used for continuous datasets, such as elevation, as the weighted distance average of the four nearest cell values is used to determine the value of a new cell.
Cubic Convolution Cubic Convolution is similar to the bilinear interpolation method, with the difference being that the weighted distance average of the nearest 16 cell values is used. This results in a less distorted raster, which is ideal for continuous data, though it comes at the cost of higher processing time. Output cell values can also fall outside the range of the input cells.

"Raster Statistics" is checked by default. Raster statistics consist of standard descriptive statistics (i.e., mean, maximum, minimum, standard deviation) calculated from the cell values of each band in a raster. These statistics are necessary for certain operations, such as applying a contrast stretch or classifying data. Even if unchecked, raster statistics can still be calculated later. Below are the "X skip factor," "Y skip factor," and "Statistics ignore value(s)." The X and Y skip factors, as the names imply, impose a skip distance between samples along the x and y axes. This can be used to limit or manage samples during raster statistics creation. "Statistics ignore value(s)" is a semicolon-separated list of values that should be excluded from raster statistics.


Finally, the "Compression" field is available. Typically, the default value "LZ77" can be left unchanged, as this compression method is compatible with a wide range of raster types and preserves all cell values in the raster. A detailed breakdown on raster compression can be found here.


With the parameters set, the "LAS Dataset to Raster" tool should be run. An output similar to Figure 7 below should have been created. It should be noted that the raster appears pixelated with stark edges around trees and structures—this is due to the selected parameters. If more intensive resampling methods and cell assignment techniques had been applied, a smoother gradient between elevation values would have been produced.


Like the LAS Dataset, when the created raster is selected in the Table of Contents, a unique "Raster Layer" option will appear on the top ribbon of ArcGIS Pro. Here, the Symbology method, Stretch Type, and Resampling Type can be altered. No changes are necessary, but it is worthwhile to explore and test these options.

Visually analyzing the raster, several notable features can be observed. The top-left part of the raster shows bright white structures—these are buildings on the Western Kentucky University (WKU) campus, located on historic Vinegar Hill. The bright white color denotes higher elevation. East of the WKU campus, high-elevation tree tops in the College Hill Historic District can be seen, which connects to Reservoir Hill (not included in the raster). To the south, a solid black line dotted with moderate-sized structures cuts the raster somewhat diagonally. This is the US-31W Bypass, which once marked the edge of Bowling Green in the 1950s and 1960s and now serves as a major thoroughfare through the city. In the south-central part of the raster, a large structure that sits lower than WKU (as indicated by its darker color) is visible. This is the TC Cherry Elementary School, which serves much of downtown and southern Bowling Green.


Elevation Raster
Figure 8. The Elevation Raster

Creating a Hillshade from a Raster Dataset


The "Hillshade" tool should be searched for in the geoprocessing toolbox. Similar to the process of creating a raster dataset, several parameters need to be considered when generating a hillshade (Table 8).


Table 8. Hillshade Dataset Creation Parameters
Setting Description
Azimuth The angle from which the light source interacts with the hillshade. It is measured clockwise starting at north. The default azimuth is 315°, or a light source simulated from the northwest.
Altitude The angle of illumination above the horizon: the value ranges from 0° to 90°. 0° is on the horizon, whereas 90° is directly above the hillshade.
Model shadows A checkbox is available to determine whether shadows should be modeled. Possible values range from 0 to 255, with 0 representing the darkest areas and 255 representing the brightest areas. If unchecked, only local illumination angles will be considered in the hillshade. If checked, shadows will also be considered. By default, this option is unchecked.
Z factor The z-factor is an adjustment factor applied to vertical units when they differ from the horizontal coordinate units on the surface. If all the units (x, y, and z) are the same, the z-factor remains at its default value of one. However, if, for example, the x and y units are in feet and the z coordinate is in meters, the z-factor would be set to 3.28084 to convert the z units from meters to feet. The z-factor is also used when symbolizing to exaggerate three-dimensional features.

The default azimuth value of 315° may seem oddly specific, but there is a reason for this, grounded in psychology. It is generally perceived that faces are lit from above and with a slight leftward bias. If something is lit from below, the "Crater Illusion effect" can occur, causing elevated structures to appear as depressions. A simple example of this effect using dots can be seen here. The azimuth should generally be left at the default value unless there is a specific reason to alter it.


The altitude of the lighting source can be left at its default value, or it can be adjusted to simulate the current position of the sun. Information about the current position of the sun relative to a chosen location can be calculated here.


The choice to model shadows is primarily an aesthetic decision. It may be enabled to determine whether it is relevant to the data.


The default z-factor is set to convert z units from meters to feet, which will exaggerate features to some degree. For the purposes of this tutorial, the default value has been maintained.


Creating a Hillshade
Figure 9. Creating a Hillshade from the Raster Dataset

Naturally, many things can be done with the hillshade, raster elevation dataset, or the LIDAR data. These datasets can be further analyzed to extract valuable insights, such as creating 3D models, conducting terrain analysis, or identifying patterns in the landscape.


List of Figures and Tables


Figure 1. Finding Point Cloud Data on KYFromAbove

Figure 2. Selecting a LIDAR Index Grid to Download on the ArcGIS Web Map

Figure 3. Converting LAZ to LAS on ArcGIS Pro

Figure 4. LAS Data from a Distance and LAS Data up close

Figure 5. Creating a LAS Dataset

Figure 6. Creating a Raster from the LAS Dataset

Figure 7. Looking at the Environment Settings of the Create Raster from LAS Dataset Tool

Figure 8. The Elevation Raster

Figure 9. Creating a Hillshade from the Raster Dataset


Table 1. Image Files as Raster Datasets

Table 2. Binning vs Triangulation Interpolation

Table 3. Binning Options

Table 4. Triangulation Options

Table 5. Floating and Integer Valued Rasters

Table 6. Sampling Types for Raster Datasets

Table 7. Resampling Methods for Raster Datasets

Table 8. Hillshade Dataset Creation Parameters


References


Map Viewer. (n.d.). https://kygeonet.maps.arcgis.com/apps/mapviewer/index.html?webmap=b5ff91df6309491090c20333c8f58f52

Raster file formats—ArcGIS Pro | Documentation. (n.d.). https://pro.arcgis.com/en/pro-app/latest/help/data/imagery/supported-raster-dataset-file-formats.htm

GISGeography. (2024, July 12). Top 6 Free LiDAR data Sources. GIS Geography. https://gisgeography.com/top-6-free-lidar-data-sources/

Raster Compression—ArcMap | Documentation. (n.d.). https://desktop.arcgis.com/en/arcmap/latest/manage-data/raster-and-images/raster-compression.htm

北岡明佳の錯視のページ. (n.d.). http://www.psy.ritsumei.ac.jp/akitaoka/cratorRamachandran01.jpg

SunCalc sun position- und sun phases calculator. (n.d.). https://www.suncalc.org/