EO - Pt.1
EO (Earth Observer) is an ongoing project that aims to layer geospatial data from disparate sources and topics into a unified analysis and visualization platform. Currently, EO is focused on using Landsat, 3DEP and Dynamic World data to quantify temporal changes in land surface temperature, vegetation health and density (NDVI), elevation terrain and urban materials.
Landsat Toolkit
This toolkit is being developed the streamline the acquisition, preprocessing, spatial and temporal analysis of multispectral raster data from the USGS's Landsat 8 and 9 missions. The toolkit also incorporates other spatial features such as slope, elevation, catalogued features (water bodies, land formations, forests, built structures etc.), and socioeconomic and demographic data such as population, income and various other indicators. The intent of this project is to provide comprehensive descriptions of land characteristics and changing land dynamics over time. An example of this toolkit's usage may be to determine the suitability of land for purposes like homesteading or small scale farming by analyzing how land surface temperature and vegetation health have changed in an given area over time and if terrane characteristics such as topography are suitable for the intended land use.
Data Acquisition and Preprocessing in Earth Engine
While the majority of tools are being developed from the ground-up, Earth Engine is the current means of data acquisition and basic preprocessing. Earth Engine is a powerful cloud based tool that allows us to harness parallel processing in the cloud, thus removing limitation local processing capabilities. Earth Engine provides access to a wide variety of data collections including the USGS’s Landsat 8/9 Level 2, Collection 2, Tier 1 collection.
Temporal Filtering
This process is designed to analyze imagery on a year over year, pixel wise basis. Input images consist of composite/mosaic imagery created from spatially and temporally filtered imagery. Temporal filtering is conducted in Earth Engine at the time of data retrieval. The user specifies the year and window of time for which to retrieve imagery. For example, in the DFW urban heat island analysis, imagery spanning from June to September between 2015 and 2022 was filtered to generate the composite/mosaic imagery.
Spatial Filtering
This process take what we refer to as the "main roi", which captures the full extents of the analysis area. To Filter imagery, we construct a bounding box in Earth Engine, using bounding point coordinates given by the user. The points are used to construct the lower left and upper right corners of the bounding box. This main roi is then applied to the retrieved filter collection and used to clip the collection to images within the bounds of the main roi.
Tiles
Raster tiles can be extremely large, making it difficult to process with local hardware. To improve local processing, large GeoTiff files exported from Earth Engine are broken into smaller tiles to optimize processing. The default tile dimension is 1000x1000 pixels and the tile dimensions can be adjusted by the user. For example, if a GeoTiff file measures 5000x5000, the GeoTiff will be split into 25 1000x1000 pixel GeoTiffs.
Log.json
During the tiling process, the geographic coordinates of the tile bounding box are logged to a log.json file which is referenced in later processes. The log.json file also tracks processing times for each tile.
Tracking the bounding box geographic coordinates of each tile is a critical part of layering geospatial data from other sources. The use of reference bounding boxes are used is exemplified by when 3Dep terrain data is integrated into the model. For example, if terrain data is added for a custom region that overlaps multiple Landsat tiles, the reference bounding boxes are used to quickly determine which Landsat tiles the terrain tile overlaps. The overlapped tiles are then called and used to combine Landsat and terrain data into a composite tile. In this example we see that the terrain tile overlaps Landsat tiles 1-3 and 2-3.
NOTE: While this is the current process of determining if a point from Landsat tiles are within 3DEP tile bounds, this process is being updated for better efficiency. I will be updating the data storage schema for efficient union and overlap filtering.
Landsat Band Combinations for Spectral Analysis
The Landsat toolkit conducts spatial and temporal analysis of time-series geospatial data to quantify how spectral characteristics have changed over a given time frame. The recommended time frame is ten years in order to provide enough time series data for regression and LSTM predictions. The multispectral nature of Landsat means we can use different band combinations to derive spectral indices. The bands captured by the Onboard Land Imager (OLI) are used to derive spectral indexes such as Normalized Difference Vegetation Index (NDVI) which is used to quantify the health and density of vegetation. This is possible because chlorophyl (the chemical responsible for photosynthesis in plants) reflects light within the near infrared range, which is captured by the OLI in band NIR-5. Other spectral indexes are used to differentiate between natural and urban materials like concrete and asphalt, which allows us to measure where new urban development occurred. The Thermal Infrared Sensors (TIRS) on Landsat are used for measuring land surface temperature and atmospheric temperature.
Band combinations for deriving spectral indices
Landsat Spatial & Temporal Analysis Process
Resampling is an important part of this process because it allows us to fill in gaps in the data and spatially aggregate the data to desired spatial resolutions. For example, if one were to conduct this analysis over a large geographic area where the 30 meter spatial resolution of the original imagery was not necessary and instead a lower spatial resolution was sufficient, the resampling process allows the data to be simplified to a given degree specified by the user. The composite imagery may have gaps in the data and/or hard contrasts where image tiles were merged together by Earth Engine's simple composite algorithm. These can cause disruptions to the analysis so to mitigate potential issues, resampleLandsat.py applies a Gaussian kernel over the image and smooths the data within the kernel. Then, a spatial aggregation window of a size determined by the user, is applied over the image to resample and simplify the image. For example, if the user defines the resampling window size to be 3x3, a 3x3 pixel sliding window is applied over the imagery and takes the mean average of pixel values within the window for all applicable spectral indices (NDVI, Land Surface Temperature). As the sliding window moves across the imagery, a real-world coordinate is generating using the row and column indexes of the window and distance of the window from the northwest corner of the GeoTiff.
This data can now be aggregated in different ways to provide spatial insights. An application of this analysis process is to quantify the impact of different building typologies on urban heat islands. If you wanted to know what the biggest contributors to urban heat islands are in a city, this analysis process allows you to do just that. In this example, GIS parcel data for the city of Dallas TX are used to pool the temporally analyzed data. If a pixel falls within the boundaries of a parcel polygon, data associated with the pixel is added to an array associated with the data object of the parcel's building type. After all pixels are pooled by parcels, the building typology data objects are summarized by taking the mean, standard deviation and min/max of land surface temperature and NDVI. Now we can see the average land surface temperature and NDVI of each parcel type within the city.
Examples of mean LST & NDVI and rate of change of LST and NDVI. Example location: Dallas Texas
Example from a Power BI dashboard showing average land surface temperature by parcel type.
Interface
The interface of this project is in development and is currently being run as a React application using DeckGL for the mapping components and Recharts for chart components. The example shown here is of a rough interface prototype that allows the user to select individual tiles with a dropdown menu. When the user selects an tile, the application loads a JSON file that contains resampled Landsat spectral data. The DeckGL components use vertical bars to visualize normalized mean land surface temperature and NDVI. The histograms on the right use Recharts and show a summary of the distribution of land surface temperature and NDVI values within the selected tile.
3DEP Toolkit
Colorized USGS elevation data rendered with a DeckGL pointcloud layer in a React web application.