A Kepler-based Three Tier Architecture applied to LiDAR

A Kepler-based Three Tier Architecture applied to LiDAR

A Kepler-based Three Tier Architecture applied
to LiDAR Interpolation and Analysis
Efrat Frank, Ilkay Altintas
San Diego Supercomputer Center, UCSD

R. Haugerud,
U.S.G.S

The Computational Challenge:

LiDAR Introduction

Survey

LiDAR (Light Distance And Ranging, a.k.a ALSM,
Airborne Laser Swath Mapping) point cloud
datasets, a high performance processing of high
point density datasets.
LiDAR generates massive data volumes - billions of
returns are common.
Distribution of these volumes of point cloud data to
users via the internet represents a significant
challenge.
Processing and analysis of these data requires
significant computing resources not available to
most geoscientists.
Interpolation of these data challenges typical GIS/
interpolation software.
our tests indicate that ArcGIS, Matlab and
similar software packages struggle to interpolate
even a small portion of these data.
Traditionally: Popularity > Resources

Interpolate / Grid
Process &
Classify

D. Harding,
NASA

Point Cloud
x, y, zn,

Analyze / Do Science

LiDAR Processing via Kepler

Increasing Usage of Technology
in Geosciences

An extensible, easy to use, workflow design and prototyping tool
On-the-fly creation of workflow instances from workflow templates
Integrating heterogeneous local and remote tools in a single interface:
Gridding and Imaging services via Web and Grid services
GIS services
Remote tools via SSH, SCP and GridFTP
Relational and spatial databases access
Direct access to data and tools from remote repositories
Reusable generic and domain specific actors
Support for High Performance Computations:
Job submission and monitoring
Logging of execution trace and registering intermediate products
Data provenance and failure recovery
Portal accessibility.
GEON LiDAR Workflow is deployed on the GEON portal
Reverse engineering of traditional approach

Online data acquisition and access
Managing large databases
Indexing data on spatial and temporal attributes
Quick subsetting operations
Large scale resource sharing and management
Collaborative and distributed applications
Parallel gridding algorithms on large data sets using
high performance computing
Integrate data with other related data sets, e.g.
geologic maps, and hydrology models
Provide easy-to-use user interfaces from portals
and scientific workflow environments

Configuration phase
Subset: DB2 query on DataStar
Analyze
move

process

Portal
Monitoring/
Translation

Visualize
move

render

display

Interpolate: Grass RST, Grass IDW, GMT
Visualize: Global Mapper, FlederMaus, ArcIMS
Scheduling/
Output
Processing

Grid

GEONs Solution:A Three-Tier
Architecture for LiDAR Processing
GOAL: Efficient three-tier architecture for LiDAR
interpolation and analysis using GEON infrastructure
and tools
GEON Portal - front end layer
Kepler Scientific Workflow System - control layer
Kepler is used as a batch execution engine
GEON Grid - computation layer
Use scientific workflows to glue/combine different tools
and the infrastructure
The architecture provides an efficient and reliable
LiDAR data analysis

LiDAR Job Management
and Monitoring
GLW is exposed to a high risk of components failures
Long running process
Distributed computational resources under diverse
controlling authorities
Kepler provides transparent/background error handling
using provenance data
A unified interface to follow up on the status of submitted jobs
View job metadata
Zoom to a specific bounding box location
Track errors
Modify a job and re-submist
View the processing results
In the future, register desired workflow products
Useful for publication

Client/ GEON Portal
Map and Attributes
Grass

Functions and Parameters
submit

Parameter
xml

Create
Workflow
Description

x,y,z and attribute

NFS Mounted Disk

DB2
Render Map
ArcInfo

DB2
Spatial
query

Binary grid
ASCII grid
Text file
Tiff/Jpeg/Gif

ArcSDE

ArcIMS

raw
data

process
output

Map onto the grid
Grass surfacing algorithms:
Spline
IDW
block mean

Example of LiDAR data acquired along the Northern San Andreas fault in Sonoma County, California.
Left: Hillshade produced from the first return surface DEM (Digital Elevation Model) derived from the
LiDAR data. In this heavily forested region the first return surface largely shows the tree canopy top.
Right: Hillshade of the last return surface DEM for the same area shown in left image. The multiple
returns offered by the LiDAR workflow allow for virtual deforestation and the creation of a bare-earth
model of the ground surface. Note San Andreas fault and roads not visible in the first return hillshade.
LiDAR data represents an important new tool for the study of the earths surface, especially in regions
where heavy vegetation makes traditional techniques such as aerial photography ineffective.
(Source: Christopher J. Crosby, J. Ramon Arrowsmith, GEON, ASU)

Future Plans
Improve overall performance using advanced processing tools
Compute Cluster

ASCII grid

Download data

KEPLER WORKFLOW

http://geongrid.org

Parallel interpolation, enhanced visualization
Extend built-in failure recovery and reporting features
Additional portal execution and registration support
Utilize provenance information for workflow product registration
Create graphical illustration of job progress / location in the
workflow to demonstrate the distributed nature of the system
ULTIMATE GOAL: Make it useful to a wide range of earth science users!

Contributors
Efrat Jaeger-Frank, Ilkay Altintas, Chaitan Baru, Ashraf Memon, Viswanath Nandigam, (GEON,
San Diego Supercomputer Center, UCSD)
Christopher J. Crosby, Jefferey S. Conner, J. Ramon Arrowsmith (GEON, ASU)

Kepler includes contributors from GEON, SEEK, SDM Center, Ptolemy II, ROADNet, CIPRes and Resurgence supported by NSF ITRs 0225673 (GEON), 022567 (SEEK), DOE DE-FC02-01ER25486 (SciDAC/SDM), and DARPA F33615-00-C-1703 (Ptolemy).

Recently Viewed Presentations

  • Presentation title

    Presentation title

    The IP Exec will lead Efficiency realisation to support each of the Route Businesses in meeting their targets (e.g. Signalling, Track) to demonstrate efficiencies and share best practice. Industry Forums. Develop commercial measures & techniques, share information, insights & learning...
  • Residential/ Non- occupational Exposure Assessment 1 Jeff Evans

    Residential/ Non- occupational Exposure Assessment 1 Jeff Evans

    Residential/ Non- occupational Exposure Assessment Jeff Evans Biologist Health Effects Division Office of Pesticide Programs Purpose To present our use of a calendar based model (Calendex™), to address the temporal aspects of OP pesticide use Approach is similar to the...
  • Chapter 20

    Chapter 20

    Chapter 20 . Challenges In Our World Today ... Radical Islamic Fundamentalists oppose Western values & wish 2 apply Islamic laws 2 all aspects of society. ... communications, and manufacturing have led to globalization—greater contact between various parts of the...
  • The Importance of Being Ernest? Earnest?

    The Importance of Being Ernest? Earnest?

    And the importance of HOMONYMS An introduction Homophone Practice After reading the background on Wilde, complete the summary activity on page 3 of your packet by correcting the misused homophones! Oscar Wilde "Lived a gloriously notorious eccentric and decadent life...
  • 7.3 Volumes - Oxford School District

    7.3 Volumes - Oxford School District

    Volumes with Known Cross Sections. A solid has as its base the circle x2 + y2 = 9, and all cross sections parallel to the y-axis are squares. Find the volume of the solid. Solids with Known Cross Sections. If...
  • BHS 499-07 Memory and Amnesia

    BHS 499-07 Memory and Amnesia

    BHS 499-07 Memory and Amnesia Emotion and Memory Complexities Emotion is both an experience to be remembered and a mediator of memory. Emotion affects recall and recognition in opposite ways, and has different effects on explicit and implicit memory.
  • Considerations for Implementing Intensive Interventions for ...

    Considerations for Implementing Intensive Interventions for ...

    It is important to model think aloud strategies for students with intensive intervention needs. All students will benefit from hearing how you approach tasks and solve problems. Some students will not realize that they may already be doing self talk,...
  • Issue Y2K The Great War for Talent!

    Issue Y2K The Great War for Talent!

    "They" hate it if you call them "bankers." "They" love it, on the other hand, when you ask to see their #s—stupendous. "They" are … Commerce Bank. These absurdly fast growing, insanely profitable "retailers," rewriting the rules of East Coast...