HydroShare: Advancing Collaboration through Hydrologic Data and Model
38 Slides7.41 MB

HydroShare: Advancing Collaboration through Hydrologic Data and Model Sharing David Tarboton, Ray Idaszak, Jeffery Horsburgh, Dan Ames, Jon Goodall, Larry Band, Venkatesh Merwade, Alva Couch, Jennifer Arrigo, Rick Hooper, David Valentine http://www.hydroshare.org OCI-1148453 OCI-1148090

CUAHSI HIS Challenges Publishing data requires access to or setting up a HydroServer Accessing data requires HydroDesktop Generally limited to time series at a point Desktop Catalog Server

A digital divide Researchers Experimentalists Modelers Big Data and HPC #!/bin/bash vi chmod #PBS -l nodes 4:ppn 8 grep awk mpiexec How can we best structure data and computer models to enable the use of high-performance and data-intensive computing by discipline scientists coming to this problem without extensive computational knowledge and algorithmic experience? Gateways, Web Interfaces, CyberGIS

Can sharing data and models be as easy as sharing photos on Facebook or videos on YouTube?

Can finding data and models be as easy as shopping on Amazon? Possible Filters Available Formats Items Recommendations Prices (perhaps usage)

Cloud Computing Applications Models Storage Computation Services Wikipedia: Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet) Google, Amazon, Microsoft, Apple, DropBox XSEDE, Condor, BOINC

HydroShare is a web based collaborative system to support analysis, modeling and data publication Collaboration Analysis Observers and instruments Data Models Publication, Archival, Curation

http://beta.hydroshare.org Currently in beta testing

HydroShare Functionality to be Developed 1. A new, web-based system for advancing model and data sharing 2. Sharing features to HydroDesktop 3. Access more types of hydrologic data using standards compliant data formats and interfaces 4. Enhance catalog functionality that broadens discovery functionality to different data types 5. New model sharing and discovery functionality 6. Facilitate and ease access to use of high performance computing 7. New social media and collaboration functionality 8. Links to other data and modeling systems

Upload

Support additional types of data Resource Types Time Series Geographic feature set Other Referenced HIS time series Geographic Raster Multidimensional Space Time dataset River geometry Sample based observations (ODM2 and CZO) Documents Tabular objects HydroDesktop Project package Scripts Models Model Components Referenced data sets from other (non HIS sources). Tools Uploaders to facilitate loading of resources Viewers to visualize the resource Exporters to download the resource Best practice tools for hydrologic data preprocessing and analysis Requires a Resource Data Model Documented resource content specification that dictates how the resource is stored in HydroShare

Imagine the Possibilities 3. 1. Discover Observe and 2. Analyze/Model Publish and Catalog (in Desktop or Cloud) Collaboration 3 Observers and instruments 1 HydroServer Data (ODM) Analysis Models 2 HydroShare to support integrated collaborative analysis, modeling and data publication Publication, Archival, Curation

Imagine the Possibilities 4. Share the results (Data and Models) Collaboration 4 Observers and instruments HydroShare Data resource store Analysis Models HydroShare to support integrated collaborative analysis, modeling and data publication Publication, Archival, Curation

Imagine the Possibilities 5. Group Collaboration using HydroShare 6. Preparation of a paper Collaboration 5 Analysis Observers and instruments Data 6 Models HydroShare to support integrated collaborative analysis, modeling and data publication Publication, Archival, Curation

Imagine the Possibilities 7. Submittal of paper, review, archival of electronic paper with data, methods and workflow Collaboration Analysis Observers and instruments Data 7 Models HydroShare to support integrated collaborative analysis, modeling and data publication Publication, Archival, Curation DataOne, EarthCube,

HydroShare Modeling Flow Time y t x Data: Links toreasoning national and global data setsinteroperability of essential Preconfigured Standards Tools Automated forpreprocess to visualization for information models and and to and configure couple analysis exchange modeling models inputs for systems based (TauDEM asonservices terrestrial variables (e.g.and NASA NEX, HydroTerre) CyberGIS)context, (CI-WATER) (OpenMI, purpose, CSDMS BMI) data resources (Aaron Byrd)

A specific example Big snow year Will my city flood? Click to delineate watershed (model domain) Generate model package from Essential Terrestrial Variables Generate suite of input scenarios Execute model and view results P Time Flow Time

But there is more What if I could express my decision needs to the system and have it reason and deduce which models need to run, then configure and run them based on the inputs available, precision needs and resources and time available.

Resource Repository Centric Paradigm for Modeling and Analysis Models Visualization Tools Analysis Tools Resource Repository Data Loaders Data Discovery Tools Enable multiple models to use common “best practice” tools

E.g. SWATShare A web based tool for publishing, sharing, and accessing Soil Water Assessment Tool (SWAT) www.water-hub.org/swat-tool

Model pre and post processing workflow Model Models Visualization Tools Analysis Tools Input Files Output Files PreProcessing Post Processing Resource Repository Data Loaders Data Discovery Tools Resource Repository Each model interacts with information in the common data store The modeler does not need to be concerned with and can take advantage of standardized analysis, visualization loading and discovery tools

Architecture and Development

Drupal – Content Management System Extensible Open Source Content Management Framework for Publication written in PHP – Over 14,000 user contributed modules Themed and Styled Presentation of HydroShare Resources with in page visualization Off the shelf modules provide a Social Experience surrounding Hydrologic Data: Comments, Ratings, Group Behavior Custom module development supports HydroShare Data Model, GeoAnalytics and iRODS Integration

Enterprise iRODS Distributed Data Grid Middleware: Users Client iCAT Rule Engine R. Server MSVC R. Server E-iRODS in HydroShare Storage of HydroShare Resources Replicated across multiple institutions Access to Computation Access to Indexing for Discovery Metadata Catalog holding virtual file system information and associated metadata Extensible number of ‘Resource Servers’ which may provide connectivity to storage resources Integrated Rule Engine for Policy Driven Data Management triggered by Data Management Activities Extensibility via Microservices (MSVC) – Plugins providing functionality to the Rule Engine

A community project http://www.cuahsi.org 109 US University members 7 affiliate members 20 international affiliate members 3 corporate members (as of January 2013) Informatics Standing Committee Users Committee

Community Governance CUAHSI Board Standing Committee on Informatics Oversight HydroShare Executive Committee CUAHSI User Community Community / User Requirements – – – – Surveys Conferences Workshops Embed UI with “Help us make our software better” Specification Requests Prioritization Decision Making Prototype HydroShare Development Team Implementation (Agile) – – – Hydrologic Information System (HIS) Integrated RuleOriented Data System (iRODS) Drupal Released Software HydroShare Evaluation – – – Metrics End-user involvement Quantitative and qualitative measurement Sustainability

HydroShare project team – USU – RENCI/UNC – CUAHSI – BYU – Tufts – UVA – Texas – Purdue – SDSC OCI-1148453 OCI-1148090 2012-2017

User driven use cases Annotate uploaded hydrology models using an ontology Register a Package with HydroShare Add data resource for a model Notify Me When Related Resources Are Registered Register a Resource with HydroShare Evaluate Load Reduction Scenarios Suggest a Resource Related to the Current Resource Building an Intelligent Digital Watershed (IDW) Contribute to a Community Dataset Define Relationships between Resources Discover a Community Dataset to which I Can Contribute Execute a Model in HydroShare Register a Workflow with HydroShare Register a Community Dataset Download a Model, Execute It, and Share the Model and Results Define a Composite Resource Crowd sourcing modeling tasks Automated Visualization (thumbnails) User displays HydroShare Gallery Existing User Logs into HydroShare New User Creates a HydroShare User Account User Sets Personal Preferences User is provided a personal Dashboard User Chooses to “Follow” Another User User Chooses to “Follow” a Group User Views His/Her Personal Content User Uploads a Resource User Deletes a Resource User Shares a Resource in HydroShare User Publishes a Resource to DataONE User Publishes a Resource to the CUAHSI Water Data Center User Exports a Resource to their Local Machine User Searches / Filters / Sorts their Personal Resources User Views Details Page for a Resource User Groups Resources into a “Folder” or “Collection” User “Opens” a Resource User Edits Metadata Description for a Resource User Adds a Comment to a Resource User Rates / Reviews a Resource User Derives a New Resource from an Existing Resource User Executes a Resource User Explores / Searches Available HydroShare Resources User “Pins” a Discovered Resource to a “Resource Collection” User Filters Discovered Resources User Imports Data from Externally Hosted Resources User Searches For Collaboration Groups User Views Group Details User Creates a Collaboration Group User Requests Group Membership User Creates a Comment on a Collaboration Group User Creates a Discussion in a Collaboration Group Discussion Forum User Edits a Collaboration Group’s Description User Searches / Filters / Sorts a Group’s Resources User Views Documentation and Gets Support User Views / Subscribes to the HydroShare Blog User Exports a HydroShare Resource Citation into Mendeley or Zotero User Transfers Ownership of a Resource to Another User User Receives HydroShare Social Media Notifications via Mobile Device User Views Access / Download Statistics for a Resource User Views HydroShare Resources via Mobile Devices Searching and/or browsing HydroShare Translate data automatically for HydroShare operations. Translate data automatically for export. Publish translated data. Translate replicated data. Registration of a new HydroShare Tool Editing a Published (with DOI) resource User Creates New “Model Package” Resource User Transfers Ownership of a Group to Another User User Develops a Client for HydroShare Summarize hydrologic model input parameters for a user defined region Discover specialist/ Promote specialized services Visualize Time Series Upload a Model

Average duration of session Number of logons Number 35 15 15 Number of compute jobs run CPU hours of compute resources used Number of resources stored Number of active users Use Metric Size of resources stored (GB) Metrics Number of resources downloaded Metric Number of registered users Number of host institutions Github HydroShare code repository owners and members User Types: University Faculty, University Professional or Research Staff, Post-Doctoral Fellow, University Graduate Student, University Undergraduate Student, Commercial/Professional, Government Official, School Student Kindergarten to 12th Grade, School Teacher Kindergarten to 12th Grade, Other, Unspecified Total use Use by user type University Faculty Post-Doctoral Fellow . Resource Types: Time Series, Geographic Feature Set, Geographic Raster, Use by Geographic Location Multidimensional Space Time Array, River Geometry, Model, Workflow, Other, State Country Use by resource type

Collaborative Open Development http://github.com/organizations/hydroshare http://hydrodesktop.codeplex.com

Summary A collaborative website for the sharing of hydrologic data and models To expand data sharing capability of CUAHSI HIS – Additional data classes – Models, scripts, tools and workflows Community Participation Interoperability Standards Open Development To boldly go where no one has gone before

Thanks to a lot of people – – – – – – – – – USU RENCI/UNC CUAHSI BYU Tufts USC Texas Purdue SDSC HydroShare team: Dave Tarboton, Ray Idaszak, Dan Ames, Jeff Horsburgh, Jon Goodall, Larry Band, Venkatesh Merwade, Jeff Heard, Carol Song, Alva Couch, David Valentine, Rick Hooper, Jennifer Arrigo, David Maidment, Tim Whiteaker, Alex Bedig, Laura Christopherson, Pabitra Dash, Tian Gan, Tony Castronova, Karl Gustafson, Stephen Jackson, Cuyler Frisby, Stephanie Mills, Brian Miles, Jon Pollak, Stephanie Reeder, Ash Semien, Yaping Xiao, Lan Zhao http://www.cuahsi.org/hydroshare.aspx OCI-1148453 OCI-1148090

Next Class

Representing River Geometry in HydroShare LiDAR Cross Sections Attached to River Network Cross Sections Hydraulic Calculations

Modular design, linking river geometry, catchment geometry, network topology, and time series observations Data is linked by common reference points along the river, which can be represented as point or cross section shapefiles and shown on a map. Based on OGC HY Features Model


