Skip to topic | Skip to bottom
Home
Main
Main.DataDictionaryr1.16 - 14 Oct 2005 - 22:10 - JeremyCothrantopic end

Start of topic | Skip to actions

Overview

Welcome to the SEACOOS Data Dictionary twiki page. This twiki has been created to provide a reference point for the data dictionary project.

The online data dictionary itself is located at http://nautilus.baruch.sc.edu/seacoos_dd

The source code(tarball) can be downloaded from http://nautilus.baruch.sc.edu/seacoos_dd/seacoos_dd.tar
with a few installation notes http://nautilus.baruch.sc.edu/twiki_dmcc/bin/view/Main/DataDictionary#Site_installation_instructions

To get a briefing on the aims of the data dictionary project, please see the following MS-Word document: TheaimoftheSEACOOSdatadictionary.doc

Following are some ad-hoc instructions for the project.

Instructions

Login

Select your username from the drop-down menu box and click "login".

Register

When registering, provide your username, institution and e-mail. Once you submit your registration you will be returned to the login page.

Home

This is the default view displayed after login. Here you will see a listing of all current variables, sorted alphabetically by scientific category and variable name

Actions

This column is common to all rows in the data dictionary. It allows the following functions to be performed:

Edit

This link allows the modification and submission of a current variable name. This is one of the main methods of providing feedback for an existing variable name. Click Update once you have completed your suggested change.

Review

This link takes you to a page that allows you to approve or disapprove the selected variable. Approving or disapproving does not have any effect on the system, it is just a tally sheet to try to help speed up discussion consideration on new variables.

Delete

This link allows the deletion of a previously added variable.

Recently Added Variables

Get a quick look at the 10 most recently added/modified variables - also includes your own recent edits

My Variables

Displays a list of all variables that have been created under your account

Add Variable

Allows the addition of new variables

Search Variable

Allows the user to search variables and export them into either MS-Excel (.xls) or CSV format.

Query Builder

Provides functionality to specify search and sorting criteria for output

Export

Once a query is generated, two export options are available(note these option links are listed at the bottom of the page, so scroll to page bottom right to see them):

Export to Excel

Exports the current query results to MS-Excel (.xls) format

Export to CSV

Exports the current query results to comma separated values (.csv) format

Logoff

Logs the current user

Additional instructions for non-Seacoos participants

If you would like to utilize this data dictionary as a non-Seacoos participant, register under the organization 'Other'. When entering rows, under the 'Standard' field, use a combined Standard and Version notation that you would like to sort by(for example, MyStandard?.v1.0).

Issues

(add problems that you are experiencing while using the application here)

Top Priority

1. add (one-to-many relation) associative table to other standards and conventions as we have started in the Excel v1.3.

For the current implementation, I've decided to try to keep things as simple as possible trying to use a single table for everything. It's possible to normalize or break our existing table up into smaller, more specific tables, but I didn't think the conceptual or referential complexity was worth it at this time. The relation between our standard and other standards can be captured in the 'Equivalent Standard Name' field (Equiv.Std.Name) using the suggested programmatic syntax seen in the fields.

2. populate tables with current Excel data (either scripted or manual entry)

Done.

3. table that associates certain fields in DD with netCDF attribute names for automated file creation of netCDF files (for example, "CDL Coordinate Attribute" in DD database = "axis" in CDL and "Geophysical Valid Range" in DD dataabase = "valid_range" in CDL)

Delaying this table breakdown for the same reasons listed in Issue #1.

Next Level Priority

4. choose view by accepted version.

Done. This is the default view when first logging in.

5. add (many-to-many relation) when user sets disapprove and later approves, have vote changed.

Done.

6. allow user to edit their user information

Not done. Currently send an email to jcothran@carocoops.org if you want me to delete or change your user data.

7. fix registration form to return to login screen (home)

Done.

Requested Features

(add features that you would like to see implemented in future versions here)

Site installation instructions

If you would like to download the source code and install or modify this application for your own use, the source code is available at
http://nautilus.baruch.sc.edu/seacoos_dd/seacoos_dd.tar

Anyone is welcome to modify this wiki page to provide better or alternate documentation regarding this application.

The current install is a php application running on a Linux Enterprise server using a PostgreSQL? 7.3.1 database.

After unpacking the tarball:

In the 'sql' directory, as the 'postgres' user, run the following commands to create the necessary dbinstance and tables. Modify the 'INSERT' statements as needed for initial population of some of the tables. You can ignore the error statements which correspond to dropping tables or indexes which do not currently exist.

  • createdb seacoos_dd
  • createlang plpgsql seacoos_dd
  • psql -U postgres -d seacoos_dd -f seacoos_dd.sql

The 'php' directory is directory for all the php source code which drives the application.

  • the application begins at 'dd_login.php' forwarded from index.html
  • in the file 'global_vars.php' are the settings for the database instance and user assumptions, the default standard name and version number
  • the support functions are located in the 'dd_functions.php' file

The 'images' and 'stylesheet' directories effect the website display styling.

The browser link you start from is index.html( which is immediately forwarded to php/dd_login.php ) from where you have it set up on your system.

Existing vocabularies and search engines

GCMD

NASA Global Change Master Directory http://gcmd.nasa.gov/

NASA Global Change Master Directory keywords http://gcmd.gsfc.nasa.gov/Resources/valids/keyword_list.html

Looking at the GCMD keywords, this seems like something we could map the seacoos data dictionary to. FGDC seems to be following GCMD's lead in regard to use of their keyword list in the Category > Topic > Term > Variable format and ISO 19115(see attachment here) is a fairly small superset (numbering about 20 elements) which maps fairly well to the GCMD categories. Here's an example of both the GCMD and ISO thesaurases included in the FGDC XML elements.

<theme>
    <themekt>GCMD</themekt>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Coastal Processes &gt; Sea Surface Height</themekey>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Bathymetry/Seafloor Topography &gt; Bathymetry</themekey>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Ocean Temperature &gt; Sea Surface Temperature</themekey>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Ocean Temperature &gt; Water Temperature</themekey>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Salinity/Density &gt; Salinity</themekey>
    <themekey>EARTH SCIENCE &gt; Oceans &gt; Ocean Winds &gt; Surface Winds</themekey>
</theme>
<theme>
    <themekt>ISO 19115 Topic Category</themekt>
    <themekey>climatologyMeteorologyAtmosphere</themekey>
    <themekey>inlandWaters</themekey>
    <themekey>oceans</themekey>
</theme> 

My questions for use of the GCMD are the following:

  • Would it be possible to provide a numeric index for database or internationalization purposes similar to ISO?
  • while there are subtopics for model output, data services, platforms, instruments - I would also like to see topics for processes like QA\QC
  • should the GCMD as a thesaurus be extended(subclassed) further for additional detail? If so, how does one bridge this thesaurus with other thesauri or extend into more detailed taxonomies(biological for instance)? A lookup table? RDF?
  • earlier concerns have been listed at this FGDC link from 1996

Others

NOAA NODC Ocean Archive System http://www.nodc.noaa.gov/Archive/Search/

Ontologies/controlled vocabularies (I don't think any of these would qualify as a 'true' ontology - maybe ASFA):

NOAA NODC Ocean Archive System Authority Tables http://www.nodc.noaa.gov/search/prod/authority.html

British Oceanographic Data Center Parameters Dictionary http://www.bodc.ac.uk/documents/bodc_params_download.html

Aquatic and Fisheries Sciences Abstracts (ASFA) Thesaurus http://www4.fao.org/asfa/asfa.htm

USGS Augmented Place Names http://geo-nsdi.er.usgs.gov/place

USGS Geographic Names Information System (GNIS) http://geonames.usgs.gov/gnishome.html

The main parameter ontologies/controlled vocabularies seem to be GCMD (US and some foreign) and BODC (mostly foreign). ASFA is used extensively by marine science libraries and library index publishers.

March 24, 2005

Meant to relay to the group that I've been having some background discussion with Luis Bermudez ( new PhD? hire at http://marinemetadata.org ) and Surya Durbha( http://www.gri.msstate.edu/gri-lr.php ) who are both active with semantic web/ontology framework technologies using RDF and OWL. Luis using his tools has converted the Seacoos draft data dictionary into an ontology as well as several other popular dictionaries at http://marinemetadata.org/examples/mmihostedwork/ontologieswork/ontologies/

Mechanisms for resolving/crosswalking/mapping between these data dictionaries and format conventions is critical as groups need to be able to dynamically establish meaningful vocabularies and format conventions for themselves while also being able to link into a larger discovery and usage framework.

The things which I'm looking forward to continuing programmatic bridging of provider datasets is the use/reuse of data dictionaries and format conventions combined with crosswalks/mapping/conversion between these where needed. Data dictionaries are critical in establishing a consensus meaning and the final vocabulary(standard names for example) by which data is referenced. Format conventions are critical in allowing some flexibility in how metadata and data are represented while still supporting the data dictionary vocabulary and programmatic use and development.

Also trying to parallel/bridge more XML and ontologies(RDF/OWL) development into the existing flow for the advantages in programmatic syntax, namespaces, validation(xml schemas) and knowledge representation that they offer, especially in regards to less bandwidth/storage intense metadata.

Condensed Seacoos observations data dictionary as XML and XML Schema

Wanted to present the data dictionary terms which we are using as part of our aggregation process now presenting only what I think are the critical attributes(standard_name, definition, units) for cross-walking our observation type data dictionaries with others. This smaller listing is presented as an xml with accompanying xml schema describing the xml structure for this observation data dictionary.

xml file

xml schema

csv file
to top

I Attachment Action Size Date Who Comment
iso19115.ppt manage 108.5 K 28 Jan 2005 - 16:09 JeremyCothran  
seacoos_dd_small.csv manage 4.2 K 14 Oct 2005 - 22:28 JeremyCothran NA

You are here: Main > DataDictionary

to top

Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding DMCC? Send feedback