Skip to topic | Skip to bottom
Main.PlatformToDatabaser1.1 - 28 Jan 2005 - 21:28 - JeremyCothrantopic end

Start of topic | Skip to actions
This page was excerpted from an email(July 2004) to another tech asking for sample code of how I go from our platform data streams to populating the relational database.

See also


Attached 'buoy2.sql' which is postgresql sql used to create one of our buoy tables (this is buoy2, 1 table per buoy) a year and a half ago. Also attached a sample buoy2 input file that is received from the modem every 2 hours along with the perl script '' used to parse and store the data fields to the database. 'processBuoys.ini' is the support text file used for setting environment parameters across scripts. Also included a telnet script ' used to poll our water level stations just to give more ideas about how quick it is to write perl script for grabbing and parsing data for insertion to the database.

You could use the above sql 'by platform' kind of table layout, but I'd say to go instead with a 'by variable' approach to your tables with separate tables for variables, and support tables platforms and sensors. So for example, you might have this postgresql for a 'water_temperature' table:

CREATE TABLE water_temperature
 row_entry_date DATETIME,
 row_update_date DATETIME,  /* optional? */
 station_id VARCHAR(32) NOT NULL,
 sensor_id VARCHAR(32),
 measurement_date DATETIME,
 measurement_value FLOAT,
 qc_level INT,
 qc_flag VARCHAR(32));

The SERIAL type is like an autocount datatype and will create a row_id_seq sequence that will get autoincremented and used to store in that field for every insert.

I added 'row_update_date' even though I don't think it should be necessary to update what are essentially final measurements.

The water level station tables are set up on a 'by variable' basis with station_id a unique key for each station. The label for station_id can serve as a foreign key into a station table which lists each of the stations and their attributes. Note that a station could potentially be a fixed or moving platform.

I added sensor_id because some stations will measure the same variable at different points like ADCP where you might have 10 bins each measuring water current and direction at different depths. The definition of sensor_id on a sensor table is dependent on the station_id. You could have one sensor table, but you'd want the station id as part of the primary key to distinguish between identical sensors at different stations.

Note that a variable doesn’t always have to be a singular measurement - it might be a combination of two measurands like 'wind' which might include 'speed' and 'direction'.

qc_level and qc_flag are similar to what I mentioned was discussed at the QARTOD meeting . qc_level is simply an integer in the set [0,1,2,3,-9] to denote what the level of qc processing has been performed on the data (good, bad, suspect, ...). qc_flag could hold one or several qc flags to denote the qc results as performed on the data using a given flagging scheme such as that offered by NDBC

There are obviously lots of other support tables you can build, but if you have this kind of basic schema broken down by station/sensor/variable, it should make getting the data into models or GIS easier since those tend to be more variable than platform focused.

-- JeremyCothran - 28 Jan 2005
to top

I Attachment Action Size Date Who Comment
buoy2.sql manage 1.2 K 28 Jan 2005 - 21:34 JeremyCothran  
Buoy2-20040514_00-29-57.dat manage 0.3 K 28 Jan 2005 - 21:34 JeremyCothran manage 97.5 K 28 Jan 2005 - 21:34 JeremyCothran  
processBuoys.ini manage 0.3 K 28 Jan 2005 - 21:34 JeremyCothran manage 2.9 K 28 Jan 2005 - 21:35 JeremyCothran  

You are here: Main > PlatformToDatabase

to top

Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding DMCC? Send feedback