Skip to topic | Skip to bottom
Home
Main
Main.XeniaDesignr1.5 - 08 Mar 2007 - 15:03 - JeremyCothrantopic end

Start of topic | Skip to actions

This is a page to list some Xenia design considerations, benchmarks, and tips as related to the relational database schema.

Other designs to consider

Others with relational database schemas that would like to have a link posted here, send an email to jeremy.cothran[at]gmail.com and I'll include.

Coastal Services Center (CSC)

http://www.csc.noaa.gov/cir/standards.html


Hey Jeremy

Just a quick piece of a diagram from the Marine Data Model group out of the ESRI domain. Its a pretty handy and simple way to model point based time series with geography in mind. The FeatureID? field points of to the spatial record elsewhere. We're going to use a variation of this in some DTL testing.....so we can report on it with some real-life experience. Its been around for quite some time....maybe you are already well aware of it.....

Hi Daniel,

Thanks for the diagram, hadn't seen it before.

The latest schema for Xenia at http://nautilus.baruch.sc.edu/twiki_dmcc/bin/view/Main/XeniaPackageV2 is different in that my time and location (and lots of other metadata) are all included in the multi_obs table. My reference to multi_obs.m_type_id (measurement type id) does branch off to the vector/scalar lookup tables. My 'platform_handle'(similar in function to MarineID?) is what the identifier connecting the observations directly back to the platform(also indirectly via the multi_obs.sensor_id which points back to the platform_id).

My design choice in not having the higher level of normalization (most SQL queries should be able to refer to just the multi_obs table and not need to use views or joins) is geared more towards making the SQL easier to read and hopefully a little faster too at the cost of some acceptable small amount of storage. Also some of the SQL functionality in further tools/ports is better supported for simpler selects on a single table than views/joins across tables.

Bill Howe

Bill works with http://stccmop.org

In the below discussion I have since implemented my approach to vectors detailed at XeniaPackageV2


Thanks for the additional feedback and ideas, enjoyed reading through them and hadn't thought of using a column array type:

> The critical thing in my mind is to use views to insulate applications
> from the physical database so we can experiment with all these various
> options without breaking things.

If you plan to use views, then I agree on the above - but we've been talking with NWS some lately and the folks we talked with there are MySQL? based. So for them or others who might take the database in-house, my tendency is to go with a lowest common denominator approach, waste the extra storage(not sure how smart Postgresql is in handling null column values storage wise) in favor of spelling out the extra columns which may or may not be used. It's an uglier 'cram everything in one table' approach, but I'm hoping the payoff is the consistency of interface similar to the view based consistency of interface idea(although a more hard-coded approach).

Keep me posted if you have a view/array based approach that works well ongoing - the devil is always in the details and actual practice for any approach.

Enjoyed a quick scan of the whitepaper and database schema also - always interesting to see what different schemas are being suggested and how they might hold up in practice. Don't know if you've seen the Boeing ERD at http://csc.noaa.gov/cir/archives.html#sept06

Regarding the last question about usage, not much repeated use yet, but we are doing a redundant xenia database setup which support the application obs at http://carocoops.org/carolinas and Pete Geincke with the Great Lakes I believe adopted the xenia schema for his work back in June/July. And the Xenia stuff will be my default solution moving forward so I'll be replacing some db's and creating new ones with the schema mentioned working out bugs on the way.

Jeremy

> -----Original Message-----
> From: Bill Howe
> Sent: Friday, November 10, 2006 2:04 PM
> To: Cothran, Jeremy
> Cc: Paul Turner; Charles Seaton
> Subject: Re: [Fwd: Xenia package powerpoint]
>
>
> > I ran the above query and as expected the result was the
> empty set(no
> > rows returned). This is correct as each sensor_id acts as
> a container
> > for it's corresponding measurements on the multi_obs table and the
> > m_type_id should always be the same.
> >
> Gotcha; this was my assumption.
>
> > Good question and this isn't covered in the powerpoint and
> not yet in
> > the online documentation(I'll try to add some notes on this
> today) and
> > I haven't done a test implementation trying to handle
> vectors yet. I
> > confess that I've gone back and forth on exactly how to
> handle vectors
> > and that I'm still on the fence some, but the following
> seem to be the
> > two options:
> >
> > ==
> > 1) keep everything as a scalar and provide a collection
> lookup table
> > which associates multiple scalars as part of the same
> vector - vectors
> > are a derived product from the basic table
> >
> > 2) extend the sensor and multi_obs tables to add an arbitrary(7
> > extra?) number of m_type_id, m_value columns like
> >
> Another option is to have a scalar_value table and a vector2_value
> table, and so on. This way you don't have superfluous null values in
> your table, nor do you have to reassemble vectors using expensive
> joins. However, it still seems a bit ugly.
>
> With postgres (not MySQL?), there are a couple of other options. One
> is to create a user-defined type "measurement value" that can act as a
> scalar or vector. In fact, postgres already supports arrays of
> primitive types, so a custom type may not be necessary. This is what
> I'm experimenting with currently.
>
> The critical thing in my mind is to use views to insulate applications
> from the physical database so we can experiment with all these various
> options without breaking things. For example, consider a simplified
> measurement table
>
> create table measurement (
> t float,
> x float,
> y float,
> z float,
> sensorid text,
> variable text,
> value float[],
> PRIMARY KEY (t, x, y, z, sensorid, variable) );
>
> we can build a "velocity" view for 2D vectors like this:
>
> create view velocity as (
> select t, x, y, z, sensorid, value[0] as u, value[1] as v
> from measurement
> where variable = 'velocity'
> );
>
> This way, applications don't need to know about our ugly array-based
> representation, and we don't have the disadvantages of the other
> options (superfluous null values with Option 2, and expensive joins
> with Option 1, or multiple tables).
>
> Postgres is snazzy enough to allow inserts and updates to this view,
> too, with some extra work and potential performance ramifications.
>
> > I'm also thinking along the same lines regarding quality control
> > columns, that is to add additional optional columns like
> qc_level_2,
> > qc_flag_2, etc for additional separate qc which might be
> performed by
> > other parties where we might want to keep track of multiple qc
> > evaluations. I'll also be adding a qc_index_... column which will
> > provide a lookup into another qc history/metadata table.
> >
> I'm feeling ambivalent about how to model quality. There's a paper by
> Jim Gray (a Turing award winning database
> researcher) that advocates separate tables for different quality
> levels in a sensor database, but I'm not convinced.
> Your approach seems pretty good.
>
> (Here's that paper:
> http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2006-90)
>
> I also liked your last slide, where you suggested that lots of
> institutions could install your schema and associated software as a
> sort of turnkey solution for managing observation data, and that these
> instances could talk to each other. Have you been able to get
> multiple institutions talking to each other in this manner?
>
> Bill
>

Benchmark calculations

General discussion

A collection of copied article discussions

scalability

case 1

from http://www.varlena.com/GeneralBits/27.php

Table Size and Performance [GENERAL] Postgresql capabilities question 03-Apr-2003

The size of a table is not as important for performance as a well tuned server and good indexing. A couple of tens, no, say hundreds of thousands of rows is a perfectly reasonable size for a table and many will still call it small.

The question about porting a table of 95000 rows brought up the following testimonials and cautions:

I have a number of 1,000,000-plus row tables (very plus in some cases) running on some nasty low-end (Celerons with 5400rpm IDE drives, Netras) and performance is quite adequate for typical use. -- Steve

Almost laughably small :) Yeah, it's those sequential and tsearch index scans that kill it but selective queries fly. -- Nigel

You're absolutely correct that there are many other factors that determine performance aside from row count. That being said, I have table with over a million entries on actively used systems that perform really well with queries utilize and index and acceptably well on queries that require a sequential scan. -- Ryan

If the queries are selective and typically indexed, or you have enough RAM then I'd say there won't be much difference between splitting the tables or combining them into one.

95000 rows could be considered small. You might even have enough RAM to cache the whole DB- can estimate from DB footprint on disk. --Link

We use postgresql for lots of stuff that's more than 90,000 rows. We have a table we use to merge log files from multiple servers and then sort them by time.

The size of this table on the hard drive is 226,689,024 bytes. It's one of the smaller tables we play with on this server. Please note that this is our well tuned production server. A workstation with default settings and less memory might not be quite as fast as that, but you're still looking at sub second response times on well indexed datasets with less than say 5,000,000 rows or so. --Scott

Contributors: John Wells jb at sourceillustrated.com, Steve Atkins steve at blighty.com, Nigel J. Andrews nandrews at investsystems.co.uk, Ryan Mahoney ryan at paymentalliance.net, Lincoln Yeoh lyeoh at pop.jaring.my, scott.marlowe scott.marlowe at ihs.com

case 2

from http://archives.postgresql.org/pgsql-sql/2002-04/msg00318.php

We are currently working on a Transcriptome project, which is a follow-on to the human genome project, in which we systematically look across all parts of the genome to see what is expressed in the form of RNA. It is publicly funded by the National Cancer Institute and the data is made publicly available at: http://www.netaffx.com/transcriptome/

We currently have about 100GB of data and will soon grow to a multi-terabyte system. We have tables of up to 1 billion rows and have been able to get ~1 million row queries to run in about 5 min. We've been very pleased with postgres. After a couple of major outages in our server room, it came back up flawlessly each time. So it has been an invaluable asset for this project. We run 7.2 on Red Hat on a 2-processor machine with SAN, and we have a 128-node linux cluster which will make analysis runs against the database.

case 3

Note the below article uses table partitioning to achieve its scalability - this is similar to what I plan to do with my archive strategy, although since I'm going more towards potential database replication across database instances the solution is more homegrown - I plan to allow a script which gives the admin control of which tables are replicated and when.

from http://enfranchisedmind.com/blog/archive/2006/11/04/166

Had something of a victory this week, and Iíve really got to give kudos to PostgreSQL? for it. Thereís something magical about taking a peice of technology, pushing it well beyond what you would rationally expect it to be able to handle, and having not just work, but work well. These sorts of experiences tend to make one a rapid fan of the technology in question. Well, postgres just did that for me.

Hereís the set up. I got the following problem dumped on me tuesday. See, thereís this table we wanted to have in the database. Now, each row isnít that large- we needed a file date the record was created, a symbol (4-8 characters), an expiration date, and two volatility quotients (floats). So weíre only looking at 40-50 bytes per record. But hereís the problem: weíre already generating something like a million records a day, and this is expected to increase, and theyíre wanting to put six months worth of data, probably more, into the database. So Iím looking at a single table thatíll start with over 100 million rows, and could easily grow to half a billion rows. Worse yet, I need to access the table by at least two different ways- by file date and by some combination of symbol and expiration date, with the latter being time critical (itís an interactive application- a human is waiting on this data).

Now hereís the punchline- the machine this monstrosity is going on. Itís an HP DL-145 2.2GHz Dual Core Opteron with 4G of memory, a nice machine- but only a single 70G 7200RPM SATA drive as storage. Doing this on $300,000 worth of hardware Iíd expect- but doing it $3,000 worth of hardware? On a machine we just had kicking around? What do they think I am, a miracle worker?

Um, yeah. Fortunately for me, I was using Postgresql. One miracle, to order, comming up. Follow me for the details of how I did the (seemingly) impossible.


Also from the same web article. Note concerning item #2 below that we are not using batching our inserts in transactions or using COPY with Xenia mainly because we are inserting a small amount of data against a recent time window and wan't things like duplicate timestamps, etc to fail out on an individual row basis.

The next person who claims ďPostgres is slowĒ is going to get an earfull from me. Maybe Postgres 7.0 was slow, I donít know- but I know for damned sure that Postgres 8.1 isnít. There are, I think, three things that make Postgres seem slow:

1) The default configuration. I joke that the default configuration of Postgres was designed for a 386 with 16 meg of memory. This isnít exactly true (itís more like a 486 with 32M of memory), but it makes my point. The default configuration of postgres is seriously minimal- itís good for getting the database up and making sure itís running, but itís not going to provide decent performance at all. For that, you need to go in and tune it a little bit, telling it how many resources to use. Which is good- you may be running the database on a machine also running other things, and may want to limit the resource utilization of the database. But if youíre reading a performance comparison between postgres and some other database, and it starts with ďwe tested both databases with their default configurationÖĒ you can stop reading at that point.

2) Lack of transactional awareness. Transactions change things. Especially with Postgres, where every command that isnít part of another transaction is itís own transaction, and transactions are ACID. So, in Postgres, when you do a single INSERT statement, it is itís own transaction, and doesnít complete until itís written out to disk. So you can only do as many INSERTs per second as you can do seeks per second. This means youíre only going to get a few hundred INSERTs per second if youíre lucky, if youíre not aware of transactions. Which could easily give you the impression that Postgres is dog-slow. However, once youíre aware of transactions, you can speed things up enormously by doing more per transaction. Grouping 100 inserts into a single transaction makes inserts about 100x times faster, as the cost of the transaction is spread over all 100 transactions- and this still keeps you ANSI SQL compliant. If youíre willing to ditch ANSI SQL and be Postgres specific, you can use things like the COPY command, which is signifigantly faster yet.

3) Most performance tests donít really push the scalability of a database. OK, your database is really fast on tables with 10,000 rows, so what? On modern hardware, just about any database will be fast enough on small tables. Performance becomes important when youíre dealing with large things- either really large tables or lots and lots of connections. If the latter, transactions stop being optional- the database has to be consistent despite the asynchronous actions of the user. And if itís the former, well- doing well on small tables does not necessarily mean that you do well on large tables. A table with 100 million rows in it is a radically different beast than a table with 10 thousand rows.

update speed tests

A useful update speed related benchmark which should also relate somewhat to insert speeds
http://archives.postgresql.org/pgsql-general/2006-10/msg00662.php

with the below image of particular interest showing update speed using a variety of maintenance schemes results.png


The numbers you presented didn't really offer any strong grounds for
believing that there's an O(N) growth rate --- as far as I can see your
results are only barely out of the measurement-noise category.  Can you
run some tests where the issue is not in the least significant digit
of the available numbers?

But assuming for the moment that you've got hold of a real problem...

The actual insertion of a row should be essentially a constant-time
operation, since we just stick it into the last page of the table
(or any page with sufficient free space).  Insertion of index entries
for the row would have cost that depends on the number of existing
table entries, but for btree indexes I'd expect the cost to vary as
O(log2(N)) not O(N).  I do not think you've presented enough evidence
to prove that you're seeing linear rather than log-N cost growth.

Most of the serious insertion-cost problems we've seen lately have
to do with the costs of checking foreign key references ... but those
normally vary with the size of the referenced table, not the table into
which you're inserting.  Besides which you mentioned nothing about
foreign keys ... or any other schema details as far as I saw ...

                        regards, tom lane 

It was really some time ago, since I have experimented with this. My last experiment was on PG
7.2 or 7.3. I was inserting cca 800,000 rows. Inserting without transactions took 25 hrs.
Inserting with 10,000 rows per transaction took about 2.5 hrs. So, the speedup was 10x. I have
not experimented with the transaction batch size, but I suspect that 1,000 would not show much
speedup.

> 2. Vacuuming also makes no difference for a heavy insert-only table, only slows it down.

Makes sense. Since my application was dumping all records each month and inserting new ones,
vacuum was really needed, but no speedup.

> 3. Table size plays no real factor.

Maybe this link my be useful,  it contains additional links to various postgresql preformance
test.
http://archives.postgresql.org/pgsql-general/2006-10/msg00662.php

Regards,

Richard Broersma Jr. 

in-house insert speed test

I used this perl script to create a batch of insert statements in the 100,000+ row range. The script produces a file called load.sql which this perl script runs against the database showing the server time difference in seconds between the start and end time. The time recorded is the server time and not the process thread time, but the server that this was run on had no real competing server load during the testing. The multi_obs table tested against was reindexed(reindex table multi_obs;) and vacuum analyzed(vacuum full verbose analyze;) before testing.

The results were the following:

100,000 rows inserted in 148 seconds = 675 insert/second = 1.48 millisecond/insert
An additional
500,000 rows inserted in 604 seconds = 827 insert/second = 1.21 millisecond/insert

Insert rates of around 1-2 millisecond seem reasonable for our Dell Poweredge 2850 Server using the table indexes correctly. 1-2 ms corresponds to a 1000-500 insert/second (or 3,600,000-1,800,000 insert/hour) speed. It was also noted that the very first insert in a batch may incur a higher millisecond cost(say 200 ms) as my guess is a prepare/execute plan is cached and reused against subsequent same type insert statements.

If we approximate the existing Seacoos data load as 1,000 platforms with each platform reporting around 10 observations hourly, then this corresponds to 10,000 inserts per hour. At the slower rate of say 2 ms inserts, the current load of 10,000 rows/hour divided by a potential of 1,800,000 rows/hour is 0.00555...

So at less than half of 1 percent, we really aren't taxing the database in terms of input load at this time and there should be plenty of capacity for further platforms and observations.

At an existing load of 10,000 rows/hour we would expect 10,000 rows/hour x 24 hours/day x 365 days/year = 87,600,000 (88 MB) in a year. This size table should still be well within the PostgreSQL? capabilities although I'll be developing a partition/replication/selection strategy by month/year which accomodates growing data depending on the growth rate.

#prepare/execute statements
http://www.petefreitag.com/item/454.cfm
I also tried a 'prepare/execute' version of these insert loads, but didn't notice much if any improvement.

performance tips

See also this link in regards to some earlier documented PostgreSQL? notes and tips

http://archives.postgresql.org/pgsql-sql/2002-04/msg00318.php backup link

storage cost of null columns

from http://www.thescripts.com/forum/thread420697.html

Storage cost of a null column What is the storage cost of a null entry in a column? i.e. does a null entry of type integer, float8 or text consume the same amount of storage as one that is filled? I ask because I have satellite data which is transmitted via a dodgy RF link that drops data packets. This means I have a number of columns in a table that are null. Moreover, the operations people decided to use a compression scheme whereby non-changing bit/integer values are not output at regular intervals which also adds a considerable number of null entries into the columns. Because of this, we made a decision that we would have hundreds of 2 column tables (timestamp, value) and use unions, intersections, and joins to get what was needed. Unfortunately, this has made application programming a real nightmare as we are often forced to reconstruct a snapshot frame for the range of times either in C or have the app create temporary tables in SQL and insert the relevant data prior to selecting it. As it stands, we've ordered a new disk array and provided that the storage costs are not that high, I will probably be reorganising all this next week. If anyone has any other suggestions, I'd be very keen to hear them.

Cheers, Randall November 23rd, 2005 12:26 AM

Re: Storage cost of a null column Randall writes:
> What is the storage cost of a null entry in a column?

If you have even one, all the rest in that row are free, so your scheme sounds reasonable.

Null columns are simply not stored. There is a bitmap at the start of the row with one bit per column, indicating which ones are null or not null. However we omit the bitmap if all columns are nonnull in a particular row. So it's reasonable to consider the cost of the first null as being the size of the bitmap (N bits for an N-column table, rounded up). The rest are free.

> Because of this, we made a decision that we would have hundreds of 2
> column tables (timestamp, value) and use unions, intersections,
> and

Narrow tables are a dead loss if you're concerned about storage space -- you'll get eaten by the per-row overhead, which is a minimum of 28 bytes per row.

regards, tom lane

postgresql bit cast and comparison

I was thinking for xenia qc_flag column which might be a single integer instead of separate columns, should be able to cast the single integer column value to bit for flag comparison using something like the below documentation.

http://developer.postgresql.org/pgdocs/postgres/functions-bitstring.html

Index usage - make sure queries are using index scans when expected and not sequential scans

I found out that the reason for my dog-slow trigger was that my select statement was doing a sequential scan instead of using the available index. A sequential scan is a full read of the entire table being queried instead of using the index to locate the rows of interest. Sequential scans are ok for

  • small tables (say less 1,000 rows)
  • if your query is actual planning to read most of the rows in a table

but should be avoided in preference of an index scan when there are a proportionally small number of rows being queried on the table.

The main indexes I have on the multi_obs table are:

Indexes:

"multi_obs_pkey" PRIMARY KEY, btree (row_id)
"i_multi_obs" UNIQUE, btree (m_type_id, m_date, sensor_id)

If you make sure to reference m_type_id (test examples below) in your select statements (which several of my selects including the one used for plotting don't), it will use the index returning very quickly (a few milliseconds) versus a sequential scan in several hundred milliseconds. The index usage seems to be sensitive to the order listing of the index so m_type_id in this case should be referenced to invoke the index.


without m_type_id referenced (note the 'Seq Scan' in the Query Plan)

explain analyze select row_id,to_char(m_date,'MI') 
from multi_obs
where m_date > '2007-01-01 01:00:00' and m_date < '2007-01-01 02:00:00';

QUERY PLAN 
---------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on multi_obs (cost=0.00..18823.74 rows=117 width=12) (actual time=19.677..719.102 rows=3599 loops=1)

Filter: ((m_date > '2007-01-01 01:00:00'::timestamp without time zone) AND (m_date < '2007-01-01 02:00:00'::timestamp without time zone))
Total runtime: 719.949 ms

with m_type_id referenced (note the 'Index Scan' in the Query Plan)

explain analyze select row_id,to_char(m_date,'MI') 
from multi_obs
where m_type_id = 41
and m_date > '2007-01-01 01:00:00' and m_date < '2007-01-01 02:00:00'; 

QUERY PLAN 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using i_multi_obs on multi_obs (cost=0.00..210.00 rows=117 width=12) (actual time=0.042..9.241 rows=3599 loops=1)

Index Cond: ((m_type_id = 41) AND (m_date > '2007-01-01 01:00:00'::timestamp without time zone) AND (m_date < '2007-01-01 02:00:00'::timestamp without time zone))
Total runtime: 9.949 ms

random_page_cost setting

Did some research and there is a setting labeled in the postgresql.conf file which you may want to set from it's default of 4.

#postgresql.conf
#random_page_cost = 4           # units are one sequential page fetch cost 

Set this to 1 (1 is lowest recommended minimum I believe, numbers higher than 4 favor sequential scans) and it will lower the query planner cost of using an index making it more likely to favor index usage over the sequential scans.

Found out about this tuning setting from this thread
http://archives.postgresql.org/pgsql-performance/2004-11/msg00333.php

You can do 'pg_ctl reload' to reload the postgresql.conf file to reload while the database is running.

Rerunning the '100 hour' tail queries after making this change should show it using the index scan at around 200ms instead of 1000+ms.

If you're in the situation where you want to enable/disable sequential scans for comparison, you can run the following command from your psql session or script using true/false settings(only effects the session, not the permanent setup):

set enable_seqscan=false;

You can alse test the random_page_cost setting to see it's effect:

set random_page_cost=1;  #default is 4

Also found the following links helpful also:

http://archives.postgresql.org/pgsql-hackers/2005-10/msg01180.php
http://www.pgadmin.org/docs/dev/pg/runtime-config-query.html
http://revsys.com/writings/postgresql-performance.html backup link

-- JeremyCothran - 08 Feb 2007
to top

I Attachment Action Size Date Who Comment
update_speed_results.png manage 7.7 K 13 Feb 2007 - 19:42 JeremyCothran na
mdm.timeseries.JPG manage 295.4 K 13 Feb 2007 - 19:50 JeremyCothran na
msg00318.php.htm manage 28.8 K 14 Feb 2007 - 18:42 JeremyCothran na
postgresql-performance.html manage 28.3 K 14 Feb 2007 - 18:42 JeremyCothran na
mk_load.pl.txt manage 1.0 K 14 Feb 2007 - 19:13 JeremyCothran na
test_load.pl.txt manage 0.2 K 14 Feb 2007 - 19:13 JeremyCothran na
mk_load_prepare.pl.txt manage 0.8 K 14 Feb 2007 - 19:13 JeremyCothran na

You are here: Main > XeniaDesign

to top

Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding DMCC? Send feedback