PG_DUMP very slow because of STDOUT ??

Discussion:

Andras Fabian

2010-07-09 11:48:15 UTC

Hi,

I am new to the mailing list (as a writer ... reading since a long time), but am using PostgreSQL in our company for quite some time now.
Lately, we rolled out some new servers with vastly improved hardware (8 disk RAID 1+0, instead of old 2 disk RAID 1 etc. etc.), new OS (Ubuntu 10.04 server instead of Debian Lenny) and newer PostgreSQL version (8.4.4 instead of 8.3.8). And everything seemed to be much better - as expected - until, well until I have seen the first full backup run-times.
For example:

- One of our old machines needed approximately 2 œ hours to make a dump (with "-Fc -compress=9") of our approximately 100 GByte DB.

- One of our new machines with the copy of the same 100 GByte DB needed 6 œ to 7 hours to dump the same DB.

First check of our MRTG monitoring showed, that during the backup, the machine was not saturated in any way by load (IO Wait were below 5%), CPUs were almost idling so you got the feeling, that machine was sleeping ... But of course it did the backup, only that it was very slow.

No I further investigated. I also did some detailed monitoring with collect (http://collectl.sourceforge.net/), seeing, that the machine often was doing some slow reading and every 20-30 seconds wrote down a few Mbytes on the backup disk. So, almost everything suggests that we are very very far from some hardware limitations. I also checked some file write/ reads on the RAID, and could confirm, that plain DD of some multy GByte file only took a snap.

OK, then I checked what PG_DUMP was doing. In the process list of PostgreSQL (viewed via pgAdmin), I see how PG_DUMP works (didn't care much about it until now). In does a COPY to STDOUT. Now I wanted to narrow down the problem, and tried to reproduce that COPY statement of one of our 8 GByte DBs (that table only has INT, BIGINT, BOOLEAN fields ! So no potential BLOB problem!). And now comes the interesting part.

- When I did "COPY public.bigtable (column1, column2, column3) TO '/var/tmp/test.dump';" it is FAST. It writes down the dump file of 3 GBytes in about 2 œ minutes (as expected with the hardware)!

- When I did "time sudo -u postgres /var/software/pgsql1/bin/psql -h /tmp/pgsql1 -p 4306 -d database_name -f copy-test.sql > /var/tmp/test.dump" (and copy-test.sql has : "COPY public.cachereleaseartist (releasebyterritoryid, artistid, isdisplayartist, artistroleclusterid, artistroletypeid, iscontainedinrelease, iscontainedinresource) TO STDOUT;" ) ... I couldn't wait until it ended .. after 20 minutes the test.dump file was merely at 1 Gbyte (far from the target size of 3 Gbyte).
I also monitored both statements via collect, and could confirm that the direct COPY to file made some reasonable IO activity, while the COPY via STDOUT was idling like mad, with some occasional bursts and falling asleep again. This would also make clear, why PG_DUMP is so dog slow. If it gets its data via STDOUT at that speed, then it falls asleep too ....

And to make things even worse: I did the same test on our old hardware too. And BOTH, COPY to file directly AND COPY via STDOUT war equally fast (took 5 œ minutes ... but OK, that's an old iron compared to the new one)!!! No difference between file and STDOUT (as expected)!

Now I ask, whats going on here ???? Why is COPY via STDOUT so much slower on out new machine?

- Has anything changed between PostgreSQL 8.3.8 and PostgreSQL 8.4.4 in this respect? Maybe some STDOUT buffer flushing from COPY? Buffer handling? Buffer size???

- Is maybe the OS to blame? As I told, we changed from Debian Lenny (Kernel 2.6.26-2) to Ubuntu 10.04 Server (Kernel 2.6.32-22) (both 64-bit)? And if yes, would there be ways for PostgreSQL to adapt to the new OSs new behavior in STDOUT (if there is such thing)???

And for the Hardware:

- Old machine was an 8 Core with 16 GByte RAM BUT only one RAID 1 (software!) with two disks for everything (I know, that was a little bit meager).

- New machine has 8 Core (with hyper-threading 16), 24 GByte RAM, and a RAID 1+0 with 8 disks for Data only, and XLOG/Backup/OS on a second RAID 1 with 2 disks, with hardware raid controller and battery backed cache (so, obviously, this machine should be faster than the old one)

Can someone shed some light on this STDOUT madness? Or give me some directions/hints in which I could further research?

Thank you very much!

Andras Fabian
Atrada Trading Network AG

Tom Lane

2010-07-09 13:38:47 UTC