Discussion:
PostgreSQL Portable
Daniel Begin
2014-09-10 21:00:01 UTC
Permalink
First, I am a Newbie regarding PostgreSQL .



I just started to look at PostgreSQL to implement a large GIS DB (1Tb). The
data must reside in an external disk with eSATA connection and may be moved
to different locations (and Windows desktops/laptops). I was looking to
install PostgreSQL and PostGIS extensions on each PC (setting-up the proper
PGDATA directory to the external disk) until I read about PostgreSQL and
PgAdmin Portable .



http://sourceforge.net/projects/pgadminportable/

http://sourceforge.net/projects/postgresqlportable/



Is that a viable alternative considering the expected size of the DB? Any
comments or proposal would be appreciated J

Daniel
John R Pierce
2014-09-10 21:19:45 UTC
Permalink
Post by Daniel Begin
I just started to look at PostgreSQL to implement a large GIS DB
(1Tb). The data must reside in an external disk with eSATA connection
and may be moved to different locations (and Windows
desktops/laptops). I was looking to install PostgreSQL and PostGIS
extensions on each PC (setting-up the proper PGDATA directory to the
external disk) until I read about PostgreSQL and PgAdmin Portable ...
http://sourceforge.net/projects/pgadminportable/
http://sourceforge.net/projects/postgresqlportable/
Is that a viable alternative considering the expected size of the DB?
Any comments or proposal would be appreciated J
a 1TB database on a single disk drive, presumably 7200rpm, will not
perform very well under any sort of concurrency, or doing any sort of
operation that requires aggregating a lot of rows.

also, Windows disables writeback caching on external disks, this will
greatly slow down update transactions.
--
john r pierce 37N 122W
somewhere on the middle of the left coast
George Neuner
2014-09-11 07:16:42 UTC
Permalink
On Wed, 10 Sep 2014 14:19:45 -0700, John R Pierce
Post by John R Pierce
also, Windows disables writeback caching on external disks, this will
greatly slow down update transactions.
Not exactly. By default, write caching is disabled for external
drives to support quick disconnect, i.e. yanking the device without
unmounting it.

If the driver permits it and you [or your users] can be trusted to
perform a safe unmount via the OS *before* disconnecting the device,
then you can enable write caching for the device using the device
manager. [Note that the device must be connected for it to be visible
in the device manager.]

Most USB disks and Flash devices do support write caching. If you are
willing to live dangerously, you can get better write performance.

George
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Craig Ringer
2014-09-12 03:33:55 UTC
Permalink
Post by George Neuner
If the driver permits it and you [or your users] can be trusted to
perform a safe unmount via the OS *before* disconnecting the device,
then you can enable write caching for the device using the device
manager. [Note that the device must be connected for it to be visible
in the device manager.]
It shouldn't be living dangerously, actually.

While I haven't tested it myself, writeback caching on the external
drive should be safe so long as it continues to honour explicit disk
flush requests.

That's why we have the WAL and do periodic checkpoints. If you yank the
drive mid-write you'll lose uncommitted transactions and might have
slower startup next time around, but it should otherwise not be overly
problematic.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
George Neuner
2014-09-12 13:34:09 UTC
Permalink
Hi Craig,

On Fri, 12 Sep 2014 11:33:55 +0800, Craig Ringer
Post by Craig Ringer
Post by George Neuner
If the driver permits it and you [or your users] can be trusted to
perform a safe unmount via the OS *before* disconnecting the device,
then you can enable write caching for the device using the device
manager. [Note that the device must be connected for it to be visible
in the device manager.]
It shouldn't be living dangerously, actually.
While I haven't tested it myself, writeback caching on the external
drive should be safe so long as it continues to honour explicit disk
flush requests.
That's why we have the WAL and do periodic checkpoints. If you yank the
drive mid-write you'll lose uncommitted transactions and might have
slower startup next time around, but it should otherwise not be overly
problematic.
For the most part you're correct, but recall that WAL itself can be
made asynchronous [see fsync() and synchronous_commit() settings] and
the periodic OS sync also may be disabled - which doesn't affect WAL
handling but may(?) affect the background writer.

Even having synchronous WAL the most recent transactions can be lost
if the log device fails *during* a write. That's why, if we use
external devices at all, we tend to use closely coupled devices - disk
array, wired SAN, etc. - that aren't very likely to be physically
disconnected. And uninterruptible power all around 8-)

A portable device can be reasonably safe if treated properly, but it
never will be quite as safe as an internal device.

George
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Rémi Cura
2014-09-12 13:44:23 UTC
Permalink
Hey,
I had many external hard drive crash (savage unplug, power off, pc forced
restart).
The server on the virtual machine was never hurt, nor the data.

Cheers,
Rémi-C
Post by George Neuner
Hi Craig,
On Fri, 12 Sep 2014 11:33:55 +0800, Craig Ringer
Post by Craig Ringer
Post by George Neuner
If the driver permits it and you [or your users] can be trusted to
perform a safe unmount via the OS *before* disconnecting the device,
then you can enable write caching for the device using the device
manager. [Note that the device must be connected for it to be visible
in the device manager.]
It shouldn't be living dangerously, actually.
While I haven't tested it myself, writeback caching on the external
drive should be safe so long as it continues to honour explicit disk
flush requests.
That's why we have the WAL and do periodic checkpoints. If you yank the
drive mid-write you'll lose uncommitted transactions and might have
slower startup next time around, but it should otherwise not be overly
problematic.
For the most part you're correct, but recall that WAL itself can be
made asynchronous [see fsync() and synchronous_commit() settings] and
the periodic OS sync also may be disabled - which doesn't affect WAL
handling but may(?) affect the background writer.
Even having synchronous WAL the most recent transactions can be lost
if the log device fails *during* a write. That's why, if we use
external devices at all, we tend to use closely coupled devices - disk
array, wired SAN, etc. - that aren't very likely to be physically
disconnected. And uninterruptible power all around 8-)
A portable device can be reasonably safe if treated properly, but it
never will be quite as safe as an internal device.
George
--
http://www.postgresql.org/mailpref/pgsql-general
Steve Atkins
2014-09-10 21:29:01 UTC
Permalink
First, I am a Newbie regarding PostgreSQL …
I just started to look at PostgreSQL to implement a large GIS DB (1Tb). The data must reside in an external disk with eSATA connection and may be moved to different locations (and Windows desktops/laptops). I was looking to install PostgreSQL and PostGIS extensions on each PC (setting-up the proper PGDATA directory to the external disk) until I read about PostgreSQL and PgAdmin Portable …
http://sourceforge.net/projects/pgadminportable/
http://sourceforge.net/projects/postgresqlportable/
Is that a viable alternative considering the expected size of the DB? Any comments or proposal would be appreciated .
Adding postgis to that, if it's not already included, might take some work. Not impossible but you'd be making some unneeded work for yourself.

The external disk isn't going to be blindingly fast, however you use it. For Windows in particular, it's probably going to be more conservative in caching the external drive than it would an internal one. Any large or unindexed queries are likely to be a bit painful.

I do use an external drive for some work, though, and it's usable. I have all of postgresql and the tools I use installed on the drive, with nothing for that instance installed on my laptop. I just have the external drives bin directory early in my PATH, so I can plug in the drive and do pg_ctl start, and it all works. That's on a mac, I'm sure you could do the same with Windows.

Cheers,
Steve
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Steve Crawford
2014-09-10 21:50:15 UTC
Permalink
First, I am a Newbie regarding PostgreSQL …
I just started to look at PostgreSQL to implement a large GIS DB
(1Tb). The data must reside in an external disk with eSATA connection
and may be moved to different locations (and Windows
desktops/laptops). I was looking to install PostgreSQL and PostGIS
extensions on each PC (setting-up the proper PGDATA directory to the
external disk) until I read about PostgreSQL and PgAdmin Portable …
http://sourceforge.net/projects/pgadminportable/
http://sourceforge.net/projects/postgresqlportable/
Is that a viable alternative considering the expected size of the DB?
Any comments or proposal would be appreciated J
Daniel
It appears you are looking to take the PostgreSQL data directory from
machine to machine on an external drive. I fear you will run into some
potential problems:

1. Performance (mentioned by others).

2. OS mismatch. Have you ensured that all client machines are running
identical setups? The underlying files are not guaranteed portable
between OS versions and 64/32-bit. In fact they probably won't be.

3. Backups. What happens when one user screws up the database?

Perhaps you could explain further the genesis of this requirement. The
message list is littered with questions like this asking how to
implement a certain solution when, given an understanding of the reason
the question is being asked, a far better solution exists. This happens
even more often when the person asking is a "newbie."

Cheers,
Steve
Rémi Cura
2014-09-11 06:40:21 UTC
Permalink
Hey,
I'm working in GIS field and I had the same problems.
Solution I found, which has been working for the past year :
virtual box on external drive !
This way you can have an independent OS (Linux for easy
postgres/postgis/whatever gis you want).

I find it very comfortable because my server is separated from guest os. So
I can take the disk and work on any pc with virtual box installed (require
admin right), and I have all GIS tools on the server, so the virtual
machine is very self contained.
It is also easy to backup (but very slow due to huge iso file).

I use a USB2 okay-ish disk. Guest win XP 64 / win seven 32 ; Host Ubuntu
12.04 32b.
About perfo : I do complex queries. Perf are OK for my use case (about same
as a dedicated XP 32bit).

Using the external disk to hold a table space is a __very__ bad idea.
As soon you do some upgrade/the disk get disconnected/anything happen, you
are really screwed.
(I had the issue. Without backup you can't do much without very strong
postgres skills)

Cheers,
Rémi-C
First, I am a Newbie regarding PostgreSQL 

I just started to look at PostgreSQL to implement a large GIS DB (1Tb).
The data must reside in an external disk with eSATA connection and may be
moved to different locations (and Windows desktops/laptops). I was looking
to install PostgreSQL and PostGIS extensions on each PC (setting-up the
proper PGDATA directory to the external disk) until I read about PostgreSQL
and PgAdmin Portable 

http://sourceforge.net/projects/pgadminportable/
http://sourceforge.net/projects/postgresqlportable/
Is that a viable alternative considering the expected size of the DB? Any
comments or proposal would be appreciated J
Daniel
It appears you are looking to take the PostgreSQL data directory from
machine to machine on an external drive. I fear you will run into some
1. Performance (mentioned by others).
2. OS mismatch. Have you ensured that all client machines are running
identical setups? The underlying files are not guaranteed portable between
OS versions and 64/32-bit. In fact they probably won't be.
3. Backups. What happens when one user screws up the database?
Perhaps you could explain further the genesis of this requirement. The
message list is littered with questions like this asking how to implement a
certain solution when, given an understanding of the reason the question is
being asked, a far better solution exists. This happens even more often
when the person asking is a "newbie."
Cheers,
Steve
Karsten Hilbert
2014-09-11 07:14:55 UTC
Permalink
Post by Rémi Cura
It is also easy to backup (but very slow due to huge iso file).
- rsync when you plug in
- make a copy
- rsync to copy continously during work
(note that this rsynced copy will be inconsistent,
it only serves to speed up the last step)
- final rsync to copy after you shut down the
virtual machine before you unplug the external disk
- delete the initial rsync (which was the copy
_before_ this session)

Should be acceptably fast even with large VMs.

Karsten
--
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Craig Ringer
2014-09-12 03:32:21 UTC
Permalink
Post by Steve Crawford
2. OS mismatch. Have you ensured that all client machines are running
identical setups? The underlying files are not guaranteed portable
between OS versions and 64/32-bit. In fact they probably won't be.
You can just run 32-bit Pg on both the 32-bit and 64-bit hosts without
problems.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Daniel Begin
2014-09-11 12:05:16 UTC
Permalink
Wow, I was not expecting so many skillful feedbacks - Thanks to all



I am not closing the point yet since, as Steve Crawford suggested, the
solution I am looking for (as newbie) might not be optimal !-) So here is
more context.



All PCs run W7/64b (different hardware) and I will be the only user
accessing the DB. Once the setup completed, the DB will mainly be used for
reading the data (requests). The results will be used for statistical
analysis/data representation.



Thank again.

Daniel
Loading...