[Bizgres-general] Performance - WAL bypass + parse
Luke Lonergan
llonergan at greenplum.com
Thu Jun 2 04:39:25 GMT 2005
Mark,
On 6/1/05 6:56 PM, "Mark Kirkwood" <markir at paradise.net.nz> wrote:
> Decided to check this out after seeing the message on -hackers,
> very nice, more speed is always good!
Cool! Glad to see someone else who has the pain...
> One thing I am wondering about - what do you see if you load files
> bigger than the RAM size of the machine (e.g. 4G in your case)? Does the
> performance difference still persist? I am raising this because:
Sure - often.
> i) It is a more realistic DW scenario
> ii) 146M is only 6x the disk cache of 3 drives (assuming 8M for each)
> iii) You dont get much chance to measure the impact of the bgwriter or
> checkpointer (speed may be throttled on these?)
Understood. We routinely load large data (100GB+). We've traced the issues
to CPU consumption without question, but I'm happy to prove it.
The big issue is: we're not anywhere near saturating the I/O subsystem for
loading or scanning data with Postgres, and it has nothing to do with the
Executor or the I/O interface, it's just lack of optimized code paths in
unexpected places. We're going to fix that :-)
> If you guys have not got the time to do some experiments along these
> lines, I could look into it, however I don't have such flash HW ... :-).
Well - the fastest case is the single column case with parse improvements
and WAL bypass, so I'll run that on a 3GB file (1.5x memory, it's a 2GB
machine).
Input file size: 2,909,128,332 bytes
Sample row:
card following server to includes 128 mesh to any away free 2 Therefore the
turn visual includes can Find a drastically fast com Digital Mbyte
------- with fast parse and WAL bypass -----------
Database directory size after loading: 3,757,084,000 bytes
Time to load using psql copy: 104.191 seconds
Rate = 2909/104.191 = 27.92MB/s
I've attached the test program we use FYI. The data generator defaults to
15 columns, if you want to change it, edit the file data-generator/main.c
and change the lines that look like this:
numcols = 1;
col_types[0] = VARCHAR; col_mins[0] = 24; col_maxes[0] = 26;
To suit your needs. You should also change the table definition in
create_db.sh and the ctl file generation in load_data.sh if you change the
number of columns.
- Luke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IVP.tgz
Type: application/octet-stream
Size: 12962 bytes
Desc: not available
Url : http://pgfoundry.org/pipermail/bizgres-general/attachments/20050602/28e5281c/IVP.obj
More information about the Bizgres-general
mailing list