[oclug]PostgreSQL Hardware
Brad Barnett
bb at L8R.net
Sat Oct 26 20:27:26 EDT 2002
On Sat, 26 Oct 2002 18:43:31 -0400
Milan Budimirovic <milan.budimirovic at sympatico.ca> wrote:
> Brad Barnett wrote:
>
> > - you can not predict the load this server will be under by looking
> > only at the amount of commits/writes it will perform
>
> No, but it's a pretty good indicator. And if the data size is very large
> for each row of results, then I would be **much** more worried about
> bandwidth than database performance, given that this is a web-based
> application.
>
> > - you can not predict the load this server will have unless you know
> > more about the specific situation, how large the queries are, and how
> > often they are made
> >
> > How do you find this to be a "not necessarily" case?
>
> Like I said, more users does not necessarily mean decreased performance.
Right, it doesn't. This isn't how you used "not necessarily" before. You
used "not necessarily" against statements which I summarized above. Your
reply, however, doesn't conflict with what I was saying.
>
>
> >
> > > up. If you don't have a lot of people using the system you will end
> > > up with all of your data being read from disk instead of from
> > > memory, simply because nothing has been cached yet. A somewhat-busy
> > > server is often faster than one that's idling.
> >
> > It depends on how randomized the queries are. I'm glad you used the
> > word"often", but often isn't the right word. What about servers with
> > terabytes of data, and random queries? What about servers with 100
> > megs of ram, and a gig of data, and random queries.
> >
> > There are probably just as many databases out there that fit the above
> > scenario, as ones that fit into the "always polling the same data"
> > branch.
>
> Again, this is a **web application**, Cold Fusion in this case, which
> almost certainly means that you will get a lot of very similar queries
> on indexed columns, and the amount of data extracted will be limited by
> bandwidth considerations.
Just because something is a web application, it does not mean it will
always be querying a lot of simular queries. Do you think that google is
constantly polling it's database for simular queries? No! They certainly
have a variety of "top searches", but they also have a wide variety of
them.
There are other areas, and yes with web applications, that can have a
highly randomized set of queries.
>
> >
> > >
> > > Anyway, Rod is right. This does not look like a huge amount of data
> > > for the hardware being dedicated to the task. The important thing is
> > > that
> >
> > Again, I find it amusing when you have no idea how often this server
> > is being read from. What if these 700 clients reading are always
> > reading at once, and constantly pulling down massive records.. up to
> > 70k of them?
> >
> > Pretending to be able to guess what our friend needs, without enough
> > data to be able to guess at the load is just plain silly.
> >
>
> Brad, it's the old thing where if you see hoofprints in the snow you
> don't start looking for zebras. Well, at least **I** don't.
>
> ;=)
>
Sure. I have no problem with that, but most of Rod's comments were along
the line of "Everything everyone else made in this thread is wrong, but
I'll make some very specific comments and be right". You backed up his
statements.
>
> > > the columns that appear in the "WHERE" clauses in your SQL
> > > statements be properly indexed, and even there there are potential
> > > tradeoffs in terms of faster reads vs. slower writes. But even the
> > > worst-case scenario you proposed should not be catastrophic.
> >
> > Again, you simply have no idea until you know about the frequency of
> > the writes and reads, the amount of data transfered with each
> > operation on average, and the consistancy of the requests.
>
> Actually, I do. I have databases that are about that size, with a
> similar number of users, and on top of that my application also does
> full text keyword searching on 500,000 flat files. And my servers are a
> notch slower than what Tim has proposed here. I would be shocked if he
> needed a lot more horsepower.
What if he starts doing what he should, and off loads the weight using
Perl procedures with postgres? I doubt he will max out those CPUs, but
when it comes to ram I don't see a problem with the ever helpful MORE!
Furthermore, we still don't even know the consistancy of the reads.
Give me more data first, then make statements like "Those machines are
overkill!".
Anyhow, we haven't really gotten down to the nub of it all. PIVs are the
sweet price point right now. That is, but a PIII or a PIV, and the PIV
will cost you less / pow. Taking that server and downgrading it to
something else doesn't even really make sense economically.
More information about the OCLUG
mailing list