[oclug]PostgreSQL Hardware
Brad Barnett
bb at L8R.net
Sun Oct 27 07:23:40 EST 2002
On Sun, 27 Oct 2002 01:48:35 -0400
Rod Giffin <rod at giffinscientific.com> wrote:
> On Saturday 26 October 2002 20:15, Brad Barnett wrote:
> > I am glad that you at least quantify that statement above. I can
> > think of many ways that the web server would _not_ run out of
> > resources before the database server. Again, and this is the entire
> > point I am stressing here, we don't even know what this pair will be
> > used for. The entire _purpose_ of many of the features in databases
> > today is to take the load off of the web server.
>
> No kidding. Ya don't say. Geez Brad - what do you think came first:
> A) the over worked application server - or
> B) Extensions to SQL that offload work onto the database server.
>
> Do you think it might be remotely possible that the software engineers
> did that because noticed that their applicaiton servers were chugging
> and huffing and puffing away while the database servers sat there
> twiddling their thumbs?
"Did that." "What came first." I'm glad you agree with me, because these
days the database server can often hit the wall first. Of course, as I've
said all along, this depends on the circumstances. It is quite possible
to write code on a web server that will hit that bottleneck first.
The thing is, a web server and a client from time gone by are typically
quite different beasts. There are many cases where the web server has
very little in the way of actual php code to execute.
Apache (just in case you hadn't heard) is very good on it's own at serving
up a ton of pages, without hitting a CPU or resource bottleneck. As soon
as you get into more dynamic database-driven web pages, more and more of
that load moves onto the database server.
>
> > An example is the simple WHERE statement you list below.
>
> Just a minor point here. Milan mentioned WHERE clauses. I mentioned
> indexing and database normalization.
Ah.. yeah, it does get difficult to keep track of whom said what sometimes
;)
>
> > You seem to think yourself quite the authority of databases. Nothing
> > wrong with that, but I'd love to have a design contest with you
> > sometime, from the raw data stage to hardware. I have a feeling I'd
> > beat the pants off of you, but this is just friendly competition
> > speaking. Nothing personal ;)
>
> Question for you... what the hell is the "raw data stage" in software
> design?
I don't know. I was referring to taking a chunk of raw data, and then
building the database for it, not about software design. You've mentioned
in the past that you need to design a database _for_ the data, so you
obviously what I'm talking about in that respect.
If we were going to build a database server in competition, one of the
most important things is how we transpose that raw data into tables in the
database. What indexes we build. What type of tables we use, and so on.
Obviously this will have more impact than hardware will on the speed and
efficiency of the database itself, so I'd want to start from the "raw data
stage".
>
> And, no I don't think myself to be an authority on anything. I just do
> what I do.
Excellent, my respect for you just went up a thousand fold. ;) The more
authoritative someone is in a field, it seems the more they form some sort
of tunnel vision. I've seen experts in topics before, who only seem to be
able to handle that one particular aspect of their field, to the detriment
of all else.
>
> > You are aware, btw, that google doesn't even store their data on hard
> > disks, but has everything stored in ram? They don't care if their
> > data is lost, because in a worst case scenario, if all of their
> > replicated machines with the same data are lost at the same time, they
> > will snag that data back within 30 days.
>
> Google's cluster architecture is a rather well documented and not very
> fancy affair. They have at the moment something over 10,000 single CPU
> systems in thier cluster, with between 256Mb and 1Gb of RAM, and two 40
> or 80 Gb (Maxtor seems to be their favourite) hard disks. Does that
> sound like a RAM based system to you? Oh. They also run RedHat Linux.
>
This is old news. Last I heard they were up to 18k machines. Not
surprising with all of the extra features they've added lately, and the
doubling in the number of sites they have indexed, over the years.
Interestingly enough, they seem to have gotten rid of the 'cache' feature
for many web pages. It appears that they are doing so for those with very
dynamic content, but those with dynamic content don't seem to all be that
way.
I admit I am unable to find the recent article that stated what I said
above. No matter. In retrospect, and being more awake now, I realise
that obviously moving everything from hard drive to ram (in their case)
would increase the number of machines they would need to store their data.
That doesn't change things, however. It is not a *poor* use of resources
to have your tables in ram, if they can fit there. Yes, access from a
cache in ram will be faster than access from a hard drive, no matter how
fast the raid is.
With indexes completely cached in ram (which you should _always_ have
enough ram to do if possible, that is your first priority), there is less
hard drive thrashing during a query. However, it *is* faster to have the
whole database cached in ram.
Again, with only a 2 gigabyte database, 3 gigs of ram would allow for
this. One could get rid of the scsi raid, and go with a cheaper mirror
IDE raid. Performance would remain the same, since reads would be using
cached data, while writes could take their time to commit to the hard
drive.
Again though, you think this is a bad thing to do.
> Ironically, a quick Google search would have prevented your foot from
> entering between your maxilla and your mandible.
>
> > You don't think google has a flawed policy, do you?
>
> No, I don't.
>
> > Maybe you should talk to them.
>
> Actually ... naw. Never mind. We won't go down that road.
>
Yes Mr. Peacock, don't spread those tail feathers ;P
> Rod.
>
> P.S. I'm kinda glad I had the opportunity (thanks to Ottawa Hydro) to
> retype this message. The earlier version was much more colorful :)
> However, it has gone to the big bit bucket in the sky. I shall miss it
> though.
Well, we've all been down that road before. I'm glad to see that Ottawa
Hydro has its own problems. I was cursing Quebec Hydro when I first moved
over to Hull. I seemed to have more frequent power failures than I had in
Ottawa, but your above paragraph makes me thing that perhaps it is just
'the other side of the fence' syndrome talking. ;)
More information about the OCLUG
mailing list