A reader from
That was then, this is now
I recently got an email from a dear old dinosaur with 50 years experience of tuning Oracle databases (hed done a lot of overtime and on-call time) who wanted to tear a strip off me for suggesting (in a presentation that Id recently done in Kazakhstan) that it was easy to test one of his pet theories to see if it was relevant to a particular member of the audience.
All this
new-fangled testing, analyzing, and careful thinking is a waste of time. Ive
fixed more than 5,000 databases in the last week alone using my methods, and
had IT managers from
I felt a little twinge of sympathy for the old chap. Things have changed. The Oracle world moves on and it must be tough when your advanced tuning skills of yesteryear are now part of Basic Design and Configuration 101 and you have to start learning all over again. It can be hard to keep up, and sometimes its so much easier to fight the rearguard action, cling to the forlorn hope, and make the little things sound more impressive than they really are. (And, of course, you do get lucky from time to time and find someone who hasnt taken Basic Design and Configuration 101).
Changing times:
Ten
years ago
Databases were small, disks were small and
people had lots of them, Oracle had a few interesting features but it was
largely just tables and indexes that people dealt with. Most databases
performed badly and had I/O problems because of bad design, inappropriate
indexing strategies, and bad code.
Today
Databases have grown by two or three orders
of magnitude (note one order of
magnitude means multiplying by 10), the capacity of a typical disk has
grown by one or two orders of magnitude, but disk access times are typically
only three or four times as fast as they used to be, and most people dont have
many of them. Oracle is stuffed full of interesting technical features, but
many systems dont use them and quite a number of the systems that do use new
features tend to use them inappropriately. Most databases perform badly and
have I/O problems because of bad design, inappropriate indexing strategies, and
bad code. Some databases still work reasonably quickly because you can stuff
enormous amounts of memory into the box, add lots of very fast CPUs, and hide the
problems for a while.
Mythtakes
Some old
dinosaurs try to protect their intellectual high ground by pumping out
declarations full of sound and fury, signifying nothing at best they do no
harm, at worst they only muddy the waters further. Some young bucks are too
keen to fight their new scientific corner, and end up making statements that
are just as extreme, over-generalised, and suspect as those of the old
dinosaurs. Both have the potential for causing problems for naïve users.
Old Methods
Back in the
bad old days, state of the art skills included:
reducing all segments to one extent,
rebuilding all indexes regularly,
wetting your pants if the cache hit ratio
dropped below 99.99999%,
making sure that small tables werent
indexed.
For a
history of some of the odd realities, odder beliefs and the reasons behind
them, you could read Oracle Insights: Tables of the Oak Table,
in particular Dave Ensors chapter, which will tell you such abstruse details
as where AFIEDT.BUF came from, and may even mention the reason for the original
2K block size.
There were two main drawbacks to many of these state of the art ideas. First, they didnt apply after Oracle 6 came out (if they ever did), and second they did actually appear to work some of the time. (Yes, that sentence reads exactly as I intended). So whats wrong with a snappy method that works some of the time? Nothing - if it works for you; but if you tell ten other people to do it and it only works for one of them then youve just made 9 people unhappy and wasted a lot of their time.
Alas, some of these old methods did work for some people. But the information that would allow individuals to decide in advance whether a strategy was a good idea, a positively bad idea, or simply just a waste of time and effort, was not available.
New versions of old
methods
Oracle
Corp. keeps updating their software, adding features designed to reduce the
workload on the DBA, the developer, and the database. Sometimes the features
work well, sometimes they are inappropriate, sometimes they are sub-optimal and
sometimes they are abused. Almost invariably, the marketing literature is ahead
of its time.
Is
everything that Oracle Corp. puts into their code right for you? If an old
method seems to be embedded in new code, does that prove that the old method
was always
a good
idea? Ill let you work that out for yourself by considering a few examples.
Oracle 8i
introduced on-line index rebuilds does this mean you should rebuild
every index on a regular basis (old method), or does it just give you a
facility for minimising interruptions on a 24 x 7 system for those special
cases where you can see that an index rebuild would positively benefit
performance ? In passing, bear in mind that on a busy 24 x 7 system an on-line
index rebuild has to acquire an exclusive lock on the underlying
table for a brief period to start the rebuild, and another to end the rebuild
its likely to be a lot better than holding a lock on the table for the
duration of the rebuild, but its not entirely threat-free after all and the
rebuild is not necessarily cheap on resources (although Oracle has at least
three possible mechanisms for doing the rebuild and will pick the one it thinks
it the cheapest).
Oracle 9i
introduced on-line table reorganization does this mean you should
physically rebuild tables on a regular basis? Table rebuilding is
actually a strategy I have suggested from time to time but usually by cloning
read-only partitions of a partitioned table because in the real-world the
cost of an online table rebuild is too high to allow, except in extreme
circumstances, as it includes recreating all the indexes and using materialized
view log mechanisms to keep the rebuilt table in synch whilst
everything else is going on. On-line table reorganization is best viewed as a
facility to allow old systems to migrate into new features such as partitioned
tables with minimum down-time.
Oracle 10g
introduced automatic memory management regularly monitoring v$db_cache_advice etc. to shift memory dynamically
between several of the large memory structures in the shared memory segment.
Does this mean that increasing the buffer cache to reduce physical I/O is
automatically the right thing to do?
Bear in mind Oracle 10g also supplies you with the tuning advisor and the access
advisor to help you improve your indexing and SQL in a balanced fashion perhaps you should only
enable automatic memory management after you have tuned the SQL. Bear in
mind, also, that the mechanisms of automatic memory management specifically exclude
the non-standard db cache sizes (see 10g Admin Guide Ch. 2 P.36) There is at least one author who
thinks that the ability to specify multiple block sizes may be the most
important feature of 9i does their exclusion from automatic memory management
mean that Oracle Corp. thinks he is talking rubbish? Bear in mind, finally,
that the algorithm tries to avoid removing memory from the shared pool
because it kicks the system to death (thats my paraphrasing the manual
describes the issue a little differently). So should we
follow the line but thats what Oracle does when even Oracle cannot do it
well? Perhaps not.
New Methods
Tuning by
observing the work done and time lost at the process level the so-called
wait interface method - is the new science.
I could
invent a silly story at this point perhaps I could tell you how I spent 72
hours watching a ratio-junkie struggle to fix a problem by tweaking
parameters, plugging in memory chips swapping CPUs, creating materialized
views, rebuilding indexes, and refuelling every three hours on pizza and
high-caffeine coffee; then I could explain how I stepped in and solved the
problem a mere 30 seconds after typing execute dbms_support.start_trace_in_session(...etc
). I wont insult your intelligence by doing so
you would obviously conclude that it was pure invention. Extreme bottlenecks
and extreme overloads are easily spotted by anyone with a basic level of
trouble-shooting skills, irrespective of their preferred method of operation.
I will say
that, once you have dealt with the trivial errors, the method of observing
business-related processes to see where the work is going and where the time is
lost helps you identify the important threats more easily, recognise some of the
slightly less obvious errors in design (or index strategies), and list options
for addressing the issues based on a realistic cost/risk/benefit estimate. And
yes sometimes the answer is: this particular I/O problem will go away for
about six months if you add a little over 48GB of RAM to the db_keep_cache_size (as I said to someone just three
weeks ago) and the other three alternatives are
Finally I
have to point out that a scientific approach is not a new method. Academic
and theoretical are terms that sometimes appear in the writings of this
industry as if they were criticisms. But have you seen how many academic papers
Oracle Corp. presents at events like the International Conference on Large
Databases - without Academia there would be no Oracle. The approach of
evaluating a hypothesis by predicting a test result then doing the test has
been around for centuries. If you are a professional trouble-shooter, you owe
it to your clients to go through that discovery process in your own labs so that
you dont have to spend your clients money learning how Oracle works. If you
are a full-time DBA, your employers owe you, and themselves, some time for you
to do some testing and experimentation that might be relevant to your (their)
systems so that you can predict problems before they actually happen.
Epilog
What of the
daft old duffer? Treated with a little warmth and compassion, it turns out he
can be a charming chap. He has seen the error of his ways, and now tries hard
to copy those more enlightened individuals around him. Of course he does slip
back into old habits from time to time but then, there are times when we all
hanker after the good old days when everything was simple.
About the author: Don Corleone runs a small family business out of NYC, mainly
involved in shooting trouble. He is a shy, retiring individual and doesnt like
to make a big fuss about his level of skill, because his published work speaks
for itself. The author reserves the right to leave any outrageous claims
unsubstantiated, and also reserves the right to have invented any characters,
events, and figures that might suit the general narrative thrust of this
article.
That was then A reader writes |
Oct 2004 |




