`
wsql
  • 浏览: 11712957 次
  • 性别: Icon_minigender_1
  • 来自: 深圳
文章分类
社区版块
存档分类
最新评论

"Caching a table in Memory"

 
阅读更多

Caching a table in Memory", version 8i

Submitted on 30-May-2000 3:31 Central time zone
Last updated 1-Jun-2012 6:48

You Asked

Hi, 
Is it enough if we use 
ALTER TABLE  CACHE; to push the table to cache ?. 
Do I need to do anything else to have the table in the Memory ?
 
Thx, VJ 

and we said...

That command simply makes a notation in the data dictionary that blocks from this table 
should be handled differently when they get into the cache.  Usually, when we full scan 
an object, we put the blocks onto the least recently end of the list.  These blocks are 
candidates for "aging" from the buffer cache.  By altering the table to 'cache', we put 
the blocks onto the most recently used end -- making them less prone to being aged out of 
the buffer cache.

From the sql reference manual we see:

...
for data that is accessed frequently, specifies that the blocks retrieved for this table 
are placed at the most recently used end of the LRU list in the buffer cache when a full 
table
scan is performed. This attribute is useful for small lookup tables. 
...

So, no -- issueing the alter table ... cache command does not put the table into memory, 
you must full scan the table for that to happen.

Consider looking into setting up multiple buffer pools and associating this table with 
its own buffer pool if you are dead serious about caching it.  Bufrer pools are 


Schema objects are referenced with varying usage patterns; therefore, their cache 
behavior may be quite different. Multiple buffer pools enable you to address these 
differences. You can use a KEEP buffer pool to maintain objects in the buffer cache
and a RECYCLE buffer pool to prevent objects from consuming unnecessary space in the 
cache. When an object is allocated to a cache, all blocks from that object are placed in 
that cache. Oracle maintains a DEFAULT buffer pool for objects that have
not been assigned to one of the buffer pools.


Bear in mind tho... there is really no true way to a have a purely "in memory" table -- 
even cached tables are subject to aging from the buffer cache. 

Reviews
5 starsFinding which tables to cache October 27, 2003 - 6pm Central time zone
Reviewer: A reader
Hi Tom,

What is your approach to find which tables to cache in SGA?

In a properly tuned system, this caching should be unnecessary and won't give much benefit, is it 
correct?

Thanks, 


Followup October 27, 2003 - 6pm Central time zone:

you normally should not ever need to use this, it only comes into play when you full scan a table 
that exceeds the short table threshold -- causing us to cache it differently.

it is rare to do that.
you would know you are doing it.

i've used it once in my experience. 

5 starsHow about alter table ... storage (buffer_pool keep)? October 27, 2003 - 7pm Central time zone
Reviewer: A reader
Thanks Tom.

Do you mean that I should use "alter table ... storage (buffer_pool keep)" instead of "alter table 
... cache", for small frequently-accessed look-up tables? 


Followup October 27, 2003 - 7pm Central time zone:

not really -- they were always "fully cachable" as is.


I ascribe to the 99/1 rule (variation on the 80/20 rule).  the buffer cache is fairly efficient.  
If it was meant to be cached, and the memory exists, it will be cached.  small, frequently accessed 
lookup tables would be in the cache by "popular demand" 99% of the time already. 

4 stars October 27, 2003 - 11pm Central time zone
Reviewer: Dave from Colorado
One of the situations for getting involved in setting the buffer pool attribute might be in a 
partitioned fact table -- you might choose to set the most recent few partitions to "keep", and the 
older ones to "recycle", thus helping to prevent a more rare scan of older data from spoiling the 
day of the majority of users who access recent data. 


3 starswhat about large tables October 28, 2003 - 4am Central time zone
Reviewer: Duncan from UK
From what was said above am I to take it that putting large tables in the buffer keep pool is going 
to be pointless because if they take up most of the pool then they'll get aged out as soon as a new 
object requires the space in memory to be cached? 


Followup October 28, 2003 - 8am Central time zone:

no?  

that is not what it means. 


blocks are aged out of buffer pools based on how often they are used.  So, if you put large tables 
in a keep pool AND you use their blocks lots (more then other blocks), they'll stay in there. 

5 starsWhat about Performance October 28, 2003 - 9am Central time zone
Reviewer: A reader from Santo Domingo, Dominican Republic
Is there any sample about the performance impact of caching tables. 


Followup October 28, 2003 - 10am Central time zone:

blocks are always cached.

the alter table cache command is somewhat of a misnomer.  it simply changes the way a full scan of 
a LARGE table would be treated in the buffer cache.  It doesn't "cache a table".

Performance is relative.

alter table cache can kill performance.
alter table cache can enhance positively performance.
alter table cache can have zero effect on performance.

it is a tool, that can sometimes - in rare cases (eg: example posted by another reader -- Dave -- 
above) be used. 

3 starsshort table threshold ? February 16, 2004 - 1pm Central time zone
Reviewer: Christo Kutrovsky from Ottawa, ON Canada
Tom,

You mention the "short table trheshold" what is it ? According to Oracle documentations it should 
be 5 database blocks.
However I have a table that uses 935 blocks, and is 37 000 rows, and when I full scan it the "table 
scans (short tables)" statistic is incremented.

What are your observations?
 


5 starsThanks February 16, 2004 - 2pm Central time zone
Reviewer: Christo Kutrovsky from Ottawa, ON Canada
Thanks.

 


4 starsExcellent Exchange April 26, 2004 - 10am Central time zone
Reviewer: Kyle from VA, USA
Thanks for posting this! 


5 starsCache a table value to be viewed by anyone May 17, 2004 - 5pm Central time zone
Reviewer: A reader
Hi Tom, This is a question for you :)

I have a value that is in a table, but this value is not going to change in all database life.
I can keep the table in memory but even then, Oracle will have block gets to get that value.
I can use a package begiend, a better solution, but per session there will be a read to the table 
in memory.

My question is can I put a value in memory for public access. So I read the value at database 
startup and from then it is read from memory from all session?

Thanks Tom
 


Followup May 18, 2004 - 7am Central time zone:

that package variable is the solution you are looking for.  You are over analyzing the situation.  
the "read to the table in memory" isn't anything to be concerned about. 

5 starsThanks Tom May 18, 2004 - 9am Central time zone
Reviewer: A reader


2 starsdifference between alter table.....cache and buffer_pool_keep(...) December 14, 2004 - 5am Central time zone
Reviewer: Anurag from INDIA
Hi Tom,

I am not clear about the difference between putting table to cache with alter 
table ....cache command and using execute ....keep('table')

Can you please explain the difference actually what happens during both?

best regards,

Anurag 


 


Followup December 14, 2004 - 9am Central time zone:

what is "execute ....keep('table')" exactly?


if you mean altering the table to be in the keep pool, then read on.

alter table cache's behavior is described above, for the original answer.


putting a table into the keep pool just changes the buffer cache where the blocks are cached. 
Instead of the blocks going into the default buffer pool -- they go into the keep buffer pool.  
Otherwise, the behaviour is precisely the same as just a "normal table".  All it does is move where 
the blocks are placed -- the keep pool instead of the default pool. 

3 starsmore about keep pool December 15, 2004 - 2am Central time zone
Reviewer: Anurag from INDIA
Thanks Tom,

Well, I may be cloudy at my concepts. How the block of same table behaves if it is kept in keep 
pool rather cached. Or is it the same  behaviour wise. Can you please illustrate this.

 


Followup December 15, 2004 - 1pm Central time zone:

when you full scan a long table (bigger than say DUAL) blocks resulting from that full scan are 
eligible for aging out of the buffer cache right away.

This prevents a single big full scan from WIPING OUT your buffer cache.  When the full scan needs 
more space in the buffer cache for the blocks it is reading, the ones that get pushed out are the 
ones the full scan just put in there -- instead of all of the other data.

If you alter table t CACHE, you simply change that behaviour.  The blocks read from that long full 
table scan are not subject to that behaviour.  they WILL not be the first ones "out".  So, a big 
full scan (say by accident) would in effect flush your buffer cache if the table was "cache".


With a keep pool -- an orthogonal concept from CACHE -- the two have really not much to do with 
eachother at all -- you are saying "when you do put a block from this table into the buffer cache, 
please put it over here, in this keep pool -- NOT in the default cache"



So, if a table is set to point to a keep pool, it behaves NOT ANY DIFFERENTLY than a table not 
pointing to the keep pool, it is just that its blocks will be put in the keep buffer pool instead 
of the default.


Don't even try to compare these two things, they have not much to do with eachother at all.

CACHE -- where in the "LRU" (conceptually speaking) the block read in from disk goes in the buffer 
cache

KEEP/RECYCLE/DEFAULT -- three buffer pools you may optionally set up.
 

5 stars December 15, 2004 - 7am Central time zone
Reviewer: Shimon Tourgeman


5 starsnamed caches? January 11, 2005 - 8am Central time zone
Reviewer: A reader
hi tom,

why doesn't oracle provide named caches? eg. if there are say 10 applications in the database 
wouldn't it be more efficient for each application to have also 10 seperate caches? the lio / pio 
behavior & impacts would be per application. they wouldn't influence each other.

regards,
max
 


Followup January 11, 2005 - 10am Central time zone:

No, in my experience what you would achieve is "gotta buy tons more ram in order to do this since 
each application refuses to share what they have with anyone else"


there are up to 7 'caches' now, more than two (actually, in most all cases more than one) seems 
overkill. 

4 starslook up table in cache memory January 30, 2005 - 6pm Central time zone
Reviewer: prathima from USA


3 starsCache VS. Keep Pool February 6, 2005 - 2pm Central time zone
Reviewer: Randall from CA USA
Tom;

Recently you recommended considering setting up a "keep" pool.  It is my understanding that there 
are no keep or recycle pools for alternate buffer caches, only for the default block size cache.  
Therefore designating a table as "keep" will break if the table is moved to an alternate block size 
tablespace later for tuning purposes.  However, creating it as a CACHE type table should work 
regardless of the block size.  At least that is what I am telling my students these days.  What do 
you think?
 


Followup February 7, 2005 - 4am Central time zone:

where did I recomend that?


but the keep/recycle pools are just "alternate" buffer caches -- like a different sized block one 
would be... eg: another way to have a "keep pool" would be to create a non-standard blocksize 
tablespace and move the object into that.  You achieve basically the same effect. 

4 stars February 8, 2005 - 3pm Central time zone
Reviewer: A reader
hi Tom,

Table UBER_INTERVAL currently has only 146 blocks and since it is a small table, Oracle is doing  a 
FTS for a query on this table, even though the same can be achieved in fewer 'consistent gets' by 
using a bitmap index.

FTS consistent gets : 152
bitmap index consistent gets : 4

My question : How do I artificially make this table NOT to belong to this 'small table' category ? 

Even after setting the 'number of blocks' to 10000 using the below dbms_stats, there is no 
effect(it is still using a FTS, for the query) and the consistent read is still 151 in sql_trace, 
not sure why the dbms_Stats is not having any effect 

begin
dbms_Stats.set_table_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',numblks=>10000);
end;

SELECT uber_interval
  FROM uber_interval_c a
 WHERE interval_type_code = :"SYS_B_0"

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.06       0.15          0        304          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.08       0.29          0        151          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.14       0.45          0        455          0           2

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 20  

Rows     Row Source Operation
-------  ---------------------------------------------------
      2  TABLE ACCESS FULL OBJ#(1837479) (cr=151 r=0 w=0 time=299577 us)

db version 9203
db_cache_size        big integer 738197504

I guess there should be some undocumented parameter for setting this small table threshold ? Why is 
the dbms_Stats.set_table_stats not having any effect( as shown in sql_trace ?


thanks
Anto
 


Followup February 9, 2005 - 1am Central time zone:

do you care to share a test case (create table, insert into table select..., create index, 
dbms_stats.gather, etc) 

just like I would.


and why are you using cursor_sharing -- seems like you have a bug in the code that needs fixing 
soon so you can turn that OFF. (this is probably more along the lines of "over binding is my 
problem" rather than stats.  oracle will not full scan a table regardless of the size if the 
cardinality to be retrieved is small -- so, i'm thinking this is "cursor sharing" -- NOT "small 
table threshold"


set table stats, i don't see how you can say "as shown in sql_trace", i see nothing in there that 
would indicate anything?

 

4 stars February 8, 2005 - 3pm Central time zone
Reviewer: A reader
corrrection  : 

the db version is 9204 not 9203

and I did a flush of shared_pool before the 2nd run(after setting dbms_Stats) 


5 stars February 9, 2005 - 10am Central time zone
Reviewer: A reader
Hi Tom,

You are right, after setting the following (cardinality for the column in where condition)

begin
dbms_Stats.set_column_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',colname=>'INTERVAL_TYPE_C
ODE',distcnt=>1000);
end;

the optimizer did use the bitmap index instead of going for a full table scan(FTS). Here are some 
other things I noted

a) Cursor_sharing(whether it was the default (exact) or Similar) - did not make any difference to 
the above ( I did flush the shared pool each time, otherwise the cursor_sharing was not having any 
effect)

b) I had to increase both the numblks(for table) as well as distcnt( for the column in the where 
condition), for the optimizer to choose the index instead of FTS. So I assume the 'Small table 
threshold' was actually coming into play here. Is my assumption right ?

For my previous post, it should have been tkprof not sql_trace. Also my question(why tkprof was 
still showing 151 for cr- consistent reads, even after setting table_stats) was stupid, since 
sql_trace or tkprof always shows the actual values irrespective of the table or column stats. Sorry 
about that

Thanks for your help

Anto 


Followup February 9, 2005 - 2pm Central time zone:

so basically, the optimizer just didn't have "correct data"

give it the right stuff and it works.  give it incomplete or stale stuff and it doesn't


small table threshold is not coming into play.

cardinalities, rows retrieved, amount of data to process - they are.

small table threshold has to do with how the blocks are placed into the cache. 

5 stars February 9, 2005 - 10am Central time zone
Reviewer: A reader
And raising question in your site is far far better than raising TARs with oracle - both from the 
point of view of response time as well as for getting the workarounds/usefulness.  

Metalink.oracle.com is really useful,but not raising TARs with oracle, maybe it depends on the type 
of support our client is having with oracle. 

thanks
Anto 


4 stars February 9, 2005 - 3pm Central time zone
Reviewer: A reader
I was under the impression that if the number of blocks in a table is below some threshold value, 
Oracle will always do a FTS, irrespective of statistics, even if index access might be cheaper. 

Anto

 


Followup February 9, 2005 - 3pm Central time zone:

FALSE.. bzzt.

provably false:

ops$tkyte@ORA9IR2> create table t ( x int primary key, y int );
 
Table created.
  
ops$tkyte@ORA9IR2> insert into t values ( 1, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>TRUE );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> set linesize 121
ops$tkyte@ORA9IR2> select * from t where x = 1;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=6)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=6)
   2    1     INDEX (UNIQUE SCAN) OF 'SYS_C005654' (UNIQUE)
 
 


it has never been true... 

5 stars February 9, 2005 - 5pm Central time zone
Reviewer: A reader
Thanks, Tom for the clear example.

Not sure how that wrong notion came into my head

 


4 starsWhy index scan ? October 1, 2005 - 4am Central time zone
Reviewer: A Reader from India
Hi,

Why index scan is cheaper for 1 record from 1 block table.


Thanks 


Followup October 1, 2005 - 9pm Central time zone:

cheaper than what? 

3 starsRe: October 2, 2005 - 3am Central time zone
Reviewer: A Reader from INIDA
Cheaper than Full Table Scan. 


Followup October 2, 2005 - 10am Central time zone:

test it and see.

tablespace ASSM is auto segment space managed
tablespace MSSM is manual segment space managed

Less LIO's with the index - we can read the single index block, the single table block. 

With ASSM we read all blocks below the high water mark which is always advanced "high" (part of the 
design) as well as blocks that represent the extent map.

With MSSM we read all of the blocks below the high water mark as well as blocks that represent the 
extent map.


ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace assm;
                                                                                                    
    
Table created.
                                                                                                    
    
ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );
                                                                                                    
    
1 row created.
                                                                                                    
    
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly statistics
ops$tkyte@ORA9IR2> select  * from t where x = 1;
                                                                                                    
    
                                                                                                    
    
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
                                                                                                    
    
ops$tkyte@ORA9IR2> select  * from t where x = 1;
                                                                                                    
    
                                                                                                    
    
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
 
Table dropped.
 
ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace mssm;
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly statistics
ops$tkyte@ORA9IR2> select  * from t where x = 1;
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> select  * from t where x = 1;
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
 

5 starsThanks October 3, 2005 - 2am Central time zone
Reviewer: A Reader from India


4 starsused buffer count October 3, 2005 - 9am Central time zone
Reviewer: Markus from Austria
dear tom,

you mentioned a few comments before:
"When the full scan needs more space in the buffer cache for the blocks it is reading, the ones 
that get pushed out are the ones the full scan just put in there -- 
instead of all of the other data."

this sounds like the fts reuses "his own" buffers. say our buffer cache is 10000 blocks in size. 
say we are starting a fts over a table 3000 blocks in size. what happens? how many of the original 
10000 block buffers will be effectivly aged out and used by this fts?
 


Followup October 3, 2005 - 11am Central time zone:

depends on how full the cache is to begin with and what else is going on.

No fixed "fast rule" 

3 starsad "used buffer count" October 4, 2005 - 5am Central time zone
Reviewer: Markus from Austria
dear tom,

thanks for your answer. could you please explain how a fts decides to reuse "his own" buffers? 


Followup October 4, 2005 - 3pm Central time zone:

it is based on the way they are added to the cache, they go out at the ends, not in the middle - so 
when a buffer needs to be reused - they look like the oldest buffers. 

3 starsad "used buffer count" October 5, 2005 - 5am Central time zone
Reviewer: Markus from Austria
tom,

i thought about that. it think the fts is done "db_multiblock_read_count"-wise:

1. the session searches for free buffers
2. the free buffers are pinned
3. one time "db_multiblock_read_count" datablocks are read into this buffers
4. some operations occur on this data
5. the buffers are marked free
6. the session searches for free buffers (finding the ones recently used by itself)
7. ...

it that the way it works? or am i totally wrong? 


Followup October 5, 2005 - 7am Central time zone:

not really, think of it like this


when we full scan a large table (we have thresholds to define "large" as a percent of the buffer 
cache), we'll put the blocks on the "OLD" end of the list, instead of the "NEW".  OLD blocks age 
out faster.  Since these new, yet marked as OLD blocks are on the OLD end of the list, they will be 
the first to go when we need more space.  They will not kick the "NEW" blocks off of the list.... 

2 starsad "used buffer count" October 5, 2005 - 10am Central time zone
Reviewer: Markus from Austria
hi tom,

thanks for your answer but you misunderstood. my question was and is still related to your reply 
found above "...the ones that get pushed out are the ones the full scan just put in there...". let 
me try to refine.

i know that the fts places its blocks at the cold end of the lru list per definition. but how can 
it happen that the same fts which places its blocks into buffers at the cold end of the lru list 
can REUSE these block buffers recently used by ITSELF? the fts is aging out blocks read-in by 
itself!?

that's why i was thinking the data is read "db_multiblock_read_count" wise. because it has to be 
read in chunks. the fts would not be able to reuse buffers recently used by its own otherwise.

please, could you explain on this, especially this "chunk" algorithm?



 


Followup October 5, 2005 - 11am Central time zone:

you have a buffer cache....

you are full scanning....

In order to do so, you need a free block.  So, where do you get it?  From the "cold end" as you 
say.  You use it, you put it on the "cold end" (large table full table scan....)

Later you need another block - where do you get it?  from the cold end - you know, right where you 
put the last stuff....


The fts is aging out blocks read in by itself - precisely.

so what if it is read in in chunks - you read 8 blocks - what do you need?  You need 8 free blocks 
- go bump out others you already read and processed. 

2 starsstill "used buffer count" October 6, 2005 - 5am Central time zone
Reviewer: Markus from Austria
yes, yes, yes, come on. could you please go into depth on that?

what i mean is that there must be an algorithm like "read in a chunk of blocks" - "process this 
chunk of blocks" - "read in another chunk of blocks" (LEADING TO A REUSE OF THE BLOCK BUFFERS 
RECENTLY USED BY ITSELF) - "process this chunk of blocks" - "read in ..."

a fts is done with multiblock i/o not single block i/o per definition. that's why i thought it 
might be done "db_multiblock_read_count" wise.

who decides how many blocks are read in? how many blocks are read in per chunk? is it 
"db_multiblock_read_count", is it a threshold, is it a formular? because only this "chunk wise 
thing" leads to the reuse of block buffers recently used by itself. could you please explain this 
"chunk wise thing" algorithm? 


Followup October 6, 2005 - 7am Central time zone:

no no no no no, because - it frankly doesn't really matter to us.

All *we* need to understand is that when we full scan - we won't wipe out the buffer cache.  that 
is it pretty much.

The internals - they change.  We've used the term "lru" to conceptually describe the process, well, 
it isn't an LRU, hasn't been since 8i, it uses a touch count.


We can conceptually understand what happens, the actual mechanics, not relevant and no I won't go 
further into it.


db_file_multiblock_read_count is documented to be the deciding that that controls the maximum size 
of a read, yes.  That is what it does.

there is no true "chunk wise thing", all things are done at the block level in the buffer cache - a 
multi-block IO leads to a wait for "db file SCATTERED read" meaning the blocks are SCATTERED in the 
buffer cache (we hash their DBA's data block addresses to figure out what LIST of cached blocks - 
of which there are many - to put them on.  They are not treated as a "chunk", they do not travel 
together).


If you want to understand some of the mechanics of the cache, how Oracle uses memory and such - I 
did write about that in some depth in my latest book (and in expert one on one Oracle - but things 
change, that one only covers up to 8i, the new one up to 10g) 

3 starsad "used buffer count" October 6, 2005 - 10am Central time zone
Reviewer: Markus from Austria
hi tom,

ok, i will take a look at your book "expert one on one oracle", first edition, i have. should i 
simply search for "cache management" or is there a special page number i should look at?

by the way, it would be hot - i think - if we could redirect blocks read in by a fts to the recycle 
buffer "on the fly"(*) using a hint. eg. "select ...". might be a nice option for hybrid systems.

(*) NO, i don't want to change the definition "on the fly" using dynamic sql. ;) 


Followup October 6, 2005 - 11am Central time zone:

why? we'd never be able to FIND the darn things again if you told us dynamically what cache to put 
them in.

that, and it is not necessary, since, well, the algorithm already says "don't flush out other stuff 
- flush out the full scanned blocks first anyway"


Suggest reading the first 1/3 of Expert :) it is sort of my version of the concepts guide.  Buffer 
cache stuff is sprinkled throughout. 

4 starsRelation of KEEP pool & large table sizes November 9, 2005 - 11am Central time zone
Reviewer: A reader
Many performance manuals indicate that if a "large" table is scanned/used frequently, it may 
benefit to configure a KEEP buffer.  I understand as you say "if you put large tables in a keep 
pool AND you use their blocks lots (more then other blocks), they'll stay in there."  Assuming the 
size of a particular table is larger than the KEEP pool itself, what impact does this impose?  Is 
there still a benefit to utilize the KEEP pool? 


Followup November 11, 2005 - 10am Central time zone:

the caching of the blocks depends on how the blocks are retrieved.  a large table read via single 
block IO (index reads) will be cached one way (the blocks will not age out really fast). 


I would not suggest the arbitrary use of the KEEP pool unless you had identified a problem (IO 
wise) that could be fixed by using it.

Have you? 

2 starsFTS of large table November 12, 2005 - 8pm Central time zone
Reviewer: A reader
I was just curious from a theoretical perspective.  Assuming the blocks are retrieved via a full 
table scan and the table itself is much larger than the size of the KEEP pool.  Would it really 
benefit to configure the KEEP pool for this table? 


Followup November 13, 2005 - 10am Central time zone:

yes, no, maybe, it depends.


Do the blocks need to be kept cached?  should they be cached?  are you using parallel query (more 
common with large full table scans) - then the cache doesn't really matter (direct io).


some of the blocks would be found in the buffer cache after the full scan.  so, some of the blocks 
may be available for other queries.  

In general, I'd rather have a reason for using the non-default pools, rather than hypothesize all 
of the possible reasons you may or may not have for using it.

Here is something I've written on this recently in Expert Oracle Database Architecture:


Block Buffer Cache

So far, we have looked at relatively small components of the SGA. Now we are going to look at one 
that is possibly huge in size. The block buffer cache is where Oracle stores database blocks before 
writing them to disk and after reading them in from disk. This is a crucial area of the SGA for us. 
Make it too small and our queries will take forever to run. Make it too big and we’ll starve other 
processes (e.g., we won’t leave enough room for a dedicated server to create its PGA, and we won’t 
even get started).

In earlier releases of Oracle, there was a single block buffer cache, and all blocks from any 
segment went into this single area. Starting with Oracle 8.0, we had three places to store cached 
blocks from individual segments in the SGA:

    *    Default pool: The location where all segment blocks are  normally cached. This is the 
original—and previously only—buffer pool.

    *    Keep pool: An alternate buffer pool where by convention you would assign segments that 
were accessed fairly frequently, but still got aged out of the default buffer pool due to other 
segments needing space.

    *    Recycle pool: An alternate buffer pool where by convention you would assign large segments 
that you access very randomly, and which would therefore cause excessive buffer flushing but would 
offer no benefit because by the time you wanted the block again it would have been aged out of the 
cache. You would separate these segments out from the segments in the default and keep pools so 
that they would not cause those blocks to age out of the cache.


Note that in the keep and recycle pool descriptions I used the phrase “by convention.” There is 
nothing in place to ensure that you use neither the keep pool nor the recycle pool in the fashion 
described. In fact, the three pools manage blocks in a mostly identical fashion; they do not have 
radically different algorithms for aging or caching blocks. The goal here was to give the DBA the 
ability to segregate segments to hot, warm, and do not care to cache areas. The theory was that 
objects in the default pool would be hot enough (i.e., used enough) to warrant staying in the cache 
all by themselves. The cache would keep them in memory since they were very popular blocks. You 
might have had some segments that were fairly popular, but not really hot; these would be 
considered the warm blocks. These segments' blocks could get flushed from the cache to make room 
for some blocks you used infrequently (the “do not care to cache” blocks). To keep these warm 
segments blocks cached, you could do one of the following:

    *    Assign these segments to the keep pool, in an attempt to let the warm blocks stay in the 
buffer cache longer.

    *    Assign the “do not care to cache” segments to the recycle pool, keeping the recycle pool 
fairly small so as to let the blocks come into the cache and leave the cache rapidly (decrease the 
overhead of managing them all).

This increased the management work the DBA had to perform, as there were three caches to think 
about, size, and assign objects to. Remember also that there is no sharing between them, so if the 
keep pool has lots of unused space, it won’t give it to the overworked default or recycle pool. All 
in all, these pools were generally regarded a very fine, low-level tuning device, only to be used 
after most all other tuning alternatives had been looked at (if I could rewrite a query to do 
one-tenth the I/O rather then set up multiple buffer pools, that would be my choice!).

Starting in Oracle9i, the DBA had up to four more optional caches, the db_Nk_caches, to consider in 
addition to the default, keep, and recycle pools. These caches were added in support of multiple 
blocksizes in the database. Prior to Oracle9i, a database would have a single blocksize (typically 
2KB, 4KB, 8KB, 16KB, or 32KB). Starting with Oracle9i, a database can have a default blocksize, 
which is the size of the blocks stored in the default, keep, or recycle pool, as well as up to four 
nondefault blocksizes, as explained in Chapter 3. 
The blocks in these buffer caches are managed in the same way as the blocks in the original default 
pool—there are no special algorithm changes for them either. Let’s now move on to cover how the 
blocks are managed in these pools.
 

5 starsIs this a compelling reason for using seperate caches? December 16, 2005 - 6am Central time zone
Reviewer: Jack Douglas from Maidenhead, UK
Hi Tom,

We have a large table (95% of the whole database) that is often queried via an index. The problem 
we have is that even though we access only a very small percentage of the table in a query, it is 
still a large number of blocks relative to the buffer size. 

Because we are not doing a full scan am I right that potentially the buffer cache will be flushed 
by these queries?

Is it true that the small table threshold or 'cache' setting will not prevent this happening to any 
of the blocks currently in the buffer?

If so, does that make this an immediate candidate for a 'recycle' pool?

Thanks for your input,
Jack 


Followup December 16, 2005 - 8am Central time zone:

small table thresholds/cache settings are relevant for full scans.


nothing makes anything an immediate candidate for the recycle pool.  Have you diagnosed a physical 
IO problem on your system regarding other objects (beyond this one)

Have you considered an index organized table - to reduce the number of blocks required to satisfy 
an index range scan (instead of a possible

a) 3 or 4 blocks to read index
b) plus number of blocks = number of rows retrieved for table access by index rowid in worst case

you would have

a) 3 or 4 blocks to read index to find first row
b) plus as few blocks as it can take to store the data

? 

5 stars December 16, 2005 - 9am Central time zone
Reviewer: David Aldridge http://oraclesponge.blogspot.com from Colorado Springs
We were discussing here 
http://tinyurl.com/cyyyu
the effect of read ahead caching on FTS of small tables. Do you think that buffer caching of small 
tables might be a "valid" method of avoiding i/o inefficiency due to read ahead kicking in towards 
the end of the table scan and inadvertantly reading past the HWM? Or is that crazy talk? 


Followup December 16, 2005 - 12pm Central time zone:

this reminds me sort of "separate indexes from tables" in a way ;)

Might be more reasonable to place "large segments you plan on multi-block IO'ing together and 
separate them from segments you plan o single-block IO'ing" at the volume level in order to prevent 
read ahead from kicking in.

Reason I say this is - it was observed that separating indexes from data helped a system once. 
turned out it was the separation of single block IO'ed objects from multi-block IO'ed objects - and 
it was the suppression of "magic read ahead algorithms" on the single block IO'ed things that made 
the difference. 

5 starsIs this a compelling reason for using seperate caches? - followup December 16, 2005 - 11am Central time zone
Reviewer: Jack Douglas from Maidenhead, UK
No, at the moment I only suspect a PIO problem on other objects. I will focus my attention on 
proving this one way or the other. If it proves to be the case would you then suggest considering 
splitting the cache?

Strangely enough it is IO that prevents us using an IOT. The queries at the end of the script are 
like a typical query on our system - the fact table is queried by dimension1, dimension2 and also 
by dimension1, dimension3:



SQL*Plus: Release 10.2.0.1.0 - Production on Fri Dec 16 16:28:33 2005

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle9i Release 9.2.0.6.0 - Production
JServer Release 9.2.0.6.0 - Production

16:28:33 TRACKER@oracle> create cluster t_cluster (id1 integer, partial_id2 integer, id3 integer);

Cluster created.

Elapsed: 00:00:00.01
16:28:37 TRACKER@oracle> create index k_t_cluster on cluster t_cluster;

Index created.

Elapsed: 00:00:00.01
16:28:37 TRACKER@oracle> create table t1 (id1 integer, partial_id2 integer, id2 integer, id3 
integer, dummy char (100)) cluster t_cluster (id1, partial_id2, id3);

Table created.

Elapsed: 00:00:00.01
16:28:38 TRACKER@oracle> alter table t1 add constraint pk_t1 primary key (id1, id2, id3);

Table altered.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t1 on t1 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> create table t2 (id1 integer, partial_id2 integer, id2 integer, id3 
integer, dummy char (100), constraint pk_t2 primary key (id1, partial_id2, id3, id2)) organization 
index;

Table created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create unique index u_t2 on t2 (id1, id2, id3);

Index created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t2 on t2 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> create table t3 (id1 integer, id2 integer, id3 integer, dummy char (100), 
constraint pk_t3 primary key (id1, id3, id2)) organization index;

Table created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t3 on t3 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> --
16:28:38 TRACKER@oracle> begin
16:28:38   2    for j in 1..10 loop
16:28:38   3      for i in 1..50 loop
16:28:38   4        insert into t1 (id1, partial_id2, id2, id3, dummy)
16:28:38   5        select j, round (rownum / 50), rownum, i, 'A'
16:28:38   6        from dba_objects
16:28:38   7        where rownum <= 100;
16:28:38   8        --
16:28:38   9        insert into t2 (id1, partial_id2, id2, id3, dummy)
16:28:38  10        select j, round (rownum / 50), rownum, i, 'A'
16:28:38  11        from dba_objects
16:28:38  12        where rownum <= 100;
16:28:38  13        --
16:28:38  14        insert into t3 (id1, id2, id3, dummy)
16:28:38  15        select j, rownum, i, 'A'
16:28:38  16        from dba_objects
16:28:38  17        where rownum <= 100;
16:28:38  18      end loop;
16:28:38  19    end loop;
16:28:38  20  end;
16:28:38  21  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:33.71
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> commit;

Commit complete.

Elapsed: 00:00:00.03
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> create table tx as select distinct id1, partial_id2 from t1;
create table tx as select distinct id1, partial_id2 from t1
             *
ERROR at line 1:
ORA-00955: name is already used by an existing object


Elapsed: 00:00:00.01
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> analyze table t1 compute statistics;

Table analyzed.

Elapsed: 00:00:02.57
16:30:15 TRACKER@oracle> analyze table t2 compute statistics;

Table analyzed.

Elapsed: 00:00:01.53
16:30:16 TRACKER@oracle> analyze table t3 compute statistics;

Table analyzed.

Elapsed: 00:00:01.50
16:30:18 TRACKER@oracle> analyze table tx compute statistics;

Table analyzed.

Elapsed: 00:00:00.03
16:30:18 TRACKER@oracle> --
16:30:18 TRACKER@oracle> select sum (blocks) as blocks from (select blocks from dba_tables where 
table_name = 'T1' union all select leaf_blocks from dba_indexes where table_name = 'T1');
--

    BLOCKS
----------
      2591

Elapsed: 00:00:00.73
16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 
'T2';
--

    BLOCKS
----------
      2764

Elapsed: 00:00:00.04
16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 
'T3';
--

    BLOCKS
----------
      1864

Elapsed: 00:00:00.03
16:30:19 TRACKER@oracle> set autotrace trace explain statistics;
16:30:19 TRACKER@oracle> --
16:30:19 TRACKER@oracle> select min (dummy) from (select * from t1 where id1 = 5 and id2 = 50 union 
select * from t1 natural join tx where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=57 Card=1 Bytes=102)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=57 Card=150 Bytes=15300)
   3    2       SORT (UNIQUE) (Cost=57 Card=150 Bytes=16600)
   4    3         UNION-ALL
   5    4           TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=14 Car
          d=50 Bytes=5400)

   6    5             INDEX (RANGE SCAN) OF 'PK_T1' (UNIQUE) (Cost=3 C
          ard=50)

   7    4           NESTED LOOPS (Cost=4 Card=100 Bytes=11200)
   8    7             TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes
          =12)

   9    7             TABLE ACCESS (CLUSTER) OF 'T1' (Cost=1 Card=33 B
          ytes=3564)

  10    9               INDEX (UNIQUE SCAN) OF 'K_T_CLUSTER' (NON-UNIQ
          UE)





Statistics
----------------------------------------------------------
         29  recursive calls
          0  db block gets
         70  consistent gets
          0  physical reads
          0  redo size
        479  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

16:30:19 TRACKER@oracle> select min (dummy) from (select * from t2 where id1 = 5 and id2 = 50 union 
select * from t2 natural join tx where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=1 Bytes=102)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=45 Card=150 Bytes=15300)
   3    2       SORT (UNIQUE) (Cost=45 Card=150 Bytes=16600)
   4    3         UNION-ALL
   5    4           INDEX (UNIQUE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 Ca
          rd=50 Bytes=5400)

   6    5             INDEX (RANGE SCAN) OF 'U_T2' (UNIQUE) (Cost=1 Ca
          rd=50)

   7    4           NESTED LOOPS (Cost=4 Card=100 Bytes=11200)
   8    7             TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes
          =12)

   9    7             INDEX (RANGE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 C
          ard=33 Bytes=3564)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        168  consistent gets
          0  physical reads
          0  redo size
        479  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

16:30:19 TRACKER@oracle> select min (dummy) from (select * from t3 where id1 = 5 and id2 = 50 union 
select * from t3 where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.25

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1 Bytes=102)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=46 Card=150 Bytes=15300)
   3    2       SORT (UNIQUE) (Cost=46 Card=150 Bytes=15900)
   4    3         UNION-ALL
   5    4           INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=19 C
          ard=50 Bytes=5300)

   6    5             INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost
          =19 Card=5000)

   7    4           INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=1 Ca
          rd=100 Bytes=10600)

   8    7             INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost
          =1 Card=100)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      15342  consistent gets
          0  physical reads
          0  redo size
        479  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
 


Followup December 16, 2005 - 1pm Central time zone:

... If it proves to be the case 
would you then suggest considering splitting the cache? ...

no, not without more information.  I would first be looking at decreasing the IO's needed on the 
other thing.


so you are using a cluster and have it clustered by dimension1? 

5 starsoops December 16, 2005 - 11am Central time zone
Reviewer: Jack Douglas from Maidenhead, UK
Sorry Tom,

Table tx create failed in my script above because I ran it twice and forgot to drop it. If you run 
it, it should work though...

Perhaps I should also have mentioned that I agree with your logic about IOTs in general and that 
the PL/SQL part of the script simulates the way data is added to our particular database (1 large 
dataload per week, dimension3/id3 is a week number). I am not trying to generalise about IOTs 
versus heaps/clusters. 


Followup December 16, 2005 - 1pm Central time zone:

no worries - the mentioning of the IOT was to achieve clustering - which maybe you are already 
doing but via a cluster right?   

5 starsyes December 16, 2005 - 3pm Central time zone
Reviewer: Jack Douglas from Maidenhead, UK
Yes we use the cluster to the same effect as an IOT - just to keep certain rows together that we 
query together. 

We cluster on a function of the dimensions rather than on any one of them (actually id1, round (id2 
/ 50), id3) because we query both id1, id3 and id1, id2 and this is a kind of halfway house between 
the two. Clustering on either one would be great for one side of the union in the query but 
disastrous for the other - hence the relatively huge IO on the third query.

But basically, yes, as you say, we do it for the same reason we would use an IOT. Unfortunately 
what you gain with an IOT is lost when you add secondary indexes in our case due to the primary key 
duplication. 


5 stars December 17, 2005 - 11am Central time zone
Reviewer: David Aldridge http://oraclesponge.blogspot.com from Colorado Springs
>> this reminds me sort of "separate indexes from tables" in a way ;)

Might be more reasonable to place "large segments you plan on multi-block IO'ing 
together and separate them from segments you plan o single-block IO'ing" at the 
volume level in order to prevent read ahead from kicking in.

Reason I say this is - it was observed that separating indexes from data helped 
a system once. turned out it was the separation of single block IO'ed objects 
from multi-block IO'ed objects - and it was the suppression of "magic read ahead 
algorithms" on the single block IO'ed things that made the difference. <<

Ah, that's an interesting point on the index/table setup ... yes, if there is a way to segregate 
the segments according to whether we want read ahead applied or not then that would do the trick. 
Much depends on how read ahead works on a particular system though I'd think. 


4 starsDefining Multiple Buffer Pools May 3, 2006 - 4pm Central time zone
Reviewer: Yoav
Hi Tom,

In 8.1.7.4 when defining multiple buffer pool its needed to specify two attributes for each buffer 
pool:
* The number of buffers in the buffer pool
* The number of lru latches allocated to the buffer pool
for example :

BUFFER_POOL_KEEP=(BUFFERS:10000, LRU_LATCHES:2)

Could you please explain how to calculate the size of "BUFFERS" and "LRE_LATCHES" ?

Regards.
 


Followup May 4, 2006 - 1am Central time zone:

are you really realy really sure you need to use this (I mean, if you don't know how big it is - 
that should be something you "know" since you are going to use this to tune a very specific problem 
with?) 

4 starsMultiple Pools May 5, 2006 - 12am Central time zone
Reviewer: Yoav
Hi Tom,
Thanks for your input.
We have 3rd party application,and we are suffering from heavy logical i/o activity.
(e.g:db file sequential read show 80% in thae last statspack on a unix with 8 cpus).
For now we cant touch the application code . 

So i read in expert-one-on-one page 80 :
"...a buffer pool ,large enough to our 'lookup' tables in memory. for example, when oracle read a 
blocks from this table,they always get cached in this special pool.
... A buffer that is set up to cache blocks like this is known as KEEP pool..."

And also read : 
http://asktom.oracle.com/pls/ask/f?p=4950:8:13335700705400115196::NO::F4950_P8_DISPLAYID,F4950_P8_CR
ITERIA:6265095774206
and thought that increasing the buffer cache and using multiple pools ,m a y b e, help to reduce 
this the logical i/o.

1. So back to my last question, i think that "BUFFER:"   represent the SUM(BYTE) from dba_tables + 
SUM(BYTE) from
dba_indexes . Is it right ?
2. I actualy dont know how much "LRE_LATCHES:" to assign
   for the KEEP pool and for the RECYCLE pool.
   I hope you can advice about that.

Regards. 


Followup May 5, 2006 - 1am Central time zone:

....
and we are suffering from heavy logical i/o 
activity.
(e.g:db file sequential read show 80% in thae last statspack on a unix with 8 
cpus).
.........

that would not be heavy logical i/o perhaps - that would be PHYSICAL IO.  and 80% - that is just a 
ratio, 80% of WHAT?


I'm not going to touch the buffer pool here - you haven't identified as far as I can tell what the 
problem is, what objects you might slide into these pools and how that might help you. 

3 starsMultiple buffer pools May 5, 2006 - 4am Central time zone
Reviewer: Yoav
Hi Tom,
I based my estimation on the statspack. since this thread is not about the 
statspack i didnt want to add any copy/paste proof from the statspack. 

Hir some more information:
1. I wrote  heavy logical i/o based on the logical reads per second (2054 per second)
   which is about 7 time more then the physical reads (300 per second)
2. From the "Top 5 Wait Events" 80% of  the total waits event    was due to "db file sequential 
read"
   (I know that you want like it, but for fairness I'll say that the statspack run for about 40 
hours - but in this case its just help me to show that i have 
    a consistent problem with "db file sequential read" wait event).     
3. The best thing to do, was to change the appliaction code.
   I know that touching the buffer cache want change the fact that 1 of 4 transaction still will 
ended with rollback. 
   But ss i said Since its not possible ,at this point in time  to change the appliaction code,
   dont you think that increasing the buffer cache and using multiple pools can help ?     

Cache Sizes
~~~~~~~~~~~
           db_block_buffers:     524288          log_buffer:262144
              db_block_size:       4096    shared_pool_size:89128960

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:            129,876.68             29,747.96
              Logical reads:              2,054.39                470.55
              Block changes:                949.55                217.49
             Physical reads:                299.29                 68.55
            Physical writes:                 72.85                 16.69
                 User calls:                329.95                 75.58
                     Parses:                 37.97                  8.70
                Hard parses:                  0.10                  0.02
                      Sorts:              9,597.31              2,198.24
                     Logons:                  0.07                  0.02
                   Executes:                195.04                 44.67
               Transactions:                  4.37

  % Blocks changed per Read:   46.22    Recursive Call %:               16.25
 Rollback per transaction %:   24.00       Rows per Sort:                0.03

Top 5 Wait Events
~~~~~~~~~~~~~~~~~                                             Wait     % Total
Event                                               Waits  Time (cs)   Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read                        15,204,906    6,856,997   79.03
log file parallel write                         1,085,325    1,070,571   12.34
          -------------------------------------------------------------

Thanks. 


Followup May 5, 2006 - 6am Central time zone:

so, how did that tell you what segments would benefit from their own buffer pool.

That is my point, how the heck do you know what you want to put into this buffer pool.

40 hours, bah - ignore this statspack, you cannot do anything with it.


multiple buffer pools ain't going to help - unless and until you know the segments that would truly 
benefit from it - and I don't think you do.
 

5 starsTouch count algorithm and "alter table t1 cache" March 16, 2007 - 9am Central time zone
Reviewer: Guy Lambregts from Belgium
Tom,

I think to know what alter table t1 cache did - in release 7.x and 8.x - this with repect to MRU and LRU if the blocks of table t1 where accessed in a scattered way (FTS).

Also I think to know from release 8.1.5 onwards Oracle has introduced the touch count algorithm. With my own words : A block is read into the buffer cache where it is inserted at a midpoint, somwhere at the "border" of the cold and the warm region. The more a block is touched the more its touch count increases (x$bh.tch) the lesser the likelyhood the block will be "thrown" out of the cache. I know there are a set of hidden parameters, which I never ... touch.

Now my question : Is the alter table t1 cache still relevant with respect to this touch count philosophy ? I mean is there still any difference between alter table t1 cache and alter table t1 nocache ?

Kind Regards
Guy

Followup March 17, 2007 - 3pm Central time zone:

you should use alter table T cache since the introduction of the keep and recycle pools really.

but the effect is the same, a table with the cache attribute is considered a short table during a full scan regardless of its real size.


3 starskeep pool March 26, 2007 - 2am Central time zone
Reviewer: A reader
I have a table a with size 250 MB and full table scan happens on this table.

[1] Does it worth to keep table a in keep pool.

[2] e.g for a table which is not in keep pool, any dml updates data buffer cache and then write the data on disk at checkpoint.What happens when we keep a table in keep pool. How does inserts, deletes and updates work on this table a.

Followup March 26, 2007 - 7am Central time zone:

1) yes, no, maybe.

2) just like they would if you did not use a keep pool - the keep/recycle pools are just like the buffer cache itself, they are buffer caches.
4 starsHow big a table can be cached? February 21, 2008 - 2pm Central time zone
Reviewer: Sal from NJ
OLTP .. 10.2 version.

We have identified 4 tables that are the center of all activities in the database. Update, insert, 
delete, selects.. 

Some DBA's are suggesting to put them in keep pool. They tested it on a production copy and the 
bottlenck operations started working in seconds instead of 5+ minutes. But in their test they were 
the only one in the database..

The tables are 3 GB in size.. Our shared pool is 8 gb, they are saying lets increase it to 16 gb 
and put these tables worth 3 gb on keep pool.

Is there any limit to what can be cached? I have cached smaller lookup tables in past, but not sure 
about this here.

Any suggestions Sir?

thanks


Followup February 21, 2008 - 5pm Central time zone:

only the limits of machine addressability.
5 starsalter table cache candidate February 28, 2008 - 11am Central time zone
Reviewer: Gareth Adcock from England
Hello

I think that I may have a candidate for the 'alter table cache' command.

The database concerned (version 9.2.0.3.0) has a buffer cache of 2400 Mb.
I have a table which I think 'should' be cached most of the time but would appear to have been aged 
out most of the times that 

the application has read from it, resulting in a large ammount of physical reads.

The table I am looking at the most is at the bottom of the following list in terms of size 
(F554215). 

select segment_name,segment_type, ((blocks*8)/1024)/1024 GB
from dba_segments
where (owner,segment_name) in
(select owner,object_name from v$segment_statistics where statistic_name = 'physical reads' and 
value > 100000000)
order by blocks desc

SEGMENT_NAME      SEGMENT_TYPE      GB
-------------------------------------------------------------
F42199                TABLE                 15.902557373046875
F42199                TABLE                 13.0977630615234375
F4111                 TABLE                 10.4477691650390625
F0911                 TABLE                 10.242462158203125
F42119                TABLE                 1.8017578125
F0902                 TABLE                 .7207183837890625
F4611                 TABLE                 .4254913330078125
F4108                 TABLE                 .34760284423828125
F4211                 TABLE                 .11937713623046875
F554215               TABLE                 .0668182373046875

 

It is a 'hot' table rather than a 'keep warm' table.
It gets read a lot; the potential problem is that it gets read a lot form disk.


select owner,object_name,value from v$segment_statistics where statistic_name = 'physical reads' 
and value > 100000000 

(hundred million)
order by value desc

OWNER        OBJECT_NAME      VALUE 
----------------------------------------------
PRODDTA      F0911            2480884853
PRODDTA      F4111            1733224651
PRODDTA      F554215          1549742899
PRODDTA      F42119           982973397
PRODDTA      F42199           908307993
PRODDTA      F0902            296390653
CRPDTA       F42199           187409137
PRODDTA      F4108            181621627
PRODDTA      F4211            148565012
PRODDTA      F4611            118151835

So a relatively small table comes third in total physical reads.

select owner,object_name,value
from v$segment_statistics
where statistic_name = 'logical reads'
and value > 1000000000 (thousand million)
order by value desc

OWNER        OBJECT_NAME            VALUE       
----------------------------------------------------
PRODDTA      F0911                  4087505552      
SYS               I_OBJ1                 2717363136      
PRODDTA      F4111                  1909729328      
PRODDTA      F554215                1840557360      
PRODDTA      F4108_0                1626329632      
PRODDTA      F554108_7              1328939200      
PRODDTA      F41021_PK              1310094416      
PRODDTA      F59FS800               1146349296      
PRODDTA      CI_F0911_FIN_CUBE      1006994896 


So about 84% of the time the data gets read from disk.
I looked at the sql statements pertaining to this table,which were currently stored in memory.
SQL Analyzer showed up the likely suspect and I queried v$sql

select fetches, executions, first_load_time,last_load_time
from v$sql
where sql_text  = 
'SELECT  *  FROM PRODDTA.F554215 WHERE  ( XHMCU = :KEY1 ) '

FETCHES      EXECUTIONS      FIRST_LOAD_TIME          LAST_LOAD_TIME
----------------------------------------------------------------------------------------------------
------------
5587             5587            2008-01-14/06:14:02      2008-02-25/06:28:48


SELECT  count(*)  FROM PRODDTA.F554215
group by xhmcu

COUNT(*)      
--------------
175002        
2             
1

I'm guessing it's always going to do a full table scan.
This table gets updated a lot and I understand that this makes it unsuitable for a bitmap index, 
besides it's going to want to 

bring back almost all of the blocks almost all of the time.
A function based index may be an option for the rare values but these are almost never queried.

I ran the sql statement from my SQL Developer session.

Explian Plan:
OPERATION                OPTIMIZER    COST    CARDINALITY    BYTES
SELECT STATEMENT            CHOOSE    1907    174200        61144200            

        
TABLE ACCESS(FULL) PRODDTA.F554215    ANALYZED    1907    174200        61144200            

        



Autotrace:
            
recursive calls                        0
db block gets                        0
consistent gets                        8856
physical reads                        8730
redo size                    0
bytes sent via SQL*Net to client        1613
bytes received via SQL*Net from client        647
SQL*Net roundtrips to/from client        5
sorts (memory)                        2
sorts (disk)                        0 


In my novice opinion it would seem that this table could benefit from 'alter table cache' to slow 
down the aging out process.
If this is not a practical or a good idea would setting up a seperate keep buffer cache be worth 
looking into?

My first instinct is to trust the algorithms that Oracle uses to allocate resources but this one is 
puzzling me.


Thanks

Gareth


Followup February 28, 2008 - 11pm Central time zone:

SELECT count(*) FROM PRODDTA.F554215
group by xhmcu


I would erase that query in the application, that would be my fix.

I'll bet it is some silly logic like:

select the count
if the count > 0
then
do something



I would just code:

do something


period, end, you never need to count the rows. Besides, look at that query, the output is USELESS

SELECT count(*) FROM PRODDTA.F554215
group by xhmcu


a count grouped by a column that is not selected, so you get a bunch of random numbers in some arbitrary order representing something.

what a waste.
5 starsMy mistake February 29, 2008 - 4am Central time zone
Reviewer: Gareth from England
The actual query I am looking at is

SELECT  *  FROM PRODDTA.F554215 WHERE  ( XHMCU = :KEY1 )

Sorry, it's tucked away in:

select fetches, executions, first_load_time,last_load_time
from v$sql
where sql_text  = 
'SELECT  *  FROM PRODDTA.F554215 WHERE  ( XHMCU = :KEY1 ) '

I should have referenced it explicitly, my fault.

The count(*) was just to demonstrate the limited number of possible values in xhmcu and skewed 
nature of the data.

apologies Gareth


Followup March 1, 2008 - 10am Central time zone:

ok, now for a super silly question.

why do you need to full scan this over and over (I'm going to keep going back to the application, where 99.999% of all tuning needs be done in every case).

Why would you full scan this table *so often*, what is the logic there. Does it really make sense to do this (putting in ram isn't going to fix very much, if you are using regular OS files, it is already probably buffered in the OS file system cache).


don't forget - the v$ tables are since *forever* (since instance started, those IO's are cumulative)

Now, that said - looking at the dates on that - that is like two weeks worth of executes.

Why do you need to full scan this table 400 times a day?
5 starsThanks March 3, 2008 - 5am Central time zone
Reviewer: Gareth from England
Thanks for looking at this.

The short answer is that I don't know.

This is  third party application that bolts on to our main JDEdwards database.
I don't have any direct involvement in development other than to raise concerns after the event.  I 
am currently highlighting issues concerning the use of literals rather than bind variables in the 
code resulting in hundreds of almost identical sql statements in the shared pool.

I will add this to my list of issues.

Thanks again for taking the time to look at this.

Gareth


Followup March 3, 2008 - 7am Central time zone:

you could try setting up a keep pool (a couple times larger than the size of the table, especially if it is modified/updated)...

alter the table to be in the keep pool
alter the table 'cache'

that should reduce the physical IO to minimal - and if that is the cause of a problem (the IO's were), that problem would be reduced...
3 starsCaching a table in Memory September 5, 2008 - 6pm Central time zone
Reviewer: Ramani from USA
Hi tom
Thanks for the information regarding keeping / deleting tables in buffer_pool.

alter table mytable storage (buffer_pool keep);
alter table mytable storage (buffer_pool default);

Is there a way to check what tables are kept in buffer_pool ? from which dictionary view. 
I do not think I get the table names that are kept in buffer_pool from V$BUFFER_POOL.

Thanks in advance,
Ramani


5 starsALTER TABLE CACHE July 9, 2009 - 1am Central time zone
Reviewer: A reader
Single user environment.

SQL> 
SQL> SELECT * FROM v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0    Production
TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

Elapsed: 00:00:00.01
SQL> 
SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M
SQL> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 400M
SQL> 
SQL> CREATE TABLE m AS
  2  SELECT rownum mID,
  3          MOD(rownum, 100) cnt1,
  4          MOD(rownum, 200) cnt2,
  5          TRUNC(dbms_random.value(1, 10000)) cnt3,
  6          RPAD('x', 30, 'x') char1,
  7          RPAD('x', 30, 'y') char2,
  8          RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

Elapsed: 00:00:16.75
SQL> 
SQL> CREATE TABLE i AS
  2  SELECT rownum iID,
  3          mID,
  4          MOD(rownum, 2) cnt1,
  5          MOD(rownum, 4) cnt2,
  6          RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10          mID,
 11          MOD(rownum, 2) cnt1,
 12          MOD(rownum, 4) cnt2,
 13          RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

Elapsed: 00:00:17.15
SQL> 
SQL> SELECT segment_Name, bytes/1024/1024 MB
  2  FROM   user_segments
  3  WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME                           MB
------------------------------ ----------
I                                     119
M                                     439

Elapsed: 00:00:00.01
SQL> 
SQL> 
SQL> ALTER TABLE m ADD CONSTRAINT m_pk PRIMARY KEY (mID);

Table altered.

Elapsed: 00:00:08.54
SQL> 
SQL> ALTER TABLE i ADD CONSTRAINT i_fk1 FOREIGN KEY (mID)
  2      REFERENCES m(mID);

Table altered.

Elapsed: 00:00:04.82
SQL> 
SQL> CREATE INDEX i_fk1 ON i(mID);

Index created.

Elapsed: 00:00:02.54
SQL> 
SQL> 
SQL> BEGIN
  2    DBMS_STATS.GATHER_TABLE_STATS(
  3       ownname      => user,
  4       tabname      => 'M',
  5       estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
  6       method_opt      => 'FOR ALL COLUMNS SIZE SKEWONLY',
  7       cascade      => TRUE);
  8  
  9    DBMS_STATS.GATHER_TABLE_STATS(
 10       ownname      => user,
 11       tabname      => 'I',
 12       estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
 13       method_opt      => 'FOR ALL COLUMNS SIZE SKEWONLY',
 14       cascade      => TRUE);
 15  END;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:21.51
SQL> 
SQL> 
SQL> set autotrace on
SQL> 
SQL> SELECT COUNT(*)
  2  FROM   m, i
  3  WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND
  5          m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.78

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      69907  consistent gets
      55557  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL&gt; 
SQL&gt; ALTER TABLE m CACHE;

Table altered.

Elapsed: 00:00:00.00
SQL&gt; ALTER TABLE i CACHE;

Table altered.

Elapsed: 00:00:00.01
SQL&gt; 
SQL&gt; SELECT COUNT(*)
  2  FROM   m, i
  3  WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND
  5          m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.40

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
        365  recursive calls
          0  db block gets
      69968  consistent gets
      55556  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL&gt; 
SQL&gt; -- --------------------------------------------------------------
SQL&gt; -- Run the SQL again.
SQL&gt; -- --------------------------------------------------------------
SQL&gt; /

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.29

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69907  consistent gets
      55556  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL&gt; 
SQL&gt; spool off
</wbr></wbr></wbr></wbr></wbr>


Why are there so many physical IO's even after I marked the tables for caching in memory? If these were the only two tables in the database, is there anyway I can completely eliminate physical IO's giving that 1 GB is set aside for SGA and the 2 tables added up to be around 550 MB?


Followup July 14, 2009 - 1pm Central time zone:

well, you say the SGA is 1gb, but the SGA is made of many things. How big is your BUFFER Cache and remember - lots of other stuff has to fit in there.

and what was the physical IO to - tkprof with a row source operation would be infinitely better than this - this has no detail.
3 starsResult cache July 14, 2009 - 3pm Central time zone
Reviewer: A reader
Reader,

Since you're on 11g, you may want to look into using the result cache feature. Make sure the 
appropriate result cache init parameters are sized for the amount of data you want to cache. Rerun 
your example and add the  hint on your selects. The first select will still do 
all the IOs, subsequent executions of the same select should be extremely fast because the result 
is cached.

Give it a try, it rocks!




5 starsALTER TABLE CACHE Followup July 14, 2009 - 4pm Central time zone
Reviewer: A reader
Here's the tkprof with row source operation. It looks like the physical IO's are all from full 
scanning table m.

STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=69914 pr=55556 pw=55556 time=0 us)'
STAT #3 id=2 cnt=1000000 pid=1 pos=1 obj=0 op='HASH JOIN  (cr=69914 pr=55556 pw=55556 time=266516 
us cost=18706 size=44818048 card=1018592)'
STAT #3 id=3 cnt=1000000 pid=2 pos=1 obj=22718 op='TABLE ACCESS FULL I (cr=14350 pr=0 pw=0 
time=3888 us cost=3199 size=8148736 card=1018592)'
STAT #3 id=4 cnt=1000000 pid=2 pos=2 obj=22717 op='TABLE ACCESS FULL M (cr=55564 pr=55556 pw=55556 
time=245890 us cost=12257 size=35996760 card=999910)'


SQL> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_cache_size                        big integer 712M

SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M

I tried setting db_cache_size to 0 so Oracle would automatically allocate memory.  However, the 
result was the same.



Followup July 15, 2009 - 11am Central time zone:

I see that pr=55556
I see also that pw (physical write) is...... 55556.


How about you do the trace with wait events being recorded - I'll bet the waits are "direct read temp" and "direct write temp"

Eg: table M is hashed but spills to disk.
5 starsTo "A Reader" - July 14, 2009 - 3pm US/Eastern July 14, 2009 - 5pm Central time zone
Reviewer: A reader
Thanks for the tip. However, that's not the point of the question. I was trying to get a better 
understanding of the behavior of caching a table.


Followup July 15, 2009 - 11am Central time zone:

I think the table is probably cached, but you are spilling to disk on a hash operation. Given that pr = pw anyway.
5 starsALTER TABLE CACHE Followup #2 July 15, 2009 - 12pm Central time zone
Reviewer: A reader
All waits are direct path reads.

WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623184 block cnt=16 obj#=22717 
tim=5157349182372
WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2623200 block cnt=16 obj#=22717 
tim=5157349183354
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623216 block cnt=16 obj#=22717 
tim=5157349184092
...
...
...
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562560 block cnt=16 obj#=22717 
tim=5157351288487
WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2562576 block cnt=16 obj#=22717 
tim=5157351289457
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562592 block cnt=16 obj#=22717 
tim=5157351290425

********************************************************************************

SELECT COUNT(*)
FROM   m, i
WHERE  m.mID   = i.mID AND
       i.cnt1  = 1 AND
       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.76       7.86      55556      69914          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.76       7.86      55556      69914          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 30  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=69914 pr=55556 pw=55556 time=0 us)
1000000   HASH JOIN  (cr=69914 pr=55556 pw=55556 time=266516 us cost=18706 size=44818048 
card=1018592)
1000000    TABLE ACCESS FULL I (cr=14350 pr=0 pw=0 time=3888 us cost=3199 size=8148736 
card=1018592)
1000000    TABLE ACCESS FULL M (cr=55564 pr=55556 pw=55556 time=245890 us cost=12257 size=35996760 
card=999910)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  reliable message                                1        0.00          0.00
  direct path read                             3757        0.00          0.01
  SQL*Net message from client                     2        0.00          0.00

********************************************************************************

</wbr>


Followup July 15, 2009 - 12pm Central time zone:

oh, that explains it, it isn't using a conventional path read at all - it is bypassing the buffer cache. So the alter cache is not useful at all in this case.

In 11g - we read via direct path over conventional path (buffer cache) based on statistics and setup. Direct path reads make sure the latest version of the block is on disk and then just reads away.


strange that the direct path read incremented the pw, that isn't right.
5 starsALTER TABLE CACHE Followup #3 July 15, 2009 - 1pm Central time zone
Reviewer: A reader
Can you elaborate on this statement "In 11g - we read via direct path over conventional path 
(buffer cache) based on statistics and setup.", specifically the setup part.

Also why would it do direct path reads on one table but not the other?


Followup July 15, 2009 - 2pm Central time zone:

the table sizes are different. Based on your object and system statistics, the optimizer opted for a direct path read.

Normally, a full scan in SERIAL mode would use a conventional path read. It would use your db file multiblock read count (should be set automatically by us in 10g and above to do the MAX multiblock read available on your system) to figure out how many blocks to read at a time. Sounds all good - all efficient - however....

Say we decided to read 32 blocks at a time. Further assume block 1, 10, 15, 20, 22 are in the cache already. We cannot use the image on disk for these guys so the IO looks like:

logical IO block 1.
multiblock read 2-9 into the cache, logical IO them out.
logical IO block 10
multiblock read 11-14 into the cache, logical IO them out.
logical IO block 15
multiblock read 16-19 into the cache, logical IO them out.
logical IO block 20
multiblock (well, single block really) read 21 into the cache, logical IO it out
logical IO block 22
multiblock read 23-32 into the cache, logical IO them out.


Using a direct read it would be

a) before running query, checkpoint the segment ensuring current version of data is on disk.

b) read blocks 1-32 into PGA and process them.



The size of the table (your one is bigger than the other) the multiblock read count (actual observed on the system - not the parameter) will influence whether we direct path or conventional path read the segment.
5 starsALTER TABLE CACHE Followup #4 July 15, 2009 - 2pm Central time zone
Reviewer: A reader
I ran the exact same test case on 10.2.0.4. Now it looks like neither tables are being cached?!

SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M
SQL> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_cache_size                        big integer 0

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

PARSING IN CURSOR #48 len=127 dep=0 uid=47 oct=3 lid=47 tim=3302881257 hv=1790720527 ad='4c634a74'
SELECT COUNT(*)
FROM   m, i
WHERE  m.mID   = i.mID AND
       i.cnt1  = 1 AND
       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx'
END OF STMT
PARSE #48:c=15625,e=18817,p=0,cr=56,cu=2,mis=1,r=0,dep=0,og=1,tim=3302881255
EXEC #48:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=3302881324
WAIT #48: nam='SQL*Net message to client' ela= 2 driver id=1413697536 ytes=1 p3=0 obj#=8862 
tim=3302881345
WAIT #48: nam='db file scattered read' ela= 10238 file#=16 block#=1826748 blocks=5 obj#=111458 
tim=3302892276
WAIT #48: nam='db file scattered read' ela= 23861 file#=16 block#=1826753 blocks=8 obj#=111458 
tim=3302917607
WAIT #48: nam='db file scattered read' ela= 427 file#=16 block#=1826762 blocks=7 obj#=111458 
tim=3302918388
WAIT #48: nam='db file scattered read' ela= 519 file#=16 block#=1826769 blocks=8 obj#=111458 
tim=3302919216
WAIT #48: nam='db file scattered read' ela= 439 file#=16 block#=1826778 blocks=7 obj#=111458 
tim=3302919994
...
...
WAIT #48: nam='db file scattered read' ela= 1198 file#=16 block#=1873645 blocks=16 obj#=111457 
tim=3311136789
WAIT #48: nam='db file scattered read' ela= 1305 file#=16 block#=1873661 blocks=16 obj#=111457 
tim=3311138410
WAIT #48: nam='db file scattered read' ela= 2228 file#=16 block#=1873677 blocks=16 obj#=111457 
tim=3311140953


********************************************************************************

SELECT COUNT(*)
FROM   m, i
WHERE  m.mID   = i.mID AND
       i.cnt1  = 1 AND
       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          2           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.06       9.26      69891      69917          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.07       9.27      69891      69917          2           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 47  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=69917 pr=69891 pw=0 time=9264531 us)
1000000   HASH JOIN  (cr=69917 pr=69891 pw=0 time=8964016 us)
1000000    TABLE ACCESS FULL I (cr=14346 pr=14335 pw=0 time=2010330 us)
1000000    TABLE ACCESS FULL M (cr=55571 pr=55556 pw=0 time=4020519 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                       4415        0.04          7.49
  SQL*Net message from client                     2        0.00          0.00
********************************************************************************

</wbr></wbr>


Followup July 15, 2009 - 3pm Central time zone:

probably the cache is just insufficient. By the time we read the end of the table we are finding we have to overwrite the "head" of the table. So when you go for the second run, the head of the table isn't there anymore, we read that in and overwrite more of the table - which we have to read and so on.

Not sure why this is surprising? You have well over 1/2gb of data, your SGA is just 1gb. There are many other 'pools' in there.


and unless you are the sole user on this system - there is a ton of other stuff going on.


5 stars July 15, 2009 - 3pm Central time zone
Reviewer: A reader
This is surprising because it's a single-user environment (my test database) and no tables are configured to specifically use a different pool. Shouldn't Oracle be able to stuff 1/2 GB of data into a 1 GB shared pool?

SQL> SELECT buffer_pool, COUNT(*)
  2  FROM   dba_tables
  3  GROUP  BY buffer_pool;

BUFFER_   COUNT(*)
------- ----------
               153
DEFAULT       1655

SQL> show parameter pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------
buffer_pool_keep                     string
buffer_pool_recycle                  string
global_context_pool_size             string
java_pool_size                       big integer 0
large_pool_size                      big integer 0
olap_page_pool_size                  big integer 0
shared_pool_reserved_size            big integer 30198988
shared_pool_size                     big integer 0
streams_pool_size                    big integer 0



Followup July 15, 2009 - 4pm Central time zone:

SQL> show sga

let us see how much is currently allocated to the buffer pool.
5 stars July 16, 2009 - 1am Central time zone
Reviewer: A reader
SQL> show sga

Total System Global Area 1157627904 bytes
Fixed Size                  1298016 bytes
Variable Size             218104224 bytes
Database Buffers          931135488 bytes
Redo Buffers                7090176 bytes


Followup July 16, 2009 - 11am Central time zone:

let me see the output of:

ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE';

NAME||'='||VALUE
-------------------------------------------------------------------------------
processes=150
sessions=300
sga_max_size=1157627904
sga_target=1157627904
control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/ora
data/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl

db_block_size=8192
compatible=10.2.0.1.0
db_create_file_dest=/home/ora10gr2/oradata/ora10gr2
db_recovery_file_dest=/home/ora10gr2/oradata/fbra
db_recovery_file_dest_size=10737418240
undo_management=AUTO
undo_tablespace=UNDOTBS
undo_retention=5000
db_domain=
dispatchers=(protocol=tcp)
job_queue_processes=10
db_name=ora10gr2
open_cursors=300
os_authent_prefix=OPS$
pga_aggregate_target=419430400

20 rows selected.


and I'll see if I can reproduce - as it is not, I cannot. It would seem you have sufficient buffer cache.
5 stars July 16, 2009 - 11am Central time zone
Reviewer: A reader
SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE'
  2  order  by name
  3  /

NAME||'='||VALUE
----------------------------------------------------------------------------------------------------
-------------------------------
aq_tm_processes=0
background_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\BDUMP
compatible=10.2.0.4.0
control_files=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL01.CTL, 
C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL02.CTL, C:\OR
ACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL03.CTL

core_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\CDUMP
db_block_size=8192
db_cache_size=209715200
db_create_file_dest=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB
db_domain=
db_file_multiblock_read_count=16
db_name=TEST_DB
disk_asynch_io=FALSE
fast_start_mttr_target=0
java_pool_size=0
job_queue_processes=200
large_pool_size=0
nls_length_semantics=BYTE
open_cursors=300
optimizer_index_caching=90
optimizer_index_cost_adj=25
pga_aggregate_target=419430400
processes=1000
query_rewrite_enabled=FALSE
remote_login_passwordfile=EXCLUSIVE
resource_manager_plan=
session_max_open_files=20
sessions=1500
sga_max_size=1157627904
sga_target=1157627904
shared_pool_size=0
sort_area_size=524288
star_transformation_enabled=FALSE
streams_pool_size=0
timed_statistics=TRUE
undo_management=AUTO
undo_retention=900
undo_tablespace=UNDOTBS1
user_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\UDUMP

38 rows selected.



Followup July 16, 2009 - 4pm Central time zone:

well, I disagree with the setting of most of your parameters - however on linux, I cannot reproduce your apparent findings. Now, that said, you differ from me in that the automatic SGA resizing has kicked in on your system - the buffer cache was made larger on your system.


That said, maybe someone else with a windows play system can test this out for us

ops$tkyte%ORA10GR2> set linesize 1000
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table m;

Table dropped.

ops$tkyte%ORA10GR2> drop table i;

Table dropped.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE m AS
  2  SELECT rownum mID,
  3          MOD(rownum, 100) cnt1,
  4          MOD(rownum, 200) cnt2,
  5          TRUNC(dbms_random.value(1, 10000)) cnt3,
  6          RPAD('x', 30, 'x') char1,
  7          RPAD('x', 30, 'y') char2,
  8          RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE i AS
  2  SELECT rownum iID,
  3          mID,
  4          MOD(rownum, 2) cnt1,
  5          MOD(rownum, 4) cnt2,
  6          RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10          mID,
 11          MOD(rownum, 2) cnt1,
 12          MOD(rownum, 4) cnt2,
 13          RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

ops$tkyte%ORA10GR2> SELECT segment_Name, bytes/1024/1024 MB
  2  FROM   user_segments
  3  WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME                           MB
------------------------------ ----------
I                                     120
M                                     438

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'M' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'I' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE' order  
by name;

NAME||'='||VALUE
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
aq_tm_processes=0
compatible=10.2.0.4.0
control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, 
/home/ora10gr2/oradata/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl
db_block_size=8192
db_cache_size=218103808
db_create_file_dest=/home/ora10gr2/oradata/ora10gr2
db_domain=
db_file_multiblock_read_count=16
db_name=ora10gr2
db_recovery_file_dest=/home/ora10gr2/oradata/fbra
db_recovery_file_dest_size=10737418240
disk_asynch_io=FALSE
dispatchers=(protocol=tcp)
fast_start_mttr_target=0
java_pool_size=0
job_queue_processes=200
large_pool_size=0
nls_length_semantics=BYTE
open_cursors=300
optimizer_index_caching=90
optimizer_index_cost_adj=25
os_authent_prefix=OPS$
pga_aggregate_target=419430400
processes=1000
query_rewrite_enabled=FALSE
remote_login_passwordfile=EXCLUSIVE
session_max_open_files=20
sessions=1500
sga_max_size=1157627904
sga_target=1157627904
shared_pool_size=0
sort_area_size=524288
star_transformation_enabled=FALSE
streams_pool_size=0
timed_statistics=TRUE
undo_management=AUTO
undo_retention=5000
undo_tablespace=UNDOTBS

38 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter system flush buffer_cache;

System altered.

ops$tkyte%ORA10GR2> alter table m cache;

Table altered.

ops$tkyte%ORA10GR2> alter table i cache;

Table altered.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace on
ops$tkyte%ORA10GR2> SELECT COUNT(*)
  2    FROM   m, i
  3   WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       |  5617   (3)| 00:00:42 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M|  5617   (3)| 00:00:42 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  1048   (5)| 00:00:08 |
|*  4 |    TABLE ACCESS FULL| M    |   995K|    34M|       |  3896   (2)| 00:00:29 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
        314  recursive calls
          0  db block gets
      69955  consistent gets
      69899  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          8  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2&gt;
ops$tkyte%ORA10GR2&gt; SELECT COUNT(*)
  2    FROM   m, i
  3   WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       |  5617   (3)| 00:00:42 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M|  5617   (3)| 00:00:42 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  1048   (5)| 00:00:08 |
|*  4 |    TABLE ACCESS FULL| M    |   995K|    34M|       |  3896   (2)| 00:00:29 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69917  consistent gets
          0  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2&gt; show sga

Total System Global Area 1157627904 bytes
Fixed Size                  1267116 bytes
Variable Size             301992532 bytes
Database Buffers          838860800 bytes
Redo Buffers               15507456 bytes
ops$tkyte%ORA10GR2&gt;
</wbr></wbr></wbr></wbr>

5 stars July 16, 2009 - 6pm Central time zone
Reviewer: A reader
"I disagree with the setting of most of your parameters"

Can you give me a few examples and the reason? Thanks.

Followup July 24, 2009 - 7am Central time zone:

because they are set - you shouldn't be setting very many, if any, non-default parameters beyond what MUST be set.

disabling features, setting pools to zero
4 starsCache July 17, 2009 - 2am Central time zone
Reviewer: Dheeraj from India
Tom,

Referring to the cache discussion in ther very first question.

I understand that the alter table ...cache, only effects the ageing of the table.

My question is,once we do the alter table...cache does it apply to the entire life of the table or 
is it session specific and goes off once the session is exit

Thanks,
Dheeraj


Followup July 24, 2009 - 7am Central time zone:

it is a table attribute that is in place until you change it. It survives database restarts and all.
3 starsTo: Dheera July 17, 2009 - 4am Central time zone
Reviewer: Oleksandr Alesinskyy from Germany
You alter TABLE, not session. this alteration (as was already pointed by Tom) is written into data 
dictionary - that means that it is permanent and not session-specific. Moreover, this option may be 
specified by table creation.


5 starsALTER TABLE CACHE Followup #5 July 17, 2009 - 1pm Central time zone
Reviewer: A reader
Tom,

The physical IO's have all disappered if there were no constraints and indexes on the tables. 
However, they come back as soon as constraints and indexes are introduced in the test.

These indexes/constraints were in the originally test case (ALTER TABLE CACHE   July 9, 2009 - 1am 
US/Eastern), but you have removed them from your test case.



Followup July 24, 2009 - 8am Central time zone:

give me the example, please don't make me attempt to try to reconstruct the example you think I should run...
3 starsA step back in Oracle 11g ? July 23, 2009 - 12pm Central time zone
Reviewer: Srinath from Dayton, OH
Hi Tom,
 SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:42:11 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Enter user-name: scott/********

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> CREATE TABLE m AS
  2  SELECT rownum mID,
  3          MOD(rownum, 100) cnt1,
  4          MOD(rownum, 200) cnt2,
  5          TRUNC(dbms_random.value(1, 10000)) cnt3,
  6          RPAD('x', 30, 'x') char1,
  7          RPAD('x', 30, 'y') char2,
  8          RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

SQL>
SQL>  CREATE TABLE i AS
  2  SELECT rownum iID,
  3          mID,
  4          MOD(rownum, 2) cnt1,
  5          MOD(rownum, 4) cnt2,
  6          RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10          mID,
 11          MOD(rownum, 2) cnt1,
 12          MOD(rownum, 4) cnt2,
 13          RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

SQL> set linesize 1000
SQL> SELECT segment_Name, bytes/1024/1024 MB
  2    FROM   user_segments
  3    WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME          MB
--------------------------------------------------------------------------------- ----------
I                     120
M                     440

SQL> exec dbms_stats.gather_table_stats( user, 'M' );

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.gather_table_stats( user, 'I' );

PL/SQL procedure successfully completed.


SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:59:06 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Enter user-name: scott/tiger

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set line 1000
SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE' orde
r
  2  by name;

NAME||'='||VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
audit_file_dest=C:\APP\******\ADMIN\ORA11G\ADUMP
audit_trail=DB
compatible=11.1.0.0.0
control_files=C:\APP\******\ORADATA\ORA11G\CONTROL01.CTL, C:\APP\******\ORADATA\ORA11G
\CONTROL02.CTL, C:\APP\******\ORADATA\ORA11G\CONTROL03.CTL
db_block_size=8192
db_domain=
db_name=ora11g
db_recovery_file_dest=C:\app\******\flash_recovery_area
db_recovery_file_dest_size=2147483648
diagnostic_dest=C:\APP\******
dispatchers=(PROTOCOL=TCP) (SERVICE=ora11gXDB)

NAME||'='||VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
memory_target=1719664640
open_cursors=300
processes=150
remote_login_passwordfile=EXCLUSIVE
undo_tablespace=UNDOTBS1

16 rows selected.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter table m cache;

Table altered.

SQL>
SQL> alter table i cache;

Table altered.

SQL> set autotrace on
SQL> SELECT COUNT(*)
  2      FROM   m, i
  3     WHERE  m.mID   = i.mID AND
  4            i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time
   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 22353   (1)| 00:04:29 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |   |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M| 22353   (1)| 00:04:29 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  3954   (1)| 00:00:48 |
|*  4 |    TABLE ACCESS FULL| M    |  1000K|    34M|       | 15167   (1)| 00:03:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
        314  recursive calls
          0  db block gets
      69936  consistent gets
      69896  physical reads
          0  redo size
        418  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         10  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL&gt; SELECT COUNT(*)
  2      FROM   m, i
  3     WHERE  m.mID   = i.mID AND
  4            i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time
   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 22353   (1)| 00:04:29 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |   |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M| 22353   (1)| 00:04:29 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  3954   (1)| 00:00:48 |
|*  4 |    TABLE ACCESS FULL| M    |  1000K|    34M|       | 15167   (1)| 00:03:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxx<wbr>xxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69900  consistent gets
      69891  physical reads
          0  redo size
        418  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL&gt; show sga

Total System Global Area 1071333376 bytes
Fixed Size                  1334380 bytes
Variable Size             327156628 bytes
Database Buffers          738197504 bytes
Redo Buffers                4644864 bytes
SQL&gt; 
</wbr></wbr></wbr></wbr>
Can you please shed some light on these stats in Oracle 11g please ? (autotrace output of the second Sql call)
I would not expect any physical reads at all, but I cant believe what I'm seeing.

Followup July 26, 2009 - 7am Central time zone:

tkprof, undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache.
3 starswhy the change in Stats between 10g and 11g ? July 26, 2009 - 7pm Central time zone
Reviewer: Srinath from Dayton, OH
Tom,
    Why is there a change in the stats between 11g and 10g versions ? Is it like a new feature in 
11g database ?

Thanks.


Followup July 26, 2009 - 9pm Central time zone:


anyway - we went over this a lot above - I'm not sure why this is a surprise. We beat this to death actually. You just ran an example we provided, and that example was provided to show that what you are seeing would actually happen....

But yes, you should always expect version X to work differently in some respects from version Y - else version X would be version Y wouldn't it.
3 stars July 27, 2009 - 12pm Central time zone
Reviewer: A reader
Maybe I can rephrase Srinath's question regarding the change of behavior between 10g and 11g 
because I also have a similar question. This direct path reads or whatever new data access path 
that 11g is employing to retrieve data - why should it be a step down in terms of performance? If 
you ran the SQL in 10g and assuming the tables are fully cached, it will always be faster than 
running the same SQL in 11g because 11g is doing direct path reads.  In other words, if I upgrade 
my database to 11g, the application will suffer significant performance degradation.  So the 
question is how do we get around this problem?  Thanks.


Followup July 27, 2009 - 7pm Central time zone:

... t will always be faster than running
the same SQL in 11g because 11g is doing direct path reads. ...

bzzzt - false.

did you read where I said above:

... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. ....

parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices)


You are making an assumption (cache must always be faster than disk) that is not true.

prove to us that reading 1,000,000 blocks from the buffer cache on a normal system is "always going to be faster" than getting from disk.
4 starsFollowup to Your "July 24, 2009 - 7am US/Eastern" Followup July 27, 2009 - 12pm Central time zone
Reviewer: A reader
Tom, earlier in this thread, you mentioned that we shouldn't be setting too many initialization 
parameters - e.g. setting pools to zero. I was under the impression that if automatic memory 
management is used, by setting pools to zero, Oracle would automatically figure out over time 
what's the best settings for each of the pools based on the usage of the application.  Is this not 
correct understanding?  Can you point me to a thread on how automatic memory management should be 
used properly?  


Followup July 27, 2009 - 8pm Central time zone:

You need not set anything other then the parameter to control how much memory to use - SGA_TARGET (just automatic sga memory management) or MEMORY_TARGET (in 11g, auto tune the pga AND sga together).

You need not set the individual pools
3 starsSometimes we have to set parameters .... July 30, 2009 - 6am Central time zone
Reviewer: Nadya from Russia
Tom, "You need not set anything other then the parameter..." I thought so when we upgrade to 10.2.0.4.But two days later after upgrade the instanse was terminated by RECO:ORA-04031. Trace file:
...
"KGH: NO ACCESS " 1046641408
...
We have to set the pools manual.


4 starsCaching Tables September 23, 2010 - 12pm Central time zone
Reviewer: fakdaddy from MO
Curious about the comment

"Bear in mind tho... there is really no true way to a have a purely "in memory table -- even cached 
tables are subject to aging from the buffer cache. "

So for exaggeration purposes - I have a 1GB KEEP cache, and I assign 1x 100MB table to it - the 
blocks would be aged out ?? 

I have never configured a keep cache before as I figure oracle is cleverer than myself to mange. 
However we have a critical batch app (more critical than its day use) .... so we are considering a 
KEPP pool to cach high read/low size tables so they don't get aged out during the 12HOUR OLTP/DAY 
window. 



Followup September 27, 2010 - 9am Central time zone:

your keep cache is the same as your recycle cache is the same as the default cache as far as holding blocks in memory go. The keep cache is not a special "keep this in memory" cache, it manages blocks the same way as the other caches. It is just a 'naming convention', you put things you want to 'keep' in the nicely sized keep pool and things you don't in the smaller sized recycle pool.


The goal of the keep pool would be to make it have very few objects - just the things you would like to have in the cache as much as possible - so they are not competing with lots of other stuff for space.
5 starstouch count algorithm October 19, 2011 - 8am Central time zone
Reviewer: A reader
Dear Tom,

I was reading the internals of touch count algorithm. I have a question. In the document it was told that,

"Oracle only allows buffer’s touch count to be incremented, at most, once every 3 seconds.
When a touch count is incremented buffer pointer should move. But movement of buffer pointer is independent of touch count increment. Also for any activity in memory area oracle needs a latch for assuring cache consistency. But there is an exception here !! For updating touch count, Oracle does not use latch and buffer block can be modified while touch count is getting incremented. But more interesting is that, two processes may increment the touch count to same value, and when this happens Oracle assures the worst that could happen is the touch count is not actually incremented every time a buffer is touched and that no cache corruption will result.
"

Therefore,
1) Does it mean at every 3 seconds the touch count is increased only once for each buffer? If not why?
Also, I couldn't understand how the latch process increments the touch count? \
2) How it is independently working with every 3 seconds process of incrementation of touch count to solve the problems?
3) When was Touch count algorithm introduced? Does it overwrites all the properties of Modified LRU?

Followup October 19, 2011 - 6pm Central time zone:

1) first the latch processing. A latch is another name for a lock, a serialization device. In general, if I have a latch on something - you cannot get it (some latches can be shared, but generally for read only access to some memory segment).

Since we are not using a latch to protect this counter - you can and will have the condition whereby two (or more) processes attempt to increase the counter at the same time. Suppose you and I both read the same block at about the same time. We both want to update the touch count. Suppose the existing touch count was 42 - we would both see 42, and then set the count to 43. It really should have been "44" since we both read it - but it won't be. But this is OK - we haven't 'corrupted' anything and we still have a 'pretty good' count.

a touch count for a block is only incremented after about 3 seconds.

2) I don't know what you are trying to ask there.

3) it *is* a modified LRU type of algorithm. It was introduced back in the Oracle 8i days.


3 starsgoood October 20, 2011 - 12pm Central time zone
Reviewer: A reader


5 starsTable Caching February 21, 2012 - 12am Central time zone
Reviewer: A reader
hi Tom,

I have five-six tables having row count between 1000 to 25000. Is it advisable to keep the tables 
of such sizes in memory through CACHE?

What is the ideal size for a table with 20 columns with average column size as 20 bytes. The SGA is 
3GB

What are the drawbacks of the CACHE aaproach?

Please advise.


Followup February 21, 2012 - 7pm Central time zone:

no, do not use cache, if you use them - they'll be cached.


... What is the ideal size for a table with 20 columns with average column size as
20 bytes. ...


you are missing an important metric ;) the number of rows.


alter table t cache does not cause a table to be cached. It changes the way the blocks are managed in the cache when the blocks are read during a full scan only. If you are using index access, it will have no affect.
4 starsBuffer Cache & Parallel Query with Direct I/O May 30, 2012 - 5am Central time zone
Reviewer: Ian Wallace from Dublin
Hi Tom,

I've read a number of comments with interest in this thread, especially around the use of the buffer cache and parallel queries. In the thread about you state the following:

"Do the blocks need to be kept cached? should they be cached? are you using parallel query (more
common with large full table scans) - then the cache doesn't really matter (direct io).
"

"... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. ...."

And finally....

"parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices)"

This topic is very pertinent to us at the moment. We're designing a reporting solution whereby we have created one denormalized table which is composite partitioned RANGE - HASH, the theory being that with partition elimination we will always full scan a sub partition. We have a 4 node RAC cluster and we're hoping to take advantage of parallelism.

We're currently designing this solution with some external consultants who have suggested defining this table in the KEEP pool. The size of the KEEP pool would be circa 10GB whereas the size of our SGA is currently 30GB. Based on what you've said above would we actually make this of this pool, especially if parallel query is using direct I/O? Also could you provide a simple example showing how the use of parallel query negates the use of the buffer pool?

The database version is 10.2.0.2 on Red Hat Linux 4.

Many thanks,
Ian.


Followup May 30, 2012 - 7am Central time zone:

ask them "why", since full scans in general will flush the buffer cache and do direct IO from disk?

Just run a parallel query and tell me if you see logical IO's or not ;)

In 11g, there is a new in memory parallel query
http://docs.oracle.com/cd/E11882_01/server.112/e16638/memory.htm#i30761


but that doesn't apply to you.

and even then, I wouldn't be using the KEEP pool


5 stars May 30, 2012 - 11am Central time zone
Reviewer: Alexander
"ask them "why", since full scans in general will flush the buffer cache and do direct IO from disk?"

Did you mean "will NOT flush the buffer cache..."?

Or what the CACHE table setting somewhere implied that I missed?

Followup May 30, 2012 - 11pm Central time zone:

No, I meant it would do a segment level checkpoint - to get the blocks for the segment you are scanning onto disk so it can bypass the inefficiencies of the buffer cache and just do direct IO from disk.

if they didn't flush those blocks to disk - they might read the wrong version (there could be a committed version in the cache that is not on disk).



I should have said "parallel full scans" for 10g, but in 11g - just saying "full scans" is sufficient
5 stars May 30, 2012 - 1pm Central time zone
Reviewer: Ian Wallace from Dublin
Hi Tom,

Could you clarify what you mean when you say full scans in general will flush the buffer cache and do direct path I/O from disk. My understanding is that parallel query in 10g will just bypass the buffer cache and always do direct I/O?

Thanks,
Ian.

Followup May 31, 2012 - 12am Central time zone:

see right above, I clarified that.
4 stars'keep' pool June 1, 2012 - 12am Central time zone
Reviewer: A reader
Hi tom,

How is ORACLE managing 'keep' pool itself?

Let's see i use 'storage' clause to put many tables into 'keep' pool, whose total size can not be
accormordated by 'keep' pool size, which one will be 'out'? 
Is it following the same logic that full table scan one will be the first candidate?


Followup June 1, 2012 - 6am Central time zone:

the keep and recycle pool are identical in nature of the "buffer pool"

by default segments are cached in the buffer pool, the default pool

You can alter/create a segment and specify to use the keep or recycle pools instead. Once they are marked for those pools they will be cached in those pools instead of the default pool - but the way they are cached are identical to the default pool.

I wish the keep pool was named "non-default pool1" and the recycle pool was named "non-default pool2".


In your question:

Let's see i use 'storage' clause to put many tables into 'keep' pool, whose
total size can not be
accormordated by 'keep' pool size, which one will be 'out'?
Is it following the same logic that full table scan one will be the first
candidate?


just replace keep with 'default' (or recycle even) and the answer is the same.

when the X pool fills up, dbwr is called upon to make space in it using LRU like algorithms.
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics