Measuring MREADTIM system statistic, alternative way

[Why this post: GATHER_SYSTEM_STATS does not gather MREADTIM information from Direct Path Reads]

Oracle can be tuned in a lot of parts and places. One of these is when Oracle is going to choose between reading an index or doing a full table scan.

In this blog I’m not going into depth about all this, but one of the ‘parameters’ here is setting the MREADTIM system statistic to a ‘real life’ value. This value will tell Oracle how fast reading multiple blocks from disk is with all the overhead in between. How many multiple blocks is, is defined by the multi block read count (MBRC) setting. Together with SREADTIM, IOSEEKTIM and MBRC this will have influence in the execution path Oracle will choose.

You can adjust Oracle here to get better (lower cost) access paths, but that’s not something I prefer. I want to set the correct, ‘real life’ values and leave the adjustment to the optimizer_index_cost_adj parameter which you can use to give indexes more preference. My opinion is; if your hardware is slow, don’t fake it to be fast. You will shoot yourself in the foot.

Back to MREADTIM. One can measure this by making Oracle do a lot of Scattered Reads (usually from full table scans) and collect them with DBMS_STATS.GATHER_SYSTEM_STATS. But there is a catch, Oracle introduced Direct Path Reads since 11g, and it seems the GATHER_SYSTEM_STATS does not gather MREADTIM information from Direct Path Reads. You can try, but the outcome is 0ms…

Of course you can turn off Direct Path Reads with a ‘10949 trace event’, but isn’t that cheating?
Direct Path Reads skip SGA; 1. not to push out buffers which are costly to get again, and 2. not to have this overhead of going through the SGA. So why switch to Scattered Reads, measure them and then revert to tell Oracle not to use Scattered Reads. Oracle will use Direct Path Reads when the amount of data to be read is larger than 2% of the buffer cache and less than 5 times the buffer cache.

I have used another solution to measure/calculate MREADTIM and that is tracing the Direct Path Reads and average them. Another advantage is that can you can read the trace file yourself and see if you are burning through the storage cache. People sometimes get results MREADTIM is faster than the SREADTIM (for single blocks), but that’s because they are hitting the storage cache. My opinion is hitting the storage cache should be considered a bonus. ‘Thank you storage’, but I don’t want to rely on it. It will also give preference for Full Table Scans over using indexes this way, which I think is bad.

My steps to measure MREADTIM

  • Find a large table to trigger Direct Path Reads and enough data to get an accurate result (also burn cache)
  • Start your trace; Start your select which triggers a Full Table Scan and using Direct Path Reads; Stop your trace
  • Get your MBRC
  • Locate the .trc file
  • awk and grep the elapsed times from it and calculate MREADTIM
  • Set the MREADTIM

Find a large enough table

Go ahead and look for some bigger tables…

select owner, table_name, num_rows, blocks
from dba_tables
where num_rows is not null
order by num_rows desc;

Start your trace, do your select, stop your trace, get MBRC

Start tracing with the 10046 event for your session, start the select with hints to force Direct Path Reads and stop the trace…

-- start trace
alter session set events '10046 trace name context forever, level 8';

-- select for direct path reads
select /*+ full(t) noparallel(t) nocache(t) */ count(*) from <table> t where t.col = 'blog.oracle48.nl';

-- stop trace
alter session set events '10046 trace name context off';

-- get the MBRC amount
show parameter db_file_multiblock_read_count

NAME TYPE VALUE 
----------------------------- ------- ----- 
db_file_multiblock_read_count integer 128 

Locate .trc file

Get the trace file for your session where the information has been stored:

-- locate your trace file
select tracefile from v$process p
join v$session s on s.paddr = p.addr
join v$mystat m on m.sid = s.sid
where m.statistic# = 0;

awk, grep and calculate

The contents of the trace file will hopefully be filled with lot of these and we get the elapsed times (ela=) and block counts for the Direct Path Reads out with awk:

WAIT #179856: nam='direct path read' ela= 13697 file number=13 first dba=12800 block cnt=128 obj#=114681 tim=1464176572743564
-- column 9 and row 18 for direct path reads
$ awk -F'[ =]' '/direct path read/ { print $9"\t"$18 }' <path>/<file>.trc > dpr.txt

-- only get the 128 mbrc ones (can be skipped)
$ grep 128 dpr.txt > dpr_mbrc.txt

-- sum the elapsed times and mbrc
$ awk '{elapsed+=$1} END {print elapsed}' dpr_mbrc.txt
46295354
$ awk '{mbrc+=$2} END {print mbrc}' dpr_mbrc.txt
504960

Now divide 46295354 (total elapsed time) and 504960 (total 128 blocks) which is 91 microseconds per block. The MBRC is 128 so times 91 is 11.735 milliseconds for 128 blocks. This is the MREADTIM. 128 blocks of 8k is 1MB, so it’s 12ms for 1MB.

Set MREADTIM

Before setting the values you just calculated, think about if the value is correct, is 30 and you know you know your storage is slow or is it 4 and you know your storage is pretty fast. If it’s 2 and you know your storage is old and you should not get these values, then you probably have been hitting the cache!

Get the old values first, so you can reset to the old values if this are starting to go bad…

-- note the old values first
select * from sys.aux_stats$ order by sname, pname;
-- set the new MBRC and MREADTIM
exec DBMS_STATS.SET_SYSTEM_STATS (pname => 'MBRC', pvalue => '128');
exec DBMS_STATS.SET_SYSTEM_STATS (pname => 'MREADTIM', pvalue => '12');

That’s it… any comments, please post them.

Happy reading!

Tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.