Monday, January 22, 2018

RAID 10 vs RAID 5 Some Empirical Evidence

There are many trade-offs between RAID 10 (or 1+0) and RAID 5 in terms of cost and performance. It's considered RAID 10 better in terms of writes compared to RAID 5, which has a overhead associated with writes. In terms of cost RAID 10 would require twice the disk storage as RAID 5 to get same capacity thus doubling the cost. RAID 10 is the preferred RAID for Oracle but in real world this isn't always followed due to financial constraints. So it's not uncommon to find Oracle database deployed with RAID 5. The metalink document 30286.1 list preferred RAID level for each file type (there's no mention of RAID 10 (1+0), document needs updating).
The post shows the change in physical writes and read by way of empirical evidence after a database was moved from a RAID 5 to RAID 10. DB was a two node RAC (11.2.0.4) using ASM for storage. It is a production database running a property software that could be considered as having OLTP characteristics. With RAID 5 it had two ASM disk groups with online redo, control file multiplexed between two disk groups. Data file resided in one disk group and archive logs were in another. On the physical level RAID 5 consisted of 4 x 400GB disk with one hot spare.
The new RAID 10 configuration consisted of 7 x 600 GB disks with one hot spare. During the move, all redo logs, control files, data files were moved to ASM disk group created using disk from RAID 10 LUN. Archive logs were still going to RAID 5.
Below is a comparison of physical reads,writes and log file sync waits for one month period before and after the change. The two periods had no statistically significant difference in the mean value for physical reads, physical writes and log file sync wait at 95% confidence level. Thus it is considered that two periods had same amount of "work". Physical read/write values were obtained from STATS$FILESTATXS (DB is Standard Edition, thus no AWR) for the data files in the main tablespace of the application. Statspack had hourly snapshots.

Physical Reads
Considering each period had same amount of physical reads it appears that on RAID 10 it took less read time compared to RIAD 5. This was confirmed by a statistical significance test.

Physical Writes
Unlike the physical reads earlier, there was no statistically significant reduction in the physical write times. This doesn't imply RAID 10 and RAID 5 perform same for writes, which is not true. One would expect RAID 10 to outperform RAID 5, especially when it comes to writes. But there was no empirical evidence for it at least for the sample periods looked at (as this seems to be contradictory to what's said about RAID 10 writes vs RAID 5 writes several sample periods were compared and all indicated there was no statistically significant reduction in write times). All that could be said about this is that, for this application at least, moving from RAID 5 to RAID 10 didn't result in signifiant reduction in physical write time (RAID 10 write times were not higher nor lower compare to RAID 5 write times). However as seen in the subsequent section other types of writes (redo log writes) have benefited significantly from going to RAID 10.





Log File Sync Waits
Log file sync wait times represent the time spent flushing log buffer to disk i.e to redo log files (34592.1). This is the other significant write in the system and happens in real time. Though physical writes didn't have a reduction, there was a significant reduction the log file sync times, which means redo log file writes have benefited from the RAID change.

In summary the empirical evidence is that RAID 10 outperform RAID 5 for OLTP like applications. However, there could be cases where there's no visible gain as in this case with physical writes.