We propose two hardware mechanisms to decrease energy consumption on massively parallel graphics processors for ray tracing while keeping performance high. First, we use a streaming data model and configure part of the L2 cache into a ray stream memory to enable efficient data processing through ray reordering. This increases the L1 hit rate and reduces off-chip memory accesses substantially. Second, we employ reconfigurable special purpose pipelines than are constructed dynamically under program control. These pipelines use shared execution units (XUs) that can be configured to support the common compute kernels that are the foundation of the ray tracing algorithm, such as acceleration structure traversal and triangle intersection. This reduces the overhead incurred by memory and register accesses. These two synergistic features yield a ray tracing architecture that significantly reduces both power consumption and off-chip memory traffic when compared to a more traditional cache only approach.
@inproceedings{HPG13_koptaEnergyBandwEfficientRTArch,
author = {Daniel Kopta and Konstantin Shkurko and Josef Spjut and Erik Brunvand and Al Davis},
booktitle = {High-Performance Graphics 2013},
title = {{An Energy and Bandwidth Efficient Ray Tracing Architecture}},
pages = {121–128},
doi = {10.1145/2492045.2492058},
url = {https://doi.org/10.1145/2492045.2492058},
year = {2013}
}
This material is based upon work supported by the National Science Foundation under Grant No. CNS-1017457. The Vegetation and Hairball models are from Samuli Laine, and the Sibenik Cathedral model is from Marko Dabrovic.