Nvidia Flow is an adaptive sparse voxel fluid simulation library for real-time applications. It currently includes backends for D3D11 and D3D12 support. Simulation and rendering capabilities are included, although interfaces exist to allow custom rendering.
NvFlowContext provides the graphics API abstraction that the core Flow API operates against. Graphics API dependent functions and structures are provided to allow sharing between Flow and applications. For D3D11, the graphics API dependent functions and structures are in NvFlowContextD3D11.h.
NvFlowGrid is the core simulation object. It is responsible for the most of the memory allocated by Flow. Simulation data is stored in sparse 3D textures, with 64KB blocks, typically 32x16x16 with FP16x4 format. There are two primary sparse 3D texture channels, one with velocity, the other with temperature, fuel, burn, and smoke fields for combustion. Blocks are allocated dynamically in active regions, with allocation driven by both emitters and simulation values.
The sparse 3D textures can be layered. This allows multiple active simulations inside the same grid, each with unique simulation parameters. This allows the memory pool provided by the grid to be shared between multiple effects. Per layer parameters are supported using Grid Materials. Grid Materials are created and referenced in Flow emitters. Active materials are tracked, with layers being allocated only for active materials.
Native emitter and collision behavior is supported in the grid, with a unified interface. Spheres, capsules, boxes, convexes, and signed distance fields are supported. Performance for spheres, capsules, boxes, and convexes is optimized for scaling, with hundreds of shapes per frame being practical. The grid processes these as emit events, where a shape and bundle of parameters is applied to the grid as an impulse. This make emitters generally stateless from the perspective of the grid. Custom emitter callbacks are also supported. This is useful for many things, including two way particle interaction.
NvFlowVolumeRender is the core visualization object. It performs ray marching on data provided by a NvFlowGrid or any other source through the NvFlowGridExport interface. Simulation results are rendered by ray marching one of the sparse 3D texture channels. Multiple render modes are supported, with the most common mode using a color map to manipulate color and transparency based on temperature and density. The ray marching can be performed at resolutions independent of the application render target, to improve performance and avoid oversampling the density field. The rendered result is composited against an application provided render target, with support for compositing against conventional, multiple resolution shading, and lens matched shading render targets. The ray marching also supports early ray termination based on an application provided depth buffer, for proper occlusion.
Multiple render materials are supported, to allow volume rendering parameters defined per grid material. Multiple layer ray march rendering is also supported, so that overlapping layers do not have sorting artifacts. Multiple render materials per grid material is also supported, so that multiple components of the same layer can be visualized. For example, one render material might have a temperature driven color map to visualize flame, while a second render material might visualize fuel.
NvFlowGridExport and NvFlowGridImport serve as the core interfaces to read and write layered 3D sparse textures with Flow. The Flow Grid provides a grid export for read only access to simulation results. Many operations can be done with that export. For example, data can be moved between multiple GPUs using a NvFlowGridProxy. In this case, grid export provides the necessary read interface to encode necessary data and copy, but provides no way to feed that information into NvFlowVolumeRender on the other GPU. Grid import provides a way to write information to a layered sparse 3D texture, and then get a grid export interface for that written data.
NvFlowGridProxy serves as a very useful extension. It provides an interface to abstract potential data movement. Data movement currently supported includes GPU to GPU tranfers, and versioning for safe async compute operation. Also included is a passthrough mode, this allows the grid proxy to be used in all cases, reducing app side code paths.
NvFlowVolumeShadow is an extension to perform sparse voxel self shadowing. It takes a grid export as a input, and outputs a grid export, with the burn component overridden with shadow values. The NvFlowVolumeRender can then take this shadow component and modulate intensity during ray march.
NvFlowDevice is an extension to simplify the creation of Flow dedicated devices and queues. This is useful for multiple GPU and async compute, combined with NvFlowGridProxy. Support is included to create D3D12 devices and queues that can interoperate with a D3D11 application.
These parameters tend to have the largest impact on performance and memory consumption. The grid size and virtual dimensions together determine the cell size. Finer grid cells tend to produce more detail at higher computational cost. Resident scale sets the fraction of virtual grid cells that can simulate simultaneously, controlling how much memory should be allocated/made resident on the GPU. Allocating a fraction of the memory needed to simulate all virtual cells simultaneously means the grid can have more virtual cells than would practical fit on the GPU, recycling memory as an effect moves around.
Flow is also able to leverage Volume Tiled Resources on supporting hardware and operating systems. This allows the GPU hardware to perform the translation from virtual cell coordinates to the correct memory address. This provides significant performance benefit to the simulation shader perf. Overhead is added in the form of page table updates that the OS must perform, however, in most cases these updates can be performed concurrently to other GPU work. This support is experimental, and best used for grids that do not change shape quickly.
NvFlowGridMaterialParams apply uniformly to a given layer, controlling fundamental properties of the simulation. The damping and fade parameters allow the user to control how fast effects lose energy and fade out of visibility. Combustion parameters drive the behavior of fire, allowing the user to manipulate fuel to heat conversion rates, buoyancy, expansion, and cooling rates. Vorticity strength allows the user to control the degree of rotational flow in the simulation. The weights and thresholds for each channel allow the user to prioritize allocation. For example, in the typical fire case, only the temperature weight is non-zero, since the color map fades out regions that are not hot. Since cool region are not visible, disabling simulation there improves performance substantially, with minimal visible impact.
NvFlowGridEmitParams are used to control both emitter and collider behavior. Per channel couple rates control how aggressively the emitter attempts to move a grid cell’s value to emitter channel value. Zero couple rate allows channels to be selectively disabled, high couple rate allow for immediate grid cell override. A typical default couple rate like 0.5 allows the emitter to influence the grid, but also allows the grid simulate in the active region of the emitter, allowing smoother and more consistent behavior.
NvFlowGridEmitParams not only control simulation behavior, but also drive grid allocation. The main controls are allocation scale and allocation predict. Allocation scale determines what grid blocks should be forced to allocate, relative to the size and location of the emitter bounding box. An allocation scale of 0.0 will result in no forced allocation. This is useful for collision objects. An allocation scale of 1.0 is a good default when the user wants the grid allocate for an effect. A non-zero allocation scale is required for to get an inactive region to become active. Once a region is active, it will automatically expand the domain as needed based on activity in the simulation.
Making new blocks active in the grid takes time, especially in the Volume Tiled Resource case, where changes to the page table must go through the operating system. Allocation predict provides a mechanism to request allocations based on emitter velocity, greatly increasing the chance blocks are allocated as the emitter passes over them. This is very useful for fast moving objects on trajectories.
Due to numerical diffusion, it is best to establish a fixed time step for simulation updates, or at least minimum/maxmimum bounds on the timestep. Numerical diffusion has the effect of damping the simulation, so very short timesteps do not necessarily result in higher perceived visual fidelity. Emitters are handled as events with their own delta time value. This means they can be stepped independently of the grid simulation. Substepping emitters, especially fast moving ones, will improve visual appearance, since the emitter location and orientation will be advanced a shorter distance per emit event.