Unfortunately APEX simulation does not gracefully handle many of the cases where we can run out of GPU memory. Even handling an out-of-memory condition just means halting simulation since there is no way we can simulate on the GPU if there is not enough memory. So the best course of action is to avoid running out of GPU memory in your shipping application.
You may need to limit how many grids and particles you create in your application to limit the maximum memory usage. You can then measure what the peak memory usage is in your application. Since some users have less video memory than others, you will also want to know up front whether there is enough video memory to meet the peak memory usage. To do that you can pre-allocate the CUDA heaps with CudaMemoryManager::reserve() when the application starts up, and if it fails you can simply disable APEX particle effects in the application.
To figure out the high water mark for memory usage in your application, you can query the CudaMemoryManager for stats. See the APEX samples (OpenAutomateInterface.cpp) for an example of how to do this. We recommend an on-screen display of memory usage you can toggle during development so you can get instant feedback on how expensive your effects are. The two heaps you will want to monitor (and eventually pre-allocate) are CudaBufferMemorySpace::T_GPU and CudaBufferMemorySpace::T_PINNED_HOST. All APEX buffers are of type F_READ_WRITE.