Simulation

Callback Sequence

The simplest type of simulation callbacks are the events. Using callbacks the application can simply listen for events and react as required, provided the callbacks obey the rule that SDK state changes are forbidden. This restriction may be a bit surprising given that the SDK permits writes to an inactive back-buffer while the simulation is running. Event callbacks, however, are not called from within the simulation thread, but rather from inside fetchResults(). The key point here is that fetchResults() processes the buffered writes, meaning that writing to the SDK from an event callback can be a particularly fragile affair. To avoid this fragility it is necessary to impose the rule that SDK state changes are not permitted from an event callback.

Inside fetchResults(), among other things, the buffers are swapped. More specifically, this means that properties of each object's internal simulation state are copied to the API-visible state. Some event callbacks happen before this swap, and some after. The events that happen before are:

  • onTrigger
  • onContact
  • onConstraintBreak

When these events are received in the callback, the shapes, actors, etc. will still be in the state they were in immediately before the simulation started. This is preferable, because these events were detected early on during the simulation, before objects were integrated (moved) forward. For example, a pair of shapes that get an onContact() to report that they are in contact will still be in contact when the call is made, even though they may have bounced apart again after fetchResults() returns.

On the other hand, these events are sent after the swap:

  • onSleep
  • onWake

Sleep information is updated after objects have been integrated, so it makes sense to send these events after the swap.

To 'listen' to any of these events it is necessary to first subclass PxSimulationEventCallback so that the various virtual functions may be implemented as desired. An instance of this subclass can then be registered per scene with either PxScene::setSimulationEventCallback or PxSceneDesc::simulationEventCallback. Following these steps alone will ensure that constraint break events are successfully reported. One further step is required to report sleep and wake events: to avoid the expense of reporting all sleep and wake events, actors identified as worthy of sleep/wake notification require the flag PxActorFlag::eSEND_SLEEP_NOTIFIES to be raised. Finally, to receive onContact and onTrigger events it is necessary to set a flag in the filter shader callback for all pairs of interacting objects for which events are required. More details of the filter shader callback can be found in Section Collision Filtering.

Simulation memory

PhysX relies on the application for all memory allocation. The primary interface is via the PxAllocatorCallback interface required to initialize the SDK:

class PxAllocatorCallback
{
public:
    virtual ~PxAllocatorCallback() {}
    virtual void* allocate(size_t size, const char* typeName, const char* filename,
        int line) = 0;
    virtual void deallocate(void* ptr) = 0;
};

After the self-explanatory function argument describing the size of the allocation, the next three function arguments are an identifier name, which identifies the type of allocation, and the __FILE__ and __LINE__ location inside the SDK code where the allocation was made. More details of these function arguments can be found in the PhysXAPI documentation.

Note

An important change since 2.x: The SDK now requires that the memory that is returned be 16-byte aligned. On many platforms malloc() returns memory that is 16-byte aligned, but on Windows the system function _aligned_malloc() provides this capability.

Note

On some platforms PhysX uses system library calls to determine the correct type name, and the system function that returns the type name may call the system memory allocator. If you are instrumenting system memory allocations, you may observe this behavior. To prevent PhysX requesting type names, disable allocation names using the method PxFoundation::setReportAllocationNames().

Minimizing dynamic allocation is an important aspect of performance tuning. PhysX provides several mechanisms to control and analyze memory usage. These shall be discussed in turn.

Scene Limits

The number of allocations for tracking objects can be minimized by presizing the capacities of scene data structures, using either PxSceneDesc::limits before creating the scene or the function PxScene::setLimits(). It is useful to note that these limits do not represent hard limits, meaning that PhysX will automatically perform further allocations if the number of objects exceeds the scene limits.

16K Data Blocks

Much of the memory PhysX uses for simulation is held in a pool of blocks, each 16K in size. The initial number of blocks allocated to the pool can be controlled by setting PxSceneDesc::nbContactDataBlocks, while the maximum number of blocks that can ever be in the pool is governed by PxSceneDesc::maxNbContactDataBlocks. If PhysX internally needs more blocks than nbContactDataBlocks then it will automatically allocate further blocks to the pool until the number of blocks reaches maxNbContactDataBlocks. If PhysX subsequently needs more blocks than the maximum number of blocks then it will simply start dropping contacts and joint constraints. When this happens warnings are passed to the error stream in the PX_CHECKED configuration.

To help tune nbContactDataBlocks and maxNbContactDataBlocks it can be useful to query the number of blocks currently allocated to the pool using the function PxScene::getNbContactDataBlocksUsed(). It can also be useful to query the maximum number of blocks that can ever be allocated to the pool with PxScene::getMaxNbContactDataBlocksUsed.

Unused blocks can be reclaimed using PxScene::flushSimulation(). When this function is called any allocated blocks not required by the current scene state will be deleted so that they may be reused by the application. Additionally, a number of other memory resources are freed by shrinking them to the minimum size required by the scene configuration.

Scratch Buffer

A scratch memory block may be passed as a function argument to the function PxScene::simulate. As far as possible, PhysX will internally allocate temporary buffers from the scratch memory block, thereby reducing the need to perform temporary allocations from PxAllocatorCallback. The block may be reused by the application after the PxScene::fetchResults() call, which marks the end of simulation. One restriction on the scratch memory block is that it must be a multiple of 16K, and it must be 16-byte aligned.

In-place Serialization

PhysX objects cab be stored in memory owned by the application using PhysX' binary deserialization mechanism. See Serialization for details.

PVD Integration

Detailed information about memory allocation can be recorded and displayed in the PhysX Visual Debugger. This memory profiling feature can be configured by setting the trackOutstandingAllocations flag when calling PxCreatePhysics(), and raising the flag PxVisualDebuggerConnectionFlag::eMEMORY when connecting to the debugger with PxVisualDebuggerExt::createConnection().

Completion Tasks

A completion task is a task that executes immediately after PxScene::simulate has exited. If PhysX has been configured to use worker threads then PxScene::simulate will start simulation tasks on the worker threads and will likely exit before the worker threads have completed the work necessary to complete the scene update. As a consequence, a typical completion task would first need to call PxScene::fetchResults(true) to ensure that fetchResults blocks until all worker threads started during simulate() have completed their work. After calling fetchResults(true), the completion task can perform any other post-physics work deemed necessary by the application:

scene.fetchResults(true); game.updateA(); game.updateB(); ... game.updateZ();

The completion task is specified as a function argument in PxScene::simulate. More details can be found in the PhysAPI documentation.

Synchronizing with Other Threads

An important consideration for substepping is that simulate() and fetchResults() are classed as write calls on the scene, and it is therefore illegal to read from or write to a scene while those functions are running. For the simulate() function it is important to make the distinction between running and ongoing. In this context, it is illegal to read or write to a scene before simulate() exits. It is perfectly legal, however, to read or write to a scene after simulate() has exited but before the worker threads that started during the simulate() call have completed their work.

Note

PhysX does not lock its scene graph, but it will report an error in checked build if it detects that multiple threads make concurrent calls to the same scene, unless they are all read calls.

Note

Write operations that occur between simulate() and fetchResults() being called are buffered and applied during fetchResults(). Any state changes will override the results of the simulation. Read operations that occur between simulate() and fetchResults() being called will return the buffer state, i.e. the state at the beginning of the simulation or corresponding to any buffered state changes that will be applied during fetchResults().

Substepping

For reasons of fidelity simulation or better stability it is often desired that the simulation frequency of PhysX be higher than the update rate of the application. The simplest way to do this is just to call simulate() and fetchResults() multiple times:

for(PxU32 i=0; i<substepCount; i++)
{
    ... pre-simulation work (update controllers, etc) ...
    scene->simulate(substepSize);
    scene->fetchResults(true);
    ... post simulation work (process physics events, etc) ...
}

Sub-stepping can also be integrated with the completion task feature of the simulate() function. For an example of how this can be achieved, see SnippetSubStep.