The Compute Particles sample shows how OpenGL Compute Shaders can be used along with OpenGL rendering to create complex animations and effects entirely on the GPU.


APIs Used

  • glBindProgramPipeline
  • glUseProgramStages
  • glDispatchCompute
  • glMemoryBarrier

Shared User Interface

The Graphics samples all share a common app framework and certain user interface elements, centered around the "Tweakbar" panel on the left side of the screen which lets you interactively control certain variables in each sample.

To show and hide the Tweakbar, simply click or touch the triangular button positioned in the top-left of the view.

Technical Details


This sample shows how compute shaders can be used to implement massively parallel animation tasks on the GPU, and pass the results to rendering. It demonstrates how vertex shaders can access any element of the compute shader's output buffer by indexing.

The high-level stages are as follows:

  1. The compute shader updates the positions and velocities of the particles on the GPU.

  2. The vertex shader accesses the resulting particle positions and expands them out to 4-vertex quads.

  3. The fragment shader applies a computed exponential fall-off alpha to draw smooth particles.

UI Options
The following options can be adjusted in the tweak UI:
  • The animate toggle enables and disables the compute shader invocation and thus the particle animations.

  • Enable attractor toggles the magnitude of the moving attractor location to zero and non-zero. When the attractor is off, only the noisy vector field affects the particles.

  • Sprite size adjusts the value sent to vertex shader to set the size of the particle quads.

  • Noise strength adjusts the magnitude of the noise vector field that is added to the particle velocities each frame.

  • Noise frequency adjusts the scaling of the spatial frequency of the noise lookup.

  • Reset returns the particles to their initial positions.


The details of each stage are as follows:

Compute Shader

The compute shader operates on a pair of in/out buffers: one for positions and one for velocities. The velocities are used only by the compute shader, while the positions are read and written by the compute shader, and read by the rendering shader. The steps in the per-frame compute shader update are:

  1. The sources used to update each particle are read: these include previous position, previous velocity, and the position and strength of the attractor.

  2. Next, a vector-valued multi-octave (4 octaves) smooth noise value is computed and added to the velocity vector. This addition is scaled by the adjustable noise magnitude parameter.

  3. An attraction vector is computed, including a quadratic falloff (this looks in the code to be a scaling by the inverse cube, but the vector is not normalized, so one factor is for the normalization). This vector is scaled by the adjustable attractor magnitude and added to the velocity.

  4. The velocity is added to the particle position as an integration step.

  5. The velocity is scaled by a damping factor to keep the system's energy in check.

Vertex Shader

The vertex shader serves multiple purposes in this rendering pipeline. Not only does it project the vertices, it also serves as a simple "vertex amplifier," splitting the N particle positions in 4N vertices, offsetting to the four corners of each particle.

  1. First, the particle index is computed by dividing the vertex ID by 4. Every four-vertex quad represents a single particle.

  2. The position of the particle is looked up in the compute shader buffer.

  3. The 0.0-1.0 U and V position of the vertex in the quad is computed via bitmasking of the vertex ID (i.e. its value mod 4). This is stored as quadPos, which is used directly as the texture coordinates.

  4. The quadPos is used to offset the particle center horizontally and vertically to form the correct quad corner.

  5. The resulting vertex is transformed and output.

Fragment Shader

The fragment shader is extremely simple. It writes the passed-in varying color as the RGB channels and computes an exponential fall-off from the center of the particle, placing it in the alpha channel. It discards the fragment if the computed alpha is close to zero.