The code of the vertex and fragment shaders are specified by the
programmer. The vertex shader has built-in inputs
gl_VertexID
and gl_InstanceID
and built-in
outputs gl_Position
and gl_PointSize
. The
fragment shader has built-in inputs gl_FragCoord
,
gl_FrontFacing
, and gl_PointCoord
and built-in
output gl_FragDepth
. The fragment shader also has a single
programmer-named color output which is used to display the frame buffer.
In between these two shaders the primitives are assembled, clipped,
projected, and rasterized, with values interpolated to each resulting
fragment.
The vertex shader may take additional programmer-specified inputs
called attributes.
The values of these attributes are pulled from
special arrays in graphics memory called buffers.
The set of
values to run the vertex shade on, together with how sets of vertices
are to be assembled into primitives, is specified by the specific draw
command used as discussed below.
The vertex shader may produce additional outputs called
varyings.
These are automatically interpolated by the rasterizer
and their interpolated values (still called varyings
) are
provided as additional inputs to the fragment shader.
Both shaders have access to global values called uniforms
that
are the same for all vertices and fragments in a given draw command.
Sending values to the buffers tends to be significantly slower than
rendering from the buffers that are there, so there’s a preference for
making the buffers static, with values specified once and rendered many
times; changing the uniforms each frame can create per-frame motion with
a static buffers. Uniforms are also used for large data like
textures.
To run a shader program, the GPU needs to know
There are multiple ways to provide this data, each of which has multiple steps. We illustrate the first two with an example based on the following simple three-triangle bowl object:
Listed in rough order of likelihood to be what you want, from most likely to least likely, these are:
For most polygonal approximations of smooth surfaces:
For each scene object,
Array | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Position | 0 | 0 | -1 | ½ | -1 | 0 | ½ | 1 | 0 | -1 | 0 | 0 |
Normal | 0 | 0 | 1 | -⅓ | ⅔ | ⅔ | -⅓ | -⅔ | ⅔ | \sqrt{5}/3 | 0 | ⅔ |
Index | 0 | 1 | 2 | 0 | 2 | 3 | 0 | 3 | 1 |
For each scene object,
gl.drawElements
This works well for almost any object type. It is a bit less efficient than the next option for drawing points or for drawing other primitives that do not share vertex attributes (such as flat-shaded polyhedra).
For most disconnected points or flat-shaded polygons:
For each scene object,
Array | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Position | 0 | 0 | -1 | ½ | -1 | 0 | ½ | 1 | 0 | 0 | 0 | -1 | ½ | 1 | 0 | -1 | 0 | 0 | 0 | 0 | -1 | -1 | 0 | 0 | ½ | -1 | 0 |
Normal | 0 | 0 | 1 | -⅓ | ⅔ | ⅔ | -⅓ | -⅔ | ⅔ | 0 | 0 | 1 | -⅓ | -⅔ | ⅔ | \sqrt{5}/3 | 0 | ⅔ | 0 | 0 | 1 | \sqrt{5}/3 | 0 | ⅔ | -⅓ | ⅔ | ⅔ |
For each scene object,
gl.drawArrays
This works well for objects that do not share vertex attributes, such as points or flat-shaded polyhedra. If vertices and their attributes are used for multiple primitives, as is the case for most virtually all polygonal approximations of smooth objects, the previous option is more efficient.
For many copies of the exact same object:
If you have several multiple copies of the same scene object in the
scene such that you can easily compute their placement using the same
uniform
s coupled with an integer telling you which copy
you’re drawing, then use one of the previous two options but use the
gl.drawElementsInstanced
or
gl.drawArraysInstanced
methods instead of the non-instanced
options.
This is generally much faster than using the non-instanced options repeatedly, but unless you have identical objects positioned in some kind of fixed grid or pseudo-random scattering it is unlikely to be useful.
For very many distinct objects:
For a set of scene objects that will have the same set of vertex attributes,
Many an array of attribute values for each vertex of all scene objects, one after the other.
For example, if you have a 12-vertex sphere and a 30-vertex knob you’d put the vertices of the sphere in indices 0 though 12n-1 and of the knob in indices 12n through 32n-1, where n is the number of values per vertex. Technically you can interleave vertices of different objects, but doing so has no advantage and might impeded cache performance.
Make an array of primitive connectivity for all scene objects, one after the other.
For example, if you have a 20-triangle sphere and a 50-triangle knob you’d put the vertices of the sphere in indices 0 though 59 and of the knob in indices 60 through 209. You cannot interleave the triangle indices: they have to be grouped by scene object.
Make a vertex array object on the GPU to collect the next steps
Send each attribute values array to the GPU as an array buffer
Send the connectivity array to the GPU as an element array buffer
Bind that vertex array object
For each scene object, call gl.drawElements
with
offset
of the index of the first entry in the index array
and count
of the number of index values.
For example, the 50-triangle knob above would use offset
of 60 and count
of 150.
This works well for almost any object type. It’s a bit more confusing to the programmer and makes for harder-to-maintain code, but it uses fewer buffers on the GPU and can be marginally faster and use slightly less GPU memory.
There’s also a similar shared-array, offset-and-count option for
gl.drawArrays
, gl.drawElementsInstanced
, and
gl.drawArraysInstanced
.
For saving GPU memory given several small attributes:
WebGL assumes all attributes are 4-vectors and at least nominally expands smaller attributes to that size automatically. If you have multiple attributes that collectively take up less than 4 floats per vertex (for example, a 2D texture coordinate and a 1D shininess parameter) you can save some time and space by combining them into a larger vector when providing the buffer.