The projection stage is going to divide values by the w component of vertices, so it is necessary to ensure no w=0 vertices remain. It’s also good for both efficiency correctness to discard geometry that would be off-screen when rendered. Both goals are achieved by clipping primitives in homogeneous coordinates.
Clipping is done with clipping planes. There are six clipping planes enabled by default, though some GPUs may allow the user to add more. The six are: \begin{matrix} -w &\le& x &\le& w\\ -w &\le& y &\le& w\\ -w &\le& z &\le& w\\ \end{matrix} Put another way, vertices are inside the clipping region if the following results in a vector of non-negative numbers: \begin{bmatrix} 1&0&0&1\\ -1&0&0&1\\ 0&1&0&1\\ 0&-1&0&1\\ 0&0&1&1\\ 0&0&-1&1\\ \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} This second form is useful because each of the six resulting numbers is a signed distance from one of the six clipping planes, and signed distances make finding intersection points easy.
The standard plane equation is Ax+By+Cz+D = 0. The Ax+By+Cz+D part provides the signed distance between the point (x,y,z) and the plane. For homogeneous points, this equation generalizes to Ax+By+Cz+Dw = 0 with distance formula Ax+By+Cz+Dw. Written as a matrix, that is \begin{bmatrix} A&B&C&D\\ \end{bmatrix} \begin{bmatrix} x\\y\\z\\w \end{bmatrix} Thus the matrix form of the constraints above is evaluating six homogeneous plane signed distance formulas at once.
Vertices that violate any one of the inequalities are discarded. Vertices that satisfy all six of the inequalities are kept. Edges that connect a kept and discarded vertex result in the creation of a new vertex that lies exactly on the clipping plane, potentially changing the number of vertices in the primitive.
The edge with endpoints (1,2,3,4) and (4,1,-1,2) crosses the x \le w clipping plane (because 1<4 but 4 \not<2).
Rewriting the plane x \le w as a plane distance equation we get 1x + 0y + 0z - 1w \le 0. Plugging in the two vertices, we get distances -3 and 2, respectively. Our new point is thus \begin{split} &\;\dfrac{\big(2(1,2,3,4))-\big(-3(4,1,-1,2)\big)}{2-(-3)}\\ =&\; \dfrac{(2,4,6,8)+(12,3,-3,6)}{5}\\ = &\;(2.8,1.4,0.6,2.8) \end{split} This point lies on the plane (2.8 = 2.8) and on the edge (being a linear combination of (1,2,3,4) and (4,1,-1,2)).
Clipping is partly an optimization: it means out-of-view object are never rendered. But it also has correctness properties, preventing division-by-zero errors during projection and removing numerical instabilities caused when dividing a large number by a small number.
A triangle clipped against one plane can result in 0, 1, or 2 triangles.
The general approach is as follows:
A triangle clipped against multiple planes may result in many new triangles. There are sophisticated approaches that try to generate the minimum number of triangles in some kind of canonical order, but it is far more common to implement the following order-of-planes-dependent approach: