This post contains my derivation notes for the Jacobi iteration of a system of spherical rigid bodies constrained together in a chain by “point” constraints (ie. ball joints) - ie. a pearl necklace :). I’ve been working on a little framework project for comparing different GPU based physics solvers, and this is meant to be a solver/scene which can function as a baseline by which to compare the stability and performance of various solvers.
I’m sticking with a specific, very simple constraint type for the scope of this derivation for concreteness, and so that it can immediately map to an actual simulation.
The result: 2.5k chains of 40 beads (100k beads and constraints total) at 60fps on a RTX4060.
I spent this weekend fascinated by the interesting patterns that can emerge from the Particle Life algorithm by Jeffrey Ventrella. While this algorithm needs no new implementations (there are many!) I decided to use it as an exercise in WebGL programming.
In the past few weeks, I learned about the Barnes-Hut algorithm for approximating long-range forces between many 2d particles from this video, and decided to try my own hand at a 3d implementation. My goal: to see how many stars I can fit into a single galaxy in real time, on my M1 Macbook Air. The video below was captured on that computer, with 262k stars. The code I wrote is here.
BIN-CSR is a sparse, symmetric matrix format, originally described by Daniel Weber et. al in their 2012 paper on GPU data structures. BIN-CSR is suitable for applications which use massively parallelized matrix operations, like some GPU-based finite element method solvers.
The diagonal elements of the matrix are stored separately, to enable fast preconditioning of the matrix. Preconditioning is a step which allows for faster convergence when performing a system solving method like the conjugate gradient method.
The WebGL demos on this page aim to demonstrate the construction of a BIN-CSR matrix in an interactive way.
The following is an interactive representation of a full matrix. Black squares represent non-zero values, and they’re populated randomly as rows are added. Drag the slider to change the dimensions of the matrix.
Size
Density
The elements of this matrix are packed into padded bins. Each bin contains a set number of rows worth of data. The number of rows contained in a bin is called the bin width. The following is an illustration of bin data packing. The slider can be used to adjust the bin width.
The above visual shows the theoretical grouping of the data. The actual layout of the data in memory is optimized for coallesced memory access on the GPU. In actuality, the data is stored in four separate arrays, visualized here:
The values in the pointer array (gray) are indices into the value and column arrays (black), indicating the starting index for each row. The pointer values are represented by gray connecting lines.
The second installment in my progress on a ductile fracture physics engine. So far, it supports neither ductile nor fracturable materials, but elastodynamic deformation is a step in the right direction.
So far, it’s pretty much a jello cube, created mostly following the FEM derivations by Erleben in Physics Based Animation. More technical updates to follow.
In the following video a teapot made of 149210 tetrahedra is generated from a regular surface-bounded mesh. Fifty tetrahedra are removed each frame. After each removal the mesh is traversed in search of “islands” - groups of adjacent tetrahedra which share no faces with other groups - which are indicated by color.
I am beginning work on an FEM based model for deforming and fracturing physics objects. Such a model requires that objects to be composed of assemblies of tetrahedrons. This post contains a high level overview of the datastructures and algorithms I’m experimenting with.
Last year I faced a problem which required me to compute products (convolutions) between vectors, matrixes, vectors of matrixes (rank-3 tensors).
I wanted to do this in C++ and the best option I could find seemed to be FTensor. Although it probably would have worked for my particular case, FTensor seems to have been written in an older style which failed to make use of variadics, and I wanted something a bit more generic.
In a quest for further flexibility I began to work on my own tensor library, Weyl.
Weyl is header only and contains just one core class, tensor, whose entire source code is very short (at the moment, 367 lines including comments). Thanks to fancy-pants template expansion, large tensor convolutions are (usually!) compiled into a list of simple add and multiply operations.
In order to produce this result, we need the template function sum to expand to something equivalent to the following set of embedded loops (assuming that the values in the resulting tensor were initialized to zero).
for(i=0;i<3;++i)// i and j are a's contribution to c's rank indexesfor(j=0;j<3;++j)for(l=0;l<2;++l)// l and m are b's contribution to c's rank indexesfor(m=0;m<4;++m)for(k=0;k<5;++k)// k is the index of the dimension to sum overc[i][j][l][m]+=a[i][j][k]*b[l][k][m];
This is essentially what weyl::tensor produces, and since the bounds of each loop are known, they are unrolled at compile time. You can see the sort of assembly it produces here on Godbolt. For brevity, the full source is not included in this example.