Face Culling (Unity)

  • The current Java version seems like have some king of culling, but the new Unity version - don't. If I guessed correctly the reason is higher complexity of geometry - different shapes, rotations, scales and so on. But without culling meshes will exponentially grow in size especially for large builds. Merging meshes into one (current approach) helps a bit, but only until amount of pieces will become really large.


    What are benefits of culling? Here is a small example - if you have a cube made of smaller cubes (for example 8 * 8 * 8 or 512 cubes) it will have all faces for each cube by default (6 * 512 = 3072 faces)


    But if we will remove (cull) all inner faces between cubes (as they are invisible) we will get only surface squares, for this cube this will be 8 * 8 * 6 = 384 faces, it is only 12,5% from original cube faces


    So, this is simple for cubes oriented by grid, but what about more complex geometry? Well, it's more difficult than for cubes. I tried to find solution for this problem in general and found this topic. There was suggested interesting and simple method - culling by normals. The problem itself is to define when object face is inside another object, and the easiest solution is to compare points of this object with nearest face of another object and comape with its normal. If the normal have the same direction as the distance to face - point is outside the object, but if not - it is inside.


    Solution for attached square faces can simpler - if both faces have oppositely directed normals than if they have same points (with distance between < 0.01 or other small value) - both faces should be culled, otherwise only small face should be culled.


    Culling will also be a good solution for transparent faces problem that was discussed here.

    If someone knows more successful and universal algorithms - feel free to suggest them. Let's help Red51 together ;)


    Images source article (about greedy meshing, another optimisation algorithm)

  • Having proper face culling would be indeed helpful: It would greatly reduce memory consumption (RAM and VRAM). It would certainly also improve the framerate, although I wouldn't expect massive gains here (considering that most faces won't be fully rendered anyway due to early depth testing).


    However, unfortunately it's almost impossible to implement this for the new version without killing the performance (in terms of world generation speed) :| Unlike the Java version (which had indeed face culling, at least for blocks), every element can have an arbitrary position, size and rotation now - so it's complicated to determine whether or not a face is fully occluded by another element. In a block world, this would be a lot easier to find out, because we just have to check if there is actually a block next to the current element - since blocks would always have the same size and orientation, they will definitely fully occlude a face.


    But in addition to this, it's also a lot slower to find neighbour elements in the new version: In the Java version, each block was stored in a flattened 3d array, so we could simply calculate the actual array index from the block position (which was always an integer position) - this way it was very easy and blazingly fast to find all neighbour blocks. In the new version, there is no way to find an element depending on its position, so we still have to go through the whole list until we find the appropriate element. This results in lots of comparisons.


    It would probably work on a small scale, but considering that some people create buildings consisting of thousands or even hundreds of thousands of construction elements, these calculations would quickly become a performance killer (in terms of generation performance) :dizzy:


    At the end of the day, I'm not sure if it's worth it to trade world generation speed for memory savings and smaller framerate improvements (as mentioned above) :/


    We want, however, offer a way to export buildings to .obj or .fbx files - there is would be certainly helpful to get rid of occluded faces (even if this is a costly operation). Once that's implemented, we could perform some runtime tests to find out the actual performance impact, but I'm skeptical about this...


    Merging meshes into one (current approach) helps a bit, but only until amount of pieces will become really large.

    Actually the current approach doesn't just merge meshes, the game does only generate a single mesh right from the beginning: It allocates a native buffer large enough to hold the vertices of all construction elements in a chunk, then the game goes through all elements and calculates their individual vertex positions (depending on the element rotation, size, surface offset etc) and stores them in the buffer. Eventually it generates a single mesh from the populated buffer. This is quite fast and also keeps memory allocations and garbage at a minimum ^^

  • But in addition to this, it's also a lot slower to find neighbour elements in the new version: In the Java version, each block was stored in a flattened 3d array, so we could simply calculate the actual array index from the block position (which was always an integer position) - this way it was very easy and blazingly fast to find all neighbour blocks. In the new version, there is no way to find an element depending on its position, so we still have to go through the whole list until we find the appropriate element. This results in lots of comparisons.

    You can use a very simple, but effective approach - temporal voxelisation. For example you have a grid of "temporal voxels", each voxel is a list of elements. When you need to calculate a mesh you firstly iterate all shapes and check what voxels they occupies (with adding this shapes to list). Then when you need to check face overlapping - you need to test only some voxel lists instead of full object list. This task can be also easily be paralleled as all lists will not change during this operation. After operation is complete you can empty lists and use in another mesh building

  • Voxelisation itself is simple and can be applied to any geometry. Here is small old example of my voxeliser. It uses mesh triangles as a source for voxelisation. Interior can be filled with simple flood fill algorithm. For suggested approach voxels can be large, so this stage will be very fast

  • This is basically a good idea and would certainly reduce the time needed to find neighbour elements (so we don't have to go through the whole list for every element). But this would still be problematic when it comes to small elements or many overlapping elements: Unless we use a very high resolution for voxelisation (which introduces other issues), we would still end up comparing too many elements in the worst case... if a user places a few hundreds or thousands of elements on a small space (to create a small but detailed sculpture, for example), I'm afraid the calculations would still be too costly (if a cell holds 1K elements, for example, this already results in 1M comparisons for face culling) :| Even if this only results in a slowdown of a few seconds per chunk, it quickly adds up on bigger view distances.


    This would certainly work on a smaller scale, but would be too problematic when it comes to detailed buildings consisting of numerous densly placed elements...


    But as mentioned, it's on our to-do list for our mesh exporter. Once it's implemented, it would be definitely interesting to see some actual results regarding the performance impact.

Participate now!

Don’t have an account yet? Create a new account now and be part of our community!