• RuntimesUnity
  • Slow Performance on Android TV

  • संपादित

Some of our target devices for the game we're working on are various Android and Fire TV devices, which have notoriously lousy specs. Our game is currently very simple in terms of things going on and happening, yet on Android TV devices we get abysmal performance, which seems to degrade over time as the Skeletons animate. In the GameScene we are getting just 15 FPS, dropping to near 0 over a few minutes.

I currently have just 3 skeletons in the GameScene, the Player, the Opponent, and the Crowd in the background. The Player and Opponent both have several animations for controlling them in the fight, and the Crowd just has a small number of animations, layered on top of each other, and programatically adjusted for "excitement" through setting the Alpha value of one of the animation tracks per update and adjusting the TimeScale of that track as well. (Excitement wanes over time if "exciting" things aren't happening in the fight.)

The Player, to allow for future player customization, currently sets that Skeleton's Skin, Attachment Colors (using either RegionAttachment.SetColor, or MeshAttachment.SetColor depending on what type of Attachment we are coloring). However we only do this when we call the Player's OnAwake method, and OnValidate so it shouldn't be causing any significant issues here, except maybe during loading.

I currently update all three skeletons from a 'GameController' script using SkeletonAnimation.Update and have set the SkeletonAnimation scripts to "Manual Update" so that we are only updating each skeleton once a frame.

Upon hooking up the Profiler to this Android TV device I've been testing on, I found that most of the slow-down appears to be coming directly from Spine, specifically:
SkeletonAnimation.LateUpdate
SkeletonRenderer.LateUpdate
SkeletonRenderer.LateUpdateMesh

Which takes nearly 55 ms. to complete each frame!

Here is a screenshot of the profiler showing these results:

Any help to solve this issue, and increase our performance on Android TV devices (and most likely old crappy Android phones and tablets as well) would be greatly appreciated!

Edit: A small extra detail is that the Player's SkeletonAnimation is rendered to a RenderTexture so that we can properly handle opacity, and this is essential for getting the look we want for our GameScene. Think Little Mac from Super Punch-Out!! and that is the effect we are making, aside from the fully opaque gloves (though now that I think about, would be very cool). Disabling the Player's SkeletonAnimation entirely, or more specifically: not loading the Player Character's scene Additively, gets us from around 10 FPS to 20 FPS from that single change alone.

Edit 2: Some more experiments, by limiting the scene to a single SkeletonAnimation, and by also setting the Animation Update setting to "In Update" rather than "Manual Update", I was able to get between 40-50 FPS depending on which SkeletonAnimation we are rendering. This is obviously not a solution though, because we need at least these three SkeletonAnimations rendered to the GameScene each frame (the Player, the Opponent, and the Crowd). This is leading me to wonder if the artists are doing something wrong when creating these Skeletons, or inefficiently, or if there are some tricks and tips I'm missing to squeeze the most performance out of these skeletons at run-time. As the Software Engineer of the team, I don't have a great understanding of how the artists have set up these Skeletons under the hood, but can ask them as well as provide Export JSONs to help nail down the cause of these slowdowns.

  • इस पर Harald ने जवाब दिया।
    Related Discussions
    ...

    Could you (or your artists) show the stats from the metrics view for each skeleton?
    http://en.esotericsoftware.com/spine-metrics

    As a general guideline, have a look at the performance section and see how it applies to your specific setup:
    http://en.esotericsoftware.com/spine-metrics#Performance

      ExNull-Tyelor [.. ] and by also setting the Animation Update setting to "In Update" rather than "Manual Update", I was able to get between 40-50 FPS depending on which SkeletonAnimation we are rendering.

      Did you perhaps confuse the two update modes? In Update would be the default, and Manual Update would mean that no automatic update happens and you need to manually call update on the SkeletonAnimation component yourself.

        • संपादित

        Mario Could you (or your artists) show the stats from the metrics view for each skeleton?

        Hey, certainly! Here is the Metrics View for each of the three skeletons we have rendering on the GameScene:

        Player:

        Opponent:

        Crowd:

        Mario As a general guideline, have a look at the performance section and see how it applies to your specific setup: http://en.esotericsoftware.com/spine-metrics#Performance

        I've read through those guidelines before while we were still working in Cocos2d-x, but it never hurts to read through them again, maybe I missed something or forgot something crucial to the performance here (though these same skeletons ran quite well in Cocos2d-x compared to Unity).

        Harald Did you perhaps confuse the two update modes? In Update would be the default, and Manual Update would mean that no automatic update happens and you need to manually call update on the SkeletonAnimation component yourself.

        No, and I apologize if my question was confusing. I initially had the skeletons set to Manual Update mode
        and was calling Update on the skeleton myself manually within a controller class. I had meant that I commented out the manual calls to Update and then switched the Skeletons to In Update instead of manually calling Update on the Skeleton in the controller class's Update method. I'm not sure whether or not this actually affected performance at this point in time however. I can confirm lowing the number of skeletons rendered from 3 to 2 to 1 all gave performance improvements though. Nearly up to 55 45 FPS for just the Player skeleton.

        • इस पर Harald ने जवाब दिया।

          Thanks for posting the additional info. In general, the skeleton metrics values are all on the high side, which however might be justified depending on your scene setup, having only three skeletons visible. Obviously it would always help to reduce the vertex, bone, constraint, etc. counts, and especially avoiding using clipping polygons.

          FYI: There is also a section on performance recommendations on the spine-unity documentation page here:
          https://esotericsoftware.com/spine-unity#Performance

          ExNull-Tyelor Opponent:

          Please note that the opponent skeleton is using a clipping polygon, and also has a high vertex count of 1900 vertices. So if the clipping polygon could be avoided, this would be much recommended as mentioned in the documentation. You could have a quick try whether disabling clipping at the opponent SkeletonAnimation component improves the situation by changing the Inspector to Debug mode and at the SkeletonAnimation component disabling Use Clipping. (Afterwards you can change the Inspector back to normal mode of course).

          Please also be sure to measure FPS and general timings with Deep profile setting disabled and with release builds, otherwise profiler augmentations might distort the actual timings.

            • संपादित

            Harald Thanks for posting the additional info. In general, the skeleton metrics values are all on the high side, which however might be justified depending on your scene setup, having only three skeletons visible. Obviously it would always help to reduce the vertex, bone, constraint, etc. counts, and especially avoiding using clipping polygons.

            If you want a bit of extra context, check out Election Year Knockout. We are making a sequel to our game, and so the Spine characters are essentially the entire game, hence the high metric count.

            Harald FYI: There is also a section on performance recommendations on the spine-unity documentation page here:

            Thanks! I went and read through there, but didn't see any "a-ha!"s or gotchas that I think I might've missed there unfortunately...

            Harald You could have a quick try whether disabling clipping at the opponent SkeletonAnimation component improves the situation by changing the Inspector to Debug mode and at the SkeletonAnimation component disabling Use Clipping.

            Harald Please also be sure to measure FPS and general timings with Deep profile setting disabled and with release builds, otherwise profiler augmentations might distort the actual timings.

            I went and gave these a quick test, making sure I wasn't using a development build/profiling. With only the Opponent spine rendering, and nothing else, we get around 28 FPS on the Android TV regardless of whether or not the clipping is enabled or disabled (we are targeting 60 FPS in game and 30 FPS in menu). The moment another SkeletonAnimations is added (either the player or crowd) we lose about 10 FPS.

            Something else I'd like to point out is that the exact same Player and Crowd Skeletons, along with different Opponent Skeletons are currently used in Cocos2d-x and run at around 60 FPS on the same Android TV. I don't think any of us anticipated such a massive difference in performance for the Spine Runtimes between C++/Cocos2d-x and C#/Unity, and it's quite disheartening to be honest since most of our players on Election Year Knockout are on Android TV...

            • इस पर Mario ने जवाब दिया।

              ExNull-Tyelor I wouldn't necessarily put the blame on the runtime itself, it sounds like the Unity renderer might be at fault here, especially if the Cocos2d-x version runs smoothly. Our Cocos2d-x renderer is actually a custom thing that directly calls into OpenGL.

              I'm not super familiar with Unity's renderer options, but my guess is whatever you selected might be too much for Android TV devices. These devices generally use SoCs with GPUs that are terrible when it comes to alpha blended overdraw and complex fragment shaders. If you're using a Unity renderer that uses a complex fragment shader internally, that might be what is causing the performance issues. What Unity version and renderer are you using?

                • संपादित

                Mario I wouldn't necessarily put the blame on the runtime itself, it sounds like the Unity renderer might be at fault here, especially if the Cocos2d-x version runs smoothly. Our Cocos2d-x renderer is actually a custom thing that directly calls into OpenGL.

                If that were indeed the case I'd expect the bottleneck to be in the 'GPU Usage' and not the 'CPU Usage' in the Unity Profiler, however I see the exact opposite. Specifically in my first post you can see the calls on the CPU that are taking the most time are all stemming from SkeletonAnimation.LateUpdate with over 55 ms used to LateUpdate the Skeleton's mesh.

                Mario I'm not super familiar with Unity's renderer options, but my guess is whatever you selected might be too much for Android TV devices. These devices generally use SoCs with GPUs that are terrible when it comes to alpha blended overdraw and complex fragment shaders. If you're using a Unity renderer that uses a complex fragment shader internally, that might be what is causing the performance issues. What Unity version and renderer are you using?

                We're not using HDRP/URP or anything like that, just the default renderer that Unity is packaged with for a 2D Mobile Project template. Specifically we are using Unity 2021.3.20f1 at the moment, one of currently supported and suggested versions for Unity. We have the Adaptive Performance package as well, but haven't done anything with that API yet. As far as I'm aware this should be one of the most lightweight setups for Unity rendering and Spine.

                We also aren't doing any complex blending or anything like that on our Crowd nor our Player, and are just using the default Spine Shaders/Materials for those as well. On the opponent we are doing a slightly modified version of the Spine shader for outlines, but even with only the Player and Crowd instantiated we get about _ FPS (on a release build without profiling). The Crowd and Player do both have their slots colored programatically on their Awake calls, to allow for player customization and matching the crowd to the background colors, but this is only done once on Awake. We also did this same exact thing, on the same exact skeletons, in EYKO using Cocos2d-x, so it seems very unlikely that this could be the issue (especially since the bottleneck definitely appears to be in the CPU usage). 😞

                Edit: After profiling without Deep Profiling (as I had done for my very first post), I do indeed see the issue is likely GPU bound, though I'm struggling to see how Unity's renderer is so much less efficient than Cocos2d-x's OpenGL renderer. I know that Unity's renderer is heavy but I still assumed far better performance than this on TV. Removing the use of an outline (or custom rim-light shader) does actually get us about 10 extra FPS, from an unstable 20. However we are still not getting 60 FPS. But after removing that we are seeing around 30 FPS (though extremely unstable) with all three skeletons. The profiler now is indicating that it is indeed likely GPU bound since the biggest CPU call is now Gfx.WaitForPresentOnGfxThread which indicates that the CPU is waiting a while for the GPU to finish rendering. Why this might be the case I'm not sure, as without any Skeletons it easily manages to hit a stable 60 FPS as well with these scenes.

                With the nature of our game we can't really afford to change the way the artists create these skeletons, and I don't really see much way to simplify our renderer or shaders much more than they already are to get better performance, unless there is a simplified Spine shader suitable for our use case on Mobile/TV devices.

                Edit 2: I've opened a forum post here on the Unity Forums to hopefully get some opinions, suggestions, and recommendations from Unity-specific experts/developers.

                • इस पर Harald ने जवाब दिया।

                  ExNull-Tyelor We're not using HDRP/URP or anything like that, just the default renderer that Unity is packaged with for a 2D Mobile Project template. Specifically we are using Unity 2021.3.20f1 at the moment, one of currently supported and suggested versions for Unity. We have the Adaptive Performance package as well, but haven't done anything with that API yet. As far as I'm aware this should be one of the most lightweight setups for Unity rendering and Spine.

                  Please note that the standard render pipeline is not the most-lightweight pipeline. Universal Render Pipeline (URP), which is the successor of the Lightweight Render Pipeline (LWRP), is usually the first recommendation for mobile devices. It's the lightweight counterpart to High Definition Render Pipeline (HDRP)

                  Could you please share a screenshot of your Project Settings - Player settings? You could have a try whether using the IL2CPP scripting backend improves the situation.

                    • संपादित

                    Harald Could you please share a screenshot of your Project Settings - Player settings? You could have a try whether using the IL2CPP scripting backend improves the situation.

                    Certainly! (Sorry the image is so long, the spoiler tag doesn't shorten the empty space the image takes up on these forums, or I'd have hidden it in a spoiler.)

                    After setting our scripting backend to IL2CPP (which I thought I already did, but must've done only for iOS), as well as disabling the outline shader effect, we get... an unstable 30 FPS still with all three skeletons (though it might be more stable than without IL2CPP). Most of the time here is still being spent waiting for the GPU in the script as far as I can tell (Gfx.WaitForPresentOnGfxThread).

                    Some other testing has shown that using URP, with the default Spine URP Example 2D Scene from Spine, I only get about 15 FPS on the Android TV (while profiling), with most of the time being spent waiting for the GPU in the script (Gfx.WaitForPresentOnGfxThread).

                    After disabling all the point and directional lights in that scene I get around 57-59 FPS (while profiling) but with spikes down to 30 FPS, seemingly on a cyclical thing. Again these spikes seem to also be due to time being spent waiting for the GPU (Gfx.WaitForPresentOnGfxThread).

                    I'll report back here once I test all three skeletons rendering with URP next.

                    Edit: Welp. Trying URP with the three skeletons, Player, Opponent, and Crowd, is not boding well so far.

                    Using URP with the default Spine Skeleton shaders only gives us about 15 FPS now, whereas the default renderer with default shaders got us around 30 FPS (albeit unstably). Using URP with the Universal Render Pipeline/2D/Spine/Sprite shader gives us a whopping 6 FPS 😳 Then finally using URP with the Universal Render Pipeline/2D/Spine/SkeletonLit shader gives us 10 FPS. GPU Instancing doesn't seem to make a difference here either.

                    In all three of these tests there were no normal maps, emission maps, fragment shaders (outside of any you've used for your shaders in Spine's Unity Package), no light sources, etc. It was quite literally three skeletons, a camera, a global light 2d, and a single canvas with a single TextMeshPro on it to display the FPS averaged over the last 60 frames. About as simple of a scene as we could make for this test.

                    After disabling all the point and directional lights in that scene

                    How many point and directional lights do you have? I assume you had no such thing in Cocos2d-x. The way those are implemented in Unity's renderers means your fragment shader is likely super complex, which would explain the high GPU side load on those low-performance Android TV chipsets.

                      • संपादित

                      Mario How many point and directional lights do you have? I assume you had no such thing in Cocos2d-x. The way those are implemented in Unity's renderers means your fragment shader is likely super complex, which would explain the high GPU side load on those low-performance Android TV chipsets.

                      We have no lighting in either our Cocos2d-x game or our Unity game. I was referring to Spine's URP 2D Example Scene which is preloaded with several point and directional lights. (Specifically the one that has two StretchymanURPs and the RaptorProURP skeleton in it.)

                      Edit: I've updated the post above with more testing using the URP with our three skeletons (the Player, Opponent, and Crowd). The results were... less than promising 😞

                      Edit 2: I'd also like to add that I just attempted to disable all post-processing, shadows, and lighting in the URP Asset data, and only get about 20 FPS with that setup (though its still unstable with spikes down to 15 FPS).

                      @ExNull-Tyelor Very sorry to hear that the situation did not improve with using the URP pipeline! To be sure, could you perhaps share the settings of the used Universal Render Pipeline Asset that is assigned at Project Settings - Graphics - Scriptable Render Pipeline Settings. Also please check whether under Project Settings - Quality there is no Render Pipeline Asset assigned which overrides the one in the Graphics section. If you haven't already, please be sure to set everything under the Lighting section to the minimal settings, to e.g. not use multiple secondary per-pixel lights.

                      Additionally, could you please share your Material settings you are using at your skeletons? Does the situation change if you are using an unlit URP shader (either Universal Render Pipeline/Spine/Skeleton)?

                        Harald To be sure, could you perhaps share the settings of the used Universal Render Pipeline Asset that is assigned at Project Settings - Graphics - Scriptable Render Pipeline Settings.

                        Sure! Here are the settings I used for my URP Asset during testing:

                        Then the settings in the Renderer Data Asset are:

                        Harald Also please check whether under Project Settings - Quality there is no Render Pipeline Asset assigned which overrides the one in the Graphics section.

                        I assigned the URP Asset to each Render Pipeline Asset slot for each Quality Level in Project Settings - Quality and not in the Project Settings - Graphics since I assumed this would give us greater control to have more complex Render Pipelines for Desktop builds vs Mobile builds. Should I assign it through the Graphics settings instead?

                        Harald Additionally, could you please share your Material settings you are using at your skeletons? Does the situation change if you are using an unlit URP shader (either Universal Render Pipeline/Spine/Skeleton)?

                        That was one of the shaders I was attempting to use for this experiment actually. The settings I'm using are:

                        I'm using the same general settings for the other two skeletons materials as well (one for the Opponent and two for the Player).

                        • इस पर Harald ने जवाब दिया।

                          Hate to be a bother, but any extra help or information you might be able to offer @Harald ?

                          A small test showed a rotating capsule was able to hit 60 FPS on this Android TV while using URP. Though the RenderFrame was taking nearly 10 ms 😳

                          • इस पर Misaki ने जवाब दिया।

                            ExNull-Tyelor Sorry for the wait! Yesterday was a public holiday and Harald will be back today, so please wait a little longer for him.

                              Sorry for the late reply!

                              ExNull-Tyelor I assigned the URP Asset to each Render Pipeline Asset slot for each Quality Level in Project Settings - Quality and not in the Project Settings - Graphics since I assumed this would give us greater control to have more complex Render Pipelines for Desktop builds vs Mobile builds. Should I assign it through the Graphics settings instead?

                              No, that's fine as well, you just need to be sure which render pipeline asset will be used, to tweak the actually used asset.

                              While I don't think it will improve the situation, you could have a try changing Light Blend Styles entry Rim to Mask Texture Channel from R to None. I doubt that any unnecessary pass will be triggered if no lights are active at all, but just to be sure no unnecessary overhead is generated.

                              ExNull-Tyelor That was one of the shaders I was attempting to use for this experiment actually. The settings I'm using are:

                              Here you are using a lit 2D shader (truncated, but likely Universal Render Pipeline/2D/Spine/Skeleton Lit). Please use an unlit non-2D shader if you don't need any lighting applied. Like Universal Render Pipeline/Spine/Skeleton.

                                • संपादित

                                Misaki ExNull-Tyelor Sorry for the wait! Yesterday was a public holiday and Harald will be back today, so please wait a little longer for him.
                                Harald Sorry for the late reply!

                                Not a problem at all, and I apologize for trying to bother you all on your holiday!

                                Harald Here you are using a lit 2D shader (truncated, but likely Universal Render Pipeline/2D/Spine/Skeleton Lit). Please use an unlit non-2D shader if you don't need any lighting applied. Like Universal Render Pipeline/Spine/Skeleton.

                                Interestingly, changing the Skeleton's shaders to the one you suggested, Universal Render Pipeline/Spine/Skeleton, causes the Skeletons to stop rendering entirely.

                                Edit: Actually both the Universal Render Pipeline/Spine/Skeleton and Universal Render Pipeline/Spine/Skeleton Lit cause the Skeleton to stop rendering entirely, as does Universal Render Pipeline/Spine/Sprite. Though Universal Render Pipeline/Spine/Outline/Skeleton-OutlineOnly does render only the outline of the Skeleton. Only the Spine shaders under Universal Render Pipeline/2D/Spine/ are working (Skeleton Lit and Sprite) are actually rendering somewhat properly, but the Sprite shader creates outlines between all the symbols that aren't in the actual atlas file.

                                Harald While I don't think it will improve the situation, you could have a try changing Light Blend Styles entry Rim to Mask Texture Channel from R to None. I doubt that any unnecessary pass will be triggered if no lights are active at all, but just to be sure no unnecessary overhead is generated.

                                Doing this with the Lit Skeleton shaders doesn't seem to give any noticeable improvement in performance. Though I can't seem to use the un-lit shaders like you suggested above to see if there is a difference there...

                                • इस पर Harald ने जवाब दिया।

                                  Not a problem at all, and I apologize for trying to bother you all on your holiday!

                                  No need to apologize, how should you know! 🙂

                                  ExNull-Tyelor Interestingly, changing the Skeleton's shaders to the one you suggested, Universal Render Pipeline/Spine/Skeleton, causes the Skeletons to stop rendering entirely.

                                  Terribly sorry, my mistake, for some reason I incorrectly assumed that the unlit Universal Render Pipeline/Spine/Skeleton shader would render normally with 2D renderer, however it is not finding the respective shader pass and does not render anything at all. We have just pushed a commit to the 4.1 branch which adds an unlit URP 2D shader, available under Universal Render Pipeline/2D/Spine/Skeleton.

                                  A new Spine URP Shaders 4.1 UPM package is available for download here as usual:
                                  https://esotericsoftware.com/spine-unity-download
                                  Please let us know if this improves the situation.

                                    Harald erribly sorry, my mistake, for some reason I incorrectly assumed that the unlit Universal Render Pipeline/Spine/Skeleton shader would render normally with 2D renderer, however it is not finding the respective shader pass and does not render anything at all. We have just pushed a commit to the 4.1 branch which adds an unlit URP 2D shader, available under Universal Render Pipeline/2D/Spine/Skeleton.

                                    A new Spine URP Shaders 4.1 UPM package is available for download here as usual:
                                    https://esotericsoftware.com/spine-unity-download
                                    Please let us know if this improves the situation.

                                    Hey no problem at all! I'm happy to report that after updating the package and then switching the shaders to Universal Render Pipeline/2D/Spine/Skeleton it basically doubles the frame rate from 15 FPS to 30-34 FPS on our Android TV device while profiling a Development Build!

                                    Still not the 60 FPS we had hoped for, but we can probably get away with having a lower FPS for TVs 😅

                                    Thank you and the rest of the Esoteric Software team so much for the help you have provided us! If you ever have any other help, suggestions, improvements or fixes that you think might help us out, we'd love to hear them as well.

                                    For a little bit more info the frame debugger is showing that two of our meshes can't be SRP batched, but I'm not entirely sure why. Could it have something to do with this CBUFFER thing, since I assume the Android TV wouldn't support Vulkan.

                                    The Gfx.WaitForPresentOnGfxThread seems to be quicker on a cyclical pattern as well, which seems odd to me. You can see this on the CPU Usage with these Gfx.PresentFrame spikes as well, taking between 13ms and 28ms to complete, which causes Gfx.WaitForPresentOnGfxThread to fluctuate between 5ms and 22ms.

                                    Anywho, thank you again so much for the help you've already given us and for this un-lit 2d skeleton patch 😃

                                    • इस पर Harald ने जवाब दिया।