This plugin was developed when I noticed that my iOS game has a considerably worse audio latency than other music apps such as Garage Band installed on this same device. I have confirmed by creating a basic Unity app which just play an audio on touch down vs. a basic XCode iOS app/Android Studio app which also plays an audio on touch down. You can clone the project on GitHub to confirm this by yourself.
If you can't feel the difference, I encourage you to do a simple sound wave recording test. Keep your phone at the same distance from your computer's mic and tap the screen nail-down. The sound wave interval between the peak of nail's sound and the peak of response sound should immediately reveals the difference visually. Newer device might have a better latency, but the difference between Unity and native app should be relative for every devices.
Unity has an internal mixing and automatic audio track management system, backed an another layer by FMOD. That's why you can go crazy with a method like
AudioSource.PlayOneShot without creating any audio track and the sound magically overlaps over itself like you own many
AudioSource. You didn't even load the audio and it just works! This is great design for a game engine.
But unfortunately all the things adds more and more audio latency. For some genre of apps and games that needs critical timing for audio this is not good. Naturally, the idea to fix this is to send audio data and directly call into the native methods, bypassing Unity's audio path entirely.
I have researched into the fastest native way of each respective platform and found that for iOS it is to use
OpenAL (Objective-C/Swift) and for Android it is to use
AudioTrack (Java). For more information about other alternatives why they are not good enough, please go to implementation page.
But having to interface with multiple different set of libraries separately from Unity is a pain, so Native Audio is here to help...
I developed this plugin as a solution for my own game which needs to gain every possible little bit faster feedback. Please watch this 1-take video which is my final proof that the plugin will have any benefit.A detailed write up of this experiment is available here
That video mainly pointed out that Native Touch would help a lot more with "perceived" latency than Native Audio. But that experiment still contains a wrong assumption. That is the baseline is
AVAudioPlayer + iOS native input. Later, I found that using
OpenAL for the native side instead of
AVAudioPlayer on iOS improves the latency by a lot in this final experiment.
In summary, Native Audio will help a lot on iOS by the use of
OpenAL and helps a bit on Android with
AudioTrack. On iOS you can additionally use iOS Native Touch to further reduce the perceived latency to minimum possible while under Unity's convenience, but beware that iOS Native Touch is not easy to use.
Unfortunately on Android using Native Touch doesn't help so Native Touch is not available on Android.
Each device is going to have a different latency improvement depending on how each device "handles" things that Unity adds. (If it handles badly, then we will get a bigger gain bypassing to the native side.) But for starters, let's look at my iPod Touch Gen 5 and Nexus 5.
This is the experiment where I haven't realized yet that
OpenAL is better than
AVAudioPlayer on iOS. The baseline is
AVAudioPlayer on an Xcode native project. The point is trying to use
AVAudioPlayer from Unity and other tricks to approach the baseline latency.
This is a project showing intervals of the peak of nail's sound hitting the touchscreen to the peak of response sound wave. Don't pay attention to the time interval number since this measurement is non-standard. (loopback cable latency test on iOS usually results in something as low as 10ms) But instead we should focus on a difference from native time
|iOS XCode Native||iOS Unity + Best Latency||iOS Unity + Native Audio||iOS Unity + Native Audio + Native Touch|
|Difference from native||-||45 ms||33 ms||16 ms|
|Latency reduced from the previous step||-||+45 ms||-12 ms||-17 ms|
|How much does a particular step helps||-||-100%||26.67%||37.78%|
On iOS, adding Native Touch helps more than Native Audio. Having both together we have reduced 64.45% of the total 45 ms difference from an ideal native time. In one of my test I even managed to reach 100% and match a native performance, but on average it is like this. And don't forget that an iPod Gen 5 is quite old, newer device might even have a better native performance due to new chips, etc.
|Android Studio Native||Android Unity + Best Latency||Android Unity + Native Audio|
|Difference from native||-||34 ms||28 ms|
|Latency reduced from the previous step||-||+34 ms||-6 ms|
|How much does a particular step helps||-||-100%||16.67%|
NOTE : If you didn't use "Best Latency" (small buffer size) audio settings in Unity before comparing with Native Audio, the improvement will "appears" to be larger.
For Android, the benefit of Native Audio is so pitiful that I almost want to drop the support altogether. But since Android is always fighting with the latency, there might be a chance that in newer devices or future versions the native method might do a better job than this. (For example they just added an interesting flag to the native API in Oreo) And at the same time there might be a moment that Unity is late in taking advantage of some new native features. In the end I decided to keep Android support.
On iOS however, I have further discovered that
OpenAL is ideal and not
AVAudioPlayer shown in the experiment above. I redo the nail sound test again. The number might changes from the previous tests since the environment, the iOS device model, battery level, position, distance to mic, air temperature, etc. is not the same. (That’s why I stressed this kind of test can only be compared to each other in the same test session.) But again, look at the difference that
OpenAL can further make :
It is now even better than the old baseline (Which uses
AVAudioPlayer) that we are trying to reach earlier. (And this is without iOS Native Touch to skip Unity's input layer)
Certain kind of games or apps relies heavily on feedback sound. The keyword is not an "audio application" but "feedback sound". For example if you are making a music player, that is clearly an audio app but audio latency won't affect the experience at all because all the interaction you do is press play and listen. It's not like if the song starts a bit late then an entire song is ruined. The core experience is on the song itself not a timing.
But if a feedback sound lags? It is not concerning for non-gameplay elements like a UI button that sounds when you press it, but imagine a drumming game that you have to hit at the correct moment. If you hit perfectly and the game says so, the sound will come later. If you hit early the game punishes you, but the sound will be exact. It's this kind of problem.Click to learn more about 3 classes of musical apps.
Application like digital audio workstation (DAW) on mobile phone or live performing musical apps like Looper, Launchpad falls into this category. The app is interactive, but the reference of what is the "correct" timing are all controllable. Imagine you start a drum loop. Each sound might have delay based on device, but all delays are equal, results in a perfect sequence albeit variable start time. When starting another loops, it is 100% possible for the software to compensate and match the beat that is currently playing. This class of application is immune to mobile audio latency.
Apps like GarageBand (in live playing mode) is in this category. The sound have to respond when you touch the screen. A latency can impact the experience, but if you are rehearsing by yourself you might be able to ignore the latency since if you play perfectly, the output sound will all have equal latency and will be perfect with a bit of delay.
There are many music games on mobile phone like Cytus, Deemo, Dynamix, VOEZ, Lanota, etc. If there is a sound feedback on hitting the note, this is the hardest class of the latency problem. Unlike Sequencer class, even though the song is predictable and the game know all the notes at all points in the song you cannot predict if the sound will play or not since it depends on player's performance. (Unless the sound is played regardless of hit or miss or bad judgement, then this class can be reduced to Sequencer class.) It is harder than Instrument class, since now we have backing track playing as a reference and also a visual indicator. If you hit on time according to the visuals or music, you will get "Perfect" judgement but the sound will be off the backing track. When this happen, even though you get Perfect already you will automatically adapt to hit earlier to make that respond sound match with the song, in which case you will not get the Perfect judgement anymore. In the Instrument class, if you are live jamming with others this might happen too but if you adapt to hit early you can get accurate sound and not be punished by the judgement like in games.
What I am making is a music game. Even a little bit of latency will be very obvious. Since there is a beat in the song for reference, players will be able to tell right away that he/she is hearing 2 separate sound (the beat in the song and the response sound) even if the player scores a perfect.
StreamingAssetsfolder so that the native side can see them.
NativeAudio.Load("yourSound.wav");you will get an object of type
myNativeAudioPointer.Play(volume, pan);you will hear low latency sound.
Playto further minimize the latency. It will automatically do the best "prepare" on each native side. (You can think of
Loadalready as the first prepare, but this one even further micro-optimize the
myNativeAudioPointer.Unload()to free up an audio buffer at respective native side.
Native Audio abstracts the difference between native audio libraries only programmatically. I believe you as a programmer needs full understanding of exactly what is happening on each platform after calling those abstraction. This is why I put in explanations in the code explaining things in a non-abstract way for every platform supported. It will also shows up in your Intellisense.
A purchase also gives you all the source code. That is : Unity's managed C# side, iOS's
.h files, and since
.aar android plugin file is not readable I have also bundled Android Studio project that produces
.aar file used by Native Audio in a zip file.
You are not wrong to think that this has to be less flexible than audio function in Unity.
OpenALbut I needed time to find a way to do it.
AudioTrackall at once. You can have this much concurrency. You can also increase this number in the source code, however Android has a total
AudioTracklimit shared for the whole device at 32. (Even outside of your game) The safe hard limit is at around 24. The default settings is 8. [iOS] All of your sounds "buffer" have shared 32 "sources". That is you can play any sounds 32 times before Native Audio stops the earliest ones. "amount" parameter is ignored. Every loaded audio can play over itself.
The current version serves the core purpose : improve latency by bypassing Unity. I can't deny that I have taken many rough shortcuts here and there. I would like to tell you all what is supposed to be included in the next iterations :
.ogg? .mp3? 48000Hz .wav?) and after this it will become uncompressed like usual. (Like "Decompress on load" of Unity. The input can be any, but the result is like
.wav) By not hardcoding the format, this adds a load time but when playing it will be the same.
StreamingAssetsof the main ones. I will have to make it look for all the parts you have.
AudioTrackis hardcoded to be instantiated with 44100Hz rate. In the next version, Native Audio would check the device's "native rate" (usually either 44100Hz or 48000Hz) and instantiate
AudioTrackof that rate. This can potentially enable a special fast audio path in Android. (Read more in Implementation page.) If it is not 44100Hz, there will be a routine to convert sampling rate on the fly after loading so we only need to prepare 44100Hz audio.
Prepare()on iOS has flaws in that it does not make sure the next
Play()will play the correct audio if any other audio overwrites the buffer first. I planned to change this to an approach similar to Android side but only if I can confirm that it does not noticably affect latency.
AudioUnit, iOS lowest level audio technology.
OpenALis built just on top of this.