kantv-1.6.8
·
52 commits
to master
since this release
Overview
- sync ggml-hexagon implementation from https://github.com/zhouwg/ggml-hexagon
- remove dependency of QNN runtime libs in APK and reduce the size of APK significantly.
- make HWACCEL_CDSP approach works fine as expected on Android phone equipped with Qualcomm high-end mobile SoC
- maintain only one version of ggml-hexagon.cpp and make work flow more clear and easy
- refine the entire project
- refine codes in JNI and UI
- sync llama.cpp with upstream llama.cpp project
- upgrade QNN SDK to v2.33.0.250327
- upgrade Android NDK to android-ndk-r28
- fix a very long-term issue of "2D graphic benchmark does not work properly on Android phone": #163
- fix a stability issue of "AI-subtitle can't works" which introduced in #281 . this is unacceptable
- try Qwen3-4B, Qwen3-8B, DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, Gemma3-4B, Gemma3-12B on Android phone.
- make DeepSeek-R1-Distill-Qwen-1.5B can works fine on Android phone
- multi-modal(image-2-text) inference supportive through Gemma3 on Android phone
- add LLM setting on Android phone
- add download LLM models in APK on Android phone(this feature need to access HuggingFace directly otherwise it's not available)
- make tv.xml can be edited by APK's user whom doesn't have tech background.
- refine and simplify UI code
- improve stability
Status
test model is Gemma3-4B.
validate following features on Android phone equipped with Snapdragon 8Gen3 and Snapdragon 8Elite:
- TV playback
- TV playback + TV recording
- local playback with recorded video file
- TV playback + AI-subtitle
- TV playback + TV recording + AI-subtitle
- ASR inference(through whisper.cpp) benchmark with ggml backend & cDSP backend
- LLM inference(through llama.cpp) benchmark with ggml backend & cDSP backend
- multi-modal(image-2-text) LLM inference(through llama.cpp) benchmark with ggml backend & cDSP backend
- 2D graphic benchmark
- video encode benchmark
- video encode benchmark and create code-generated video file
- local playback with code-generated video file
- download LLM model(in local dev envs)
- edit tv.xml(aka customize tv.xml for personal need or R&D activity)
Dev envs
QNN SDK is v2.33.0.250327,Hexagon SDK is v6.2.0.1, Android NDK is android-ndk-r28.
QNN SDK and Android NDK can be downloaded automatically through build-run-android.sh, Hexagon SDK must be obtained with a Qualcomm Developer Account.
Running on Android phone
-
Android 5.1 --- Android 15 or higher version might-be/should-be/could-be supported.
-
Android smartphone equipped with one of below Qualcomm mobile SoCs(Qualcomm Snapdragon 8Gen3 and Snapdragon 8Elite are highly recommended) is required for verify/running ggml-hexagon backend on Android phone:
Snapdragon 8 Gen 1
Snapdragon 8 Gen 1+
Snapdragon 8 Gen 2
Snapdragon 8 Gen 3
Snapdragon 8 Elite
- Android smartphone equipped with ANY mainstream high-end mobile SoC is highly recommended for realtime AI-subtitle feature otherwise unexpected behavior would happen
Todo
- an automated CT approach should be introduced in this project for validate every PR/release. whether AI can be used for this purpose?
- there is unsolved issue in UI(corner case would lead to APP crash)
- this is debug build rather than release build(this is a workaround approach for a known issue)
- realtime text-2-image inference on Android phone:#301
- others can be found at roadmap