Support running LLM and building AI Agents locally and efficiently on edge devices
Benchmark on Phi-3 on Google Pixel 6
20 token/s prefill and decoding speed for Phi-3 (data collected on SAMSUNG S23)
CPU, GPU, and hybrid CPU + GPU inference
1.5-bit, 2-bit, 4-bit and 8-bit integer quantization
Android, iOS, MacOS and Windows Operating Systems
Presented by: Ethan Wang, David Qian, Perry Cheng, Brian Guo
Presented by: Ethan Wang, David Qian, Perry Cheng, Brian Guo
Presented by: Ethan Wang, David Qian, Perry Cheng, Brian Guo
Explore our collection of 200+ Premium Webflow Templates