The annual Arm Tech Symposia concluded this afternoon in Shenzhen.

During the conference, Arm delved deep into AI’s computing demands and shared how to better seize AI development opportunities through three core areas: hardware, software, and ecosystem. Attendees also explored technology innovations and AI development trends based on Arm’s architecture.

In his Shenzhen keynote speech, James McNiven, VP of Product Management for Arm’s Client Line of Business, emphasized that Armv9, Arm’s latest technology architecture, was designed from the start to support AI computing. Through continuous iterations and key technologies like SVE, SVE2, and SME, Arm keeps optimizing mobile AI experiences through architectural innovation and robust hardware-software coordination capabilities.

Arm Tech Symposia

KleidiAI software was one of the conference highlights.

It achieves deep integration with mainstream AI frameworks, providing developers with a smooth development experience. When used with Arm CSS, KleidiAI significantly improves computational performance by integrating Arm acceleration technologies including Neon™, SVE2, and SME2.

KleidiAI is a high-performance computing kernel specifically designed for AI framework developers.

It helps developers easily maximize Arm CPU performance across various devices while fully utilizing key Arm architectural features like Neon, SVE2, and SME2.

KleidiAI Integration

Additionally, KleidiAI integrates popular AI frameworks like PyTorch, Tensorflow, and MediaPipe, optimizes performance for models like Meta Llama 3 and Phi-3, and employs a forward and backward compatible design.

This approach ensures Arm can meet future market demands as more technologies are introduced.

KleidiAI’s integration significantly improves generative AI efficiency.

Data shows that compared to reference implementations (based on llama.cpp without Kleidi software optimization), Meta Llama 3 and Microsoft Phi-3 large language models using llama.cpp (integrated with KleidiAI) achieved 190% faster token first response times on the new Arm Cortex-X925 CPU.

Performance Improvement

Another major advantage of KleidiAI is its easy integration.

Arm’s engineering team completed Llama 3 performance optimization testing in less than 24 hours.

Furthermore, KleidiAI integrates with MediaPipe through XNNPACK to support the open-source Gemma LLM running on mobile devices. As a result, Gemma 2B’s token first response time on the Google Pixel 8 Pro smartphone was reduced by 25%.

Simultaneously, Arm collaborated with Unity to develop Sentis, an on-device AI inference engine that allows game developers to create new AI gaming experiences on all devices supporting the Unity game engine.

Unity Collaboration

As Arm’s fastest computing platform to date, Arm Client CSS achieves over 30% improvement in computing and graphics performance, capable of handling demanding Android workloads.

Meanwhile, Arm Client CSS also increases AI inference speed by 59%, suitable for a broader range of AI/ML and computer vision workloads.

The core advantage of Arm Client CSS lies in its most powerful, efficient, and comprehensive CPU cluster from Arm, aimed at achieving optimal balance between performance and energy efficiency.

With the new generation of Arm Cortex®-X CPU, the AI-optimized Arm Client CSS delivers the highest year-over-year IPC improvement with 36% increased performance; the new Arm Immortalis™ GPU improves graphics performance by 37%.

Performance Metrics

The Arm Immortalis-G925 GPU is Arm’s most powerful and efficient GPU, achieving 37% performance improvement in multiple mobile games and 34% performance improvement across various AI and ML networks.

The Immortalis-G925 primarily targets the flagship smartphone market.

Meanwhile, the new highly scalable GPU series, including Arm Mali™-G725 and Mali-G625 GPUs, targets a broad consumer electronics market from high-end phones to smartwatches and XR wearables.

Arm expects over 100 billion Arm devices with AI capabilities worldwide by the end of 2025.

From sensors and smartphones to industrial IoT, automotive, and data centers, just as skyscrapers need solid foundations, AI technology’s flourishing development requires powerful and efficient computing platforms as support.

Through persistent efforts in chip architecture and technological innovation, Arm is building the most reliable foundation for this “AI skyscraper” and will play an increasingly crucial role in this technological revolution.

By Kaiho

Leave a Reply

Your email address will not be published. Required fields are marked *