Arun Ramachandran

Arun Ramachandran

Principal Member Of Technical Staff - Machine Learning

About

Arun Ramachandran is a Principal Member of Technical Staff at AMD India, with 19 years of industry experience. His work focuses on large language model (LLM) inference, a domain he is also pursuing through his doctoral research at the Indian Institute of Science (IISc), Bengaluru. He has 15 patent applications filed with the USPTO, several of which have been granted. At AMD, he contributes to AMD PACE, an open-source research and advanced-development project.

As generative AI becomes integral to modern life, the cost of token generation rises. This session addresses the growing demand for higher tokens-per-watt efficiency. We will discuss emerging workloads (e.g., RAG), techniques (e.g., token pruning, speculative decoding, and quantization), and the role of hardware, as well as the challenges and opportunities in bridging research and practical deployment.As generative AI becomes integral to modern life, the cost of token generation rises. This session addresses the growing demand for higher tokens-per-watt efficiency. We will discuss emerging workloads (e.g., RAG), techniques (e.g., token pruning, speculative decoding, and quantization), and the role of hardware, as well as the challenges and opportunities in bridging research and practical deployment.

Read More →