ITRT(IT Research Trends)

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

연구 분야: Networking

논문 키워드: #learning #efficient #wireless #efficiency #optimizing

학회: Frontiers of Information Technology & Electronic Engineering

초록

Optimizing the deployment of large language models (LLMs) in edge computing environments is critical for enhancing privacy and computational efficiency. In the path toward efficient wireless LLM inference in edge computing, this study comprehensively analyzes the impact of different splitting points in mainstream open-source LLMs. Accordingly, this study introduces a framework taking inspiration from model-based reinforcement learning to determine the optimal splitting point across the edge and user equipment. By incorporating a reward surrogate model, our approach significantly reduces the computational cost of frequent performance evaluations. Extensive simulations demonstrate that this method effectively balances inference performance and computational load under varying network conditions, providing a robust solution for LLM deployment in decentralized settings.

📄 논문 정보

발행 연도	2025년
인용수	0
출판 국가	China
사이트	Springer
좋아요 수	0

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Yuxuan Chen (陈宇轩)

Rongpeng Li (李荣鹏)

Xiaoxue Yu (于小雪)

Zhifeng Zhao (赵志峰)

Honggang Zhang (张宏纲)

📄 논문 정보

연관 논문 목록 (330건)

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

📄 논문 정보

연관 논문 목록 (330건) 내 서재 담기

연관 논문 목록 (330건)