Cloud-edge Inference

image
Our research emphasizes the characteristics of inference tasks and system dynamics, and proposes intelligent scheduling frameworks and algorithms to address different challenges, thereby improving the overall performance and service quality of edge intelligence systems.
We integrate mechanisms such as model partitioning, early exit, model lightweighting, and service migration to construct system processing strategies and design scheduling algorithms, which have been validated on real hardware platforms and public datasets. We also apply methods such as linear programming, dynamic programming, game theory, and deep reinforcement learning to study the theoretical properties and algorithmic implementations of distributed scheduling decisions.
These research efforts have been supported by national and industry-level key projects. Through these collaborations, our technologies have been successfully transferred and applied in scenarios such as intelligent manufacturing, smart power, and Industry 4.0.