Abstract: In edge inference systems, efficient model deployment and accurate forecasting of online service requests are critical for maintaining optimal performance and resource utilization. This ...