When in instruct mode and using SSE for token streaming with an MCP server connected, the last chunk of the LLM's response is being rendered twice in KoboldLite. For example: "How may I help you today ...
current device: 0, in function ggml_backend_cuda_device_event_synchronize at ggml/src/ggml-cuda/ggml-cuda.cu:4947 This issue only seems to effect the Qwen 3.5 series ...