KV Cache - Search News

New KV cache compaction technique cuts LLM memory 50x without accuracy loss

MIT researchers developed Attention Matching, a KV cache compaction technique that compresses LLM memory by 50x in seconds — ...

DIGITIMES

Huawei Launches AI Data Platform to Bridge Models and Business Value

Huawei Data Storage officially launched a "3+1" AI Data Platform to tackle these challenges: Knowledge generation and retrieval with high-accuracy multimodal knowledge for more accurate retrieval ...

Fudzilla

Micron ships “World’s First” 256GB SOCAMM2 Modules

Micron is pushing a new SOCAMM2 module that targets the memory choke points showing up in long-context AI workloads. The outfit claims it has set a “new benchmark” by raising per-module capacity to ...

DatacenterDynamics

Reshaping data infrastructure to help carriers digitally transform

At MWC Barcelona 2026 the president of Huawei Data Storage Product Line shared Huawei's key insights and innovations ...

DIGITIMES

Huawei Launches Its AI Data Platform to Power Faster AI Adoption for Enterprises

BARCELONA, Spain, March 5, 2026 /PRNewswire/ -- At the Huawei AI DC Innovation Forum at MWC Barcelona 2026, Huawei unveiled its AI Data Platform, designed to address the key challenges in adopting AI ...

insideHPC

DDN Takes on GPU Waste with KV Cache Performance for AI Reasoning

CHATSWORTH, Calif. — July 18, 2025 DDN today unveiled performance benchmarks that the company said demonstrates how its AI-optimized DDN Infinia platform eliminates GPU waste and delivers the fastest ...

Semiconductor Engineering

Dynamic KV Cache Scheduling in Heterogeneous Memory Systems for LLM Inference (Rensselaer Polytechnic Institute, IBM)

A new technical paper titled “Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System” was published by researchers at Rensselaer Polytechnic Institute and IBM. “Large ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results