LLMs rely on memory-intensive mechanisms like the key-value (KV) cache to store and quickly retrieve data. FastGen optimizes KV cache usage, reducing LLM memory demands by up to 50% while maintai...
https://www.microsoft.com/en-us/research/blog/llm-profiling-guides-kv-cache-optimization/
LoftQ boosts LLM efficiency by streamlining the fine-tuning process, reducing computational demands while preserving high performance. Innovations like this can help make AI technology more energ...
Researcher Michel Galley explores how he and fellow researchers combined new and existing data to create MathVista, an open-source benchmark for measuring the mathematical reasoning capabilities ...
https://www.microsoft.com/en-us/research/podcast/abstracts-may-6-2024/
In this edition: Can LLMs transform natural language into formal method postconditions; Semantically aligned question + code generation for automated insight generation; Explaining CLIP performan...
https://www.microsoft.com/en-us/research/blog/research-focus-week-of-april-29-2024/
From AI and deep learning to innovations in infrastructure, researchers from Microsoft are bridging the gap between architecture, programming languages, and operating systems to advance the state...
Microsoft recently developed and released the Situated Interactive Guidance, Monitoring, and Assistance (SIGMA) system, an open-source research platform, to enable research and innovation at the ...
Energized by disruption, partner group product manager Rafah Hosn is helping to drive scientific advancement in AI for Microsoft. She talks about the mindset needed to work at the frontiers of AI...
https://www.microsoft.com/en-us/research/podcast/ideas-exploring-ai-frontiers-with-rafah-hosn/
SAMMO optimizes prompts for LLMs by leveraging their structure to guide optimization. This minimizes the time and effort needed to find performant prompts on a variety of tasks. The post SAMMO:...
In this issue: New research on appropriate reliance on generative AI; Power management opportunities for LLMs in the cloud; LLMLingua-2 improves task-agnostic prompt compression; Enhancing COMET ...
https://www.microsoft.com/en-us/research/blog/research-focus-week-of-april-15-2024/
Microsoft at NDSI 2024: Discoveries and implementations in networked systems Topics range from 5G, space, datacenters, and wide-area networking to applications in artificial intelligence, secur...