【vLLM 学习】Save Sharded State-编程实验室

vLLM 是一款专为大语言模型推理加速而设计的框架，实现了 KV 缓存内存几乎零浪费，解决了内存管理瓶颈问题。

更多 vLLM 中文文档及教程可访问 →vllm.hyper.ai/

源码 examples/offline_inference/save_sharded_state.py

# SPDX-License-Identifier: Apache-2.0 """ 将每个工作进程(worker)的模型状态字典直接保存到检查点， 这为大型张量并行模型提供了快速加载路径 - 每个工作进程只需读取自己的分片， 而无需读取整个检查点。 示例用法： python save_sharded_state.py \ --model /path/to/load \ --quantization deepspeedfp \ --tensor-parallel-size 8 \ --output /path/to/save Then, the model can be loaded with llm = LLM( model="/path/to/save", load_format="sharded_state", quantization="deepspeedfp", tensor_parallel_size=8, ) """ import dataclasses import os import shutil from pathlib import Path from vllm import LLM, EngineArgs from vllm.utils import FlexibleArgumentParser parser = FlexibleArgumentParser() EngineArgs.add_cli_args(parser) parser.add_argument("--output", "-o", required=True, type=str, help="path to output checkpoint") parser.add_argument("--file-pattern", type=str, help="string pattern of saved filenames") parser.add_argument("--max-file-size", type=str, default=5 * 1024**3, help="max size (in bytes) of each safetensors file") def main(args): engine_args = EngineArgs.from_cli_args(args) if engine_args.enable_lora: raise ValueError("Saving with enable_lora=True is not supported!") model_path = engine_args.model if not Path(model_path).is_dir(): raise ValueError("model path must be a local directory") # Create LLM instance from arguments # 从参数创建 LLM 实例 llm = LLM(**dataclasses.asdict(engine_args)) # Prepare output directory # 准备输出目录 Path(args.output).mkdir(exist_ok=True) # Dump worker states to output directory # 转储工作进程状态到输出目录 model_executor = llm.llm_engine.model_executor model_executor.save_sharded_state(path=args.output, pattern=args.file_pattern, max_size=args.max_file_size) # Copy metadata files to output directory # 将元数据文件复制到输出目录 for file in os.listdir(model_path): if os.path.splitext(file)[1] not in (".bin", ".pt", ".safetensors"): if os.path.isdir(os.path.join(model_path, file)): shutil.copytree(os.path.join(model_path, file), os.path.join(args.output, file)) else: shutil.copy(os.path.join(model_path, file), args.output) if __name__ == "__main__": args = parser.parse_args() main(args)

10个AI写作利器，助你快速完成数学建模论文复现

数学建模论文的复现与排版往往时间紧迫、任务繁重，但借助AI工具可以显著提升效率。通过对10款热门AI论文写作工具的评测，发现部分工具能自动优化公式排版、生成代码框架，甚至辅助模型复现，尤其适合需要快速完成高质量论文的场景。…

李华

数学建模论文复现不再难！10个AI工具助你事半功倍

李华

打卡信奥刷题（2750）用C++实现信奥题 P3657 [USACO17FEB] Why Did the Cow Cross the Road II P

P3657 [USACO17FEB] Why Did the Cow Cross the Road II P 题目背景本题与金组同名题目在题意上一致，唯一的差别是数据范围。题目描述 Farmer John 饲养了 N N N 种奶牛，编号从 1 1 1 到 N N N。一些品种的奶牛和其他奶牛间相处良好&#xff…

李华

8款AI应用改变软件工程毕设：智能论文撰写与程序复现

文章总结表格（工具排名对比） 工具名称核心优势 aibiye 精准降AIGC率检测，适配知网/维普等平台 aicheck 专注文本AI痕迹识别，优化人类表达风格 askpaper 快速降AI痕迹，保留学术规范秒篇高效处理混AIGC内容&…

李华

【vLLM 学习】Save Sharded State

10个AI写作利器，助你快速完成数学建模论文复现

数学建模论文复现不再难！10个AI工具助你事半功倍

打卡信奥刷题（2750）用C++实现信奥题 P3657 [USACO17FEB] Why Did the Cow Cross the Road II P

微软MOS认证2月份考试时间

[论文阅读] AI + 软件工程 | 33k+ AI编码PR实证揭秘：为什么AI提交的代码常被拒绝？

8款AI应用改变软件工程毕设：智能论文撰写与程序复现