fix typo and detail

This commit is contained in:
Azure 2025-02-14 19:40:15 +00:00
parent 823b25eec9
commit 483182fc3a
3 changed files with 6 additions and 12 deletions

View File

@ -23,14 +23,13 @@ Our vision for KTransformers is to serve as a flexible platform for experimentin
<h2 id="Updates">🔥 Updates</h2>
* **Feb 10, 2025**: Support Deepseek-R1 and V3 on single (24GB VRAM)/multi gpu and 382G DRAM, up to 3~28x speedup. The detailed tutorial is [here](./doc/en/DeepseekR1_V3_tutorial.md).
* **Aug 28, 2024**: Support 1M context under the InternLM2.5-7B-Chat-1M model, utilizing 24GB of VRAM and 150GB of DRAM. The detailed tutorial is [here](./doc/en/long_context_tutorial.md).
* **Feb 10, 2025**: Support Deepseek-R1 and V3 on single (24GB VRAM)/multi gpu and 382G DRAM, up to 3~28x speedup. For detailed show case and reproduction tutorial, see [here](./doc/en/DeepseekR1_V3_tutorial.md).
* **Aug 28, 2024**: Decrease DeepseekV2's required VRAM from 21G to 11G.
* **Aug 15, 2024**: Update detailed [TUTORIAL](doc/en/injection_tutorial.md) for injection and multi-GPU.
* **Aug 15, 2024**: Update detailed [tutorial](doc/en/injection_tutorial.md) for injection and multi-GPU.
* **Aug 14, 2024**: Support llamfile as linear backend.
* **Aug 12, 2024**: Support multiple GPU; Support new model: mixtral 8\*7B and 8\*22B; Support q2k, q3k, q5k dequant on gpu.
* **Aug 9, 2024**: Support windows native.
<!-- * **Aug 28, 2024**: Support 1M context under the InternLM2.5-7B-Chat-1M model, utilizing 24GB of VRAM and 150GB of DRAM. The detailed tutorial is [here](./doc/en/long_context_tutorial.md). -->
<h2 id="show-cases">🌟 Show Cases</h2>
<div>
@ -105,11 +104,6 @@ Getting started with KTransformers is simple! Follow the steps below to set up a
To install KTransformers, follow the official [Installation Guide](https://kvcache-ai.github.io/ktransformers/).
Alternatively, you can install it directly via pip:
```bash
pip install ktransformers
```
<h2 id="tutorial">📃 Brief Injection Tutorial</h2>
At the heart of KTransformers is a user-friendly, template-based injection framework.

View File

@ -6,10 +6,10 @@
# Tutorial
- [Deepseek-R1/V3 Show Case](en/DeepseekR1_V3_tutorial.md)
- [Why Ktransformers So Fast](en/deepseek-v2-injection.md)
- [Why KTransformers So Fast](en/deepseek-v2-injection.md)
- [Injection Tutorial](en/injection_tutorial.md)
- [Multi-GPU Tutorial](en/multi-gpu-tutorial.md)
# Server
# Server(Temperary Deprected)
- [Server](en/api/server/server.md)
- [Website](en/api/server/website.md)
- [Tabby](en/api/server/tabby.md)

View File

@ -1,6 +1,6 @@
# How to Run DeepSeek-R1
In this document, we will show you how to run the local_chat.py script to test the DeepSeek-R1's performance. There are two versions:
In this document, we will show you how to install and run KTransformers on your local machine. There are two versions:
* V0.2 is the current main branch.
* V0.3 is a preview version only provides binary distribution for now.
* To reproduce our DeepSeek-R1/V3 results, please refer to [Deepseek-R1/V3 Tutorial](./DeepseekR1_V3_tutorial.md) for more detail settings after installation.