diff --git a/doc/SUMMARY.md b/doc/SUMMARY.md index c2461fc..fb7a6cd 100644 --- a/doc/SUMMARY.md +++ b/doc/SUMMARY.md @@ -9,7 +9,7 @@ - [Why KTransformers So Fast](en/deepseek-v2-injection.md) - [Injection Tutorial](en/injection_tutorial.md) - [Multi-GPU Tutorial](en/multi-gpu-tutorial.md) -# Server(Temperary Deprected) +# Server (Temporary Deprecated) - [Server](en/api/server/server.md) - [Website](en/api/server/website.md) - [Tabby](en/api/server/tabby.md) diff --git a/doc/en/DeepseekR1_V3_tutorial.md b/doc/en/DeepseekR1_V3_tutorial.md index 9d8a05f..60dc033 100644 --- a/doc/en/DeepseekR1_V3_tutorial.md +++ b/doc/en/DeepseekR1_V3_tutorial.md @@ -83,7 +83,7 @@ Memory: standard DDR5-4800 server DRAM (1 TB), each socket with 8×DDR5-4800 #### Change Log - Longer Context (from 4K to 8K for 24GB VRAM) and Slightly Faster Speed (+15%):
Integrated the highly efficient Triton MLA Kernel from the fantastic sglang project, enable much longer context length and slightly faster prefill/decode speed -- We suspect the impressive improvement comes from the change of hardwre platform (4090D->4090) +- We suspect that some of the improvements come from the change of hardwre platform (4090D->4090) #### Benchmark Results