llama.cpp

RYDE-WORK/llama.cpp

Fork 0

mirror of https://github.com/RYDE-WORK/llama.cpp.git synced 2026-02-03 14:13:46 +08:00

c71bf2c45c

swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
bc39553c90

build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
0ccfc62a96

ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
7f1a0fe709

ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00
16bc66d947

llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
0512d66670

ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
0e76a8992c

train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
2db94d98ed

gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
ecf90b1a51

gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
2619109ad5

ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
ec893798b7

llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
45855b3f1c

docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
4aea3b846e

readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
da0400344b

ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
e519621010

convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
ac43576124

make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
20c7e1e804

gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
dc6897404e

metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
527e57cfd8

build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
ffe88a36a9

readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
99115f3fa6

cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
1726f9626f

docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
a98b1633d5

nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
c091cdfb24

llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
51a7cf5c6e

examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
bedb92b603

scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
bc9d3e3971

Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
36b904e200

ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
324f3403d5

zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
f56c418ab0

embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
8185710a80

CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
7eb41179ed

readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
a5661d7e71

llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
65c2c1c5ab

benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
80834daecf

flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
a40f2b656f

CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
d119c04c15

examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
8781013ef6

make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
7ddf185537

ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
ee66942d7e

CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
111163e246

CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
8b428c9bc8

llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
578d8c8f5c

CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
b541b4f0b1

Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
5dbc2b3213

Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
b08e75baea

Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
e6616cf0db

examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
3aefaab9e5

check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
69eb67e282

fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
4fe09dfe66

llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
80291a1d02

common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
c6f1491da0

metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
e3d87a6c36

convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
8c00b7a6ff

sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
7e50d34be6

cmake : fix building shared libs for clang (rocm) on windows (#3176) Engininja2 2023-09-15 06:24:30 -06:00
235f7c193b

flake : use pkg-config instead of pkgconfig (#3188) Evgeny Kurnevsky 2023-09-15 10:10:22 +02:00
a51b687657

metal : relax conditions on fast matrix multiplication kernel (#3168) Georgi Gerganov 2023-09-15 11:09:24 +03:00
76164fe2e6

cmake : fix llama.h location when built outside of root directory (#3179) Andrei 2023-09-15 04:07:40 -04:00
c2ab6fe661

ci : Cloud-V for RISC-V builds (#3160) Ali Tariq 2023-09-15 13:06:56 +05:00
2d770505a8

llama : remove mtest (#3177) Roland 2023-09-15 03:28:45 -04:00
98311c4277

llama : make quantize example up to 2.7x faster (#3115) Cebtenzzre 2023-09-14 21:09:53 -04:00
feea179e9f

flake : allow $out/include to already exist (#3175) jneem 2023-09-14 13:54:47 -05:00
769266a543

cmake : compile ggml-rocm with -fpic when building shared library (#3158) Andrei 2023-09-14 13:38:16 -04:00
cf8238e7f4

flake : include llama.h in nix output (#3159) Asbjørn Olling 2023-09-14 19:25:00 +02:00
4b8560e72a

make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) Cebtenzzre 2023-09-14 13:22:47 -04:00
83a53b753a

CI: add FreeBSD & simplify CUDA windows (#3053) Alon 2023-09-14 20:21:25 +03:00
5c872dbca2

falcon : use stated vocab size (#2914) akawrykow 2023-09-14 10:19:42 -07:00
990a5e226a

cmake : add relocatable Llama package (#2960) bandoti 2023-09-14 14:04:40 -03:00
980ab41afb

docker : add gpu image CI builds (#3103) dylan 2023-09-14 09:47:00 -07:00
e394084166

gguf-py : support identity operation in TensorNameMap (#3095) Kerfuffle 2023-09-14 10:32:26 -06:00
4c8643dd6e

feature : support Baichuan serial models (#3009) jameswu2014 2023-09-15 00:32:10 +08:00
35f73049af

speculative : add heuristic algorithm (#3006) Leng Yue 2023-09-14 09:14:44 -07:00
71ca2fad7d

whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096) goerch 2023-09-13 15:19:44 +02:00
1b6c650d16

cmake : add a compiler flag check for FP16 format (#3086) Tristan Ross 2023-09-13 06:08:52 -07:00
0a5eebb45d

CUDA: mul_mat_q RDNA2 tunings (#2910) Johannes Gäßler 2023-09-13 11:20:24 +02:00
84e723653c

speculative: add --n-gpu-layers-draft option (#3063) FK 2023-09-13 08:50:46 +02:00
b52b29ab9d

arm64 support for windows (#3007) Eric Sommerlade 2023-09-13 02:54:20 +01:00
4f7cd6ba9c

CUDA: fix LoRAs (#3130) Johannes Gäßler 2023-09-13 00:15:33 +02:00
89e89599fd

CUDA: fix mul_mat_q not used for output tensor (#3127) Johannes Gäßler 2023-09-11 22:58:41 +02:00
d54a4027a6

CUDA: lower GPU latency + fix Windows performance (#3110) Johannes Gäßler 2023-09-11 19:55:51 +02:00
1b0d09259e

cmake : support build for iOS/tvOS (#3116) Jhen-Jie Hong 2023-09-11 19:49:06 +08:00
8a4ca9af56

CUDA: add device number to error messages (#3112) Johannes Gäßler 2023-09-11 13:00:24 +02:00
f31b6f4e2d

metal : PP speedup (#3084) Kawrakow 2023-09-11 09:30:11 +02:00
6eeb4d9083

convert: remove most of the n_mult usage in convert.py (#3098) Erik Scholz 2023-09-10 17:06:53 +02:00
21ac3a1503

metal : support for Swift (#3078) kchro3 2023-09-09 02:12:10 -07:00
4fd5477955

metal : support build for iOS/tvOS (#3089) Jhen-Jie Hong 2023-09-09 16:46:04 +08:00
ec2a24fedf

flake : add train-text-from-scratch to flake.nix (#3042) takov751 2023-09-08 17:06:26 +01:00
7d99aca759

readme : fix typo (#3043) Ikko Eltociear Ashimine 2023-09-09 01:04:32 +09:00
ba7ffbb251

metal : Q3_K speedup (#2995) Kawrakow 2023-09-08 18:01:04 +02:00
e64f5b5578

examples : make n_ctx warning work again (#3066) Cebtenzzre 2023-09-08 11:43:35 -04:00
94f10b91ed

readme : update hot tpoics Georgi Gerganov 2023-09-08 18:18:04 +03:00
b3e9852e47

sync : ggml (CUDA GLM RoPE + POSIX) (#3082) Georgi Gerganov 2023-09-08 17:58:07 +03:00
cb6c44c5e0

build : do not use _GNU_SOURCE gratuitously (#2035) Przemysław Pawełczyk 2023-09-08 14:09:21 +02:00
a21baeb122

docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044) hongbo.mo 2023-09-08 18:57:55 +08:00
6ff712a6d1

Update deprecated GGML TheBloke links to GGUF (#3079) Yui 2023-09-08 12:32:55 +02:00
ebc96086af

ggml-alloc : correctly check mmap return value for errors (#3075) slaren 2023-09-08 04:04:56 +02:00
7f412dab9c

enable CPU HBM (#2603) Kunshang Ji 2023-09-08 09:46:56 +08:00
6336d834ec

convert : fix F32 ftype not being saved (#3048) Cebtenzzre 2023-09-07 14:27:42 -04:00
00d62adb79

fix some warnings from gcc and clang-tidy (#3038) Cebtenzzre 2023-09-07 13:22:29 -04:00
4fa2cc1750

make : improve test target (#3031) Cebtenzzre 2023-09-07 10:15:01 -04:00

Commit Graph Select branches Hide Pull Requests master Mono Color

Commit Graph

Select branches

Hide Pull Requests

master