Commit Graph

  • c71bf2c45c
    swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
  • bc39553c90
    build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
  • 0ccfc62a96
    ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
  • 7f1a0fe709
    ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00
  • 16bc66d947
    llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
  • 0512d66670
    ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
  • 0e76a8992c
    train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
  • 2db94d98ed
    gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
  • ecf90b1a51
    gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
  • 2619109ad5
    ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
  • ec893798b7
    llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
  • 45855b3f1c
    docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
  • 4aea3b846e
    readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
  • da0400344b
    ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
  • e519621010
    convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
  • ac43576124
    make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
  • 20c7e1e804
    gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
  • dc6897404e
    metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
  • 527e57cfd8
    build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
  • ffe88a36a9
    readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
  • 99115f3fa6
    cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
  • 1726f9626f
    docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
  • a98b1633d5
    nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
  • c091cdfb24
    llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
  • 51a7cf5c6e
    examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
  • bedb92b603
    scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
  • bc9d3e3971
    Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
  • 36b904e200
    ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
  • 324f3403d5
    zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
  • f56c418ab0
    embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
  • 8185710a80
    CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
  • 7eb41179ed
    readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
  • a5661d7e71
    llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
  • 65c2c1c5ab
    benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
  • 80834daecf
    flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
  • a40f2b656f
    CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
  • d119c04c15
    examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
  • 8781013ef6
    make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
  • 7ddf185537
    ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
  • ee66942d7e
    CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
  • 111163e246
    CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
  • 8b428c9bc8
    llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
  • 578d8c8f5c
    CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • b541b4f0b1
    Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • 5dbc2b3213
    Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • b08e75baea
    Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • e6616cf0db
    examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • 3aefaab9e5
    check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 69eb67e282
    fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 4fe09dfe66
    llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • 80291a1d02
    common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • c6f1491da0
    metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • e3d87a6c36
    convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • 8c00b7a6ff
    sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
  • 7e50d34be6
    cmake : fix building shared libs for clang (rocm) on windows (#3176) Engininja2 2023-09-15 06:24:30 -06:00
  • 235f7c193b
    flake : use pkg-config instead of pkgconfig (#3188) Evgeny Kurnevsky 2023-09-15 10:10:22 +02:00
  • a51b687657
    metal : relax conditions on fast matrix multiplication kernel (#3168) Georgi Gerganov 2023-09-15 11:09:24 +03:00
  • 76164fe2e6
    cmake : fix llama.h location when built outside of root directory (#3179) Andrei 2023-09-15 04:07:40 -04:00
  • c2ab6fe661
    ci : Cloud-V for RISC-V builds (#3160) Ali Tariq 2023-09-15 13:06:56 +05:00
  • 2d770505a8
    llama : remove mtest (#3177) Roland 2023-09-15 03:28:45 -04:00
  • 98311c4277
    llama : make quantize example up to 2.7x faster (#3115) Cebtenzzre 2023-09-14 21:09:53 -04:00
  • feea179e9f
    flake : allow $out/include to already exist (#3175) jneem 2023-09-14 13:54:47 -05:00
  • 769266a543
    cmake : compile ggml-rocm with -fpic when building shared library (#3158) Andrei 2023-09-14 13:38:16 -04:00
  • cf8238e7f4
    flake : include llama.h in nix output (#3159) Asbjørn Olling 2023-09-14 19:25:00 +02:00
  • 4b8560e72a
    make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) Cebtenzzre 2023-09-14 13:22:47 -04:00
  • 83a53b753a
    CI: add FreeBSD & simplify CUDA windows (#3053) Alon 2023-09-14 20:21:25 +03:00
  • 5c872dbca2
    falcon : use stated vocab size (#2914) akawrykow 2023-09-14 10:19:42 -07:00
  • 990a5e226a
    cmake : add relocatable Llama package (#2960) bandoti 2023-09-14 14:04:40 -03:00
  • 980ab41afb
    docker : add gpu image CI builds (#3103) dylan 2023-09-14 09:47:00 -07:00
  • e394084166
    gguf-py : support identity operation in TensorNameMap (#3095) Kerfuffle 2023-09-14 10:32:26 -06:00
  • 4c8643dd6e
    feature : support Baichuan serial models (#3009) jameswu2014 2023-09-15 00:32:10 +08:00
  • 35f73049af
    speculative : add heuristic algorithm (#3006) Leng Yue 2023-09-14 09:14:44 -07:00
  • 71ca2fad7d
    whisper : tokenizer fix + re-enable tokenizer test for LLaMa (#3096) goerch 2023-09-13 15:19:44 +02:00
  • 1b6c650d16
    cmake : add a compiler flag check for FP16 format (#3086) Tristan Ross 2023-09-13 06:08:52 -07:00
  • 0a5eebb45d
    CUDA: mul_mat_q RDNA2 tunings (#2910) Johannes Gäßler 2023-09-13 11:20:24 +02:00
  • 84e723653c
    speculative: add --n-gpu-layers-draft option (#3063) FK 2023-09-13 08:50:46 +02:00
  • b52b29ab9d
    arm64 support for windows (#3007) Eric Sommerlade 2023-09-13 02:54:20 +01:00
  • 4f7cd6ba9c
    CUDA: fix LoRAs (#3130) Johannes Gäßler 2023-09-13 00:15:33 +02:00
  • 89e89599fd
    CUDA: fix mul_mat_q not used for output tensor (#3127) Johannes Gäßler 2023-09-11 22:58:41 +02:00
  • d54a4027a6
    CUDA: lower GPU latency + fix Windows performance (#3110) Johannes Gäßler 2023-09-11 19:55:51 +02:00
  • 1b0d09259e
    cmake : support build for iOS/tvOS (#3116) Jhen-Jie Hong 2023-09-11 19:49:06 +08:00
  • 8a4ca9af56
    CUDA: add device number to error messages (#3112) Johannes Gäßler 2023-09-11 13:00:24 +02:00
  • f31b6f4e2d
    metal : PP speedup (#3084) Kawrakow 2023-09-11 09:30:11 +02:00
  • 6eeb4d9083
    convert: remove most of the n_mult usage in convert.py (#3098) Erik Scholz 2023-09-10 17:06:53 +02:00
  • 21ac3a1503
    metal : support for Swift (#3078) kchro3 2023-09-09 02:12:10 -07:00
  • 4fd5477955
    metal : support build for iOS/tvOS (#3089) Jhen-Jie Hong 2023-09-09 16:46:04 +08:00
  • ec2a24fedf
    flake : add train-text-from-scratch to flake.nix (#3042) takov751 2023-09-08 17:06:26 +01:00
  • 7d99aca759
    readme : fix typo (#3043) Ikko Eltociear Ashimine 2023-09-09 01:04:32 +09:00
  • ba7ffbb251
    metal : Q3_K speedup (#2995) Kawrakow 2023-09-08 18:01:04 +02:00
  • e64f5b5578
    examples : make n_ctx warning work again (#3066) Cebtenzzre 2023-09-08 11:43:35 -04:00
  • 94f10b91ed
    readme : update hot tpoics Georgi Gerganov 2023-09-08 18:18:04 +03:00
  • b3e9852e47
    sync : ggml (CUDA GLM RoPE + POSIX) (#3082) Georgi Gerganov 2023-09-08 17:58:07 +03:00
  • cb6c44c5e0
    build : do not use _GNU_SOURCE gratuitously (#2035) Przemysław Pawełczyk 2023-09-08 14:09:21 +02:00
  • a21baeb122
    docker : add git to full-cuda.Dockerfile main-cuda.Dockerfile (#3044) hongbo.mo 2023-09-08 18:57:55 +08:00
  • 6ff712a6d1
    Update deprecated GGML TheBloke links to GGUF (#3079) Yui 2023-09-08 12:32:55 +02:00
  • ebc96086af
    ggml-alloc : correctly check mmap return value for errors (#3075) slaren 2023-09-08 04:04:56 +02:00
  • 7f412dab9c
    enable CPU HBM (#2603) Kunshang Ji 2023-09-08 09:46:56 +08:00
  • 6336d834ec
    convert : fix F32 ftype not being saved (#3048) Cebtenzzre 2023-09-07 14:27:42 -04:00
  • 00d62adb79
    fix some warnings from gcc and clang-tidy (#3038) Cebtenzzre 2023-09-07 13:22:29 -04:00
  • 4fa2cc1750
    make : improve test target (#3031) Cebtenzzre 2023-09-07 10:15:01 -04:00