Commit Graph

  • 5ffab089a5
    make : fix CPPFLAGS (#3035) Cebtenzzre 2023-09-07 10:13:50 -04:00
  • 15b67a66c2
    llama-bench : use two tokens in the warmup run for prompt evals (#3059) slaren 2023-09-07 15:52:34 +02:00
  • be8c9c245b
    metal : parallel RoPE on Metal (#3024) Kawrakow 2023-09-07 15:45:01 +02:00
  • be6beeb8d7
    metal : correct fix of kernel_norm (#3060) Kawrakow 2023-09-07 15:42:42 +02:00
  • c4f496648c
    metal : fix kernel_norm (fixes Falcon on Metal) (#3057) Georgi Gerganov 2023-09-07 15:49:09 +03:00
  • fec2fb19e4
    ggml : posixify madvise and pagesize (#3037) Przemysław Pawełczyk 2023-09-07 10:15:06 +02:00
  • 178b1850eb
    k-quants : fix zero-weight guard in Q6_K (ref #3040) Georgi Gerganov 2023-09-06 12:40:57 +03:00
  • ea2c85d5d2
    convert-llama-ggml-to-gguf: Try to handle files older than GGJTv3 (#3023) Kerfuffle 2023-09-06 02:49:11 -06:00
  • 9912b9efc8
    build : add LLAMA_METAL_NDEBUG flag (#3033) Cebtenzzre 2023-09-05 18:21:10 -04:00
  • 9e2023156e
    make : use new flag variables for recent changes (#3019) Cebtenzzre 2023-09-05 15:12:00 -04:00
  • de2fe892af
    examples : replace fprintf to stdout with printf (#3017) Cebtenzzre 2023-09-05 15:10:27 -04:00
  • c9c3220c48
    convert: fix convert.py not working with int filename_stem (#3028) Erik Scholz 2023-09-05 19:41:00 +02:00
  • d59bd97065
    Guard against all weights in a super-block being zero (#3010) Kawrakow 2023-09-05 09:55:33 +02:00
  • 35938ee3b0
    llama : update logic for number of threads when using BLAS Georgi Gerganov 2023-09-05 10:46:39 +03:00
  • 921772104b
    speculative : add grammar support (#2991) Georgi Gerganov 2023-09-05 08:46:17 +03:00
  • 2ba85c8609
    py : minor Georgi Gerganov 2023-09-04 22:50:50 +03:00
  • e36ecdccc8
    build : on Mac OS enable Metal by default (#2901) Georgi Gerganov 2023-09-04 22:26:24 +03:00
  • bd33e5ab92
    ggml-opencl : store GPU buffer in ggml_tensor::extra (#2994) slaren 2023-09-04 14:59:52 +02:00
  • 3103568144
    llama-bench : make cpp file non-executable (#2999) Cebtenzzre 2023-09-04 06:40:18 -04:00
  • 5b8530d88c
    make : add speculative example (#3003) Leng Yue 2023-09-04 03:39:57 -07:00
  • e4386f417f
    server : add a subtle loading animation to the edit box (#2466) Aarni Koskela 2023-09-04 10:28:55 +02:00
  • 35195689cd
    2x faster (rms) norm cuda kernels (3.7% e2e improvement) (#2985) Jiahao Li 2023-09-04 14:53:30 +08:00
  • cf9b08485c
    ggml-alloc : use virtual memory for measurement (#2973) slaren 2023-09-03 20:34:09 +02:00
  • 47068e5170
    speculative : PoC for speeding-up inference via speculative sampling (#2926) Georgi Gerganov 2023-09-03 15:12:08 +03:00
  • 8f429fa511
    perplexity : fix ETA by warming up the model with an empty run Georgi Gerganov 2023-09-03 13:42:56 +03:00
  • 6519e9c99c
    gguf(python): Fix special vocab handling when id < 0 (#2984) Kerfuffle 2023-09-03 04:38:43 -06:00
  • b7f2aa9e51
    metal : restore 363f0bf and fix reduce in F16_F32 kernels (#2986) Georgi Gerganov 2023-09-03 13:23:33 +03:00
  • 73a12a6344
    cov : disable comment in PRs (#2989) Alon 2023-09-03 13:19:01 +03:00
  • 3730134776
    llama : fix bpe tokenize from byte (#2889) opparco 2023-09-03 19:18:09 +09:00
  • d9151e6f57
    metal : revert 6af0bab until we fix it Georgi Gerganov 2023-09-03 12:40:56 +03:00
  • afc43d5f82
    cov : add Code Coverage and codecov.io integration (#2928) Alon 2023-09-03 11:48:49 +03:00
  • 6460f758db
    opencl : fix a bug in ggml_cl_pool_malloc() for ggml_cl_mul_mat_f32() (#2955) Wentai Zhang 2023-09-03 16:46:44 +08:00
  • ca82cf7bac
    metal : more optimizations (#2959) Kawrakow 2023-09-03 11:06:22 +03:00
  • 6a31a3bd98
    swift : add support for k-quants (#2983) kchro3 2023-09-02 23:21:05 -07:00
  • cff7b0bf07
    convert.py : BPE fixes (#2938) Kerfuffle 2023-09-02 23:52:13 -06:00
  • 340af42f09
    docs : add catai to README.md (#2967) Ido S 2023-09-03 08:50:51 +03:00
  • c42f0ec6b3
    examples : fix gpt-neox (#2943) momonga 2023-09-03 14:36:28 +09:00
  • 2753415afd
    swift : add missing c file to Package.swift (#2978) kchro3 2023-09-02 22:27:25 -07:00
  • bc054af97a
    make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS (#2886) Cebtenzzre 2023-09-03 01:26:59 -04:00
  • 3358c381f6
    logging: Fix creating empty file even when disabled (#2966) Kerfuffle 2023-09-02 11:53:55 -06:00
  • 52315a4216
    readme : update clblast instructions (#2903) bandoti 2023-09-02 09:53:18 -03:00
  • 8b56b4f2c3
    metal : show all Metal device instances in the system (#2952) Karsten Weiss 2023-09-02 14:29:09 +02:00
  • 21f3d1be86
    k-quants : fix build on armv7 (android only) (#2920) Jhen-Jie Hong 2023-09-02 20:23:45 +08:00
  • 571083f508
    server : avoid aniprompt in probabilities of final response (#2849) Jhen-Jie Hong 2023-09-02 08:31:46 +08:00
  • f04d002844
    cuda : vsubss4 for older versions of ROCm/clang (#2942) Engininja2 2023-09-01 15:33:19 -06:00
  • 69fdbb9abc
    readme : quick start command fix (#2908) ZHAOKAI WANG 2023-09-01 22:06:44 +08:00
  • 5d6f19f16b
    Allow quantize to only copy tensors, some other improvements (#2931) Kerfuffle 2023-09-01 08:02:48 -06:00
  • 0d58936686
    llama2c : rename function Georgi Gerganov 2023-09-01 17:00:40 +03:00
  • 6c9c23429b
    make : use unaligned vector moves on MinGW (#2945) Cebtenzzre 2023-09-01 09:53:14 -04:00
  • ee8654bcd0
    minor : add const qualifiers (#2853) m3ndax 2023-09-01 15:47:27 +02:00
  • 49bb9cbe0f
    docs : add java-llama.cpp to README.md (#2935) Konstantin Herud 2023-09-01 15:36:14 +02:00
  • ef15649972
    build : fix most gcc and clang warnings (#2861) Cebtenzzre 2023-09-01 09:34:50 -04:00
  • d8d6977f48
    examples : add C grammar (#2357) Ben Siraphob 2023-09-01 09:32:14 -04:00
  • 5aec2cfaac
    ggml : add RISC-V vector intrinsics support (#2929) Tameem 2023-09-01 18:27:40 +05:00
  • 13268c5331
    metal : slight speed-up for add and mul kernels (#2917) Georgi Gerganov 2023-09-01 13:42:41 +03:00
  • 4dcd47d71d
    logs : fix mingw-like builds (fixes #2898) (#2911) staviq 2023-09-01 11:07:06 +02:00
  • 18705a30ef
    llama2c : fix segfault and alloc-dealloc-mismatch (#2913) Cebtenzzre 2023-09-01 05:03:49 -04:00
  • e8d9158925
    metal: somewhat faster f16 x f32 matrix multiply kernel (#2951) Kawrakow 2023-09-01 11:15:57 +03:00
  • bce1fef328
    convert : fix another python 3.8 issue (#2949) Cebtenzzre 2023-08-31 22:13:51 -04:00
  • 528134dd02
    remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py (#2906) slaren 2023-09-01 01:32:09 +02:00
  • aeefac4ff7
    scripts: Use local gguf package when running from repo (#2927) Kerfuffle 2023-08-31 16:49:24 -06:00
  • e8422de39e
    @vxiiduu's fix for PrefetchVirtualMemory (#2930) DannyDaemonic 2023-08-31 04:21:45 -07:00
  • 92d0b751a7
    convert : fix python 3.8 support, modernize type annotations (#2916) Cebtenzzre 2023-08-31 01:02:23 -04:00
  • 8afe228000
    CUDA: mul_mat_q=true llama_context_params default (#2912) Johannes Gäßler 2023-08-30 21:46:19 +02:00
  • 71d6975559
    [Docker] fix tools.sh argument passing. (#2884) Henri Vasserman 2023-08-30 19:14:53 +03:00
  • b532a69b2f
    convert.py : use dir name to name the llama Georgi Gerganov 2023-08-30 13:29:40 +03:00
  • c90d135eb4
    examples : fix underscore in beam-search + .gitignore (close #2900) Georgi Gerganov 2023-08-30 12:52:46 +03:00
  • 0d1c706181
    gguf : add workflow for Pypi publishing (#2896) M. Yusuf Sarıgöz 2023-08-30 12:47:40 +03:00
  • 9509294420
    make : add test and update CI (#2897) alonfaraj 2023-08-30 12:42:51 +03:00
  • 35092fb547
    docs : add node-llama-cpp to README.md (#2885) Gilad S 2023-08-30 11:40:12 +03:00
  • dc07dc492e
    convert : various script cleanups/fixes + merges and special token handling (#2842) Kerfuffle 2023-08-30 02:25:50 -06:00
  • ad9ddcff6e
    llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) chaihahaha 2023-08-30 14:50:55 +08:00
  • 8341a25957
    main : log file (#2748) staviq 2023-08-30 08:29:32 +02:00
  • 849408957c
    tests : add a C compliance test (#2848) Cebtenzzre 2023-08-30 02:20:26 -04:00
  • 06abf8eeba
    ggml : add view_src and view_offs to ggml_tensor for views (#2874) slaren 2023-08-29 23:24:42 +02:00
  • c03a243abf
    remove outdated references to -eps and -gqa from README (#2881) slaren 2023-08-29 23:17:34 +02:00
  • fa3582f509
    Tell users attmepting to run perplexity with too few tokens to use more (#2882) Kawrakow 2023-08-29 23:55:45 +03:00
  • e37e69dcc3
    10X faster BPE tokenizer (#2876) Kawrakow 2023-08-29 23:55:03 +03:00
  • 53885d7256
    py : fix "usage" messages (#2873) maddes8cht 2023-08-29 15:51:02 +02:00
  • bcce96ba4d
    convert.py : fix baichuan7B support (#2870) jameswu2014 2023-08-29 17:48:41 +08:00
  • 74e0caeb82
    readme : add react-native binding (#2869) Jhen-Jie Hong 2023-08-29 17:30:10 +08:00
  • d4b5e16c32
    make : fix clang tests build, add missing examples (#2859) Cebtenzzre 2023-08-29 04:42:41 -04:00
  • 3a007648f2
    metal : add option to disable debug logs (close #2764) Georgi Gerganov 2023-08-29 11:33:46 +03:00
  • 611363ac79 scripts : add pipefail Georgi Gerganov 2023-08-29 10:50:30 +03:00
  • 95b6e5212f
    added struct to llama_dump_timing_info_yaml's llama_context (#2857) Marcus Dunn 2023-08-28 23:33:27 -07:00
  • 44c117f41e
    train : mem usage and other improvements (#2439) xaedes 2023-08-28 21:51:47 +02:00
  • 43033b7bb4
    llama-bench : set locale to utf8 (#2832) slaren 2023-08-28 19:19:18 +02:00
  • 6b73ef1201
    YAML result logging + preset script (#2657) Johannes Gäßler 2023-08-28 17:59:39 +02:00
  • 75fafcbccc
    make : fix tests build (#2855) alonfaraj 2023-08-28 18:38:35 +03:00
  • be475f60af
    llama.cpp : fix wrong vsnprintf call in MS compiler (#2856) grahameth 2023-08-28 17:38:12 +02:00
  • 3af6b86301
    ggml : tiny ggml_vec_dot_q4_K_q8_K AVX2 improvement (#2819) Ronny Brendel 2023-08-28 14:51:08 +02:00
  • 35feac6560
    ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852) Georgi Gerganov 2023-08-28 14:24:53 +03:00
  • 92b1bbd2ec
    CUDA: fix RoPE asserts, block sizes (#2833) Johannes Gäßler 2023-08-28 13:23:55 +02:00
  • dd0dc366da
    llama.h : add missing struct keyword for C compat in callback type (#2847) igarnier 2023-08-28 10:19:59 +02:00
  • f55538c3cc
    metal : fix memory leak (#2762) Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • ebcee207b6
    quantize : make output filename optional again (#2823) Cebtenzzre 2023-08-28 02:32:25 -04:00
  • 3e8ff47af6
    devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • 103cfafc77
    gguf : fix strings to not be null-terminated (#2839) Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • c10704d01e
    llama : fix MPI threads (close #2827) Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • 230d46c723
    examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) Olivier Chafik 2023-08-27 15:13:31 +01:00