Commit Graph

  • 4301535326 sync : ggml Georgi Gerganov 2024-09-20 19:06:59 +03:00
  • 424c5d00a9 ggml/examples: add backend support for numerical optimization (ggml/949) Johannes Gäßler 2024-09-20 19:04:44 +03:00
  • a6809c6a2e examples : add null threadpool args where needed (ggml/0) Georgi Gerganov 2024-09-08 11:10:43 +03:00
  • 5cb12f6839
    CUDA: fix sum.cu compilation for CUDA < 11.7 (#9562) Johannes Gäßler 2024-09-20 18:35:35 +02:00
  • d39e26741f
    examples : flush log upon ctrl+c (#9559) Georgi Gerganov 2024-09-20 11:46:56 +03:00
  • 722ec1eb51
    perplexity : do not escape input data by default (#9548) Sigbjørn Skjæret 2024-09-20 08:38:10 +02:00
  • 6026da52d6
    server : clean-up completed tasks from waiting list (#9531) Georgi Gerganov 2024-09-19 12:44:53 +03:00
  • eca0fab44e
    imatrix : disable prompt escape by default (#9543) Sigbjørn Skjæret 2024-09-19 09:58:14 +02:00
  • 64c6af3195
    ggml : fix n_threads_cur initialization with one thread (#9538) slaren 2024-09-18 19:13:08 +02:00
  • 0d2f22e45c
    scripts : verify py deps at the start of compare (#9520) Georgi Gerganov 2024-09-18 18:34:32 +03:00
  • 6443ddd985
    llama : use reserve/emplace_back in sampler_sample (#9534) Daniel Bevenius 2024-09-18 13:42:36 +02:00
  • 8a308354f6
    server : match OAI structured output response (#9527) Vinesh Janarthanan 2024-09-18 01:50:34 -05:00
  • f799155ab8
    server : fix OpenSSL build (remove obsolete LOG_INFO) (#9529) Eric Zhang 2024-09-18 14:28:20 +08:00
  • faf67b3de4
    [SYCL]set context default value to avoid memory issue, update guide (#9476) Neo Zhang Jianyu 2024-09-18 08:30:31 +08:00
  • 7be099fa81
    llama-bench: correct argument parsing error message (#9524) Michael Podvitskiy 2024-09-17 22:41:38 +02:00
  • 8b836ae731
    arg : add env variable for parallel (#9513) Bert Wagner 2024-09-17 09:35:38 -04:00
  • 8344ef58f8
    llama : fix n_vocab init for 'no_vocab' case (#9511) Michael Podvitskiy 2024-09-17 12:18:22 +02:00
  • 0226613853
    threadpool : skip polling for unused threads (#9461) Max Krasnyansky 2024-09-17 01:19:46 -07:00
  • 503147a9f9
    unicode : add <algorithm> (#9508) Yuri Khrustalev 2024-09-17 02:51:15 -04:00
  • 0d2ec43833
    llama : support IBM Granite architecture (#9412) Gabe Goodhart 2024-09-17 00:44:58 -06:00
  • 37f3a3810e
    llama : add llama_n_head() (#9512) Michael Podvitskiy 2024-09-17 08:23:30 +02:00
  • 23e0d70bac
    ggml : move common CPU backend impl to new header (#9509) slaren 2024-09-16 16:22:07 +02:00
  • acb2c32c33
    llama : rename n_embed to n_embd in rwkv6_time_mix (#9504) Daniel Bevenius 2024-09-16 13:07:13 +02:00
  • a6a3a5c531
    ggml : link MATH_LIBRARY not by its full path (#9339) Michael Podvitskiy 2024-09-16 13:06:50 +02:00
  • d54c21df7e
    convert : identify missing model files (#9397) compilade 2024-09-16 03:30:22 -04:00
  • 19514d632e
    cmake : do not hide GGML options + rename option (#9465) Georgi Gerganov 2024-09-16 10:27:50 +03:00
  • 5c3d0f1824
    ggml : IQ4_NL sgemm + Q4_0 AVX optimization (#9422) Eve 2024-09-16 06:48:24 +00:00
  • 0aadac10c7
    llama : support OLMoE (#9462) Shane A 2024-09-15 23:47:37 -07:00
  • 95ca85168b
    llama : support MiniCPM3 (#9322) CarryFun 2024-09-16 14:45:20 +08:00
  • 441b72b91f
    main : option to disable context shift (#9484) Vinesh Janarthanan 2024-09-16 01:20:01 -05:00
  • c4965a64f7
    metal : handle zero-sized allocs (#9466) Georgi Gerganov 2024-09-16 09:05:56 +03:00
  • 90a2fff0e7
    flake.lock: Update (#9488) Georgi Gerganov 2024-09-16 05:14:23 +03:00
  • 6262d13e0b
    common : reimplement logging (#9418) Georgi Gerganov 2024-09-15 20:46:12 +03:00
  • e6deac31f7
    gguf-split : add basic checks (#9499) slaren 2024-09-15 19:02:27 +02:00
  • 6988da94a2
    cmake : correct order of sycl flags (#9497) Michael Podvitskiy 2024-09-15 18:55:52 +02:00
  • 3c7989fd29
    py : add "LLaMAForCausalLM" conversion support (#9485) Csaba Kecskemeti 2024-09-15 00:48:25 -07:00
  • d6b37c881f
    readme : update tools list (#9475) OSecret 2024-09-15 10:36:53 +03:00
  • 7596487beb
    cmake : try to fix sycl+intel build (#9487) Michael Podvitskiy 2024-09-15 09:06:38 +02:00
  • 822b6322de
    ggml : ggml_type_name return "NONE" for invalid values (#9458) Yuri Khrustalev 2024-09-14 05:54:37 -04:00
  • dcdcee3a74
    server: add data: [DONE] to /chat/completions stream response (#9459) VoidIsVoid 2024-09-14 17:36:44 +08:00
  • 1f4111e540
    cmake : use list(APPEND ...) instead of set() + dedup linker (#9463) Georgi Gerganov 2024-09-14 10:55:05 +03:00
  • befaf1197f
    llama : make cell_id const in inp_s_mask block (#9470) Daniel Bevenius 2024-09-14 09:50:12 +02:00
  • feff4aa846
    server : add loading html page while model is loading (#9468) Xuan Son Nguyen 2024-09-13 14:23:11 +02:00
  • 0abc6a2c25
    llama : llama_perf + option to disable timings during decode (#9355) Georgi Gerganov 2024-09-13 09:53:38 +03:00
  • bd35cb0ae3
    feat: remove a sampler from a chain (#9445) Gilad S. 2024-09-13 04:54:49 +03:00
  • 78203641fe
    server : Add option to return token pieces in /tokenize endpoint (#9108) Mathijs Henquet 2024-09-12 22:30:11 +02:00
  • e6b7801bd1
    cann: Add host buffer type for Ascend NPU (#9406) Dou Xinpeng 2024-09-12 19:46:43 +08:00
  • e665744317
    llava : fix the script error in MobileVLM README (#9054) fengerhu1 2024-09-12 19:34:22 +08:00
  • d4c3c10fad
    lora : raise error if lm_head is ignored (#9103) Xuan Son Nguyen 2024-09-12 13:33:57 +02:00
  • 2a825116b6
    cmake : fix for builds without GGML_CDEF_PUBLIC (#9338) Michael Podvitskiy 2024-09-12 13:30:01 +02:00
  • 4dc4f5f14a
    ci : update HIP SDK to 24.Q3 (ROCm 6.1) (#9329) Huang Qi 2024-09-12 19:28:43 +08:00
  • c837981bba
    py : add Phi-1.5/Phi-2 tokenizer (#9361) daminho 2024-09-12 20:28:20 +09:00
  • 3c26a1644d
    ci : bump actions/checkout to v4 (#9377) Trivikram Kamat 2024-09-12 04:27:45 -07:00
  • ff76e18516
    cmake : fixed the order of linking libraries for llama-quantize (#9450) Michael Podvitskiy 2024-09-12 13:27:14 +02:00
  • 39f852f440
    py : add special tokens in hf_converter for RWKV v6 (#9428) Molly Sophia 2024-09-12 19:25:16 +08:00
  • 2b00fa7997
    riscv : modify Makefile and add a RISCV_VECT to print log info (#9442) Ahmad Tameem 2024-09-12 16:24:31 +05:00
  • d6a04f872d
    ggml : hide ggml_object, ggml_cgraph, ggml_hash_set (#9408) Georgi Gerganov 2024-09-12 14:23:49 +03:00
  • c9c8575a1a
    enhance run script to be easy to change the parameters (#9448) Neo Zhang Jianyu 2024-09-12 17:44:17 +08:00
  • df4b7945ae
    cann: Fix error when running a non-exist op (#9424) Xinpeng Dou 2024-09-12 09:02:35 +08:00
  • 449ccfb6f5
    Add Jais to list of supported models (#9439) Faisal Zaghloul 2024-09-11 20:29:53 -04:00
  • 1b28061400
    llama : skip token bounds check when evaluating embeddings (#9437) slaren 2024-09-11 17:52:13 +02:00
  • 8db003a19d
    py : support converting local models (#7547) Pavel Zloi 2024-09-11 15:29:51 +03:00
  • 0996c5597f
    llava : correct args for minicpmv-cli (#9429) Xuan Son Nguyen 2024-09-11 12:59:13 +02:00
  • 5bb2c5dbd2
    files : remove accidentally added lora_test submodule (#9430) Xuan Son Nguyen 2024-09-11 12:02:09 +02:00
  • 67155ab7f5
    feat: Implements retrying logic for downloading models using --model-url flag (#9255) Farbod Bijary 2024-09-11 12:52:37 +03:30
  • 5af118efda
    CUDA: fix --split-mode row race condition (#9413) Johannes Gäßler 2024-09-11 10:22:40 +02:00
  • d2b496bff4
    batched-bench : remove unused code (#9305) Georgi Gerganov 2024-09-11 10:03:54 +03:00
  • b34e023480
    musa: remove Clang builtins mapping (#9421) R0CKSTAR 2024-09-11 09:46:55 +08:00
  • 51b6038636
    sycl : update support conditions (#9394) Alberto Cabrera Pérez 2024-09-11 01:53:42 +01:00
  • cb9c933eb2
    flake.lock: Update (#9360) Georgi Gerganov 2024-09-11 01:46:59 +03:00
  • 6cd4e03444
    arg : bring back missing ifdef (#9411) Xuan Son Nguyen 2024-09-10 22:41:29 +02:00
  • 8d300bd35f
    enable --special arg for llama-server (#9419) matteo 2024-09-10 22:40:59 +02:00
  • 49006c67b4
    llama : move random seed generation to the samplers (#9398) slaren 2024-09-10 18:04:25 +02:00
  • 00ba2ff781
    metal : fix compile warning with GGML_METAL_NDEBUG (#0) Georgi Gerganov 2024-09-10 10:17:03 +03:00
  • 83008b7cfe
    llama : update llm_build_copy_mask_state comment [no ci] (#9385) Daniel Bevenius 2024-09-10 09:03:21 +02:00
  • 0b4ac75772
    RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list (#9387) Molly Sophia 2024-09-10 15:02:30 +08:00
  • fb3f249815
    make : do not run llama-gen-docs when building (#9399) slaren 2024-09-10 08:23:33 +02:00
  • bfe76d4a17
    common : move arg parser code to arg.cpp (#9388) Xuan Son Nguyen 2024-09-09 23:36:09 +02:00
  • 293bebe077
    rpc : fix segfault with nkvo (#9389) Radoslav Gerganov 2024-09-09 18:40:10 +03:00
  • 5fac4d5764
    ggml : vector length agnostic SVE support (#9290) Prashant Vithule 2024-09-09 21:07:18 +05:30
  • 5fb5e24811
    llama : minor sampling refactor (2) (#9386) slaren 2024-09-09 17:10:46 +02:00
  • 38ca6f644b
    readme : update hot topics Georgi Gerganov 2024-09-09 15:51:37 +03:00
  • 8e6e2fbe14
    CUDA: fix variable name conflict for Windows build (#9382) Johannes Gäßler 2024-09-09 14:22:53 +02:00
  • 5ed087573e
    readme : add LLMUnity to UI projects (#9381) Antonis Makropoulos 2024-09-09 14:21:38 +03:00
  • 54f376d0b9
    rpc : update README [no ci] (#9320) Radoslav Gerganov 2024-09-09 11:04:39 +03:00
  • b2e89a3274
    Arm AArch64: Documentation updates (#9321) Dan Johansson 2024-09-09 09:02:45 +02:00
  • daa9623ab0
    Overlap cmdbuffer creation and cmdbuffer execution in Vulkan backend by submitting smaller cmdbuffers early. (#9118) Markus Tavenrath 2024-09-08 21:43:48 +02:00
  • e079bffb66
    cuda : fix FA Q src index (1 -> 0) (#9374) Georgi Gerganov 2024-09-08 22:01:02 +03:00
  • 3f7ccfd649
    common : bring back missing args, add env var duplication check (#9375) Xuan Son Nguyen 2024-09-08 18:08:55 +02:00
  • a249843d89
    common : restore --n-gpu-layers (#9371) slaren 2024-09-08 16:44:42 +02:00
  • 19f4a7b296
    llama : refactor samplers internal implementation (#9370) slaren 2024-09-08 15:52:07 +02:00
  • 2a358fb0c4
    [SYCL] add check malloc result on device (#9346) Neo Zhang Jianyu 2024-09-08 19:05:29 +08:00
  • eae597182c
    llama : sanitize tokens in the upper bound (#9359) slaren 2024-09-08 12:41:51 +02:00
  • 00b02bb249
    imatrix : fix arg parser for imatrix (#9366) Xuan Son Nguyen 2024-09-08 12:12:17 +02:00
  • a876861455 metal : update support condition for im2col + fix warning (#0) Georgi Gerganov 2024-09-08 09:57:57 +03:00
  • 385decbd63 sync : ggml Georgi Gerganov 2024-09-08 09:38:56 +03:00
  • 60a3107ccd scripts : option to increase git patch context Georgi Gerganov 2024-09-08 09:38:42 +03:00
  • 406c1a32a1 vulkan: add dryrun support to sin and cos ops (ggml/947) Salvatore Mesoraca 2024-09-06 14:34:25 +02:00
  • 9cb9260861 vulkan: correctly report support for OP_CONT (ggml/946) Salvatore Mesoraca 2024-09-06 14:34:07 +02:00
  • 202084d31d tests: add gradient tests for all backends (ggml/932) Johannes Gäßler 2024-09-03 17:21:46 +02:00