Commit Graph

  • 7e72aa74fd
    py: add_array() will not add to kv store if value is an empty array (#8774) Brian 2024-07-31 00:57:03 +10:00
  • 7c27a19b2e
    added android implementation of ggml_print_backtrace_symbols (#8751) l3utterfly 2024-07-30 23:40:18 +09:00
  • 140074bb86
    flake.lock: Update (#8729) Georgi Gerganov 2024-07-30 15:58:57 +03:00
  • 6e2b6000e5
    cann: update cmake (#8765) wangshuai09 2024-07-30 18:37:35 +08:00
  • c887d8b017
    [SYCL] Add TIMESTEP_EMBEDDING OP (#8707) zhentaoyu 2024-07-30 14:56:51 +08:00
  • 75af08c475
    ggml: bugfix: fix the inactive elements is agnostic for risc-v vector (#8748) CarterLi999 2024-07-30 00:38:34 +08:00
  • 439b3fc75a
    cuda : organize vendor-specific headers into vendors directory (#8746) R0CKSTAR 2024-07-29 20:56:12 +08:00
  • 0832de7236
    [SYCL] add conv support (#8688) Meng, Hengyu 2024-07-29 10:50:27 +08:00
  • 6eeaeba126
    cmake: use 1 more thread for non-ggml in CI (#8740) Johannes Gäßler 2024-07-28 22:32:44 +02:00
  • 4730faca61
    chore : Fix vulkan related compiler warnings, add help text, improve CLI options (#8477) Austin 2024-07-28 03:52:42 -04:00
  • 4c676c85e5
    llama : refactor session file management (#8699) compilade 2024-07-28 00:42:05 -04:00
  • e54c35e4fb
    feat: Support Moore Threads GPU (#8383) R0CKSTAR 2024-07-28 07:41:25 +08:00
  • 5e2727fe03
    scripts : sync vulkan-shaders (#0) Georgi Gerganov 2024-07-27 18:08:31 +03:00
  • 56f20aa25d
    scripts : sync ggml-aarch64 sources Georgi Gerganov 2024-07-27 17:19:35 +03:00
  • 345c8c0c87 ggml : add missing semicolon (#0) Georgi Gerganov 2024-07-27 15:57:09 +03:00
  • ae7985cd7b sync : ggml Georgi Gerganov 2024-07-27 15:53:48 +03:00
  • a05ca93697 ggml : loop tiling optimizations for scalar path (ggml/898) Mahesh Madhav 2024-07-25 00:54:08 -07:00
  • 9f77d899b7 ggml: add support for float16 input tensors in pooling operations (ggml/895) Ivan Filipov 2024-07-22 14:32:02 +03:00
  • 203b7f1531 vulkan : initialize vk_buffer_struct members to VK_NULL_HANDLE (ggml/893) Tony Wasserka 2024-07-20 20:49:44 +02:00
  • d2b851bfa1 cmake : only enable GGML_NATIVE and x86 flags if not crosscompiling (ggml/885) Borislav Stanimirov 2024-07-12 17:24:20 +03:00
  • c12b6e8ee7 ggml : remove unnecessary UNUSED macro call (ggml/880) Daniel Bevenius 2024-07-08 12:03:42 +02:00
  • b5e95468b1
    llama : add support for llama 3.1 rope scaling factors (#8676) Jeffrey Morgan 2024-07-27 05:03:45 -07:00
  • 92090eca21
    llama : add function for model-based max number of graph nodes (#8622) Georgi Gerganov 2024-07-27 14:59:29 +03:00
  • 9d03d085dd
    common : add --no-warmup option for main/llama-cli (#8712) Daniel Bevenius 2024-07-27 12:45:02 +02:00
  • bfb4c74981
    cann: Fix Multi-NPU execution error (#8710) wangshuai09 2024-07-27 16:36:44 +08:00
  • 2b1f616b20
    ggml : reduce hash table reset cost (#8698) slaren 2024-07-27 04:41:55 +02:00
  • 01245f5b16
    llama : fix order of parameters (#8706) Judd 2024-07-26 16:38:12 +08:00
  • 01aec4a631
    server : add Speech Recognition & Synthesis to UI (#8679) Yaiko 2024-07-25 18:10:16 -04:00
  • 41cd47caab
    examples : export-lora : fix issue with quantized base models (#8687) Xuan Son Nguyen 2024-07-25 23:49:39 +02:00
  • 49ce0ab6d4
    ggml: handle ggml_init failure to fix NULL pointer deref (#8692) DavidKorczynski 2024-07-25 22:23:05 +01:00
  • 4226a8d10e
    llama : fix build + fix fabs compile warnings (#8683) Georgi Gerganov 2024-07-25 19:57:31 +03:00
  • bf5a81df37
    ggml : fix build on Windows with Snapdragon X (#8531) Andreas (Andi) Kunar 2024-07-25 18:01:00 +02:00
  • 88954f7fbd
    tests : fix printfs (#8068) Georgi Gerganov 2024-07-25 18:57:44 +03:00
  • ed67bcb24f
    [SYCL] fix multi-gpu issue on sycl (#8554) Chen Xi 2024-07-25 11:45:18 +00:00
  • eddcb5238b
    ggml : add and use ggml_cpu_has_llamafile() (#8664) Georgi Gerganov 2024-07-25 12:37:42 +03:00
  • be6d7c0791
    examples : remove finetune and train-text-from-scratch (#8669) Xuan Son Nguyen 2024-07-25 10:39:04 +02:00
  • 4b0eff3df5
    docs : Quantum -> Quantized (#8666) Ujjawal Panchal 2024-07-25 13:43:27 +05:30
  • 8a4bad50a8
    llama: use sliding window for phi3 (#8627) Fan Shupei 2024-07-25 15:21:09 +08:00
  • 68504f0970
    readme : update games list (#8673) MorganRO8 2024-07-24 12:48:00 -04:00
  • f19bf99c01
    Build Llama SYCL Intel with static libs (#8668) Joe Todd 2024-07-24 14:36:00 +01:00
  • 3a7ac5300a
    readme : update UI list [no ci] (#8505) Thorsten Sommer 2024-07-24 14:52:30 +02:00
  • 96952e7181
    llama : fix llama_chat_format_single for mistral (#8657) Xuan Son Nguyen 2024-07-24 13:48:46 +02:00
  • 79167d9e49
    Re-add erroneously removed -fsycl from GGML_EXTRA_LIBS (#8667) Joe Todd 2024-07-24 11:55:26 +01:00
  • b115105f05
    add llama_lora_adapter_clear (#8653) Xuan Son Nguyen 2024-07-24 11:25:19 +02:00
  • de280085e7
    examples : Fix llama-export-lora example (#8607) Xuan Son Nguyen 2024-07-23 23:48:37 +02:00
  • b841d07408
    server : fix URL.parse in the UI (#8646) Vali Malinoiu 2024-07-23 17:37:42 +03:00
  • 64cf50a0ed
    sycl : Add support for non-release DPC++ & oneMKL (#8644) Joe Todd 2024-07-23 14:58:37 +01:00
  • 938943cdbf
    llama : move vocab, grammar and sampling into separate files (#8508) Georgi Gerganov 2024-07-23 13:10:17 +03:00
  • 751fcfc6c3
    Vulkan IQ4_NL Support (#8613) 0cc4m 2024-07-23 10:56:49 +02:00
  • 46e47417aa
    Allow all RDNA2 archs to use sdot4 intrinsic (#8629) Jeroen Mostert 2024-07-23 10:50:40 +02:00
  • e7e6487ba0
    contrib : clarify PR squashing + module names (#8630) Georgi Gerganov 2024-07-23 11:28:38 +03:00
  • 063d99ad11
    [SYCL] fix scratch size of softmax (#8642) luoyu-intel 2024-07-23 07:43:28 +00:00
  • 081fe431aa
    llama : fix codeshell support (#8599) Keke Han 2024-07-23 00:43:43 +08:00
  • d94c6e0ccb
    llama : add support for SmolLm pre-tokenizer (#8609) Jason Stillerman 2024-07-22 10:43:01 -04:00
  • 566daa5a5b
    *.py: Stylistic adjustments for python (#8233) Jiří Podivín 2024-07-22 15:44:53 +02:00
  • 6f11a83e4e
    llama : allow overrides for tokenizer flags (#8614) Georgi Gerganov 2024-07-22 13:33:22 +03:00
  • e093dd2382
    tests : re-enable tokenizer tests (#8611) Georgi Gerganov 2024-07-22 13:32:49 +03:00
  • 50e05353e8
    llama : add Mistral Nemo inference support (#8604) Douglas Hanley 2024-07-22 03:06:17 -05:00
  • 628154492a
    server : update doc to clarify n_keep when there is bos token (#8619) Jan Boon 2024-07-22 16:02:09 +08:00
  • 04bab6b7da
    ggml: fix compile error for RISC-V (#8623) Mark Zhuang 2024-07-22 15:56:45 +08:00
  • b7c11d36e6
    examples: fix android example cannot be generated continuously (#8621) devojony 2024-07-22 14:54:42 +08:00
  • 45f2c19cc5
    flake.lock: Update (#8610) Georgi Gerganov 2024-07-21 16:45:10 +03:00
  • 22f281aa16
    examples : Rewrite pydantic_models_to_grammar_examples.py (#8493) M-A 2024-07-20 22:09:17 -04:00
  • 328884f421
    gguf-py : fix some metadata name extraction edge cases (#8591) compilade 2024-07-20 21:58:49 -04:00
  • c69c63039c
    convert_hf : fix Gemma v1 conversion (#8597) compilade 2024-07-20 21:53:01 -04:00
  • 69c487f4ed
    CUDA: MMQ code deduplication + iquant support (#8495) Johannes Gäßler 2024-07-20 22:25:26 +02:00
  • 07283b1a90
    gguf : handle null name during init (#8587) Georgi Gerganov 2024-07-20 17:15:42 +03:00
  • 940362224d
    llama : add support for Tekken pre-tokenizer (#8579) Michael Coppola 2024-07-20 09:43:51 -04:00
  • 69b9945b44
    llama.swiftui: fix end of generation bug (#8268) Huifeng Ou 2024-07-20 09:09:37 -04:00
  • c3776cacab
    gguf_dump.py: fix markddown kv array print (#8588) Brian 2024-07-20 17:35:25 +10:00
  • 87e397d00b
    ggml : fix quant dot product with odd number of blocks (#8549) slaren 2024-07-19 17:17:27 +02:00
  • 57b1d4f9eb
    convert-*.py: remove add_name from ChatGLMModel class (#8590) Brian 2024-07-20 00:04:38 +10:00
  • d197545530
    llama : bump max layers from 256 to 512 (#8530) Georgi Gerganov 2024-07-19 16:50:47 +03:00
  • be0cfb4175
    readme : fix server badge Georgi Gerganov 2024-07-19 14:34:55 +03:00
  • b57eb9ca4f
    ggml : add friendlier error message to fopen errors (#8575) Clint Herron 2024-07-19 07:05:45 -04:00
  • f299aa98ec
    fix: typo of chatglm4 chat tmpl (#8586) Frank Mai 2024-07-19 17:44:41 +08:00
  • 3d0e4367d9
    convert-*.py: add general.name kv override (#8571) Brian 2024-07-19 17:51:51 +10:00
  • a15ef8f8a0
    CUDA: fix partial offloading for ne0 % 256 != 0 (#8572) Johannes Gäßler 2024-07-18 23:48:47 +02:00
  • 705b7ecf60
    cmake : install all ggml public headers (#8480) 65a 2024-07-18 07:47:12 -07:00
  • 0d2c7321e9
    server: use relative routes for static files in new UI (#8552) Eric Zhang 2024-07-18 18:43:49 +08:00
  • 672a6f1018
    convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499) Brian 2024-07-18 20:40:15 +10:00
  • 3807c3de04
    server : respect --special cli arg (#8553) RunningLeon 2024-07-18 16:06:22 +08:00
  • e02b597be3
    lookup: fibonacci hashing, fix crashes (#8548) Johannes Gäßler 2024-07-17 23:35:44 +02:00
  • b3283448ce
    build : Fix docker build warnings (#8535) (#8537) Al Mochkin 2024-07-17 20:21:55 +02:00
  • 30f80ca0bc
    CONTRIBUTING.md : remove mention of noci (#8541) Brian 2024-07-18 00:57:06 +10:00
  • 1bdd8ae19f
    [CANN] Add Ascend NPU backend (#6035) hipudding 2024-07-17 19:23:50 +08:00
  • da3913d8f9
    batched: fix n_predict parameter (#8527) Masaya, Kato 2024-07-17 16:34:28 +09:00
  • d65a8361fe
    llama : disable context-shift for DeepSeek v2 (#8501) Georgi Gerganov 2024-07-17 10:32:59 +03:00
  • 5e116e8dd5
    make/cmake: add missing force MMQ/cuBLAS for HIP (#8515) Johannes Gäßler 2024-07-16 21:20:59 +02:00
  • 1666f92dcd
    gguf-hash : update clib.json to point to original xxhash repo (#8491) Brian 2024-07-16 17:14:16 +10:00
  • 37b12f92ab
    export-lora : handle help argument (#8497) Steve Bonds 2024-07-16 00:04:45 -07:00
  • 0efec57787
    llama : valign + remove unused ftype (#8502) Georgi Gerganov 2024-07-16 10:00:30 +03:00
  • 7acfd4e8d5
    convert_hf : faster lazy safetensors (#8482) compilade 2024-07-15 23:13:10 -04:00
  • 97bdd26eee
    Refactor lora adapter support (#8332) Xuan Son Nguyen 2024-07-15 20:50:47 +02:00
  • 4db8f60fe7
    fix ci (#8494) Xuan Son Nguyen 2024-07-15 19:23:10 +02:00
  • 8fac431b06
    ggml : suppress unknown pragma 'GCC' on windows (#8460) Daniel Bevenius 2024-07-15 14:48:17 +02:00
  • f17f39ff9c
    server: update README.md with llama-server --help output [no ci] (#8472) M-A 2024-07-15 08:04:56 -04:00
  • 9104bc20ed
    common : add --no-cont-batching arg (#6358) Georgi Gerganov 2024-07-15 14:54:58 +03:00
  • fc690b018e
    docs: fix links in development docs [no ci] (#8481) NikolaiLyssogor 2024-07-15 04:46:39 -07:00
  • 16bdfa42ac
    [SYCL] add concat through dim 1/2 (#8483) Meng, Hengyu 2024-07-15 19:32:15 +08:00