Commit Graph

  • e1675d133c
    llama : avoid fprintf in favor of LLAMA_LOG (#3538) Georgi Gerganov 2023-10-17 22:34:26 +03:00
  • 8402566a7c
    readme : update hot-topics & models, detail windows release in usage (#3615) BarfingLemurs 2023-10-17 14:13:21 -04:00
  • 40e5ce054f CLBlast: Fix temporary buffer size for f16 conversion (wsize) shibe2 2023-10-11 21:30:06 +04:00
  • a5e8c1d8c7
    train-text-from-scratch : fix assert failure in ggml-alloc (#3618) slaren 2023-10-17 19:00:58 +02:00
  • e74c705e15
    editorconfig : remove trailing spaces Georgi Gerganov 2023-10-17 19:52:53 +03:00
  • 3ad1e3f1a1
    server : documentation of JSON return value of /completion endpoint (#3632) coezbek 2023-10-17 18:51:02 +02:00
  • 1142013da4
    save-load-state : fix example + add ci test (#3655) Georgi Gerganov 2023-10-17 19:12:46 +03:00
  • 5fe268a4d9
    readme : add Aquila2 links (#3610) ldwang 2023-10-17 23:52:33 +08:00
  • 1a159553f9
    tokenizer : special token handling (#3538) staviq 2023-10-17 17:11:01 +02:00
  • 281ef73c25
    k-quants : fix quantization ranges (#3646) Georgi Gerganov 2023-10-17 09:19:28 +03:00
  • 940efa95fe
    llava : fix tokenization to not add bos between image embeddings and user prompt (#3645) Georgi Gerganov 2023-10-16 23:58:00 +03:00
  • 11bff29045
    MPT : support GQA for replit-code-v1.5 (#3627) cebtenzzre 2023-10-15 02:32:06 -04:00
  • 11dc1091f6
    Honor -ngl option for Cuda offloading in llava (#3621) M. Yusuf Sarıgöz 2023-10-14 13:52:44 +03:00
  • 2a4bcbacea
    llama : remove n_threads from llama_decode_internal (#3614) Daniel Bevenius 2023-10-13 12:33:16 +02:00
  • 424b6381c4
    ggml : add context enumeration functions (#3605) slaren 2023-10-13 12:23:10 +02:00
  • 1e0e873c37
    CLBlast: Fix matrix-vector multiplication (#3544) shibe2 2023-10-12 23:59:47 +04:00
  • 370359e5ba
    examples: support LLaVA v1.5 (multimodal model) (#3436) M. Yusuf Sarıgöz 2023-10-12 18:23:18 +03:00
  • 9e24cc6e2e
    docs : fix typo GOMP_CPU_AFFINITY (#3597) uint256_t 2023-10-12 22:36:16 +09:00
  • d28e572c02
    cmake : fix add_compile_options on macOS Georgi Gerganov 2023-10-12 14:31:05 +03:00
  • f3040beaab
    typo : it is --n-gpu-layers not --gpu-layers (#3592) Ian Scrivener 2023-10-12 22:10:50 +11:00
  • 1a8c8795d6
    ci : check if there is enough VRAM (#3596) Georgi Gerganov 2023-10-12 13:44:56 +03:00
  • b016596d90
    server : add completion mode (no chat) (#3582) Aarni Koskela 2023-10-12 15:51:53 +09:00
  • 6b3ae4da92
    prompts : add mnemonics.txt Georgi Gerganov 2023-10-12 09:35:19 +03:00
  • 57dd55e2c7
    server : fix kv cache management (#3588) Georgi Gerganov 2023-10-12 09:29:04 +03:00
  • b8fe4b5cc9
    main : fix session loading bug (#3400) Georgi Gerganov 2023-10-11 23:55:08 +03:00
  • a8bdd65525
    server : add parameter -tb N, --threads-batch N (#3584) Michael Coppola 2023-10-11 15:42:22 -04:00
  • 70c29da118
    common : fix mirostat state when using multiple sequences (#3543) Kerfuffle 2023-10-11 13:35:46 -06:00
  • 8c70a5ff25
    batched : add bench tool (#3545) Georgi Gerganov 2023-10-11 21:25:33 +03:00
  • 24ba3d829e
    examples : add batched.swift + improve CI for swift (#3562) Zane Shannon 2023-10-11 04:14:05 -07:00
  • 9f6ede19f3
    Add MPT model to supported models in README.md (#3574) Galunid 2023-10-11 01:02:49 +02:00
  • 233fc1c69f
    Minor improvements in GPT2 tokenizer (#3567) goerch 2023-10-10 18:59:52 +02:00
  • c5b49360d0
    readme : add bloom (#3570) Xingchen Song(宋星辰) 2023-10-11 00:28:50 +08:00
  • 02d2875def
    llm : add bloom models (#3553) Xingchen Song(宋星辰) 2023-10-10 22:48:21 +08:00
  • 0aa6595ae0
    swift : improvements and fixes (#3564) Jhen-Jie Hong 2023-10-10 06:31:13 -05:00
  • f5f9121de1
    llm : add MPT support (#3417) Jan Ploski 2023-10-10 09:50:23 +02:00
  • 11ea5c7d96
    infill. : fix tokenization (#3508) vvhg1 2023-10-10 09:31:21 +02:00
  • 95bd60a0a6
    ggml-alloc : fix assert in debug builds (#3555) slaren 2023-10-09 14:44:58 +02:00
  • fcca0a7004
    refact : fix convert script + zero out KV cache to avoid nans (#3523) Georgi Gerganov 2023-10-09 14:32:17 +03:00
  • dcc09d2596
    metal : do not use mul_mm kernels when ne00 < 64 (#3542) Georgi Gerganov 2023-10-09 14:28:27 +03:00
  • db3abcc114
    sync : ggml (ggml-backend) (#3548) Georgi Gerganov 2023-10-08 20:19:14 +03:00
  • eee42c670e
    ci : add Zig CI/CD and fix build (#2996) Matheus C. França 2023-10-08 10:59:20 -03:00
  • 8e6716a102
    api_like_OAI.py : compat with Microsoft Guidance (#2746) Ryder Wishart 2023-10-08 03:55:58 -07:00
  • 9c38d181d4
    api_like_OAI.py : simplify function (#2796) arcrank 2023-10-08 06:52:57 -04:00
  • a1202a31ed
    k-quants : fix comments about block sizing (#3499) Johannes Rudolph 2023-10-08 12:21:19 +02:00
  • 94e502dfb7
    ci : enable on obj-c changes + fix metal build (#3540) Georgi Gerganov 2023-10-08 11:24:50 +03:00
  • 7d8b24932f
    zig : fix build by introducing train.cpp (#3539) Luo Tian 2023-10-08 16:24:01 +08:00
  • b0ec5218c3
    metal : support MTLGPUFamily < Apple7, formatting, style (#3524) Georgi Gerganov 2023-10-08 10:01:53 +03:00
  • 63d3b06a43
    llama : fix missing break in Persimmon arch case statements (#3535) Kerfuffle 2023-10-07 23:22:17 -06:00
  • a16e89cec8
    Fix trying to strip newline from empty prompt and cfg prompt file content (#3534) Kerfuffle 2023-10-07 15:31:41 -06:00
  • 4d03833211
    gguf.py : fix CI for publishing GGUF package (#3532) M. Yusuf Sarıgöz 2023-10-07 22:14:10 +03:00
  • c47066d833
    py : change version of numpy requirement to 1.24.4 (#3515) Tom C 2023-10-07 02:56:15 -07:00
  • f1782c68de
    quantize : fail fast on write errors (#3521) cebtenzzre 2023-10-07 04:41:52 -04:00
  • c26765a0a1
    metal : support default.metallib load & reuse code for swift package (#3522) Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
  • 0e797c2fc5
    llm : support Adept Persimmon 8B (#3410) Phillip Kravtsov 2023-10-07 00:12:43 -07:00
  • 3a716b4dae
    Fix for #3454 (#3455) goerch 2023-10-07 06:57:01 +02:00
  • 1faaae8c2b
    readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
  • cb13d73a72
    server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
  • 9ca79d5cbb
    kv cache slot search improvements (#3493) Kerfuffle 2023-10-06 10:10:13 -06:00
  • 0c731ca403
    prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
  • a8777ad84e
    parallel : add option to load external prompt file (#3416) pudepiedj 2023-10-06 14:16:38 +01:00
  • 97af49fa39
    server : reuse llama_sample_token common util (#3494) Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
  • 16820a5a0d
    llama : correct hparams comparison (#3446) l3utterfly 2023-10-06 18:47:59 +08:00
  • 04b2f4386e
    ci : fix xcodebuild destinations (#3491) Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
  • 48edda30ee
    convert : update Falcon script for new HF config (#3448) cebtenzzre 2023-10-05 15:00:34 -04:00
  • 45eba9369f
    build : use std::make_tuple() for compatibility with older GCC versions (#3488) Kenvix ⭐ 2023-10-06 01:16:39 +08:00
  • acec9eaaa9
    common : process escape sequences in reverse prompts (#3461) staviq 2023-10-05 18:17:29 +02:00
  • e2583cbc29 CLBlast: Fix handling of on-device tensor data shibe2 2023-10-05 15:57:03 +04:00
  • e8b8d32e86
    server : fix incorrect num_tokens_predicted (#3480) Jhen-Jie Hong 2023-10-05 09:02:55 -05:00
  • 8f3a642ec1
    swift : disable ACCELERATE_NEW_LAPACK (#3481) Jhen-Jie Hong 2023-10-05 09:00:07 -05:00
  • 0745384449
    ci : add swift build via xcodebuild (#3482) Jhen-Jie Hong 2023-10-05 08:56:21 -05:00
  • 019ba1dcd0
    convert : fix Baichuan2 models by using vocab size in config.json (#3299) Kerfuffle 2023-10-04 08:20:28 -06:00
  • beabc8cfb0
    readme : add project status link Georgi Gerganov 2023-10-04 16:50:44 +03:00
  • 0d152b37fe
    ggml : fix build after #3329 Georgi Gerganov 2023-10-04 16:25:41 +03:00
  • f8c90cdbaa
    llm : add Refact model (#3329) ds5t5 2023-10-04 06:23:39 -07:00
  • f93af02488
    sync : ggml (conv 1d + 2d updates, UB fixes) (#3468) Georgi Gerganov 2023-10-04 15:29:58 +03:00
  • f72f8f22c9
    finetune : readme fix typo (#3465) Merrick Christensen 2023-10-04 00:33:13 -06:00
  • 79f34abddb
    ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) Tameem 2023-10-03 23:38:19 +05:00
  • 8186242b6d
    main : consistent prefix/suffix coloring (#3425) h-h-h-h 2023-10-03 20:16:15 +02:00
  • ac2219fef3
    llama : fix session saving/loading (#3400) Georgi Gerganov 2023-10-03 21:04:01 +03:00
  • 48be797ffb
    llama : expose model's rope_freq_scale in the API (#3418) Alex Klinkhamer 2023-10-03 10:09:28 -07:00
  • f56e1baec3
    metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
  • 017efe899d
    cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) Eve 2023-10-03 16:53:15 +00:00
  • ff5a3f0c09
    Work on the BPE tokenizer (#3252) goerch 2023-10-03 09:16:26 +02:00
  • 1c84003c08
    convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
  • e78f0b0d05
    cmake : increase minimum version for add_link_options (#3444) cebtenzzre 2023-10-02 15:38:43 -04:00
  • 665018c749
    CLBlast: Add broadcast support for matrix multiplication (#3402) shibe2 2023-10-02 23:26:15 +04:00
  • 29a404a951
    gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
  • 0fe321031a
    gguf : general usability improvements (#3409) cebtenzzre 2023-10-02 14:58:46 -04:00
  • 9476b01226
    cmake : make CUDA flags more similar to the Makefile (#3420) cebtenzzre 2023-10-02 09:16:50 -04:00
  • a03ce38455
    finetune : fix #3404 (#3437) xaedes 2023-10-02 15:15:45 +02:00
  • a847676984
    metal : set log callback before initializing (#3427) Adrian 2023-10-02 03:49:59 -07:00
  • 095231dfd3
    cmake : fix transient definitions in find pkg (#3411) bandoti 2023-10-02 06:51:49 -03:00
  • ea55295a74
    docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
  • c97f01c362
    infill : add new example + extend server API (#3296) vvhg1 2023-10-02 09:42:02 +02:00
  • f5ef5cfb18
    ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) slaren 2023-09-30 18:12:57 +02:00
  • 40e07a60f9
    llama.cpp : add documentation about rope_freq_base and scale values (#3401) slaren 2023-09-29 18:42:32 +02:00
  • bc34dd4f5b
    train : fix KQ_pos allocation (#3392) Georgi Gerganov 2023-09-29 19:05:18 +03:00
  • 2777a84be4
    llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) Cebtenzzre 2023-09-29 09:48:45 -04:00
  • 0a4a4a0982
    readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
  • 569550df20
    readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00