Commit Graph

  • 611363ac79 scripts : add pipefail Georgi Gerganov 2023-08-29 10:50:30 +03:00
  • 95b6e5212f added struct to llama_dump_timing_info_yaml's llama_context (#2857) b1109 Marcus Dunn 2023-08-28 23:33:27 -07:00
  • 44c117f41e train : mem usage and other improvements (#2439) b1108 xaedes 2023-08-28 21:51:47 +02:00
  • 43033b7bb4 llama-bench : set locale to utf8 (#2832) b1107 slaren 2023-08-28 19:19:18 +02:00
  • 6b73ef1201 YAML result logging + preset script (#2657) b1106 Johannes Gäßler 2023-08-28 17:59:39 +02:00
  • 75fafcbccc make : fix tests build (#2855) b1105 alonfaraj 2023-08-28 18:38:35 +03:00
  • be475f60af llama.cpp : fix wrong vsnprintf call in MS compiler (#2856) b1104 grahameth 2023-08-28 17:38:12 +02:00
  • 3af6b86301 ggml : tiny ggml_vec_dot_q4_K_q8_K AVX2 improvement (#2819) b1103 Ronny Brendel 2023-08-28 14:51:08 +02:00
  • cec628e7fd temporarily disable broken 512 build ci_cublas_linux-b1104-cec628e Green Sky 2023-08-26 01:54:14 +02:00
  • 0e1730a90e ci: add linux binaries to release build Green Sky 2023-05-05 00:01:30 +02:00
  • 35feac6560 ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852) b1102 Georgi Gerganov 2023-08-28 14:24:53 +03:00
  • 92b1bbd2ec CUDA: fix RoPE asserts, block sizes (#2833) b1101 Johannes Gäßler 2023-08-28 13:23:55 +02:00
  • dd0dc366da llama.h : add missing struct keyword for C compat in callback type (#2847) b1100 igarnier 2023-08-28 10:19:59 +02:00
  • f55538c3cc metal : fix memory leak (#2762) b1099 Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • ebcee207b6 quantize : make output filename optional again (#2823) b1098 Cebtenzzre 2023-08-28 02:32:25 -04:00
  • 3e8ff47af6 devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • 103cfafc77 gguf : fix strings to not be null-terminated (#2839) b1096 Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • c10704d01e llama : fix MPI threads (close #2827) b1095 Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • 230d46c723 examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) b1094 Olivier Chafik 2023-08-27 15:13:31 +01:00
  • 463173a6c0 llama : speedup tokenization (#2831) b1093 Kawrakow 2023-08-27 16:50:33 +03:00
  • eaa13a48ff falcon : fix CUDA inference by making K and Q contiguous (#2830) b1092 Georgi Gerganov 2023-08-27 16:40:48 +03:00
  • da7455d046 readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +03:00
  • 25423e9185 scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +03:00
  • a6d1189fdd k_quants tuning for Falcon-7b (#2816) b1089 Kawrakow 2023-08-27 15:19:59 +03:00
  • c48c5bb0b0 readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +03:00
  • d0cee0d36d gguf : add 64-bit support (GGUF v2) (#2821) b1087 Georgi Gerganov 2023-08-27 14:19:54 +03:00
  • edd4c14817 llama : more tokenizer fixes (#2810) b1086 Georgi Gerganov 2023-08-27 14:19:19 +03:00
  • 1591e2e590 ggml : detect SSSE3 (#2825) b1085 Przemysław Pawełczyk 2023-08-27 10:10:25 +02:00
  • 789c8c945a ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +02:00
  • c1ac54b77a server : add /detokenize endpoint (#2802) b1083 Bruce MacDonald 2023-08-26 16:11:45 -07:00
  • 33a5517d87 llama.cpp : print gguf version gguf-64bit klosax 2023-08-26 23:56:48 +02:00
  • b61b170005 gguf : fix typo Georgi Gerganov 2023-08-26 23:14:19 +03:00
  • 730d9c681e convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -06:00
  • 09b6da741e gguf.py : string len uint64_t and n_dims uint32_t klosax 2023-08-26 21:53:56 +02:00
  • 6d369a1558 gguf : update all counts to 64-bit Georgi Gerganov 2023-08-26 22:41:55 +03:00
  • bc3eaf262e gguf.py : string lengths uint32_t klosax 2023-08-26 21:29:36 +02:00
  • be726c57ee gguf.py : uint64_t on all lengths, sizes and counts, enums still uint32_t klosax 2023-08-26 21:23:12 +02:00
  • ba335ff5b2 gguf.py : bump GGUF version Georgi Gerganov 2023-08-26 22:13:05 +03:00
  • 3656b3ce81 gguf : v1 backwards comp Georgi Gerganov 2023-08-26 22:11:42 +03:00
  • 4f0547e4a3 gguf : add support for 64-bit (no backwards comp yet) Georgi Gerganov 2023-08-26 22:05:14 +03:00
  • 5f1fffd2d4 gguf : bump version to 2 Georgi Gerganov 2023-08-26 21:52:27 +03:00
  • c7d92e6dfe llama : use Unicode Escape Sequence to replace encoded characters (#2814) b1081 Tim Miller 2023-08-27 03:27:07 +09:00
  • 61d1a2895e flake.nix : add rocm support and cleanup (#2808) Tungsten842 2023-08-26 20:19:44 +02:00
  • 741ca7dd1c llama : move #includes out of _GNU_SOURCE conditional (#2817) b1079 Cebtenzzre 2023-08-26 14:17:51 -04:00
  • 72f895c923 main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528) b1078 Dr. Tom Murphy VII Ph.D 2023-08-26 14:12:56 -04:00
  • 50526f37eb llama : use std::abs in llama_sample_tail_free (#2800) b1077 Cebtenzzre 2023-08-26 12:53:52 -04:00
  • 04f4b1eb10 k-quants : remove unnecessary tensor shape restrictions (#2811) b1076 Georgi Gerganov 2023-08-26 17:37:35 +03:00
  • 7592375403 Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) b1075 Kawrakow 2023-08-26 17:27:49 +03:00
  • 771551a793 Fix HellaSwag (#2805) b1074 Kawrakow 2023-08-26 16:48:53 +03:00
  • f305bad11e flake : build llama.cpp on Intel with nix (#2795) Volodymyr Vitvitskyi 2023-08-26 14:25:39 +01:00
  • a2ca4e9de9 Handle null rope scaling value (#2793) Nigel Bosch 2023-08-26 07:11:17 -05:00
  • 2ba83c8685 Fix spm whitespaces (#2806) b1071 klosax 2023-08-26 13:45:53 +02:00
  • bae5c5f679 examples : skip unnecessary external lib in server README.md how-to (#2804) lon 2023-08-26 10:07:43 +02:00
  • d34472c124 Fix HellaSwag ik/fix_hellaswag Iwan Kawrakow 2023-08-26 10:55:39 +03:00
  • 5562e3e6fa temporarily disable broken 512 build ci_cublas_linux-b1071-5562e3e Green Sky 2023-08-26 01:54:14 +02:00
  • 20f7f4c8de ci: add linux binaries to release build Green Sky 2023-05-05 00:01:30 +02:00
  • 232caf3c15 llama : fix struct decl (#2790) b1069 Marcus Dunn 2023-08-25 09:17:15 -07:00
  • d046dcee08 Faster perplexity computation (#2786) b1068 Kawrakow 2023-08-25 19:05:02 +03:00
  • c82742ac9c llama : add llama_beam_search() (#2267) b1067 Matt Pulver 2023-08-25 11:18:48 -04:00
  • 28b2c996ca convert.py : Get rope scale from HuggingFace models (#2772) Nigel Bosch 2023-08-25 09:41:52 -05:00
  • 154725c543 llama-bench : add model sizes (#2771) b1065 slaren 2023-08-25 15:16:19 +02:00
  • 12e2e33a97 convert.py : export rope freq_base when converting CodeLlama from an HF model (#2773) slaren 2023-08-25 14:08:53 +02:00
  • 29674ab4e8 server : display token probabilities in the UI (#2489) b1063 Jhen-Jie Hong 2023-08-25 18:32:45 +08:00
  • 5439a0ab57 ci : pip install gguf in editable mode (#2782) Georgi Gerganov 2023-08-25 13:03:25 +03:00
  • 8194cd8772 gguf : export objects to user code (#2780) M. Yusuf Sarıgöz 2023-08-25 12:43:41 +03:00
  • 6bbc598a63 ROCm Port (#1087) b1060 Henri Vasserman 2023-08-25 12:09:42 +03:00
  • 3f460a2b72 cuda : add RoPE kernel for mode == 2 (NeoX) (#2760) b1059 Georgi Gerganov 2023-08-25 11:55:59 +03:00
  • 87e3733f24 gguf : make gguf pip-installable M. Yusuf Sarıgöz 2023-08-25 09:26:05 +03:00
  • 0248ca811e gguf : add notes for tests gguf-pip M. Yusuf Sarıgöz 2023-08-25 09:08:05 +03:00
  • 2897926d90 gguf : update readme with build notes M. Yusuf Sarıgöz 2023-08-25 09:06:33 +03:00
  • 8798aea247 gguf : update readme with build notes M. Yusuf Sarıgöz 2023-08-25 09:02:36 +03:00
  • b91ad7f461 ggml-alloc : enlarge size of parse_seq (#2776) b1057 Shouzheng Liu 2023-08-25 01:58:00 -04:00
  • 3e98cbe76e Merge branch 'master' into gguf-pip M. Yusuf Sarıgöz 2023-08-25 08:50:40 +03:00
  • 87338093d6 requirements : add gguf M. Yusuf Sarıgöz 2023-08-25 08:47:19 +03:00
  • b8a777e77b Merge branch 'master' into gguf-pip M. Yusuf Sarıgöz 2023-08-25 08:38:11 +03:00
  • 2e5f70a25f Added enum to llama_token_get_type return type (#2774) b1056 Marcus Dunn 2023-08-24 14:49:30 -07:00
  • d0f77b1353 convert.py : try to determine n_ctx automatically for CodeLlama (#2770) slaren 2023-08-24 21:10:39 +02:00
  • 0d3094f0c7 gguf : add rope_freq_base parameter for CodeLlama (#2769) b1054 slaren 2023-08-24 20:04:05 +02:00
  • 01f2224682 falcon : write file type Georgi Gerganov 2023-08-24 19:58:30 +03:00
  • 38b16dfca6 metal : bug-fix when enable ggml-alloc (#2757) b1052 Shouzheng Liu 2023-08-24 12:27:25 -04:00
  • 8f8c28e89c convert : auto-determine model name based on dir + scripts update Georgi Gerganov 2023-08-24 19:26:19 +03:00
  • 7694adda8d Fix for main example getting stuck when -n -2 and --interactive (#2767) b1050 Kerfuffle 2023-08-24 10:11:13 -06:00
  • fea95c682d fix convert.py for codellama, add llama 34B to the list of recognized models (#2768) b1049 slaren 2023-08-24 17:44:11 +02:00
  • ef955fbd23 Tag release with build number (#2732) b1048 DannyDaemonic 2023-08-24 06:58:02 -07:00
  • d67777c202 metal : add Q8_0 support (#2763) Georgi Gerganov 2023-08-24 16:19:57 +03:00
  • c3e53b421a llama : escape all U+2581 in a string (#2750) b1047 Georgi Gerganov 2023-08-24 12:26:01 +03:00
  • 0288361b65 gguf : fix line endings M. Yusuf Sarıgöz 2023-08-24 09:26:13 +03:00
  • 344f6e373b gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:09:52 +03:00
  • 5dd870574e gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:08:19 +03:00
  • 050046fa45 gitignore : add dist and rm pyproject.toml M. Yusuf Sarıgöz 2023-08-24 09:07:42 +03:00
  • 6e91a1b070 llama : fix grammar sometimes generating null char (#2756) b1046 Evan Jones 2023-08-24 00:07:13 -04:00
  • 44d5462b5c readme : fix link Georgi Gerganov 2023-08-23 23:44:19 +03:00
  • c7868b0753 minor : fix trailing whitespace Georgi Gerganov 2023-08-23 23:43:00 +03:00
  • 79da24b58c readme : update hot topics Georgi Gerganov 2023-08-23 23:41:16 +03:00
  • cf658adc83 llm : add Falcon support (#2717) master-cf658ad Georgi Gerganov 2023-08-23 23:08:04 +03:00
  • 977629a34e Merge branch 'master' into fix-eos fix-eos Georgi Gerganov 2023-08-23 22:40:19 +03:00
  • a192860cfe minor : fix trailing whitespace master-a192860 Georgi Gerganov 2023-08-23 22:37:39 +03:00
  • 95385241a9 examples : restore the functionality to import llama2.c models (#2685) master-9538524 Olivier Chafik 2023-08-23 20:33:05 +01:00
  • 335acd2ffd fix convert-lora-to-ggml.py (#2738) slaren 2023-08-23 16:46:54 +02:00
  • 5290c38e6e main : insert bos if no tokens (#2727) master-5290c38 klosax 2023-08-23 16:46:03 +02:00