Commit Graph

  • cc34dbda96 gitignore : fix for windows (#2729) akawrykow 2023-08-23 07:31:34 -07:00
  • 7c2227a197 chmod : make scripts executable (#2675) Cebtenzzre 2023-08-23 10:29:09 -04:00
  • f19dca04ea devops : RPM Specs (#2723) JohnnyB 2023-08-23 15:28:22 +01:00
  • 8207214b6a Fix values shown in the quantize tool help (#2735) master-8207214 Kawrakow 2023-08-23 12:57:12 +03:00
  • 62959e740e Strided perplexity (#2714) master-62959e7 Kawrakow 2023-08-23 12:56:42 +03:00
  • 7f7ddd5002 Fix ggml to gguf conversion on Windows (#2733) IgnacioFDM 2023-08-23 06:31:09 -03:00
  • b8ad1b66b2 server : allow json array in prompt or content for direct token input (#2306) master-b8ad1b6 Xiao-Yong Jin 2023-08-23 02:12:12 -05:00
  • f5fe98d11b docs : add grammar docs (#2701) Evan Jones 2023-08-22 21:01:57 -04:00
  • 777f42ba18 Improve handling of special tokens in GGML to GGUF converter (#2725) master-777f42b Kerfuffle 2023-08-22 17:39:39 -06:00
  • 46ef5b5fcf llama : fix whitespace escaping in tokenizer (#2724) master-46ef5b5 goerch 2023-08-22 23:10:42 +02:00
  • c63bb1d16a CUDA: use mul_mat_q kernels by default (#2683) master-c63bb1d Johannes Gäßler 2023-08-22 22:47:05 +02:00
  • 3b6cfe7c92 convert.py : clarifying error message (#2718) Alex Petenchea 2023-08-22 21:58:16 +03:00
  • 800c9635b4 Fix CUDA softmax by subtracting max value before exp (#2665) master-800c963 Jiahao Li 2023-08-23 02:27:06 +08:00
  • deb7dfca4b gguf : add ftype meta info to the model (#2710) master-deb7dfc Georgi Gerganov 2023-08-22 20:05:59 +03:00
  • bac66994cf Quantization imrovements for k_quants (#2707) master-bac6699 Kawrakow 2023-08-22 19:14:09 +03:00
  • 519c981f8b embedding : evaluate prompt in batches (#2713) master-519c981 slaren 2023-08-22 16:03:12 +02:00
  • 1123f7fbdf ggml-cuda : use graph allocator (#2684) master-1123f7f slaren 2023-08-22 15:25:19 +02:00
  • ef3f333d37 ggml : sync latest (SAM + SD operators, CUDA alibi) (#2709) master-ef3f333 Georgi Gerganov 2023-08-22 14:22:08 +03:00
  • 8e4364f2af llama-bench : minor fixes (#2695) master-8e4364f slaren 2023-08-22 09:56:03 +02:00
  • 1e3bc523d8 ggml : support CUDA's half type for aarch64(#1455) (#2670) master-1e3bc52 Kylin 2023-08-22 15:14:23 +08:00
  • 14b1d7e6f7 metal : add missing barriers for mul-mat (#2699) Shouzheng Liu 2023-08-22 02:18:40 -04:00
  • 226255b44e server : fallback to default if client param is null (#2688) master-226255b Jhen-Jie Hong 2023-08-22 08:32:00 +08:00
  • 930523c8e1 Fix convert-llama-ggmlv3-to-gguf.py vocab conversion (#2698) Kerfuffle 2023-08-21 18:01:34 -06:00
  • c8dba409e6 py : remove obsolete script Georgi Gerganov 2023-08-21 23:40:22 +03:00
  • 6381d4e110 gguf : new file format with flexible meta data (beta) (#2398) master-6381d4e Georgi Gerganov 2023-08-21 23:07:43 +03:00
  • 66a66a05a8 readme : add notice about new file format gguf Georgi Gerganov 2023-08-21 22:11:00 +03:00
  • 811f653f95 py : cosmetics Georgi Gerganov 2023-08-21 20:40:08 +03:00
  • 49c25cce19 tests : use new tokenizer type API (#2692) goerch 2023-08-21 19:11:14 +02:00
  • d3f5fbef6c main : flush stdout Georgi Gerganov 2023-08-21 19:52:51 +03:00
  • 0b53b8b08d llama : add API for token type Georgi Gerganov 2023-08-21 19:35:31 +03:00
  • 8d177eddeb llama : improve token type support (#2668) goerch 2023-08-21 17:56:02 +02:00
  • e06cbcee73 gguf : add Python script to convert GGMLv3 LLaMA models to GGUF (#2682) Kerfuffle 2023-08-21 08:45:52 -06:00
  • 6490ff7198 py : fix whitespace Georgi Gerganov 2023-08-21 16:42:27 +03:00
  • e3da126f2a main : inject reverse prompt after EOS + update examples/chat.sh Georgi Gerganov 2023-08-21 16:41:27 +03:00
  • 1e7a0092dd Merge branch 'master' into gguf Georgi Gerganov 2023-08-21 16:27:51 +03:00
  • 8af1991e2a main : restore old EOS behavior in interactive mode Georgi Gerganov 2023-08-21 15:40:51 +03:00
  • 7a7d1ba68a convert-llama-hf-to-gguf.py : rope scale fix klosax 2023-08-21 14:12:02 +02:00
  • 9070e330ab convert-llama-7b-pth-to-gguf.py : rope scale fix klosax 2023-08-21 14:11:22 +02:00
  • c082b9fa0b llama.cpp : use rope scale kv klosax 2023-08-21 13:30:03 +02:00
  • dc1f051013 convert-llama-7b-pth-to-gguf.py : rope scale and added tokens klosax 2023-08-21 13:27:53 +02:00
  • 5f6ff387ca convert-llama-hf-to-gguf.py : rope scale and added tokens klosax 2023-08-21 13:25:14 +02:00
  • 6a69a693cb gguf.py : fix rope scale kv klosax 2023-08-21 13:23:10 +02:00
  • dadbed99e6 metal : fix synchronization in new matrix multiplication kernel (#2686) Shouzheng Liu 2023-08-21 06:59:29 -04:00
  • cb1c0727bd HellaSwag: split token evaluation into batches if needed (#2681) master-cb1c072 Kawrakow 2023-08-21 11:11:31 +03:00
  • c818c405e0 convert-llama-hf-to-gguf.py : fix attn_q permute klosax 2023-08-21 04:42:09 +02:00
  • 58bde5c5c1 Delete convert-permute-debug.py klosax 2023-08-21 04:35:06 +02:00
  • 287db51015 Delete convert-permute-debug-master.py klosax 2023-08-21 04:34:39 +02:00
  • d5c8fcfd8a convert.py : 70b model working (change attn_q permute) klosax 2023-08-21 04:33:33 +02:00
  • 7de7cb4bd8 convert-permute-debug.py : change permute type of attn_q klosax 2023-08-21 04:06:59 +02:00
  • 4f92488dd6 convert-permute-debug-master.py : permute debug for master klosax 2023-08-21 03:44:16 +02:00
  • 5a02b9625a convert-permute-debug.py : permute debug print klosax 2023-08-21 03:24:29 +02:00
  • 9e232f0234 ggml : move all type info to ggml_type_traits (#2663) master-9e232f0 slaren 2023-08-20 22:17:53 +02:00
  • f838faa874 convert-llama-7b-pth-to-gguf.py : special tokens klosax 2023-08-20 16:56:48 +02:00
  • 76b46627e2 convert-llama-hf-to-gguf.py : special tokens klosax 2023-08-20 16:54:42 +02:00
  • 5e9ff54a67 More efficient Hellaswag implementation (#2677) master-5e9ff54 Kawrakow 2023-08-20 16:44:46 +03:00
  • 28b8c265eb cmpnct_gpt2bpe.hpp : cleanup gguf-28b8c26 klosax 2023-08-19 18:26:51 +02:00
  • c0a1269b7f Update examples/server/README.md klosax 2023-08-19 15:27:37 +02:00
  • 6a2e520095 cmpnct_gpt2bpe.hpp : remove non-general stuff klosax 2023-08-19 13:19:02 +02:00
  • 8945d47f52 gptneox-main.cpp : fixes klosax 2023-08-19 12:09:24 +02:00
  • 781bf2481f falcon-main.cpp : fixes klosax 2023-08-19 12:08:17 +02:00
  • dadf098b5a cmpnct_gpt2bpe.hpp : fixes klosax 2023-08-19 12:06:22 +02:00
  • b3a7a2b486 convert-falcon-hf-to-gguf.py : add tensor data layout klosax 2023-08-19 12:05:11 +02:00
  • 2c8055b65b convert-falcon-hf-to-gguf.py : update ref klosax 2023-08-19 01:08:39 +02:00
  • 1d80eea574 falcon-main.cpp : fix for falcon 40b klosax 2023-08-19 01:03:37 +02:00
  • bd5a57901b gguf.py : fix for falcon 40b klosax 2023-08-19 01:01:52 +02:00
  • 281d6d1105 convert-llama-hf-to-gguf.py : remove extra kv klosax 2023-08-19 00:32:56 +02:00
  • 593b04fdcd convert-llama-7b-pth-to-gguf.py : remove extra kv klosax 2023-08-19 00:32:27 +02:00
  • c0e4ca630b convert-gptneox-hf-to-gguf.py : remove extra kv klosax 2023-08-19 00:31:56 +02:00
  • 16ab9ba3b3 convert-falcon-hf-to-gguf.py : remove extra kv klosax 2023-08-19 00:31:28 +02:00
  • d5e976c12b falcon-main.cpp : falcon inference example klosax 2023-08-19 00:02:18 +02:00
  • 1f0bccb279 server : better default prompt (#2646) Georgi Gerganov 2023-08-19 00:45:36 +03:00
  • f63564adfa server : update xxd usage for older versions compatibility (#2649) Jhen-Jie Hong 2023-08-19 05:41:32 +08:00
  • 2d8b76a110 Add link to clojure bindings to Readme. (#2659) Adrian 2023-08-18 12:39:22 -07:00
  • fb7c883cd3 convert-falcon-hf-to-gguf.py : falcon HF --> gguf conversion, not tested klosax 2023-08-18 20:14:01 +02:00
  • 25b8a8922d llama : introduce enum llama_vocab_type + remove hardcoded string constants Georgi Gerganov 2023-08-18 18:46:38 +03:00
  • 7af633aec3 readme : incoming BREAKING CHANGE Georgi Gerganov 2023-08-18 17:48:31 +03:00
  • a4ad2bf35c llama : fix MPI build Georgi Gerganov 2023-08-18 17:34:27 +03:00
  • 5d2656d670 llama : avoid hardcoded special tokens Georgi Gerganov 2023-08-18 17:29:20 +03:00
  • 035d511457 llama : minor API updates Georgi Gerganov 2023-08-18 17:06:34 +03:00
  • 2d6c2c757c llama : remove C++ API + reorganize common source in /common dir Georgi Gerganov 2023-08-18 16:22:48 +03:00
  • 38016ed9ec Merge branch 'master' into gguf Georgi Gerganov 2023-08-18 15:21:48 +03:00
  • 660ca9bbca llama : re-order functions Georgi Gerganov 2023-08-18 14:56:36 +03:00
  • 097e121e2f llama : add benchmark example (#2626) master-097e121 slaren 2023-08-18 12:44:58 +02:00
  • eaf98c2649 readme : add link to Rust bindings (#2656) mdrokz 2023-08-18 15:47:58 +05:30
  • e9b12c332e perplexity : more meaningful ETA number - 2 decimal points master-e9b12c3 Georgi Gerganov 2023-08-18 12:48:55 +03:00
  • dea5be61d7 editorconfig : fix whitespaces Georgi Gerganov 2023-08-18 12:42:38 +03:00
  • e35f8c744e tests : update vocab file with new magic Georgi Gerganov 2023-08-18 12:39:09 +03:00
  • 856afff746 Merge branch 'master' into gguf Georgi Gerganov 2023-08-18 12:38:05 +03:00
  • aa3efe87c8 llama : print number of tensors per type + print arch + style Georgi Gerganov 2023-08-18 10:36:45 +03:00
  • b275de745d llama.cpp : get special token kv and linefeed token id klosax 2023-08-18 03:34:30 +02:00
  • 604b8bdfa6 Fix unicode in grammars (fixes #2501) (#2553) master-604b8bd Evan Jones 2023-08-17 19:54:44 -04:00
  • 10151bee2e server : support for saving templates in browser LocalStorage (#2486) master-10151be staviq 2023-08-17 23:34:01 +00:00
  • 306070c896 llama.cpp : print kv general.name klosax 2023-08-18 01:06:27 +02:00
  • 0992a7b8b1 README: fix LLAMA_CUDA_MMV_Y documentation (#2647) Johannes Gäßler 2023-08-17 23:57:59 +02:00
  • d9e6890a51 test-tokenizer-0.cpp : fix warning klosax 2023-08-17 23:34:21 +02:00
  • 147a99bd3a gguf.py : reverse GGUF_MAGIC klosax 2023-08-17 23:24:04 +02:00
  • c20ae49b59 ggml.h : reverse GGUF_MAGIC klosax 2023-08-17 23:23:17 +02:00
  • 6ddeefad9b [Zig] Fixing Zig build and improvements (#2554) Henri Vasserman 2023-08-17 23:11:18 +03:00
  • 3c1b7217a9 convert-llama-7b-pth-to-gguf.py : fixes klosax 2023-08-17 21:44:34 +02:00
  • 9e2d4dd48e convert-llama-hf-to-gguf.py : fixes klosax 2023-08-17 21:43:48 +02:00