Commit Graph

  • 58e6c9f36f Add support for file load progress reporting callbacks (#434) master-58e6c9f Jed Fox 2023-03-25 01:26:28 -04:00
  • 36d07532ef Add missing struct annotation (#483) master-36d0753 Doomsdayrs 2023-03-25 01:21:24 -04:00
  • 6f1ee4b640 Fix crash for 65B model with pre-allocated memory (#485) master-6f1ee4b Chris Kuehl 2023-03-24 23:38:14 -05:00
  • 8520fc310e Disable BLAS altogether - the bug is not just for qunatized mat mul master-8520fc3 Georgi Gerganov 2023-03-24 23:47:06 +02:00
  • b3f460e941 Disable BLAS branch in mul_mat - seems there is a bug master-b3f460e Georgi Gerganov 2023-03-24 23:39:17 +02:00
  • 04c6f5ed6f Immediately start processing the prompt before user input has been provided (#476) master-7a9b6c3 master-04c6f5e Georgi Gerganov 2023-03-24 23:17:58 +02:00
  • 7a9b6c3a8b Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +02:00
  • 4aeee216fd Regroup q4_1 dot addition for better numerics. q4_1_more_accel Matvey Soloviev 2023-03-23 04:56:21 +01:00
  • 580991bbed Squeeze out about 5% more performance in Q4_1 inference Matvey Soloviev 2023-03-21 22:55:35 +01:00
  • 31572d9665 Temporary bump the memory buffer size - hopefully fix issues from 483bab2e master-31572d9 Georgi Gerganov 2023-03-24 18:23:56 +02:00
  • f4f5362edb Update README.md (#444) master-863f65e Gary Mulder 2023-03-24 15:23:09 +00:00
  • 863f65e2e3 fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -05:00
  • afd220d9c6 Properly free llama_context on failure master-afd220d master-563cdc3 master-481044d Georgi Gerganov 2023-03-24 17:21:01 +02:00
  • 481044d50c additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -07:00
  • 563cdc391d Support calling mlock() on loaded model data on Linux and macOS (#453) comex 2023-03-24 08:19:05 -07:00
  • 8d4a855c24 Add embedding mode with arg flag. Currently working (#282) master-8d4a855 Luciano 2023-03-24 08:05:13 -07:00
  • b6b268d441 Add link to Roadmap discussion Georgi Gerganov 2023-03-24 09:13:35 +02:00
  • 3cd8dde0d1 Revert "Fix memory allocation issues and seg faults" master-3cd8dde Georgi Gerganov 2023-03-24 06:22:28 +02:00
  • 4870e455b3 Fix memory allocation issues and seg faults master-4870e45 Georgi Gerganov 2023-03-24 00:11:53 +02:00
  • 483bab2e3d Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) master-483bab2 Georgi Gerganov 2023-03-23 23:22:01 +02:00
  • 404e1da38e Fix quantize script not finding models in parent directory (#428) Jed Fox 2023-03-23 16:42:52 -04:00
  • 4cc053b6d5 Remove oboslete command from Docker script Georgi Gerganov 2023-03-23 22:39:44 +02:00
  • 0ba5a3a9a5 Obsolete Georgi Gerganov 2023-03-23 22:32:02 +02:00
  • 2e17dfd80a Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) master-2e17dfd rabidcopy 2023-03-23 15:22:47 -05:00
  • 20a1a4e09c Fix GPTQ converter (#423) master-ad072fc Timmy Knight 2023-03-23 10:18:13 -10:00
  • ad072fc5ad Generate library with CMake (#430) nusu-github 2023-03-24 05:16:48 +09:00
  • ea10d3ded2 Command line args bounds checking (#424) master-ea10d3d anzz1 2023-03-23 19:54:28 +02:00
  • a18c19259a Fix Nix build Ben Siraphob 2023-03-22 00:37:02 -05:00
  • a50e39c6fe Revert "Delete SHA256SUMS for now" (#429) Stephan Walter 2023-03-23 14:15:48 +00:00
  • a140219e81 Fix Makefile echo escape codes (by removing them). (#418) master-a140219 Kerfuffle 2023-03-23 05:41:32 -06:00
  • 8a3e5ef801 Move model section from issue template to README.md (#421) Gary Mulder 2023-03-23 11:30:40 +00:00
  • 8eea5ae0e5 Delete SHA256SUMS for now (#416) anzz1 2023-03-23 12:26:19 +02:00
  • 93208cfb92 Adjust repetition penalty .. Georgi Gerganov 2023-03-23 10:46:58 +02:00
  • 03ace14cfd Add link to recent podcast about whisper.cpp and llama.cpp Georgi Gerganov 2023-03-23 09:48:51 +02:00
  • 66ea164e1d Kahan summation on Q4_1 q4_1_more_accel_kahan Matvey Soloviev 2023-03-23 04:28:51 +01:00
  • e4412b45e3 CI: CMake: Separate build and test steps (#376) master-e4412b4 anzz1 2023-03-23 04:20:34 +02:00
  • 711224708d Break up loop for numeric stability q4_1_more_accel_loopsplit Matvey Soloviev 2023-03-23 03:14:44 +01:00
  • f7dc43bc0d Fix instruct mode broken by PR #354 (#409) master-f7dc43b tjohnman 2023-03-23 01:30:23 +01:00
  • 69071d3b6b Squeeze out about 5% more performance in Q4_1 inference Matvey Soloviev 2023-03-21 22:55:35 +01:00
  • ee8a788786 Update issue template so people will use it (#404) Gary Mulder 2023-03-22 19:06:18 +00:00
  • 3a0dcb3920 Implement server mode. tcp_server Thiago Padilha 2023-03-22 10:41:26 -03:00
  • bf44faa0ee Remove direct access to std streams from "run" Thiago Padilha 2023-03-22 09:55:45 -03:00
  • b7f1fa6d8c Move llama_context setup + perplexity back to main.cpp Thiago Padilha 2023-03-22 09:39:25 -03:00
  • d7d53b84db Add main.cpp back and invoke "run" from it Thiago Padilha 2023-03-22 09:16:33 -03:00
  • 90175ee13f Move main.cpp to run.cpp Thiago Padilha 2023-03-22 09:05:50 -03:00
  • 69c92298a9 Deduplicate q4 quantization functions (#383) master-69c9229 Stephan Walter 2023-03-22 17:29:06 +00:00
  • 97940520e8 fix: add POSIX functionality for Linux compilation (#51) master-9794052 master-305ba6f Valentyn Bezshapkin 2023-03-22 18:20:25 +01:00
  • 305ba6f0e6 Don't force immediate interactive without -i (#354) tjohnman 2023-03-22 18:16:35 +01:00
  • 4122dffff9 cmake: make llama an actual library (#392) master-4122dff Erik Scholz 2023-03-22 17:37:10 +01:00
  • 56e659a0b2 fix perplexity after c-api refactor (#390) master-56e659a Erik Scholz 2023-03-22 17:09:38 +01:00
  • 40ea807a97 Add details on perplexity to README.md (#395) Gary Linscott 2023-03-22 08:53:54 -07:00
  • d5850c53ca Add missing header for memcpy (#386) master-d5850c5 Yusuf Kağan Hanoğlu 2023-03-22 11:55:45 +03:00
  • ae44e23ee3 When seed <= 0 - use the clock to generate one master-ae44e23 master-928480e Georgi Gerganov 2023-03-22 07:47:15 +02:00
  • 928480ef5b Init llama_context_params properly from CLI (#370) Georgi Gerganov 2023-03-22 07:45:00 +02:00
  • 56817b1f88 Remove temporary notice and update hot topics master-f5a77a6 Georgi Gerganov 2023-03-22 07:34:02 +02:00
  • f5a77a629b Introduce C-style API (#370) Georgi Gerganov 2023-03-22 07:32:36 +02:00
  • da0e9fe90c Add SHA256SUMS file and instructions to README how to obtain and verify the downloads Gary Mulder 2023-03-20 20:14:06 +00:00
  • e6c9e0986c Fix bin dir for win ci master-e6c9e09 anzz1 2023-03-21 23:49:24 +02:00
  • 01a297b099 specify build type for ctest on windows (#371) master-01a297b Erik Scholz 2023-03-21 22:34:25 +01:00
  • 3366853e41 Add notice about pending change Georgi Gerganov 2023-03-21 22:57:35 +02:00
  • 3f9c6135e4 fix typo in chatLLaMa (#368) Mathieu Nayrolles 2023-03-21 16:52:27 -04:00
  • 0f61352708 Update issue templates Georgi Gerganov 2023-03-21 19:47:27 +02:00
  • 353ec251a4 We could use std::unordered_map over std::map (#305) Fabio R. Sluzala 2023-03-21 14:21:50 -03:00
  • 89d5d90f3b Fix color codes emitting mid-UTF8 code. (#312) Matvey Soloviev 2023-03-21 18:11:01 +01:00
  • 16ffc013c6 Importer for GPTQ quantized LLaMA models (#301) comex 2023-03-21 09:42:25 -07:00
  • 486ae645fd Compute perplexity over prompt (#270) Gary Linscott 2023-03-21 09:27:42 -07:00
  • 3ab3e6582f Add chatLLaMa script (#198) Jean-Christophe Hoelt 2023-03-21 18:23:15 +02:00
  • f157088cb7 makefile: Fix CPU feature detection on Haiku (#218) Alex von Gluck IV 2023-03-21 11:21:06 -05:00
  • c86ba036e6 Enable ANSI colors on Windows 10+ (#311) anzz1 2023-03-21 18:14:46 +02:00
  • 1daf4dd712 Minor style changes Georgi Gerganov 2023-03-21 18:10:32 +02:00
  • dc6a845b85 Add chat.sh script Georgi Gerganov 2023-03-21 18:09:37 +02:00
  • 6a612959e1 Check for reverse prompt by characters instead of tokens (#292) (#330) tjohnman 2023-03-21 17:05:06 +01:00
  • d5f56a5e5a Check for reverse prompt by characters instead of tokens (#292) (#330) tjohnman 2023-03-21 17:04:43 +01:00
  • 3bfa3b43b7 Fix convert script, warnings alpaca instructions, default params Georgi Gerganov 2023-03-21 17:59:16 +02:00
  • 715d292ee0 Add OpenBSD support (#314) Kevin Lo 2023-03-21 09:50:09 -06:00
  • c98ae02668 fix typo in comment (#318) Mack Straight 2023-03-21 08:49:43 -07:00
  • c3b2306b18 Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335) Qingyou Meng 2023-03-21 23:44:11 +08:00
  • 975d2cebf9 cmdline option for custom amount of model parts (--n_parts N) (#348) anzz1 2023-03-21 17:42:43 +02:00
  • e0ffc861fa Update IPFS links to quantized alpaca with new tokenizer format (#352) Kevin Kwok 2023-03-21 08:34:49 -07:00
  • 8f644a0a85 Change default repeat_penalty to 1.0 Georgi Gerganov 2023-03-21 17:32:14 +02:00
  • eb34620aec Add tokenizer test + revert to C++11 (#355) Georgi Gerganov 2023-03-21 17:29:41 +02:00
  • 2e664f1ff4 Add initial AVX512 support for dot product on Linux (#320) master-2e664f1 Casey Primozic 2023-03-21 07:35:42 -07:00
  • 8cf9f34edd Adding missing features of CMakeLists.txt & Refactoring (#131) master-8cf9f34 nusu-github 2023-03-21 09:37:16 +09:00
  • bd4b46d6ba Nix flake: set meta.mainProgram to llama Ben Siraphob 2023-03-20 16:44:30 -05:00
  • 6b6d5b5024 Fixed tokenizer.model not found error when model dir is symlink (#325) Qingyou Meng 2023-03-21 03:33:10 +08:00
  • a791a68b61 move file magic/version to header, print expected version (#319) master-a791a68 Mack Straight 2023-03-20 12:26:01 -07:00
  • 0f1b21cb90 Docker - Fix publish docker image in GitHub Registry (#235) master-0f1b21c Bernat Vadell 2023-03-20 18:05:20 +01:00
  • 074bea2eb1 sentencepiece bpe compatible tokenizer (#252) master-074bea2 Mack Straight 2023-03-20 03:17:23 -07:00
  • 5cb63e2493 Add tqdm to Python requirements (#293) Stephan Walter 2023-03-20 08:24:11 +00:00
  • da5303c1ea bugfix: default should not be interactive (#304) master-da5303c cocktailpeanut 2023-03-19 17:44:20 -04:00
  • 4545539d71 Rename script Georgi Gerganov 2023-03-19 21:58:51 +02:00
  • edeba28366 Add temporary helper script for Alpaca chat Georgi Gerganov 2023-03-19 21:57:28 +02:00
  • 5c19c70ba6 fix coloring of last n_batch of prompt, and refactor line input (#221) master-5c19c70 Rickey Bowers Jr 2023-03-19 13:44:30 -06:00
  • 24568371ae Support for multiple reverse prompts. (#299) master-2456837 tjohnman 2023-03-19 20:33:06 +01:00
  • 7392f1cd2c Improved quantize script (#222) master-ad5fd5b Suaj Carrot 2023-03-19 12:38:44 -06:00
  • ad5fd5b60c Make prompt randomization optional. (#300) tjohnman 2023-03-19 19:36:19 +01:00
  • 368d0c8a9e Respect the maximum number of tokens in interactive. (#298) master-368d0c8 tjohnman 2023-03-19 19:31:17 +01:00
  • 50fae10d03 Add --ignore-eos parameter (#181) master-50fae10 slaren 2023-03-19 19:22:48 +01:00
  • 084e2f0ec0 interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283) master-084e2f0 Qingyou Meng 2023-03-20 02:10:00 +08:00
  • 0b366e7357 Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294) master-0b366e7 Erik Scholz 2023-03-19 18:57:00 +01:00