Commit Graph

  • a68a1e7ed0 metal : log more info on error (#6987) b2772 Kevin Gibbons 2024-04-30 02:34:50 -07:00
  • 9c67c2773d ggml : add Flash Attention (#5021) b2771 Georgi Gerganov 2024-04-30 12:16:08 +03:00
  • c240ae234c ci : fix arg order gg/flash-attn Georgi Gerganov 2024-04-30 11:43:36 +03:00
  • 952d03dbea convert : use utf8 encoding (#7000) Georgi Gerganov 2024-04-30 11:05:25 +03:00
  • e180fcd3d5 metal : fix max nsg Georgi Gerganov 2024-04-30 11:04:32 +03:00
  • 8843a98c2b Improve usability of --model-url & related flags (#6930) b2769 Olivier Chafik 2024-04-30 00:52:50 +01:00
  • b8c1476e44 Extending grammar integration tests (#6644) b2768 Clint Herron 2024-04-29 14:40:14 -04:00
  • 5539e6fdd1 main : fix typo in comment in main.cpp (#6985) b2767 Daniel Bevenius 2024-04-29 19:56:59 +02:00
  • b6fafd1747 llama : remove useless return value for some llama_cache_* functions Francis Couture-Harpin 2024-04-29 12:59:43 -04:00
  • b8a7a5a90f build(cmake): simplify instructions (cmake -B build && cmake --build build ...) (#6964) b2766 Olivier Chafik 2024-04-29 17:02:45 +01:00
  • ca0275ceb7 Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-04-29 18:37:04 +03:00
  • d2c898f746 ci : tmp disable gguf-split (#6983) Georgi Gerganov 2024-04-29 18:36:39 +03:00
  • 5ddad95e5c ci : tmp disable gguf-split gg/tmp-ci Georgi Gerganov 2024-04-29 18:29:38 +03:00
  • 544f1f10ad ggml : fix __MSC_VER -> _MSC_VER (#6977) b2764 Georgi Gerganov 2024-04-29 17:55:02 +03:00
  • ffe666572f llava-cli : multiple images (#6969) b2763 cpumaxx 2024-04-29 07:34:24 -07:00
  • c460ff1a1c Merge branch 'master' into compilade/refactor-kv-cache Francis Couture-Harpin 2024-04-29 10:31:39 -04:00
  • a09db95eab llama : rename many llama_kv_cache_* functions Francis Couture-Harpin 2024-04-29 10:24:45 -04:00
  • a1616e9f72 Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-04-29 17:19:25 +03:00
  • 24affa7db3 readme : update hot topics Georgi Gerganov 2024-04-29 17:06:19 +03:00
  • f4ab2a4147 llama : fix BPE pre-tokenization (#6920) b2761 Georgi Gerganov 2024-04-29 16:58:41 +03:00
  • 3f167476b1 sampling : use std::random_device{}() for default random seed (#6962) b2760 David Renshaw 2024-04-29 09:35:45 -04:00
  • 3055a41805 convert : fix conversion of some BERT embedding models (#6937) Christian Zhou-Zheng 2024-04-29 09:34:41 -04:00
  • 577277ffd2 make : change GNU make default CXX from g++ to c++ (#6966) Przemysław Pawełczyk 2024-04-29 15:08:20 +02:00
  • ca7f29f568 ci : add building in MSYS2 environments (Windows) (#6967) b2757 Przemysław Pawełczyk 2024-04-29 14:59:47 +02:00
  • c4f708a93f llama : fix typo LAMMAFILE -> LLAMAFILE (#6974) b2756 Johannes Gäßler 2024-04-29 14:36:22 +02:00
  • 80cb3127df tests : disable test-tokenizer-1-bpe due to slowness gg/bpe-preprocess Georgi Gerganov 2024-04-29 15:24:39 +03:00
  • 3202676f5d llama : more prominent warning for old BPE models Georgi Gerganov 2024-04-29 15:24:27 +03:00
  • 6d6ce93959 tests : use faster bpe test Georgi Gerganov 2024-04-29 14:47:25 +03:00
  • 9a7d430ff2 tests : disable obsolete Georgi Gerganov 2024-04-29 14:12:34 +03:00
  • 120cf37d54 models : add phi-3, mpt, gpt-2, starcoder Georgi Gerganov 2024-04-29 13:40:30 +03:00
  • c21ab1833e scripts : ignore new update script in check-requirements.sh Georgi Gerganov 2024-04-29 11:24:05 +03:00
  • af05268cdd unicode : cleanup Georgi Gerganov 2024-04-29 11:20:42 +03:00
  • c68d2596ea tests : add more vocabs and tests Georgi Gerganov 2024-04-29 11:07:25 +03:00
  • 43708d22c3 tests : refactor vocab tests Georgi Gerganov 2024-04-29 10:46:43 +03:00
  • ef4cca9e87 cmake : refactor test targets Georgi Gerganov 2024-04-29 09:53:14 +03:00
  • e00b4a8f81 Fix more int overflow during quant (PPL/CUDA). (#6563) b2755 DAN™ 2024-04-28 18:38:44 -04:00
  • 7b1210f6a8 lint : fix Georgi Gerganov 2024-04-28 22:51:13 +03:00
  • 78081502e9 convert : exercise contractions Georgi Gerganov 2024-04-28 22:18:20 +03:00
  • 0f9058ceec convert : add comments Georgi Gerganov 2024-04-28 22:10:04 +03:00
  • 02fd977fe1 convert : remove unused functions Georgi Gerganov 2024-04-28 22:03:21 +03:00
  • e8dd4a1494 lint : fix Georgi Gerganov 2024-04-28 22:02:10 +03:00
  • 491f2339bb lint : fix Georgi Gerganov 2024-04-28 21:42:58 +03:00
  • 1545550ec2 unicode : normalize signatures Georgi Gerganov 2024-04-28 21:40:36 +03:00
  • 1c888eb4da convert : add falcon Georgi Gerganov 2024-04-28 21:26:40 +03:00
  • 4e3e6d8ecc lint : update Georgi Gerganov 2024-04-28 21:16:50 +03:00
  • 7642973616 convert : add convert-hf-to-gguf-update.py Georgi Gerganov 2024-04-28 20:29:32 +03:00
  • ee6d1b3fb4 unicode : simplify Georgi Gerganov 2024-04-28 18:36:57 +03:00
  • 7bb36ccf91 gguf : enforce that tensor names are unique (#6905) b2754 Xuan Son Nguyen 2024-04-28 17:36:18 +02:00
  • e972e6cbf8 unicode : clean-up Georgi Gerganov 2024-04-28 18:01:59 +03:00
  • ce023f6f2f add device version in device list (#6959) b2753 Neo Zhang 2024-04-28 22:40:31 +08:00
  • d63cc9068b Merge branch 'master' into gg/bpe-preprocess Georgi Gerganov 2024-04-28 15:34:36 +03:00
  • b97add52a4 unicode : category support via std::regex Georgi Gerganov 2024-04-28 13:42:00 +03:00
  • 6e472f58e4 flake.lock: Update github-actions[bot] 2024-04-28 00:18:27 +00:00
  • 4dba7e8114 Replace "alternative" boolean operator in conditional compilation directive (#6949) b2751 mgroeber9110 2024-04-27 21:02:06 +02:00
  • b7368332e2 ci: server: tests python env on github container ubuntu latest / fix n_predict (#6935) b2750 Pierrick Hymbert 2024-04-27 17:50:48 +02:00
  • 581c4a0239 unicode : try fix windows Georgi Gerganov 2024-04-27 18:36:00 +03:00
  • 91eaa414bf unicode : support \p{N}, \p{L} and \p{P} natively Georgi Gerganov 2024-04-27 17:48:38 +03:00
  • ce5485aee0 unicode : always use std::wregex Georgi Gerganov 2024-04-27 17:11:34 +03:00
  • 2affd0b221 unicode : set bomb Georgi Gerganov 2024-04-27 11:56:02 +03:00
  • a22645c2a7 unicode : set bomb Georgi Gerganov 2024-04-27 11:48:24 +03:00
  • 4434c9d6c2 minor Georgi Gerganov 2024-04-27 11:33:16 +03:00
  • ad929833cb llama : adapt punctuation regex + add llama 3 regex Georgi Gerganov 2024-04-27 11:06:08 +03:00
  • 96965f67e6 models : add llama v3 vocab file Georgi Gerganov 2024-04-27 11:05:12 +03:00
  • c160818ec0 wip Georgi Gerganov 2024-04-27 00:28:36 +03:00
  • a774d7084e make : add test-tokenizer-0-llama-v3 Georgi Gerganov 2024-04-26 21:25:36 +03:00
  • 8791e94e3c lint : fix Georgi Gerganov 2024-04-26 21:12:05 +03:00
  • 928e0b7013 Reset schedule earlier to allow overlap with ggml graph computation on device (#6933) b2749 agray3 2024-04-26 19:08:30 +01:00
  • 0c4d489e29 quantize: add imatrix and dataset metadata in GGUF (#6658) b2748 Pierrick Hymbert 2024-04-26 20:06:33 +02:00
  • 1b9b79dd14 convert : fix pre-tokenizer type writing Georgi Gerganov 2024-04-26 20:55:14 +03:00
  • 43e12ce8e5 llama : use new pre-tokenizer type Georgi Gerganov 2024-04-26 20:08:28 +03:00
  • 017e6999b5 add basic tensor data validation function (#6884) b2747 slaren 2024-04-26 18:39:58 +02:00
  • 9b4d63ae53 convert : add "tokenizer.ggml.pre" GGUF KV (wip) Georgi Gerganov 2024-04-26 19:21:55 +03:00
  • e3f6dc7409 Merge branch 'master' into gg/bpe-preprocess Georgi Gerganov 2024-04-26 18:08:40 +03:00
  • e2764cd7ca gguf : fix mismatch between alloc and free functions (#6929) b2746 slaren 2024-04-26 17:07:42 +02:00
  • 4b1c3c98b4 llamafile : use 64-bit integers in sgemm (#6928) Justine Tunney 2024-04-26 10:05:33 -04:00
  • e9891769ff unicode : first try custom implementations Georgi Gerganov 2024-04-26 15:09:07 +03:00
  • e8c206be61 unicode : shot in the dark to fix tests on Windows Georgi Gerganov 2024-04-26 14:57:12 +03:00
  • 4907e41aa7 llama : towards llama3 tokenization support (wip) Georgi Gerganov 2024-04-26 14:55:03 +03:00
  • ed42711b90 gguf-py : reader prints warnings on duplicate keys Georgi Gerganov 2024-04-26 14:32:22 +03:00
  • e1b2bf783e tests : add sample usage Georgi Gerganov 2024-04-26 13:43:54 +03:00
  • aeafb43ed7 tests : remove and rename tokenizer test scripts Georgi Gerganov 2024-04-26 13:39:03 +03:00
  • d999cf65c5 unicode : remove redundant headers Georgi Gerganov 2024-04-26 13:29:48 +03:00
  • bbe3c6e761 ci: server: fix python installation (#6925) Pierrick Hymbert 2024-04-26 12:27:25 +02:00
  • 7a44e44342 tests : add tokenizer tests for numbers Georgi Gerganov 2024-04-26 13:21:28 +03:00
  • 7f5ff558ee server: stop generation at n_ctx_train if n_predict is not set (#6638) Pierrick Hymbert 2024-04-26 12:15:30 +02:00
  • c56e19db4b lint : fix whitespaces Georgi Gerganov 2024-04-26 12:58:07 +03:00
  • 06d3e693db unicode : fix? unicode_wstring_to_utf8 Georgi Gerganov 2024-04-26 12:55:11 +03:00
  • 9e4e077ec5 ci: server: fix python installation (#6922) Pierrick Hymbert 2024-04-26 11:11:51 +02:00
  • 36d983262e Fixed issue with gpt2 regex custom preprocessor Kazim Abrar Mahi 2024-04-17 07:40:40 +06:00
  • 753580360b Fixed issues Kazim Abrar Mahi 2024-04-16 05:53:29 +06:00
  • feeaf4f39c Added needed functionality, testing remains Kazim Abrar Mahi 2024-04-16 04:56:35 +06:00
  • 7e308ed212 Adding unicode regex function Kazim Abrar Mahi 2024-04-16 01:52:33 +06:00
  • a5710a4101 Adding unicode regex mappings Kazim Abrar Mahi 2024-04-15 23:48:04 +06:00
  • 4c3e882a85 Refactored code Kazim Abrar Mahi 2024-04-13 19:33:06 +06:00
  • c8e7d9521d Updated/merged the deepseek coder pr Jaggzh 2024-02-12 04:18:06 -08:00
  • 4056dc5b1e added and refactored unicode_regex_split and related functions Kazim Abrar Mahi 2024-04-01 00:48:49 +06:00
  • 1c924e4b35 Resolved issues Kazim Abrar Mahi 2024-03-23 14:38:06 +06:00
  • 54f93eb50b Moved header files Kazim Abrar Mahi 2024-03-23 01:16:04 +06:00
  • d2cfc2225f Moved regex patterns to unicode.cpp and updated unicode.h Kazim Abrar Mahi 2024-03-23 01:13:08 +06:00
  • 6fbab2dbc8 merged the changes from deepseeker models to main branch Jaggzh 2024-02-12 04:04:34 -08:00