Commit Graph

  • c887d8b017 [SYCL] Add TIMESTEP_EMBEDDING OP (#8707) b3489 zhentaoyu 2024-07-30 14:56:51 +08:00
  • 75af08c475 ggml: bugfix: fix the inactive elements is agnostic for risc-v vector (#8748) b3488 CarterLi999 2024-07-30 00:38:34 +08:00
  • eab4a88210 Using dp4a ptx intrinsics for an improved Mul8MAT perf [By Alcpz] codeplay/sycl-main OuadiElfarouki 2024-07-29 16:52:29 +01:00
  • 439b3fc75a cuda : organize vendor-specific headers into vendors directory (#8746) b3487 R0CKSTAR 2024-07-29 20:56:12 +08:00
  • 0832de7236 [SYCL] add conv support (#8688) b3486 Meng, Hengyu 2024-07-29 10:50:27 +08:00
  • 79a278e922 Merge branch 'master' into compilade/bitnet-ternary Francis Couture-Harpin 2024-07-28 21:27:33 -04:00
  • dd3e62a703 ggml : add some informative comments in q1_3 vec_dot Francis Couture-Harpin 2024-07-28 21:17:16 -04:00
  • 6eeaeba126 cmake: use 1 more thread for non-ggml in CI (#8740) b3485 Johannes Gäßler 2024-07-28 22:32:44 +02:00
  • 4730faca61 chore : Fix vulkan related compiler warnings, add help text, improve CLI options (#8477) b3484 Austin 2024-07-28 03:52:42 -04:00
  • 704a303323 llama : fix Mamba session save and restore Francis Couture-Harpin 2024-07-28 01:59:10 -04:00
  • 0dea4263aa Merge branch 'master' into compilade/batch-splits Francis Couture-Harpin 2024-07-28 01:20:13 -04:00
  • 4c676c85e5 llama : refactor session file management (#8699) b3483 compilade 2024-07-28 00:42:05 -04:00
  • e54c35e4fb feat: Support Moore Threads GPU (#8383) b3482 R0CKSTAR 2024-07-28 07:41:25 +08:00
  • 9cddd9aeec llama : cast seq_id in comparison with unsigned n_seq_max compilade/refactor-session-files Francis Couture-Harpin 2024-07-27 15:50:23 -04:00
  • ffd5117def llama : more graceful error handling of invalid session files Francis Couture-Harpin 2024-07-27 14:31:57 -04:00
  • 5e2727fe03 scripts : sync vulkan-shaders (#0) b3481 Georgi Gerganov 2024-07-27 18:08:31 +03:00
  • 56f20aa25d scripts : sync ggml-aarch64 sources Georgi Gerganov 2024-07-27 17:19:35 +03:00
  • 345c8c0c87 ggml : add missing semicolon (#0) b3479 Georgi Gerganov 2024-07-27 15:57:09 +03:00
  • ae7985cd7b sync : ggml Georgi Gerganov 2024-07-27 15:53:48 +03:00
  • a05ca93697 ggml : loop tiling optimizations for scalar path (ggml/898) Mahesh Madhav 2024-07-25 00:54:08 -07:00
  • 9f77d899b7 ggml: add support for float16 input tensors in pooling operations (ggml/895) Ivan Filipov 2024-07-22 14:32:02 +03:00
  • 203b7f1531 vulkan : initialize vk_buffer_struct members to VK_NULL_HANDLE (ggml/893) Tony Wasserka 2024-07-20 20:49:44 +02:00
  • d2b851bfa1 cmake : only enable GGML_NATIVE and x86 flags if not crosscompiling (ggml/885) Borislav Stanimirov 2024-07-12 17:24:20 +03:00
  • c12b6e8ee7 ggml : remove unnecessary UNUSED macro call (ggml/880) Daniel Bevenius 2024-07-08 12:03:42 +02:00
  • b5e95468b1 llama : add support for llama 3.1 rope scaling factors (#8676) b3472 Jeffrey Morgan 2024-07-27 05:03:45 -07:00
  • 92090eca21 llama : add function for model-based max number of graph nodes (#8622) b3471 Georgi Gerganov 2024-07-27 14:59:29 +03:00
  • 9d03d085dd common : add --no-warmup option for main/llama-cli (#8712) b3470 Daniel Bevenius 2024-07-27 12:45:02 +02:00
  • bfb4c74981 cann: Fix Multi-NPU execution error (#8710) b3469 wangshuai09 2024-07-27 16:36:44 +08:00
  • 83e6a17ddf llama : fix session file loading Francis Couture-Harpin 2024-07-26 22:56:57 -04:00
  • 2b1f616b20 ggml : reduce hash table reset cost (#8698) b3468 slaren 2024-07-27 04:41:55 +02:00
  • c8b424fae5 llama : remove _context suffix for llama_data_context Francis Couture-Harpin 2024-07-26 19:06:37 -04:00
  • 65f7455cea Modify 2 notes caitianchi 2024-07-26 21:49:23 +08:00
  • f3d400dac0 remove uhd_image_embed caitianchi 2024-07-26 21:15:03 +08:00
  • 01245f5b16 llama : fix order of parameters (#8706) b3467 Judd 2024-07-26 16:38:12 +08:00
  • cddc899b85 llama : various integer type cast and format string fixes Francis Couture-Harpin 2024-07-25 22:58:20 -04:00
  • 9e22064a0d llama : fix uint64_t format type Francis Couture-Harpin 2024-07-25 22:49:14 -04:00
  • 8e39037b86 llama : refactor session file management Francis Couture-Harpin 2024-07-25 18:33:54 -04:00
  • 01aec4a631 server : add Speech Recognition & Synthesis to UI (#8679) b3466 Yaiko 2024-07-25 18:10:16 -04:00
  • 41cd47caab examples : export-lora : fix issue with quantized base models (#8687) b3465 Xuan Son Nguyen 2024-07-25 23:49:39 +02:00
  • 49ce0ab6d4 ggml: handle ggml_init failure to fix NULL pointer deref (#8692) b3464 DavidKorczynski 2024-07-25 22:23:05 +01:00
  • 4226a8d10e llama : fix build + fix fabs compile warnings (#8683) b3463 Georgi Gerganov 2024-07-25 19:57:31 +03:00
  • bf5a81df37 ggml : fix build on Windows with Snapdragon X (#8531) b3462 Andreas (Andi) Kunar 2024-07-25 18:01:00 +02:00
  • 88954f7fbd tests : fix printfs (#8068) b3461 Georgi Gerganov 2024-07-25 18:57:44 +03:00
  • 9aeb0e1f75 sycl add conv support sycl-conv-op Meng, Hengyu 2024-07-25 12:14:30 +00:00
  • ed67bcb24f [SYCL] fix multi-gpu issue on sycl (#8554) b3460 Chen Xi 2024-07-25 11:45:18 +00:00
  • eddcb5238b ggml : add and use ggml_cpu_has_llamafile() (#8664) b3459 Georgi Gerganov 2024-07-25 12:37:42 +03:00
  • be6d7c0791 examples : remove finetune and train-text-from-scratch (#8669) b3458 Xuan Son Nguyen 2024-07-25 10:39:04 +02:00
  • 4b0eff3df5 docs : Quantum -> Quantized (#8666) Ujjawal Panchal 2024-07-25 13:43:27 +05:30
  • 72b962925b delete minicpmv-wrapper in pr caitianchi 2024-07-25 16:01:26 +08:00
  • 107e1edb20 fix uhd code for review comment caitianchi 2024-07-25 15:22:11 +08:00
  • 8a4bad50a8 llama: use sliding window for phi3 (#8627) b3456 Fan Shupei 2024-07-25 15:21:09 +08:00
  • 68504f0970 readme : update games list (#8673) MorganRO8 2024-07-24 12:48:00 -04:00
  • f19bf99c01 Build Llama SYCL Intel with static libs (#8668) Joe Todd 2024-07-24 14:36:00 +01:00
  • 3a7ac5300a readme : update UI list [no ci] (#8505) Thorsten Sommer 2024-07-24 14:52:30 +02:00
  • 96952e7181 llama : fix llama_chat_format_single for mistral (#8657) b3452 Xuan Son Nguyen 2024-07-24 13:48:46 +02:00
  • 79167d9e49 Re-add erroneously removed -fsycl from GGML_EXTRA_LIBS (#8667) b3451 Joe Todd 2024-07-24 11:55:26 +01:00
  • b115105f05 add llama_lora_adapter_clear (#8653) b3450 Xuan Son Nguyen 2024-07-24 11:25:19 +02:00
  • 5934580905 ggml : add and use ggml_cpu_has_llamafile() gg/system-info-llamafile Georgi Gerganov 2024-07-24 11:31:41 +03:00
  • de280085e7 examples : Fix llama-export-lora example (#8607) b3449 Xuan Son Nguyen 2024-07-23 23:48:37 +02:00
  • 9c0a61f8c3 Merge branch 'master' into compilade/batch-splits Francis Couture-Harpin 2024-07-23 13:37:09 -04:00
  • b841d07408 server : fix URL.parse in the UI (#8646) Vali Malinoiu 2024-07-23 17:37:42 +03:00
  • 64cf50a0ed sycl : Add support for non-release DPC++ & oneMKL (#8644) b3447 Joe Todd 2024-07-23 14:58:37 +01:00
  • 938943cdbf llama : move vocab, grammar and sampling into separate files (#8508) Georgi Gerganov 2024-07-23 13:10:17 +03:00
  • 751fcfc6c3 Vulkan IQ4_NL Support (#8613) b3445 0cc4m 2024-07-23 10:56:49 +02:00
  • 46e47417aa Allow all RDNA2 archs to use sdot4 intrinsic (#8629) Jeroen Mostert 2024-07-23 10:50:40 +02:00
  • e7e6487ba0 contrib : clarify PR squashing + module names (#8630) Georgi Gerganov 2024-07-23 11:28:38 +03:00
  • 063d99ad11 [SYCL] fix scratch size of softmax (#8642) b3442 luoyu-intel 2024-07-23 07:43:28 +00:00
  • 6fd0937e9f remove the extern "C", MINICPMV_API caitianchi 2024-07-23 15:25:32 +08:00
  • fcde997126 remove load_image_size into clip_ctx caitianchi 2024-07-23 15:24:43 +08:00
  • 3642be9937 fix KEY_HAS_MINICPMV_PROJ caitianchi 2024-07-23 14:55:55 +08:00
  • fe28a7b9d8 llama : clean-up gg/llama-reorganize Georgi Gerganov 2024-07-23 08:38:50 +03:00
  • dad4abe1bc add warn caitianchi 2024-07-23 11:57:42 +08:00
  • 62fa15bcd2 fix cmakefile caitianchi 2024-07-23 11:52:34 +08:00
  • dae3cae841 llama : suffix the internal APIs with "_impl" Georgi Gerganov 2024-07-22 19:59:00 +03:00
  • 39fbaf9f50 llama : redirect external API to internal APIs Georgi Gerganov 2024-07-19 16:56:20 +03:00
  • 66ac80f5b9 make : update llama.cpp deps [no ci] Georgi Gerganov 2024-07-19 16:25:53 +03:00
  • 8fef5b1897 llama : move tokenizers into llama-vocab Georgi Gerganov 2024-07-19 15:44:30 +03:00
  • e7dffa6bc7 llama : deprecate llama_sample_grammar Georgi Gerganov 2024-07-19 14:43:56 +03:00
  • 689d377916 cont Georgi Gerganov 2024-07-19 14:21:33 +03:00
  • b4b242e6bd cont : pre-fetch rules Georgi Gerganov 2024-07-16 23:12:29 +03:00
  • 5a71d1aefd cont Georgi Gerganov 2024-07-16 23:01:20 +03:00
  • 675f305f31 llama : move grammar code into llama-grammar Georgi Gerganov 2024-07-16 16:09:08 +03:00
  • 0ddc8e361c llama : move sampling code into llama-sampling Georgi Gerganov 2024-07-19 18:15:36 +03:00
  • 081fe431aa llama : fix codeshell support (#8599) b3441 Keke Han 2024-07-23 00:43:43 +08:00
  • d94c6e0ccb llama : add support for SmolLm pre-tokenizer (#8609) b3440 Jason Stillerman 2024-07-22 10:43:01 -04:00
  • 4c755832fe remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir caitianchi 2024-07-22 21:44:56 +08:00
  • 566daa5a5b *.py: Stylistic adjustments for python (#8233) Jiří Podivín 2024-07-22 15:44:53 +02:00
  • be8b5b2f8d fix code review caitianchi 2024-07-22 21:34:21 +08:00
  • 6f11a83e4e llama : allow overrides for tokenizer flags (#8614) b3438 Georgi Gerganov 2024-07-22 13:33:22 +03:00
  • e093dd2382 tests : re-enable tokenizer tests (#8611) b3437 Georgi Gerganov 2024-07-22 13:32:49 +03:00
  • 50e05353e8 llama : add Mistral Nemo inference support (#8604) b3436 Douglas Hanley 2024-07-22 03:06:17 -05:00
  • 628154492a server : update doc to clarify n_keep when there is bos token (#8619) Jan Boon 2024-07-22 16:02:09 +08:00
  • 04bab6b7da ggml: fix compile error for RISC-V (#8623) b3434 Mark Zhuang 2024-07-22 15:56:45 +08:00
  • b7c11d36e6 examples: fix android example cannot be generated continuously (#8621) b3433 devojony 2024-07-22 14:54:42 +08:00
  • 45f2c19cc5 flake.lock: Update (#8610) Georgi Gerganov 2024-07-21 16:45:10 +03:00
  • 57349e1db3 llama : allow overrides for tokenizer flags gg/allow-kv-overrides Georgi Gerganov 2024-07-21 14:42:15 +03:00
  • 22f281aa16 examples : Rewrite pydantic_models_to_grammar_examples.py (#8493) M-A 2024-07-20 22:09:17 -04:00
  • 328884f421 gguf-py : fix some metadata name extraction edge cases (#8591) compilade 2024-07-20 21:58:49 -04:00
  • c69c63039c convert_hf : fix Gemma v1 conversion (#8597) compilade 2024-07-20 21:53:01 -04:00
  • 1932a1b871 gguf-py : do not use title case for naming convention compilade/fix-metadata-name-extraction Francis Couture-Harpin 2024-07-20 16:47:43 -04:00