Commit Graph

  • 6f53d8a6b4 docker: add missing vulkan library to base layer and update to 24.04 (#11422) Nuno 2025-01-26 18:22:43 +01:00
  • 19f65187cb cmake: add ggml find package (#11369) b4560 bandoti 2025-01-26 12:07:48 -04:00
  • 1d8ee06000 rpc: fix register position (#11424) b4559 Frank Mai 2025-01-26 23:20:34 +08:00
  • 2cc9b8c32c readme : update hot topics Georgi Gerganov 2025-01-26 14:30:15 +02:00
  • f35726c2fb build: apply MSVC /bigobj option to c/cpp files only (#11423) b4557 Jeff Bolz 2025-01-25 20:10:03 -06:00
  • 4a75d19376 vulkan: compile shaders on-demand (#11406) Jeff Bolz 2025-01-25 15:29:57 -06:00
  • 26771a1491 Hip: disable VMM on hip as it seams that it dosent work in some configurations (#11420) uvos 2025-01-25 21:01:12 +01:00
  • ca6baf76c1 build: add /bigobj to MSVC build (#11407) Jeff Bolz 2025-01-25 11:26:37 -06:00
  • 6e264a905b docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for (#11419) Diego Devesa 2025-01-25 17:22:41 +01:00
  • 49b0e3cec4 server : fix cleaning up stream task (#11418) b4552 Xuan Son Nguyen 2025-01-25 16:36:44 +01:00
  • 20a758155b docker : fix CPU ARM build (#11403) Diego Devesa 2025-01-25 15:22:29 +01:00
  • 00c24acb2a ci : fix line breaks on windows builds (#11409) b4550 Georgi Gerganov 2025-01-25 13:36:48 +02:00
  • 466ea66f33 CANN: Add Ascend CANN build ci (#10217) b4549 jiahao su 2025-01-25 07:26:01 +08:00
  • 5f0db9522f hip : Add hipGraph and VMM support to ROCM (#11362) b4548 uvos 2025-01-25 00:02:23 +01:00
  • de9d2c6f09 test [pack] sl/pr-releases slaren 2025-01-24 22:07:27 +01:00
  • df0edbb0be test slaren 2025-01-24 22:03:31 +01:00
  • 202b1e7105 ci : allow creating artifacts on PRs on demand slaren 2025-01-24 21:36:11 +01:00
  • c5d9effb49 CUDA: fix FP16 cuBLAS GEMM (#11396) b4547 Johannes Gäßler 2025-01-24 21:02:43 +01:00
  • 9fbadaef4f rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (#11356) b4546 uvos 2025-01-24 17:50:49 +01:00
  • 9755129c27 release : pack /lib in the packages (#11392) b4545 Georgi Gerganov 2025-01-24 18:41:30 +02:00
  • 969b264657 Revert "TMP : push artifacts" gg/build-pack-lib-include Georgi Gerganov 2025-01-24 17:58:09 +02:00
  • 5740ec7a66 ci : change ubuntu package to 22.04 Georgi Gerganov 2025-01-24 17:05:44 +02:00
  • 872fd18420 ci : fix typo Georgi Gerganov 2025-01-24 16:46:03 +02:00
  • 39d0621872 ci : macos set build rpath to "@loader_path" Georgi Gerganov 2025-01-24 16:31:12 +02:00
  • dae44bf21a ci : change back to ubuntu latest Georgi Gerganov 2025-01-24 16:28:32 +02:00
  • 537b09e70f TMP : push artifacts Georgi Gerganov 2025-01-24 14:54:24 +02:00
  • 8b2ed1e432 ci : remove obsolete MacOS build Georgi Gerganov 2025-01-24 16:01:52 +02:00
  • f9f65f0162 ci : try to fix macos build rpaths Georgi Gerganov 2025-01-24 16:01:32 +02:00
  • 56e26a7f30 ci : change ubuntu build from latest to 20.04 Georgi Gerganov 2025-01-24 15:58:48 +02:00
  • 194358e3b7 ci : restore the original HIP commands Georgi Gerganov 2025-01-24 15:41:52 +02:00
  • a07c2c8a52 docs : Update readme to build targets for local docker build (#11368) Jafar Uruç 2025-01-24 13:30:13 +00:00
  • 50455ded31 ci : fix HIP cmake compiler options to be on first line Georgi Gerganov 2025-01-24 15:23:22 +02:00
  • 564353c9a3 Revert "TMP : push artifacts" Georgi Gerganov 2025-01-24 15:22:36 +02:00
  • 4decf2c4df TMP : push artifacts Georgi Gerganov 2025-01-24 14:54:24 +02:00
  • 3a35bfe1f7 cmake : put libs in /bin Georgi Gerganov 2025-01-24 14:40:48 +02:00
  • 8137b4bb2b CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380) b4543 Johannes Gäßler 2025-01-24 12:38:31 +01:00
  • ff4cb6ef4c release : pack /lib and /include in the packages gg/build-linux-static Georgi Gerganov 2025-01-24 13:28:37 +02:00
  • 1af6945eb0 cmake : avoid -march=native when reproducible build is wanted (#11366) b4542 Bernhard M. Wiedemann 2025-01-24 12:21:35 +01:00
  • 01f37edf1a Update llama-run README.md (#11386) Eric Curtin 2025-01-24 09:39:24 +00:00
  • c07e87f38b server : (webui) put DeepSeek R1 CoT in a collapsible <details> element (#11364) stduhpf 2025-01-24 09:02:38 +01:00
  • 564804b79b tests: fix some mul_mat test gaps (#11375) b4539 Jeff Bolz 2025-01-23 14:51:24 -06:00
  • 05f63cc9ee Update documentation (#11373) b4538 Eric Curtin 2025-01-23 20:04:31 +00:00
  • f7fb43cd0b Add -ngl (#11372) b4537 Eric Curtin 2025-01-23 16:16:18 +00:00
  • 5845661640 server : add more clean up when cancel_tasks is called (#11340) b4536 Xuan Son Nguyen 2025-01-23 13:56:05 +01:00
  • f211d1dc10 Treat hf.co/ prefix the same as hf:// (#11350) b4535 Eric Curtin 2025-01-23 10:38:20 +00:00
  • 955a6c2d91 Vulkan-run-test: fix mmq_wg_denoms (#11343) b4534 amd-dwang 2025-01-23 15:14:28 +08:00
  • 1971adf55e vulkan: sort shaders for more deterministic binary (#11315) b4533 Jeff Bolz 2025-01-23 01:07:50 -06:00
  • 5245729e33 vulkan: fix diag_mask_inf (#11323) b4532 Jeff Bolz 2025-01-23 01:01:17 -06:00
  • 6152129d05 main : update README documentation for batch size (#11353) Diego Devesa 2025-01-22 19:22:20 +01:00
  • 16d3df7ab0 readme : add plugin links (#11355) Georgi Gerganov 2025-01-22 19:44:26 +02:00
  • 12c2bdf2de server : fix draft context not being released (#11354) b4529 Diego Devesa 2025-01-22 17:44:40 +01:00
  • c64d2becb1 minja: sync at https://github.com/google/minja/commit/0f5f7f2b3770eb682fbc11763266d45204173686 (#11352) b4528 Olivier Chafik 2025-01-22 16:16:27 +00:00
  • 96f4053934 Adding logprobs to /v1/completions (#11344) b4527 Jiří Podivín 2025-01-22 12:51:32 +01:00
  • a94f3b2727 common: utils to split / join / repeat strings (from json converter) (#11342) b4526 Olivier Chafik 2025-01-22 09:51:44 +00:00
  • 3e3357fd77 llava : support Minicpm-omni (#11289) b4525 tc-mb 2025-01-22 15:35:48 +08:00
  • 6171c9d258 Add Jinja template support (#11016) b4524 Olivier Chafik 2025-01-21 13:18:51 +00:00
  • e28245f35f export-lora : fix tok_embd tensor (#11330) b4523 Xuan Son Nguyen 2025-01-21 14:07:12 +01:00
  • 6da5bec81c rpc : better caching of the base buffer pointer (#11331) b4522 Radoslav Gerganov 2025-01-21 15:06:41 +02:00
  • 2e2f8f093c linenoise.cpp refactoring (#11301) b4521 Eric Curtin 2025-01-21 09:32:35 +00:00
  • 2139667ec4 metal : fix out-of-bounds write (#11314) b4520 Georgi Gerganov 2025-01-21 08:48:13 +02:00
  • 80d0d6b4b7 common : add -hfd option for the draft model (#11318) b4519 Georgi Gerganov 2025-01-20 22:29:43 +02:00
  • aea8ddd516 vulkan: fix coopmat2 validation failures (#11284) b4518 Jeff Bolz 2025-01-20 10:38:32 -06:00
  • c9e7cbb08b safer jinja llama_chat_templates struct xsn/tmp_jinja_safer Xuan Son Nguyen 2025-01-20 16:58:29 +01:00
  • 9f7add1cde examples : fix add_special conditions (#11311) Georgi Gerganov 2025-01-20 16:36:08 +02:00
  • 90d987b105 mmap: add include for cerrno (#11296) b4516 Christopher Nielsen 2025-01-20 09:02:43 -05:00
  • a4251edd6f cmake: fix shell command quoting in build-info script (#11309) Michael Podvitskiy 2025-01-20 15:02:15 +01:00
  • ec7f3ac9ab llama : add support for Deepseek-R1-Qwen distill model (#11310) b4514 Xuan Son Nguyen 2025-01-20 14:35:07 +01:00
  • ef6dada60c cont : fix whitespaces (#11305) b4513 Georgi Gerganov 2025-01-20 09:29:32 +02:00
  • ae3c1db2f9 llama : re-add LLM_ARCH_PHIMOE (#11305) b4512 Kyle Bruene 2025-01-20 01:21:01 -06:00
  • 92bc493917 tests : increase timeout when sanitizers are enabled (#11300) Georgi Gerganov 2025-01-19 20:22:30 +02:00
  • b9daaffe02 simple-chat : fix BOS being added to each message (#11278) b4510 Georgi Gerganov 2025-01-19 18:12:09 +02:00
  • 90a0349349 recommended way to check if the version is 0.3, as requested by ngxson cedo/add-outetts-v0.3 LostRuins Concedo 2025-01-19 21:43:59 +08:00
  • 99487b57d4 SYCL: Introducing memory host pool (#11251) b4509 Nicolò Scipione 2025-01-19 14:33:34 +01:00
  • cc50356470 minja: fix vigogne (https://github.com/google/minja/pull/22) ochafik 2025-01-18 17:55:04 +00:00
  • e3c475cd12 Disable jinja test that has a cryptic windows failure ochafik 2025-01-18 14:55:27 +00:00
  • a1649cc13f Adding linenoise.cpp to llama-run (#11252) b4508 Eric Curtin 2025-01-18 14:42:31 +00:00
  • 4dd34ff831 cmake : add sanitizer flags for llama.cpp (#11279) Georgi Gerganov 2025-01-18 16:18:15 +02:00
  • f30f099228 server : implement cancellable request (#11285) b4506 Xuan Son Nguyen 2025-01-18 14:12:05 +01:00
  • 0e74c9dabe Add missing optional include to server.cpp ochafik 2025-01-18 11:58:00 +00:00
  • fc60802b6e Rm unused optional include ochafik 2025-01-18 11:35:54 +00:00
  • f26c874179 scripts : restore hf.sh (#11288) Georgi Gerganov 2025-01-18 13:18:32 +02:00
  • b5486956ff added rudimentary support for outetts v0.3 500m and 1b models Concedo 2025-01-18 18:48:49 +08:00
  • 5074e6fecd Fix copy elision warning ochafik 2025-01-18 10:48:03 +00:00
  • 33322e823e Flush stdout in chat template before potential crash ochafik 2025-01-18 10:38:21 +00:00
  • e63520f37a Forward decl minja::chat_template to avoid eager json dep ochafik 2025-01-18 10:37:56 +00:00
  • 6390a998bf tts : add guide tokens support (#11186) b4504 LostRuins Concedo 2025-01-18 18:20:57 +08:00
  • ba421dd04e gguf-test: tensor data comparison jg/llama-sanitize Johannes Gäßler 2025-01-18 09:49:47 +01:00
  • 44e18ef939 vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281) b4503 Jeff Bolz 2025-01-18 02:26:50 -06:00
  • ee1e10e21e Normalize newlines in test-chat-templates for windows tests ochafik 2025-01-18 02:52:40 +00:00
  • d5fa351a24 Revert LLAMA_CHATML_TEMPLATE refactor ochafik 2025-01-18 01:04:12 +00:00
  • 81c0d437a5 Attempt to fix linkage of LLAMA_CHATML_TEMPLATE ochafik 2025-01-18 00:56:19 +00:00
  • 40db78963b Merge remote-tracking branch 'origin/master' into jinja ochafik 2025-01-18 00:44:37 +00:00
  • b75d0622e4 Refactor common_chat_* functions to accept minja template + use_jinja option ochafik 2025-01-18 00:43:38 +00:00
  • 7000623c00 tests : fix gguf context use in same_tensor_data Georgi Gerganov 2025-01-17 16:26:12 +02:00
  • e872097c35 cmake : apply only sanitizer flags at top level Georgi Gerganov 2025-01-17 15:48:39 +02:00
  • 9d1b20ad1a cmake : move llama.cpp compile flags to top level lists Georgi Gerganov 2025-01-17 15:40:03 +02:00
  • 9a03bc811f cmake : move sanitizer flags to llama_add_compile_flags Georgi Gerganov 2025-01-17 15:33:36 +02:00
  • ce293d837c tests : fix compile warnings Georgi Gerganov 2025-01-17 15:22:36 +02:00
  • 72dc7bff4d cmake : add sanitizer flags for llama.cpp Georgi Gerganov 2025-01-17 15:18:24 +02:00
  • 3edfa7d375 llama.android: add field formatChat to control whether to parse special tokens when send message (#11270) b4502 codezjx 2025-01-17 20:57:56 +08:00