Commit Graph

  • 6e75fe940d disable mtmd video on i/tv/visionos cisc/mtmd-disable-i-tv-vision-video Sigbjørn Skjæret 2026-06-25 21:43:55 +02:00
  • 9d5d882d8c model : Add label for LFM2.5-230M (#25008) master Tarek Dakhran 2026-06-25 18:58:52 +02:00
  • 81313a35ae type check for get_arr_int xsn/mtmd_fix_2 Xuan Son Nguyen 2026-06-25 18:54:57 +02:00
  • a4b1c14c1a refactor a bit Xuan Son Nguyen 2026-06-25 18:49:30 +02:00
  • 732d5b6fd8 fix Xuan Son Nguyen 2026-06-25 17:36:53 +02:00
  • 1ec44d178d CUDA: Various fixes to cpy.cu (#25000) Oliver Simons 2026-06-25 17:29:23 +02:00
  • 8eef8c1b21 mtmd: add more validations Xuan Son Nguyen 2026-06-25 17:29:23 +02:00
  • c7cddefcbd misc: fix labeler (#25012) Xuan-Son Nguyen 2026-06-25 17:23:37 +02:00
  • e9d1b76d0a server: use status code 403 for disabled features (#24970) Xuan-Son Nguyen 2026-06-25 16:36:40 +02:00
  • 2e4cbade70 Merge branch 'master' into xsn/mtmd_ds_ocr_tiles xsn/mtmd_ds_ocr_tiles Xuan Son Nguyen 2026-06-25 16:28:50 +02:00
  • 099bf06952 misc: update lables (#24920) Xuan-Son Nguyen 2026-06-25 16:26:56 +02:00
  • 68ed5149fb bring back examples, add mtmd xsn/update_labels Xuan Son Nguyen 2026-06-25 15:23:03 +02:00
  • 60bc8866b1 common: refactor model handling (#24980) Xuan-Son Nguyen 2026-06-25 15:17:51 +02:00
  • bf05250df9 use unsigned ints 0cc4m/vulkan-submission-threshold-flops Ruben Ortlam 2026-06-25 15:02:50 +02:00
  • e8ecce53b8 docs : Eagle3 qwen3 draft model support (#24977) Kashif Rasul 2026-06-25 14:58:00 +02:00
  • cb2a4259aa use flops instead of matmul src0 tensor size for submission threshold Ruben Ortlam 2026-06-24 15:24:56 +02:00
  • 492adff8fb vulkan: extract flops calculation into function Ruben Ortlam 2026-06-24 14:30:38 +02:00
  • 683b04cc4a app : add the llama download subcommand (#24982) Adrien Gallouët 2026-06-25 13:36:36 +02:00
  • f728adab68 ggml : address integer overflows in binary ops CUDA implementation (#24706) fairydreaming 2026-06-25 10:06:44 +02:00
  • 3e61ea0e2f ui: fix always-show-sidebar-on-desktop setting after navigation refactor (#24979) Pascal 2026-06-25 09:45:55 +02:00
  • fdbd6abee2 tests : synchronize contexts at end of test-thread-safety (#24935) Christopher Albert 2026-06-25 08:22:51 +02:00
  • e12a0128ab build: include libmtmd in Apple XCFramework (#21935) Abraham Gonzalez 2026-06-25 01:37:30 -04:00
  • b3ce5cedf4 quant : fix quantizing moe with mtp (#24986) b9789 Sigbjørn Skjæret 2026-06-25 07:36:49 +02:00
  • e9fb3b3fc0 sycl : support --split-mode tensor (#24152) b9788 David Spruill 2026-06-25 01:35:21 -04:00
  • 9c10954865 sycl : fix the failed UT cases of conv_3d (#24900) b9787 Neo Zhang 2026-06-25 13:27:58 +08:00
  • fdb2c11c70 opencl: support non-contig rows in norm (#24965) b9786 lhez 2026-06-24 19:21:25 -07:00
  • 09cedfd699 chat: harden caps check (#24973) b9785 Piotr Wilkin (ilintar) 2026-06-25 02:49:22 +02:00
  • 8be759e6f7 hexagon: MUL_MAT and MUL_MAT_ID rework : 32x32 tiled weight repack, kernel-params, cached graphs (#24954) b9784 Max Krasnyansky 2026-06-24 12:14:25 -07:00
  • 894bb27af3 mtmd: model: unlimited-ocr: converter + parity test (#24969) Saba Fallah 2026-06-24 18:20:22 +02:00
  • fb401045cc common: remove unused json-partial (#24968) b9782 Xuan-Son Nguyen 2026-06-24 18:12:16 +02:00
  • 51eae8cfca vulkan: allow reducing the graph submission batches to avoid timeouts (#24872) b9781 Wagner Bruna 2026-06-24 11:29:24 -03:00
  • 3199d5357c chat: harden caps check caps-harden Piotr Wilkin 2026-06-24 15:16:13 +02:00
  • a14f8d2ed5 fix test case xsn/server_403_disabled_endpoints Xuan Son Nguyen 2026-06-24 13:38:25 +02:00
  • d9a0c0fe9b cont Xuan Son Nguyen 2026-06-24 13:29:28 +02:00
  • 796b1ada8d server: use status code 403 for disabled features Xuan Son Nguyen 2026-06-24 12:56:51 +02:00
  • ef687feb42 common: remove unused json-partial xsn/rm_unused_json_partial Xuan Son Nguyen 2026-06-24 12:49:42 +02:00
  • 1191758c5d vulkan: fail the build when a shader fails to compile (#24450) b9780 liminfei-amd 2026-06-24 17:42:03 +08:00
  • 00139b660b ui: loading bar below the model picker (#24931) Pascal 2026-06-24 10:50:44 +02:00
  • ef9c13d4c2 ui: New Logo + Navigation cleanup & Mobile UI/UX improvements (#24897) Aleksander Grygier 2026-06-24 10:21:33 +02:00
  • 88636e178f model : Add LFM2.5-ColBERT-350M and LFM2.5-Embedding-350M (#24913) b9777 Tarek Dakhran 2026-06-24 08:49:46 +02:00
  • ac4105d68b vulkan: Apply bias before softmax in FA, to avoid overflow (#24909) b9776 Jeff Bolz 2026-06-23 22:34:00 -05:00
  • a432e6f863 use destructor instead xsn/cli_http_based Xuan Son Nguyen 2026-06-23 22:57:20 +02:00
  • 5d67f69f59 remove outdated comment Xuan Son Nguyen 2026-06-23 22:49:40 +02:00
  • beef5cf077 Apply suggestions from code review Xuan-Son Nguyen 2026-06-23 22:48:04 +02:00
  • be4a6a63eb server : check draft context creation error (#24922) b9775 kononnable 2026-06-23 16:56:50 +02:00
  • 72a9269172 vulkan: support all backend tests for SQR/SQRT/SIN/COS/CLAMP/LEAKY_RELU/NORM (#24582) b9774 Jeff Bolz 2026-06-23 09:48:24 -05:00
  • b093e46873 case: router with only one model Xuan Son Nguyen 2026-06-23 16:47:30 +02:00
  • 1401fc3ca7 cli support router mode Xuan Son Nguyen 2026-06-23 16:39:59 +02:00
  • 85c58bbcd0 remote server ok Xuan Son Nguyen 2026-06-23 16:19:28 +02:00
  • 19296c1735 working Xuan Son Nguyen 2026-06-23 16:09:09 +02:00
  • 92e854ab83 vulkan: Support GET_ROWS_BACK (#24883) b9773 Jeff Bolz 2026-06-23 08:39:37 -05:00
  • c5606364b2 vulkan: support CONV_3D (#24612) Jeff Bolz 2026-06-23 08:39:20 -05:00
  • 0eb874d374 vulkan: make mul_mm ALIGNED a spec constant (#24689) b9771 Jeff Bolz 2026-06-23 07:26:17 -05:00
  • 90c111bf98 Merge branch 'master' into xsn/cli_http_based Xuan Son Nguyen 2026-06-23 13:29:22 +02:00
  • 75ad0b23ed server: fix remote preset handling, add test (#24938) b9770 Xuan-Son Nguyen 2026-06-23 13:28:34 +02:00
  • f7421eabe8 wip Xuan Son Nguyen 2026-06-23 13:28:14 +02:00
  • 59797670dc cli: move to HTTP-based implementation Xuan Son Nguyen 2026-06-23 13:14:28 +02:00
  • c926ad0985 vulkan: link ggml-cpu when GGML_VULKAN_CHECK_RESULTS / RUN_TESTS are enabled (#24444) b9769 Wyatt Caldwell 2026-06-23 03:55:46 -07:00
  • a3900a6694 model: Granite Speech Plus (#24818) b9768 Gabe Goodhart 2026-06-23 04:03:31 -06:00
  • 7c908502ea ggml-webgpu: improve MTP inference by using mat-vec path for small batches (#24811) b9767 Masashi Yoshimura 2026-06-23 17:13:55 +09:00
  • 035cd8f9a6 codeowners: add yomaytk to ggml-webgpu (#24930) Masashi Yoshimura 2026-06-23 15:19:34 +09:00
  • 73618f27a8 server: improve user message detection and create checkpoints at every user message (#24176) b9765 Aldehir Rojas 2026-06-23 00:27:28 -05:00
  • 23ee8797e1 opencl: q8_0 gemv precision improvement (#24923) Shawn Gu 2026-06-22 22:25:21 -07:00
  • a19f3ea631 misc: update lables Xuan Son Nguyen 2026-06-23 00:22:12 +02:00
  • dec5ca5577 server : Add id to tool call responses api (#24882) b9763 Matt Thompson 2026-06-22 14:03:12 -07:00
  • 9c0ac887f3 ui: Prioritize favorite models in model selection (#24766) Mahdiou Diallo 2026-06-22 21:00:21 +02:00
  • 095058ca19 add arg --threads-sampling xsn/server_multithread_sampling Xuan Son Nguyen 2026-06-22 20:03:49 +02:00
  • c62fdd5fd0 working Xuan Son Nguyen 2026-06-22 19:38:25 +02:00
  • 41ed530be2 wip Xuan Son Nguyen 2026-06-22 19:30:11 +02:00
  • fe03cce8db server: run sampling in a threadpool Xuan Son Nguyen 2026-06-22 19:05:39 +02:00
  • 721354fbdf server: (router) move model downloading to dedicated process (#24834) b9761 Xuan-Son Nguyen 2026-06-22 18:24:04 +02:00
  • 6ee0f65793 server: refactor/generalize input file schema (#24299) b9760 Xuan-Son Nguyen 2026-06-22 16:42:47 +02:00
  • 1b82e9ae51 fix windows xsn/server_input_file_schema Xuan Son Nguyen 2026-06-22 16:20:56 +02:00
  • 61653c7989 Merge branch 'master' into xsn/server_input_file_schema Xuan Son Nguyen 2026-06-22 16:19:59 +02:00
  • 099b579acb ui: model status and load progress via /models/sse feed (#24878) Pascal 2026-06-22 15:55:30 +02:00
  • 037397792a vulkan: split ggml-vulkan.cpp file 0cc4m/vulkan-cpp-split Ruben Ortlam 2026-06-22 15:50:01 +02:00
  • bec3083830 metal : per-op source split + parallel compile (#24021) dev-metal YiChen Lv 2026-06-20 18:36:32 +08:00
  • f8cc15f163 [SYCL] support bf16 on bin_bcast OP and unary OPs (#24838) b9758 Neo Zhang 2026-06-22 19:09:02 +08:00
  • 37957e8531 sampling : remove unconditional softmax+sort in top-n-sigma sampler (#22645) b9757 Tim Neumann 2026-06-22 13:08:32 +02:00
  • d0f9d2e5ac server: fix edit_file crash on append at end of file (line_start -1) (#24893) b9756 Pascal 2026-06-22 10:55:28 +02:00
  • 0ef6f06d55 docs/android.md: Add dependency libandroid-spawn for building in termux (#21812) b9755 aafsmarak 2026-06-22 09:18:31 +05:30
  • 52b3df0023 common/peg : implement ac parser for stricter grammar generation (#24869) b9754 Aldehir Rojas 2026-06-21 16:20:58 -05:00
  • 7c082bc417 server: fix report progress for loading spec models, add "stages" list (#24870) b9753 Xuan-Son Nguyen 2026-06-21 17:36:52 +02:00
  • bddfd2b113 server: refactor batch construction (#24843) b9752 Xuan-Son Nguyen 2026-06-21 14:16:11 +02:00
  • 0d135df48c mtmd: fix mtmd_get_memory_usage (#24867) b9751 Xuan-Son Nguyen 2026-06-21 14:12:15 +02:00
  • bf533823cd jinja : implement call statement (#24847) b9750 Sigbjørn Skjæret 2026-06-21 14:04:52 +02:00
  • 2f89acc2bc mtmd: add load progress callback (#24865) Xuan-Son Nguyen 2026-06-21 13:40:52 +02:00
  • 7ac864bf97 disable DEBUG_TIMINGS xsn/server_refactor_batch Xuan Son Nguyen 2026-06-21 13:38:09 +02:00
  • d37414510b address comments Xuan Son Nguyen 2026-06-21 13:15:58 +02:00
  • bfa3219177 server: add "verbose" field to schema (#24864) b9748 Xuan-Son Nguyen 2026-06-21 13:03:14 +02:00
  • d6d899580d server: real-time model load progress tracking via /models/sse (#24828) b9747 Xuan-Son Nguyen 2026-06-21 11:58:14 +02:00
  • f1ef61fb1b server: add "verbose" field to schema xsn/server_verbose_field Xuan Son Nguyen 2026-06-21 11:16:06 +02:00
  • 8a118ee86c minor : clean-up whitespaces (#24862) Georgi Gerganov 2026-06-21 11:37:12 +03:00
  • d789527482 spec : Support Step3.5/3.7 flash mtp3 (#24340) b9745 YiChen Lv 2026-06-21 16:33:18 +08:00
  • 063d9c156e common/peg : refactor until gbnf grammar generation (#24839) b9744 Aldehir Rojas 2026-06-20 21:15:06 -05:00
  • c57607016a common/json-schema-to-grammar : align spacing rules with parsers (#24835) b9743 Aldehir Rojas 2026-06-20 17:43:04 -05:00
  • 4a80943174 fix(hexagon): use padded stride for ssm-conv weights (#24470) b9742 Guanhuai Zhang 2026-06-21 05:58:49 +08:00
  • 447b0c3646 poc: threadpool sampling xsn/tmp_smpl_parallel Xuan Son Nguyen 2026-06-20 22:08:42 +02:00
  • a527509d0f debug: force llama_synchronize for accurate timings Xuan Son Nguyen 2026-06-20 20:22:31 +02:00
  • 7486a39756 (debug) add timings Xuan Son Nguyen 2026-06-20 20:12:05 +02:00