Improve usability of --model-url & related flags by ochafik · Pull Request #6930 · ggml-org/llama.cpp

ochafik · 2024-04-26T14:26:35Z

--model is now inferred as models/$filename with the filename from --model-url / -mu or --hf-file / -hff if set (it still defaults to models/7B/gguf-model-f16.gguf otherwise). Downloading different URLs will no longer overwrite previous downloads.

URL model download now write a .json companion metadata file (instead of the previous separate .etag & .lastModified files). This also contains the URL itself, which is useful to remember the exact origin of models & prevents accidental overwrites of files.

Note: This is a breaking change wrt/ already downloaded models as .etag and .lastModified files are now obsolete. If you're used to typing the following:

./main -mu <some-url> -m <some-model-file>

Then you can avoid a re-download by migrating your .etag & .lastModified files to a .json file using a simple Python snippet

Show `migrate_etag.py` & its usage

import os
import json, sys

for model in sys.argv[1:]:
    if os.path.exists(f'{model}.etag') and not os.path.exists(f'{model}.json'):
        with open(f'{model}.etag', 'r') as f: etag = f.read()
        with open(f'{model}.lastModified', 'r') as f: last_modified = f.read()
        with open(f'{model}.json', 'w') as f: f.write(json.dumps(dict(etag=etag, lastModified=last_modified), indent=2))
        print(f'Created {model}.json')

python migrate_etag.py models/7B/ggml-model-f16.gguf
cat models/7B/ggml-model-f16.gguf.json
# {
#   "etag": "\"40d7e29dab8ea579f8b8087bc9370c8a-359\"",
#   "lastModified": "Fri, 19 Apr 2024 02:34:23 GMT"
# }
rm models/7B/ggml-model-f16.gguf.{etag,lastModified}

Smaller changes:
- Log about etag / modified time changes that cause re-downloads
- Enable the defaulting of --hf-file to --model on server (as was done on main)
- Mitigate risk of buffer overflows in headers handling

make clean && make -j LLAMA_CURL=1 main server

./main -p Test -n 100 -mu https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf
# ...

./main -p Test -n 100 -hfr NousResearch/Meta-Llama-3-8B-Instruct-GGUF -hff Meta-Llama-3-8B-Instruct-Q4_K_M.gguf
# ...

ls models/
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf
# Meta-Llama-3-8B-Instruct-Q4_K_M.gguf.json
# Phi-3-mini-4k-instruct-q4.gguf
# Phi-3-mini-4k-instruct-q4.gguf.json
# ...

cat models/Phi-3-mini-4k-instruct-q4.gguf.json
# {
#     "url": "https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf"
#     "etag": "\"b83ce18f1e735d825aa3402db6dae311-145\"",
#     "lastModified": "Thu, 25 Apr 2024 21:26:15 GMT",
# }

TODO:

Provide simple bash / python snippet to migrate existing .etag / .lastModified files to JSON (or be backwards compatible)

…file (or else legacy models/7B/ggml-model-f16.gguf)

…astModified)

phymbert · 2024-04-26T14:35:10Z

Great, please mind it will force people to download again files from the remote URL, kind of a breaking change.

ochafik · 2024-04-26T15:10:41Z

Great, please mind it will force people to download again files from the remote URL, kind of a breaking change.

@phymbert ah I forgot, indeed! I initially planned on being backwards compatible (I felt it had low long-term usefulness but happy to add this code back, it's just a few lines) but I thought it's easier to provide a code snippet for people to create the JSON file out of the etag & lastModified. Added this as TODO before I undraft this PR.

(also technically even without a migration snippet people can just use -m models/... to use their already downloaded model(s), but agree it's an unpleasant surprise)

phymbert · 2024-04-26T18:23:02Z

-                        n_items - strlen(last_modified_prefix) - 2); // Remove CRLF
+            std::string header(buffer, n_items);
+            std::smatch match;
+            if (std::regex_match(header, match, std::regex("([^:]+): (.*)\r\n", std::regex_constants::multiline))) {


the regex will be compiled at each header ? do we really need a regex to parse http headers ?

do we really need a regex to parse http headers ?

Agree std::regex may seem overkill but it's simpler & safer than C string manipulations.

In the previous code for instance, just realized there's at least one buffer overflow bug (cc/ @ggerganov FYI): this strncpy will write beyond the stack-allocated etag's LLAMA_CURL_MAX_HEADER_LENGTH (=256) bytes and into stack-allocated etag_path if the ETag header value length is > 256 bytes, possibly giving HuggingFace (or anyone else you download from) write access to the system (a fix would be to turn the last arg of strncpy to MIN(sizeof(etag) - 1, n_items - strlen(etag_prefix) - 2), but it would make the code a bit harder to read & maintain).

the regex will be compiled at each header ?

Good point! I had opted to be slightly wasteful in CPU cycles here as the lifecyle management of regexes is trickier in the C callback context (e.g. can't allocate outside & pass through lambda capture), and the easiest alternatives (static alloc inside the callback or globally) were a bit wasteful in memory as these regex aren't useful afterwards. Let's go for potential shorter startup time? (now using local static allocs).

You could always pass a pointer to the compiled regex in the userdata, however this is not performance sensitive code at all, and it's preferable to keep the code simple and easier to maintain.

github-actions · 2024-04-26T19:12:17Z

📈 llama.cpp server for bench-server-baseline on Standard_NC4as_T4_v3 for phi-2-q4_0: 440 iterations 🚀

Expand details for performance related PR only

Concurrent users: 8, duration: 10m
HTTP request : avg=10678.97ms p(95)=29221.78ms fails=, finish reason: stop=382 truncated=58
Prompt processing (pp): avg=113.15tk/s p(95)=512.71tk/s
Token generation (tg): avg=24.41tk/s p(95)=37.58tk/s
ggml-org/models/phi-2/ggml-model-q4_0.gguf parallel=8 ctx-size=16384 ngl=33 batch-size=2048 ubatch-size=256 pp=1024 pp+tg=2048 branch=model-args commit=5598a6a87d45159baf7b842b99bf14812f2233ec

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 440 iterations"
    y-axis "llamacpp:prompt_tokens_seconds"
    x-axis "llamacpp:prompt_tokens_seconds" 1714423821 --> 1714424447
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 349.73, 349.73, 349.73, 349.73, 349.73, 438.56, 438.56, 438.56, 438.56, 438.56, 390.64, 390.64, 390.64, 390.64, 390.64, 411.3, 411.3, 411.3, 411.3, 411.3, 421.51, 421.51, 421.51, 421.51, 421.51, 471.37, 471.37, 471.37, 471.37, 471.37, 493.34, 493.34, 493.34, 493.34, 493.34, 498.95, 498.95, 498.95, 498.95, 498.95, 515.04, 515.04, 515.04, 515.04, 515.04, 530.2, 530.2, 530.2, 530.2, 530.2, 533.2, 533.2, 533.2, 533.2, 533.2, 544.61, 544.61, 544.61, 544.61, 544.61, 527.82, 527.82, 527.82, 527.82, 527.82, 521.35, 521.35, 521.35, 521.35, 521.35, 564.35, 564.35, 564.35, 564.35, 564.35, 573.4, 573.4, 573.4, 573.4, 573.4, 575.49, 575.49, 575.49, 575.49, 575.49, 587.03, 587.03, 587.03, 587.03, 587.03, 590.58, 590.58, 590.58, 590.58, 590.58, 590.25, 590.25, 590.25, 590.25, 590.25, 609.94, 609.94, 609.94, 609.94, 609.94, 610.03, 610.03, 610.03, 610.03, 610.03, 606.36, 606.36, 606.36, 606.36, 606.36, 614.78, 614.78, 614.78, 614.78, 614.78, 615.01, 615.01, 615.01, 615.01, 615.01, 618.57, 618.57, 618.57, 618.57, 618.57, 634.21, 634.21, 634.21, 634.21, 634.21, 633.22, 633.22, 633.22, 633.22, 633.22, 637.24, 637.24, 637.24, 637.24, 637.24, 637.74, 637.74, 637.74, 637.74, 637.74, 639.56, 639.56, 639.56, 639.56, 639.56, 633.62, 633.62, 633.62, 633.62, 633.62, 631.72, 631.72, 631.72, 631.72, 631.72, 626.85, 626.85, 626.85, 626.85, 626.85, 626.47, 626.47, 626.47, 626.47, 626.47, 628.55, 628.55, 628.55, 628.55, 628.55, 632.68, 632.68, 632.68, 632.68, 632.68, 632.6, 632.6, 632.6, 632.6, 632.6, 636.63, 636.63, 636.63, 636.63, 636.63, 642.95, 642.95, 642.95, 642.95, 642.95, 636.32, 636.32, 636.32, 636.32, 636.32, 642.43, 642.43, 642.43, 642.43, 642.43, 630.8, 630.8, 630.8, 630.8, 630.8, 630.72, 630.72, 630.72, 630.72, 630.72, 630.42, 630.42, 630.42, 630.42, 630.42, 631.47, 631.47, 631.47, 631.47, 631.47, 635.18, 635.18, 635.18, 635.18, 635.18, 637.81, 637.81, 637.81, 637.81, 637.81, 632.71, 632.71, 632.71, 632.71, 632.71, 630.95, 630.95, 630.95, 630.95, 630.95, 630.69, 630.69, 630.69, 630.69, 630.69, 630.69, 630.69, 630.69, 630.69, 630.69, 629.21, 629.21, 629.21, 629.21, 629.21, 628.83, 628.83, 628.83, 628.83, 628.83, 626.44, 626.44, 626.44, 626.44, 626.44, 624.97, 624.97, 624.97, 624.97, 624.97, 628.89, 628.89, 628.89, 628.89, 628.89, 631.73, 631.73, 631.73, 631.73, 631.73, 631.94, 631.94, 631.94, 631.94, 631.94, 631.98, 631.98, 631.98, 631.98]

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 440 iterations"
    y-axis "llamacpp:predicted_tokens_seconds"
    x-axis "llamacpp:predicted_tokens_seconds" 1714423821 --> 1714424447
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 34.81, 34.81, 34.81, 34.81, 34.81, 35.65, 35.65, 35.65, 35.65, 35.65, 21.52, 21.52, 21.52, 21.52, 21.52, 21.59, 21.59, 21.59, 21.59, 21.59, 20.6, 20.6, 20.6, 20.6, 20.6, 20.97, 20.97, 20.97, 20.97, 20.97, 21.52, 21.52, 21.52, 21.52, 21.52, 22.6, 22.6, 22.6, 22.6, 22.6, 23.49, 23.49, 23.49, 23.49, 23.49, 23.75, 23.75, 23.75, 23.75, 23.75, 23.82, 23.82, 23.82, 23.82, 23.82, 23.82, 23.82, 23.82, 23.82, 23.82, 24.08, 24.08, 24.08, 24.08, 24.08, 24.07, 24.07, 24.07, 24.07, 24.07, 23.7, 23.7, 23.7, 23.7, 23.7, 23.74, 23.74, 23.74, 23.74, 23.74, 22.87, 22.87, 22.87, 22.87, 22.87, 22.95, 22.95, 22.95, 22.95, 22.95, 23.08, 23.08, 23.08, 23.08, 23.08, 23.3, 23.3, 23.3, 23.3, 23.3, 23.27, 23.27, 23.27, 23.27, 23.27, 23.0, 23.0, 23.0, 23.0, 23.0, 22.72, 22.72, 22.72, 22.72, 22.72, 22.7, 22.7, 22.7, 22.7, 22.7, 22.72, 22.72, 22.72, 22.72, 22.72, 22.88, 22.88, 22.88, 22.88, 22.88, 22.98, 22.98, 22.98, 22.98, 22.98, 22.79, 22.79, 22.79, 22.79, 22.79, 22.86, 22.86, 22.86, 22.86, 22.86, 23.03, 23.03, 23.03, 23.03, 23.03, 23.04, 23.04, 23.04, 23.04, 23.04, 23.08, 23.08, 23.08, 23.08, 23.08, 22.61, 22.61, 22.61, 22.61, 22.61, 22.46, 22.46, 22.46, 22.46, 22.46, 22.25, 22.25, 22.25, 22.25, 22.25, 22.41, 22.41, 22.41, 22.41, 22.41, 22.54, 22.54, 22.54, 22.54, 22.54, 22.56, 22.56, 22.56, 22.56, 22.56, 22.79, 22.79, 22.79, 22.79, 22.79, 22.87, 22.87, 22.87, 22.87, 22.87, 22.87, 22.87, 22.87, 22.87, 22.87, 22.83, 22.83, 22.83, 22.83, 22.83, 22.76, 22.76, 22.76, 22.76, 22.76, 22.55, 22.55, 22.55, 22.55, 22.55, 22.52, 22.52, 22.52, 22.52, 22.52, 22.63, 22.63, 22.63, 22.63, 22.63, 22.76, 22.76, 22.76, 22.76, 22.76, 22.86, 22.86, 22.86, 22.86, 22.86, 22.94, 22.94, 22.94, 22.94, 22.94, 22.81, 22.81, 22.81, 22.81, 22.81, 22.73, 22.73, 22.73, 22.73, 22.73, 22.73, 22.73, 22.73, 22.73, 22.73, 22.08, 22.08, 22.08, 22.08, 22.08, 22.08, 22.08, 22.08, 22.08, 22.08, 20.84, 20.84, 20.84, 20.84, 20.84, 20.81, 20.81, 20.81, 20.81, 20.81, 20.81, 20.81, 20.81, 20.81, 20.81, 20.83, 20.83, 20.83, 20.83, 20.83, 20.8, 20.8, 20.8, 20.8, 20.8, 20.84, 20.84, 20.84, 20.84]

Details

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 440 iterations"
    y-axis "llamacpp:kv_cache_usage_ratio"
    x-axis "llamacpp:kv_cache_usage_ratio" 1714423821 --> 1714424447
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.11, 0.11, 0.11, 0.11, 0.11, 0.39, 0.39, 0.39, 0.39, 0.39, 0.23, 0.23, 0.23, 0.23, 0.23, 0.28, 0.28, 0.28, 0.28, 0.28, 0.14, 0.14, 0.14, 0.14, 0.14, 0.22, 0.22, 0.22, 0.22, 0.22, 0.11, 0.11, 0.11, 0.11, 0.11, 0.14, 0.14, 0.14, 0.14, 0.14, 0.15, 0.15, 0.15, 0.15, 0.15, 0.17, 0.17, 0.17, 0.17, 0.17, 0.21, 0.21, 0.21, 0.21, 0.21, 0.19, 0.19, 0.19, 0.19, 0.19, 0.19, 0.19, 0.19, 0.19, 0.19, 0.21, 0.21, 0.21, 0.21, 0.21, 0.16, 0.16, 0.16, 0.16, 0.16, 0.24, 0.24, 0.24, 0.24, 0.24, 0.16, 0.16, 0.16, 0.16, 0.16, 0.15, 0.15, 0.15, 0.15, 0.15, 0.18, 0.18, 0.18, 0.18, 0.18, 0.16, 0.16, 0.16, 0.16, 0.16, 0.2, 0.2, 0.2, 0.2, 0.2, 0.23, 0.23, 0.23, 0.23, 0.23, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.14, 0.14, 0.14, 0.14, 0.14, 0.11, 0.11, 0.11, 0.11, 0.11, 0.34, 0.34, 0.34, 0.34, 0.34, 0.15, 0.15, 0.15, 0.15, 0.15, 0.1, 0.1, 0.1, 0.1, 0.1, 0.16, 0.16, 0.16, 0.16, 0.16, 0.19, 0.19, 0.19, 0.19, 0.19, 0.34, 0.34, 0.34, 0.34, 0.34, 0.37, 0.37, 0.37, 0.37, 0.37, 0.19, 0.19, 0.19, 0.19, 0.19, 0.11, 0.11, 0.11, 0.11, 0.11, 0.1, 0.1, 0.1, 0.1, 0.1, 0.09, 0.09, 0.09, 0.09, 0.09, 0.12, 0.12, 0.12, 0.12, 0.12, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.14, 0.14, 0.14, 0.14, 0.14, 0.2, 0.2, 0.2, 0.2, 0.2, 0.33, 0.33, 0.33, 0.33, 0.33, 0.14, 0.14, 0.14, 0.14, 0.14, 0.15, 0.15, 0.15, 0.15, 0.15, 0.08, 0.08, 0.08, 0.08, 0.08, 0.1, 0.1, 0.1, 0.1, 0.1, 0.09, 0.09, 0.09, 0.09, 0.09, 0.33, 0.33, 0.33, 0.33, 0.33, 0.46, 0.46, 0.46, 0.46, 0.46, 0.54, 0.54, 0.54, 0.54, 0.54, 0.6, 0.6, 0.6, 0.6, 0.6, 0.61, 0.61, 0.61, 0.61, 0.61, 0.62, 0.62, 0.62, 0.62, 0.62, 0.5, 0.5, 0.5, 0.5, 0.5, 0.11, 0.11, 0.11, 0.11, 0.11, 0.15, 0.15, 0.15, 0.15, 0.15, 0.22, 0.22, 0.22, 0.22, 0.22, 0.23, 0.23, 0.23, 0.23, 0.23, 0.11, 0.11, 0.11, 0.11]

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 440 iterations"
    y-axis "llamacpp:requests_processing"
    x-axis "llamacpp:requests_processing" 1714423821 --> 1714424447
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 8.0, 8.0, 8.0, 8.0, 8.0, 5.0, 5.0, 5.0, 5.0, 5.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 5.0, 5.0, 5.0, 5.0, 5.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 8.0, 8.0, 8.0, 8.0, 8.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 1.0, 1.0, 1.0, 1.0, 1.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0]

phymbert · 2024-04-30T07:02:57Z

    Given a server listening on localhost:8080
    And   a model url https://huggingface.co/ggml-org/models/resolve/main/bert-bge-small/ggml-model-f16.gguf
-    And   a model file ggml-model-f16.gguf
+    And   a model file bert-bge-small.gguf


Could you please explain why this change is now necessary?

Sorry I missed this! I think there was another test case that was implicitly downloading another URL to ggml-model-f16.gguf, causing a collision

* args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf) * args: main & server now call gpt_params_handle_model_default * args: define DEFAULT_MODEL_PATH + update cli docs * curl: check url of previous download (.json metadata w/ url, etag & lastModified) * args: fix update to quantize-stats.cpp * curl: support legacy .etag / .lastModified companion files * curl: rm legacy .etag file support * curl: reuse regex across headers callback calls * curl: unique_ptr to manage lifecycle of curl & outfile * curl: nit: no need for multiline regex flag * curl: update failed test (model file collision) + gitignore *.gguf.json

Olivier Chafik added 5 commits April 26, 2024 00:40

args: default --model to models/ + filename from --model-url or --hf-…

40a961d

…file (or else legacy models/7B/ggml-model-f16.gguf)

args: main & server now call gpt_params_handle_model_default

9c0db4d

args: define DEFAULT_MODEL_PATH + update cli docs

e55dfde

curl: check url of previous download (.json metadata w/ url, etag & l…

0664e9b

…astModified)

args: fix update to quantize-stats.cpp

5ce50f6

curl: support legacy .etag / .lastModified companion files

eeb3d58

phymbert reviewed Apr 26, 2024

View reviewed changes

Comment thread common/common.cpp Outdated

phymbert reviewed Apr 26, 2024

View reviewed changes

Comment thread common/common.cpp

phymbert reviewed Apr 26, 2024

View reviewed changes

Comment thread common/common.cpp

phymbert reviewed Apr 26, 2024

View reviewed changes

Comment thread common/common.cpp Outdated

ochafik added 4 commits April 27, 2024 15:25

curl: rm legacy .etag file support

5c4aea1

curl: reuse regex across headers callback calls

4c4dc25

curl: unique_ptr to manage lifecycle of curl & outfile

abffd1b

Merge remote-tracking branch 'origin/master' into model-args

f70e4d6

ochafik marked this pull request as ready for review April 27, 2024 15:51

curl: nit: no need for multiline regex flag

f97fa9b

phymbert approved these changes Apr 29, 2024

View reviewed changes

Olivier Chafik added 2 commits April 29, 2024 19:25

Merge remote-tracking branch 'origin/master' into model-args

84b966d

curl: update failed test (model file collision) + gitignore *.gguf.json

5598a6a

ochafik merged commit 8843a98 into ggml-org:master Apr 29, 2024

phymbert reviewed Apr 30, 2024

View reviewed changes

ochafik mentioned this pull request Jun 8, 2024

url: save -mu downloads to new cache location #7826

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve usability of --model-url & related flags#6930

Improve usability of --model-url & related flags#6930
ochafik merged 13 commits intoggml-org:masterfrom
ochafik:model-args

ochafik commented Apr 26, 2024 •

edited

Loading

Uh oh!

phymbert commented Apr 26, 2024

Uh oh!

ochafik commented Apr 26, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phymbert Apr 26, 2024

Uh oh!

ochafik Apr 27, 2024

Uh oh!

slaren Apr 27, 2024

Uh oh!

Uh oh!

github-actions Bot commented Apr 26, 2024 •

edited

Loading

Uh oh!

phymbert Apr 30, 2024

Uh oh!

ochafik May 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ochafik commented Apr 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phymbert commented Apr 26, 2024

Uh oh!

ochafik commented Apr 26, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phymbert Apr 26, 2024

Choose a reason for hiding this comment

Uh oh!

ochafik Apr 27, 2024

Choose a reason for hiding this comment

Uh oh!

slaren Apr 27, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phymbert Apr 30, 2024

Choose a reason for hiding this comment

Uh oh!

ochafik May 18, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ochafik commented Apr 26, 2024 •

edited

Loading

github-actions Bot commented Apr 26, 2024 •

edited

Loading