docker/install.sh

#!/bin/bash

torch_version=${1:-2.4.0}
torchvision_version=${2:-0.19.0}
torchaudio_version=${3:-2.4.0}
vllm_version=${4:-0.6.0}
lmdeploy_version=${5:-0.6.1}
autogptq_version=${6:-0.7.1}
flashattn_version=${7:-2.7.1.post4}

pip uninstall -y torch torchvision torchaudio

pip install --no-cache-dir torch==$torch_version torchvision==$torchvision_version torchaudio==$torchaudio_version

pip install --no-cache-dir -U autoawq lmdeploy==$lmdeploy_version

pip install --no-cache-dir torch==$torch_version torchvision==$torchvision_version torchaudio==$torchaudio_version

pip install --no-cache-dir tiktoken transformers_stream_generator bitsandbytes deepspeed torchmetrics decord optimum openai-whisper

# pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.4cxx11abiTRUE-cp310-cp310-linux_x86_64.whl
# find on: https://github.com/Dao-AILab/flash-attention/releases
# cd /tmp && git clone https://github.com/Dao-AILab/flash-attention.git && cd flash-attention && python setup.py install && cd / && rm -fr /tmp/flash-attention && pip cache purge;
pip install --no-cache-dir flash_attn==$flashattn_version

pip install --no-cache-dir triton auto-gptq==$autogptq_version -U && pip cache purge

if [[ "$(printf '%s\n' "0.6.0" "$vllm_version" | sort -V | head -n1)" = "0.6.0" ]]; then
    # vllm_version is >= 0.6.0
    pip install --no-cache-dir vllm==$vllm_version && pip cache purge
fi

# pip uninstall -y torch-scatter && TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.9;9.0" pip install --no-cache-dir -U torch-scatter
Refactor Dockerfile (#1036) 2024-10-21 18:26:23 +08:00			`#!/bin/bash`

			`torch_version=${1:-2.4.0}`
			`torchvision_version=${2:-0.19.0}`
			`torchaudio_version=${3:-2.4.0}`
			`vllm_version=${4:-0.6.0}`
			`lmdeploy_version=${5:-0.6.1}`
			`autogptq_version=${6:-0.7.1}`
Merge release 1.22 (#1187) * bump version 1.22.0 * fix confict between lmdeploy & vllm * flash-attn version * fix version building image * fix https://www.modelscope.cn/models/iic/nlp_structbert_address-parsing_chinese_base/feedback/issueDetail/20431 (#1170) * fix path contatenation to be windows compatabile (#1176) * fix path contatenation to be windows compatabile * support dataset too --------- Co-authored-by: Yingda Chen <yingda.chen@alibaba-inc.com> * logger.warning when using remote code (#1171) * logger warning when using remote code Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * feat: all other ollama models (#1174) * add cases * new models --------- Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * Unify datasets cache dir (#1178) * fix cache * fix lint * fix dataset cache * fix lint * remove * Add repo_id and repo_type in snapshot_download (#1172) * add repo_id and repo_type in snapshot_download * fix positional args * update * Fix/text gen (#1177) * fix text-gen: read pipeline type from configuration.json first --------- Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * update doc with llama_index (#1180) * update version to 1.22.1 * merge release/1.22 --------- Co-authored-by: suluyana <suluyan_sly@163.com> Co-authored-by: tastelikefeet <58414341+tastelikefeet@users.noreply.github.com> Co-authored-by: Yingda Chen <yingdachen@apache.org> Co-authored-by: Yingda Chen <yingda.chen@alibaba-inc.com> Co-authored-by: suluyana <110878454+suluyana@users.noreply.github.com> Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> Co-authored-by: Yunlin Mao <mao.looper@qq.com> 2025-01-15 14:08:52 +08:00			`flashattn_version=${7:-2.7.1.post4}`
Refactor Dockerfile (#1036) 2024-10-21 18:26:23 +08:00
Refine docker file (#1041) 2024-10-22 11:13:17 +08:00			`pip uninstall -y torch torchvision torchaudio`

Refine dockerfile (#1042) 2024-10-23 09:59:51 +08:00			`pip install --no-cache-dir torch==$torch_version torchvision==$torchvision_version torchaudio==$torchaudio_version`
Refine docker file (#1041) 2024-10-22 11:13:17 +08:00
Use legacy cache (#1215) 2025-02-07 17:31:32 +08:00			`pip install --no-cache-dir -U autoawq lmdeploy==$lmdeploy_version`

			`pip install --no-cache-dir torch==$torch_version torchvision==$torchvision_version torchaudio==$torchaudio_version`

Merge release1.25 to master (#1319) 2025-04-23 13:40:54 +08:00			`pip install --no-cache-dir tiktoken transformers_stream_generator bitsandbytes deepspeed torchmetrics decord optimum openai-whisper`
Refactor Dockerfile (#1036) 2024-10-21 18:26:23 +08:00
			`# pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.4cxx11abiTRUE-cp310-cp310-linux_x86_64.whl`
			`# find on: https://github.com/Dao-AILab/flash-attention/releases`
Merge release 1.22 (#1187) * bump version 1.22.0 * fix confict between lmdeploy & vllm * flash-attn version * fix version building image * fix https://www.modelscope.cn/models/iic/nlp_structbert_address-parsing_chinese_base/feedback/issueDetail/20431 (#1170) * fix path contatenation to be windows compatabile (#1176) * fix path contatenation to be windows compatabile * support dataset too --------- Co-authored-by: Yingda Chen <yingda.chen@alibaba-inc.com> * logger.warning when using remote code (#1171) * logger warning when using remote code Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * feat: all other ollama models (#1174) * add cases * new models --------- Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * Unify datasets cache dir (#1178) * fix cache * fix lint * fix dataset cache * fix lint * remove * Add repo_id and repo_type in snapshot_download (#1172) * add repo_id and repo_type in snapshot_download * fix positional args * update * Fix/text gen (#1177) * fix text-gen: read pipeline type from configuration.json first --------- Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> * update doc with llama_index (#1180) * update version to 1.22.1 * merge release/1.22 --------- Co-authored-by: suluyana <suluyan_sly@163.com> Co-authored-by: tastelikefeet <58414341+tastelikefeet@users.noreply.github.com> Co-authored-by: Yingda Chen <yingdachen@apache.org> Co-authored-by: Yingda Chen <yingda.chen@alibaba-inc.com> Co-authored-by: suluyana <110878454+suluyana@users.noreply.github.com> Co-authored-by: suluyan <suluyan.sly@alibaba-inc.com> Co-authored-by: Yunlin Mao <mao.looper@qq.com> 2025-01-15 14:08:52 +08:00			`# cd /tmp && git clone https://github.com/Dao-AILab/flash-attention.git && cd flash-attention && python setup.py install && cd / && rm -fr /tmp/flash-attention && pip cache purge;`
			`pip install --no-cache-dir flash_attn==$flashattn_version`
Refactor Dockerfile (#1036) 2024-10-21 18:26:23 +08:00
Merge 1.29 to master (#1469) 2025-08-18 13:51:00 +08:00			`pip install --no-cache-dir triton auto-gptq==$autogptq_version -U && pip cache purge`

			`if [[ "$(printf '%s\n' "0.6.0" "$vllm_version" \| sort -V \| head -n1)" = "0.6.0" ]]; then`
			`# vllm_version is >= 0.6.0`
			`pip install --no-cache-dir vllm==$vllm_version && pip cache purge`
			`fi`
Refactor Dockerfile (#1036) 2024-10-21 18:26:23 +08:00
			`# pip uninstall -y torch-scatter && TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.5;8.0;8.6;8.9;9.0" pip install --no-cache-dir -U torch-scatter`