intel-extension-for-pytorch logo

intel/intel-extension-for-pytorch

Verified Publisher

By Intel Corporation

Updated 7 days ago

Image
11

50K+

Intel® Extension for Pytorch*

Intel® Extension for PyTorch* extends PyTorch* with up-to-date feature optimizations for an extra performance boost on Intel hardware.

On Intel CPUs optimizations take advantage of the following instuction sets:

  • Intel® Advanced Matrix Extensions (Intel® AMX)
  • Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
  • Vector Neural Network Instructions (VNNI)

On Intel GPUs Intel® Extension for PyTorch* provides easy GPU acceleration through the PyTorch* xpu device. The following Intel GPUs are supported:

Images available here start with the Ubuntu* 22.04 base image with Intel® Extension for PyTorch* built for different use cases as well as some additional software. The Python Dockerfile is used to generate The images below at https://github.com/intel/ai-containers.

Note: There are two dockerhub repositories (intel/intel-extension-for-pytorch and intel/intel-optimized-pytorch) that are routinely updated with the latest images, however, some legacy images have not be published to both repositories.

XPU images

The images below include support for both CPU and GPU optimizations:


docker run -it --rm \
    --device /dev/dri \
    -v /dev/dri/by-path:/dev/dri/by-path \
    --ipc=host \
    intel/intel-extension-for-pytorch:2.5.10-xpu

The images below additionally include Jupyter Notebook server:

Tag(s)PytorchIPEXDriverJupyter PortDockerfile
2.5.10-xpu-pip-jupyterv2.5.1v2.5.10+xpu10578888v0.4.0-Beta
2.3.110-xpu-pip-jupyterv2.3.1v2.3.110+xpu9508888v0.4.0-Beta
2.1.40-xpu-pip-jupyterv2.1.0v2.1.40+xpu9148888v0.4.0-Beta
2.1.20-xpu-pip-jupyterv2.1.0v2.1.20+xpu8038888v0.3.4
2.1.10-xpu-pip-jupyterv2.1.0v2.1.10+xpu7368888v0.2.3
Run the XPU Jupyter Container
docker run -it --rm \
    -p 8888:8888 \
    --device /dev/dri \
    -v /dev/dri/by-path:/dev/dri/by-path \
    intel/intel-extension-for-pytorch:2.5.10-xpu-pip-jupyter

After running the command above, copy the URL (something like http://127.0.0.1:$PORT/?token=***) into your browser to access the notebook server.

CPU only images

The images below are built only with CPU optimizations (GPU acceleration support was deliberately excluded):

Run the CPU Container
docker run -it --rm intel/intel-extension-for-pytorch:latest

The images below additionally include Jupyter Notebook server:

docker run -it --rm \
    -p 8888:8888 \
    -v $PWD/workspace:/workspace \
    -w /workspace \
    intel/intel-extension-for-pytorch:2.6.0-pip-jupyter

After running the command above, copy the URL (something like http://127.0.0.1:$PORT/?token=***) into your browser to access the notebook server.


The images below additionally include Intel® oneAPI Collective Communications Library (oneCCL) and Neural Compressor (INC):

Note
Passwordless SSH connection is also enabled in the image, but the container does not contain any SSH ID keys. The user needs to mount those keys at `/root/.ssh/id_rsa` and `/etc/ssh/authorized_keys`.
Tip
Before mounting any keys, modify the permissions of those files with `chmod 600 authorized_keys; chmod 600 id_rsa` to grant read access for the default user account.

Setup and Run IPEX Multi-Node Container

Important
Maintainence, Bug Fixes, and Releases of [Intel® Extension for PyTorch*] Multi-Node Container for Xeon Processors have ceased development. The last supported version is `2.4.0`. For future releases, please use the [Intel® Extension for PyTorch*] Multi-Node Container for XPU.

Some additional assembly is required to utilize this container with OpenSSH. To perform any kind of DDP (Distributed Data Parallel) execution, containers are assigned the roles of launcher and worker respectively:

SSH Server (Worker)

  1. Authorized Keys : /etc/ssh/authorized_keys

SSH Client (Launcher)

  1. Private User Key : /root/.ssh/id_rsa

To add these files correctly please follow the steps described below.

  1. Setup ID Keys

    You can use the commands provided below to generate the identity keys for OpenSSH.

    ssh-keygen -q -N "" -t rsa -b 4096 -f ./id_rsa
    touch authorized_keys
    cat id_rsa.pub >> authorized_keys
    
  2. Configure the permissions and ownership for all of the files you have created so far

    chmod 600 id_rsa config authorized_keys
    chown root:root id_rsa.pub id_rsa config authorized_keys
    
  3. Create a hostfile for torchrun or ipexrun. (Optional)

    Host host1
        HostName <Hostname of host1>
        IdentitiesOnly yes
        IdentityFile ~/.root/id_rsa
        Port <SSH Port>
    Host host2
        HostName <Hostname of host2>
        IdentitiesOnly yes
        IdentityFile ~/.root/id_rsa
        Port <SSH Port>
    ...
    
  4. Configure Intel® oneAPI Collective Communications Library in your python script

    import oneccl_bindings_for_pytorch
    import os
    
    dist.init_process_group(
        backend="ccl",
        init_method="tcp://127.0.0.1:3022",
        world_size=int(os.environ.get("WORLD_SIZE")),
        rank=int(os.environ.get("RANK")),
    )
    
  5. Now start the workers and execute DDP on the launcher

    1. Worker run command:

      docker run -it --rm \
          --net=host \
          -v $PWD/authorized_keys:/etc/ssh/authorized_keys \
          -v $PWD/tests:/workspace/tests \
          -w /workspace \
          intel/intel-extension-for-pytorch:2.4.0-pip-multinode \
          bash -c '/usr/sbin/sshd -D'
      
    2. Launcher run command:

      docker run -it --rm \
          --net=host \
          -v $PWD/id_rsa:/root/.ssh/id_rsa \
          -v $PWD/tests:/workspace/tests \
          -v $PWD/hostfile:/workspace/hostfile \
          -w /workspace \
          intel/intel-extension-for-pytorch:2.4.0-pip-multinode \
          bash -c 'ipexrun cpu  --nnodes 2 --nprocs-per-node 1 --master-addr 127.0.0.1 --master-port 3022 /workspace/tests/ipex-resnet50.py --ipex --device cpu --backend ccl'
      
Note
[Intel® MPI] can be configured based on your machine settings. If the above commands do not work for you, see the documentation for how to configure based on your network.

Enable DeepSpeed* optimizations

To enable DeepSpeed* optimizations with Intel® oneAPI Collective Communications Library, add the following to your python script:

import deepspeed

# Rather than dist.init_process_group(), use deepspeed.init_distributed()
deepspeed.init_distributed(backend="ccl")

Additionally, if you have a DeepSpeed* configuration you can use the below command as your launcher to run your script with that configuration:

    docker run -it --rm \
    --net=host \
    -v $PWD/id_rsa:/root/.ssh/id_rsa \
    -v $PWD/tests:/workspace/tests \
    -v $PWD/hostfile:/workspace/hostfile \
    -v $PWD/ds_config.json:/workspace/ds_config.json \
    -w /workspace \
    intel/intel-extension-for-pytorch:2.4.0-pip-multinode \
    bash -c 'deepspeed --launcher IMPI \
    --master_addr 127.0.0.1 --master_port 3022 \
    --deepspeed_config ds_config.json --hostfile /workspace/hostfile \
    /workspace/tests/ipex-resnet50.py --ipex --device cpu --backend ccl --deepspeed'

The image below is an extension of the IPEX Multi-Node Container designed to run Hugging Face Generative AI scripts. The container has the typical installations needed to run and fine tune PyTorch generative text models from Hugging Face. It can be used to run multinode jobs using the same instructions from the IPEX Multi-Node container.

Tag(s)PytorchIPEXoneCCLHF TransformersDockerfile
2.4.0-pip-multinode-hf-4.44.0-genaiv2.4.0v2.4.0+cpuv2.4.0v4.44.0v0.4.0-Beta

Below is an example that shows single node job with the existing finetune.py script.

# Change into home directory first and run the command
docker run -it \
    -v $PWD/workflows/charts/huggingface-llm/scripts:/workspace/scripts \
    -w /workspace/scripts \
    intel/intel-extension-for-pytorch:2.4.0-pip-multinode-hf-4.44.0-genai \
    bash -c 'python finetune.py <script-args>'

The images below are TorchServe* with CPU Optimizations:

For more details, follow the procedure in the TorchServe instructions.

The images below are TorchServe* with XPU Optimizations:

Tag(s)PytorchIPEXDockerfile
2.3.110-serving-xpuv2.3.1v2.3.110+xpuv0.4.0-Beta

CPU only images with Intel® Distribution for Python*

The images below are built only with CPU optimizations (GPU acceleration support was deliberately excluded) and include Intel® Distribution for Python*:

The images below additionally include Jupyter Notebook server:

The images below additionally include Intel® oneAPI Collective Communications Library (oneCCL) and Neural Compressor (INC):

XPU images with Intel® Distribution for Python*

The images below are built only with CPU and GPU optimizations and include Intel® Distribution for Python*:

The images below additionally include Jupyter Notebook server:

Tag(s)PytorchIPEXDriverJupyter PortDockerfile
2.5.10-xpu-idp-jupyterv2.5.1v2.5.10+xpu10578888v0.4.0-Beta
2.3.110-xpu-idp-jupyterv2.3.1v2.3.110+xpu9508888v0.4.0-Beta
2.1.40-xpu-idp-jupyterv2.1.0v2.1.40+xpu9148888v0.4.0-Beta
2.1.20-xpu-idp-jupyterv2.1.0v2.1.20+xpu8038888v0.3.4
2.1.10-xpu-idp-jupyterv2.1.0v2.1.10+xpu7368888v0.2.3

Build from Source

To build the images from source, clone the AI Containers repository, follow the main README.md file to setup your environment, and run the following command:

cd pytorch
docker compose build ipex-base
docker compose run ipex-base

You can find the list of services below for each container in the group:

Service NameDescription
ipex-baseBase image with Intel® Extension for PyTorch*
jupyterAdds Jupyter Notebook server
multinodeAdds Intel® oneAPI Collective Communications Library and INC
xpuAdds Intel GPU Support
xpu-jupyterAdds Jupyter notebook server to GPU image
servingTorchServe*

MLPerf Optimized Workloads

The following images are available for MLPerf-optimized workloads. Instructions are available at 'Get Started with Intel MLPerf'.

Tag(s)Base OSMLPerf RoundTarget Platform
mlperf-inference-4.1-resnet50rockylinux:8.7Inference v4.1Intel(R) Xeon(R) Platinum 8592+
mlperf-inference-4.1-retinanetubuntu:22.04Inference v4.1Intel(R) Xeon(R) Platinum 8592+
mlperf-inference-4.1-gptjubuntu:22.04Inference v4.1Intel(R) Xeon(R) Platinum 8592+
mlperf-inference-4.1-bertubuntu:22.04Inference v4.1Intel(R) Xeon(R) Platinum 8592+
mlperf-inference-4.1-dlrmv2rockylinux:8.7Inference v4.1Intel(R) Xeon(R) Platinum 8592+
mlperf-inference-4.1-3dunetubuntu:22.04Inference v4.1Intel(R) Xeon(R) Platinum 8592+

License

View the License for the Intel® Extension for PyTorch*.

The images below also contain other software which may be under other licenses (such as Pytorch*, Jupyter*, Bash, etc. from the base).

It is the image user's responsibility to ensure that any use of The images below comply with any relevant licenses for all software contained within.

* Other names and brands may be claimed as the property of others.

Docker Pull Command

docker pull intel/intel-extension-for-pytorch