Running oneAPI C++ with SYCL code on Intel Arc and Iris Xe GPUs
oneAPI, featuring C++ with SYCL, enables the same accelerator code to run on a variety of GPU and CPU architectures. As promised in my last post about heterogeneous computing, this time we are going to use the new Intel® Arc™ GPU and oneAPI to see how SYCL plays with Intel Arc.
The Intel Arc GPU we will be using comes in the form of an HP Envy 16″ laptop. It is powered by an Intel i7–12700H with 32GB of memory and has an Intel Arc A370M running driver version 18.104.22.1683. As usual, I This hardware myself (aka laptop not provided by Intel) and everything purchased in this post is something you can do yourself.
Random note, I also purchased the Asus Flipbook with an Intel Arc GPU, 16GB of memory and a beautiful OLED screen! I highly recommend that people check that out as well, as the screen is just amazing.
Since this is a brand new laptop, today we are going to test setting up a purely Windows development environment instead of the WSL2+Ubuntu environment we previously used in the past. We need to install some basic tools like Git, Microsoft Visual Studio 2022 Community Edition and the Intel oneAPI Base Toolkit. I covered the installation of these before, so won’t rehash it here. After installing these tools, we are ready to start building and running some code on the Arc laptop.
Building the code
For our test today, we will be using the mandelbrot code from the oneAPI-samples repository. We launch an x64 Native Tools Command Prompt for VS2022 and simply run:
> git clone https://github.com/oneapi-src/oneAPI-samples
> cd DirectProgrammingDPC++CombinationalLogicmandelbrot
This puts us in the right place to compile and run our sample. Before we do that though, we are going to tweak some of the parameters in the src/mandel.hpp file to make the image produced by the sample a little larger and the sample more computationally intensive. We do this by changing the row_size and col_size variables from 512 to 2048 and setting the max_iterations variable to 10000.
After these small changes, we initialize our oneAPI environment and compile the sample code:
> c:Program Files (x86)InteloneAPIsetvars.bat
> MSBuild mandelbrot.sln /t:Rebuild /p:Configuration=”Release”
The resulting binary is written to x64/Release/mandelbrot.exe. Now on to the fun part…
Running on Intel Arc (and other hardware)
Of course, the first thing to do is just run the binary and see what happens:
Woo hoo! As we can see, the binary selected and ran on our Intel Arc A370M GPU and produced the image one would expect:
While this is super cool for those excited by Intel Arc as a product, what is also interesting is to see if it will run properly on any other hardware on this laptop. By running the sycl-ls command, which is included in the oneAPI Base Toolkit, we can see all the available hardware on this system that can run SYCL code:
To run the binary on different devices, we simply set the SYCL_DEVICE_FILTER environment variable using this command:
> set SYCL_DEVICE_FILTER=opencl:cpu:1
where the environment variable assignment matches the value inside the first braces in a line of the sycl-ls output. The cool thing is there are 3 interesting targets from the sycl-ls output, the Alder Lake i7–12700H, the Intel® Iris® Xe integrated GPU and the Intel Arc discrete GPU.
Just to show how easy it is to test running against multiple hardware targets, we can quickly use the variable environment to run the sample code on all three of these targets:
Can we break it?
Just for fun, let’s see what would happen if we disabled the GPUs on the laptop. Will our SYCL binary be smart enough to use the CPU?
We will be using Remote Desktop to connect to the laptop, which should allow me to disable the GPUs while still using the remote connection to interact with the laptop. After connecting with Remote Desktop, we go to the Device Manager and disable both GPUs.
The remote desktop connection is still up, so we just go back to my command prompt, unset the SYCL_DEVICE_FILTER environment variable and run the sample:
The Sanity Check — SYCL on Another Architecture
Of course, only running on Intel hardware does not really show us the full value of SYCL. Here is the same SYCL sample running on my Alienware R13 i9–12900KF, NVIDIA Geforce 3080Ti card:
This sample was compiled using the Codeplay oneAPI for CUDA compiler just like in my previous post.
I should point out that the HP Envy is running the sample natively in Windows, while the Alienware system with NVIDIA GPU is running in the WSL2 environment using a different compiler. It does not make sense to compare the timings of the runs because that would be an apples to oranges comparison. The real takeaway should be the C++ with SYCL code is running on a variety of hardware, and even on a variety of OSes, with no code changes.
What should we take away from this? First, Intel Arc-based systems are out there and we can pick them up and play games, create content and yes even do some GPGPU computing on them!
Just as importantly, for those of us who need to do accelerator-based programming, oneAPI featuring C++ with SYCL can help reduce the amount of churn in our codebases and improve maintainability by allowing us to write software that runs on an ever-growing variety of accelerator hardware. Life is hard enough as a developer, injecting a little simplicity into the amount of code we need to write and maintain is a worthy thing to strive for…
Want to Connect?If you want to see what random tech news I’m reading, you can follow me on Twitter.Tony is a Software Architect and Technical Evangelist at Intel. He has worked on several software developer tools and most recently led the software engineering team that built the data center platform which enabled Habana’s scalable MLPerf solution.Intel, the Intel logo and other Intel marks are trademarks of Intel Corporation or its subsidiaries. SYCL is a trademark of the Khronos® Group. Other names and brands may be claimed as the property of others.