Advanced search

Message boards : Number crunching : Making Python GPU tasks to succeed - User side

Author Message
Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,945,502,024
RAC: 10,648,503
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58312 - Posted: 22 Jan 2022 | 18:03:07 UTC

Making Python GPU tasks to succeed - User side

Currently, Python apps for GPU hosts developer Abouh is trying to make the app to succeed at as much hosts as possible.
The official Thread is this Experimental Python tasks (beta) - task description

Here, I'm sharing what I've made from my "User side" to get these tasks to consistently succeed at all my requirements-complying Ubuntu Linux hosts.
Basically, I'm unifying in one post, along with my own experiences, many spreaded advices coming from developer abouh and several other kind users as Richard Haselgrove, Keith Myers, Ian&Steve C., mmonnin...

The first you have to know is that this app is currently available for Linux hosts only.
You can follow up the launch of eventual new versions at Gpugrid Apps page, section Python apps for GPU hosts
If you are running under any other operating system than any Linux distribution, you can stop reading. Or you can take the opportunity to explore the Linux world by installing it at any of your hosts before continuing ;-)

And it is currently an app under development and debugging. Therefore, for receiving tasks it's necessary that in your Gpugrid preferences page you have selected to Run test applications and the Python Runtime (GPU, beta) app itself at your host's venue.

Now, system hardware requirements are to be taken in mind.
I've found to be limiting:

-1) System RAM must be at least 16 GB for each ABOU Python GPU task in execution.
If your system has one only GPU, then 16 GB of system RAM should be enough.
If the system has two GPUs installed and two ABOU Python GPU tasks are being concurrently executed, then the minimum system RAM should be 32 GB... And so on.
If you are thinking about upgrading your system RAM to meet these requirements (as I had to do for my hosts), I published a how-to at post Upgrading system RAM, It's easy

-2) Graphics card based in a Nvidia GPU with at least 4 GB of dedicated RAM is needed.

----------------

And now, in sequence, I'm relating every of the actions that I've taken so far.

-1) Editing boinc-client.service

I executed the command:

sudo systemctl edit boinc-client.service

And I edited the file for the [Service] section to look as follows:

[Service]
PrivateTmp=true

Don't forget to save changes before exit.

Before doing this, all my tasks were failing with the following error:

INTERNAL ERROR: cannot create temporary directory!

This is a very frequent error that I'm finding at currently failing tasks for other users.


-2) Updating older Python versions to the required one.
Although app is supposed to install its required Python environment, I found that older Python versions were likely conflicting and causing my tasks to fail.

I executed the following commands:

sudo apt install python-is-python3
sudo apt install python3-pip


This worked for all my hosts. If this is not working for yours, user mmonnin published an alternative way at his Message #57840


-3) Installing cmake

I executed the following command:

sudo apt install cmake


-4) Finishing steps:

To uninstall no longer needed packages, I executed:

sudo apt autoremove


I also entered BOINC manager, selected GPUGRID project, and I reset it.

And finally, I rebooted.

----------------

That's all for now.
Many of my currently successful tasks have previously failed at one or more previous users. I hope that all of the above may help for some of them start succeeding...

Please, feel free to give your feedback whether this has helped or not, or sharing additional helpful steps.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,945,502,024
RAC: 10,648,503
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58313 - Posted: 23 Jan 2022 | 20:07:42 UTC

Today I've brought to executing Python GPU tasks two more of my hosts, using the same method previously related.
For making them compliant, I upgraded system RAM from previous 8 GB to 16 GB on both hosts.

However, they are likely borderline in hardware requirements:

- Test Host #540272 is based on a GTX 1060 GPU with 3 GB GDDR, below my previously validated minimum size of 4GB.

- Host #482132 is based on a GTX 1050 Ti GPU with 4 GB GDDR, but CPU is a quite old Intel Core 2 Quad Q9550S @2,83 MHz, socket 775, DDR3 RAM.

For the moment, both hosts are processing their first tasks for more than two hours. Time will show whether they're successful or not... ⏳️

kksplace
Send message
Joined: 4 Mar 18
Posts: 53
Credit: 1,401,826,749
RAC: 3,554,086
Level
Met
Scientific publications
wat
Message 58314 - Posted: 24 Jan 2022 | 1:03:01 UTC - in response to Message 58312.

First of all, thank you for consolidating this information. It is very helpful to have this all in one place instead of following through several threads and piecing it together.

Next, just for my case, I did not have to do all the steps you have to get my host to work. I am working from Linux Mint that I keep up to date/latest version. All worked on my system from the start for these WUs. Just another data point...

And I agree with the 16 GB RAM requirement. It might also be useful to note here that these WUs also use multiple CPU cores/threads. As noted on the other discussion, they seem to use whatever is available.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 566
Credit: 5,945,502,024
RAC: 10,648,503
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58315 - Posted: 24 Jan 2022 | 22:59:42 UTC - in response to Message 58313.

- Test Host #540272 is based on a GTX 1060 GPU with 3 GB GDDR, below my previously validated minimum size of 4GB.

- Host #482132 is based on a GTX 1050 Ti GPU with 4 GB GDDR, but CPU is a quite old Intel Core 2 Quad Q9550S @2,83 MHz, socket 775, DDR3 RAM.

Finally, both hosts finished their respective tasks successfully.

Test Host #540272 took about 13,5 hours to process its first e9a17-ABOU_rnd_ppod_baseline_cnn_nophi_2-0-1-RND3764_0 task.
Several nvidia-smi screenshots showed that the used GDDR was higher than 2 GB but lower than 3 GB.
Therefore, for current test tasks, a minimum of 3 GB GPU dedicated RAM should be enough.

On the other hand, Host #482132 took about 21 hours to successfully finishing its e9a22-ABOU_rnd_ppod_baseline_cnn_nophi_2-0-1-RND2295_5 task.
It is the slowest of my current hosts, but fast enough to get the full bonus for returning its task before 24 hours past.
Also, to be noted that mentioned task had previously failed at 5 other hosts before finally succeeding at mine... Slow but sure ;-)

Post to thread

Message boards : Number crunching : Making Python GPU tasks to succeed - User side

//