1) Message boards : Graphics cards (GPUs) : Compute error - 195 (0xc3) EXIT_CHILD_FAILED (Message 55256)
Posted 14 days ago by Profile Retvari Zoltan
a few WUs, errored out almost immediately, stating that memory leakages were detected.
That's a false alarm. Even successful tasks contain that message, for example: http://www.gpugrid.net/result.php?resultid=28081659
The specs of my system is ... modest overclock on the GPU. Could this error also be connected to my OC?
Yes.
Should I revert back to the stock settings?
I would start with that. Different workunits tolerate different overclocking.
If this card does not have its own PCI power connector, then it draws all of its power from the motherboard, in this case overclocking is not recommended.
Check the following too:
What is its operating temperature?
Are its fans rotating ok?
Is its heatsink clean?
Is the thermal interface material ok between the GPU chip and the heatsink?
Is the 12V pins (yellow cable) on the 24-pin MB power connector ok?
2) Message boards : Graphics cards (GPUs) : Ampere 10496 & 8704 & 5888 fp32 cores! (Message 55238)
Posted 17 days ago by Profile Retvari Zoltan
I expect that the CUDA cores could be used for crunching will be only the half of that stated in the name of this thread.
Similarly to the GF116 architecture, where only the 2/3rd of the cores could be used for crunching (due to the dispatch unit/CUDA core (4/6) ratio):


I think the relative performance in computing compared to the RTX 2080Ti will be the following:
card cores performance RTX 2080Ti 4352 100.0% RTX 3090 10496 5248 120.6% RTX 3080 8704 4352 100.0% RTX 3070 5888 2944 67.6%
Perhaps a bit (say 10%) more (taking other factors in consideration).
3) Message boards : Number crunching : The hardware enthusiast's corner (Message 55182)
Posted 27 days ago by Profile Retvari Zoltan
TGC has solidified on another GPU (RTX 2080Ti this time) in one of my hosts.
It was completely gone from the silicone itself, the markings of the chip left their mirrored print on the heatsink.
I'm suspecting that I put too little amount of TGC on these GPUs, in fear of spilling it on the PCB around the GPU chip (full of SMD capacitors).
It's much easier to put on the TGC for the second time, as it makes the solidified part liquid again, or at least the fresh TGC spreads on it very well without cleaning the surface. If the TGC reacts with the copper of the heatsink (as I suspect), leaving the "used" TGC on its surface may prevent further reaction between the two materials. I'll see, and report.
4) Message boards : Number crunching : No tasks to send (Message 55181)
Posted 28 days ago by Profile Retvari Zoltan
What else is everyone working on in the meantime?

https://foldingathome.org/
5) Message boards : Number crunching : The hardware enthusiast's corner (Message 55052)
Posted 98 days ago by Profile Retvari Zoltan
This is very strange. I didn't experienced such change in the liquidity of the Conductonaut...

I guess that tested heatsink's core is not made of pure copper, but some kind of alloy not compatible with Conductonaut.
You are probably right.
My Gigabyte AORUS GTX 1080 Ti showed the same symptoms (its GPU temperature rose to 90°C). First I cleaned its fins, but there were no change in GPU temperature, so I reduced its power target to 150W until I could remove the card again for disassembly. After I did, I've noticed that the TGC has solidified, and completely gone from the silicone of the GPU chip. So I re-applied some TGC on both surfaces, and assembled the card. Now it's running fine again (71°C). I regularly check the temperatures of my GPUs, so I'm sure that this change in the physical state of TGC was quite sudden.
However I have a GTX 2080 Ti with a copper heatsink, and it's running fine. Other cards with nickel(?) heatsinks and TGC are running fine. I keep an eye on them, if another one will have higher temperatures I'll disassemble that card too.
6) Message boards : Server and website : Server only allows one connection at a time from an IP? 30s cooldown is too short. (Message 54972)
Posted 116 days ago by Profile Retvari Zoltan
It didn't help to abort the stalled downloads, or aborting the whole task - it was STILL complaining about those downloads!!
That's a different problem. These tasks were created before the http->https transition, so they still want to download through http, but that won't succeed. You have to abort the downloads, then restart the BOINC manager, or manually edit the client_state.xml file (see the Warning: bad tasks re-appearing in the download queue thread for details).
7) Message boards : Server and website : Warning: bad tasks re-appearing in the download queue (Message 54971)
Posted 116 days ago by Profile Retvari Zoltan
Wait until GPUGrid is idle (all previous tasks reported)
Stop BOINC
Edit client_state.xml
Change https:// to http:// for the affected download files only
Save file
Restart BOINC

but be very careful when editing that file - use text mode only.
I'm not sure that the task generated by a "repaired" task would come using https, so we may do it 10 times.
(I've aborted the transfer of these tasks, then restarted the BOINC manager, as it thinks that "some downloads are stalled". It refers to the ones I've aborted.)
8) Message boards : Server and website : Server only allows one connection at a time from an IP? 30s cooldown is too short. (Message 54928)
Posted 118 days ago by Profile Retvari Zoltan
Are you positive it's not the upload bandwidth being saturated?
I'm sure about it. I have this problem on my symmetrical 1Gbps fiber optics internet connection. Earlier I had an ADSL (through an old phone line copper wire) with 50Mbps download 15Mbps upload bandwidth, which had the same problem.
9) Message boards : Server and website : Server only allows one connection at a time from an IP? 30s cooldown is too short. (Message 54912)
Posted 119 days ago by Profile Retvari Zoltan
Should this be set 0 or 1???
<report_results_immediately>1</report_results_immediately>
It should be set to 1 for GPUGrid, but this property is set by the GPUGrid project in the tasks (so this option has no effect on GPUGrid tasks, until it's set by the project).
Look for <report_immediately/> in the client_state.xml and you'll find similar records:
<result> <name>2c1dB00_379_1-TONI_MDADex2sc-0-50-RND9291_0</name> <final_cpu_time>0.000000</final_cpu_time> <final_elapsed_time>0.000000</final_elapsed_time> <exit_status>0</exit_status> <state>2</state> <platform>windows_x86_64</platform> <version_num>210</version_num> <plan_class>cuda101</plan_class> <report_immediately/> <wu_name>2c1dB00_379_1-TONI_MDADex2sc-0-50-RND9291</wu_name> <report_deadline>1590700376.000000</report_deadline> <received_time>1590268377.505087</received_time> ...
10) Message boards : News : More tasks: MDAD* (Message 54837)
Posted 121 days ago by Profile Retvari Zoltan
There are some bad workuntis in the new batch.
EXCEPTIONAL CONDITION: src\mdio\bincoord.c, line 193: "nelems != 1"
https://www.gpugrid.net/workunit.php?wuid=20162190
https://www.gpugrid.net/workunit.php?wuid=20162534
https://www.gpugrid.net/workunit.php?wuid=20009439
https://www.gpugrid.net/workunit.php?wuid=20009346
https://www.gpugrid.net/workunit.php?wuid=20009664
and
ERROR: src\mdsim\trajectory.cpp line 135: Simulation box has to be rectangular!
https://www.gpugrid.net/workunit.php?wuid=20009564


Next 10