Cosmology continues to stop after 72% ??

IB
Ian Baker ID: 6761996 Posts: 10
14 Oct 2020 11:20 AM

"Hypervisor failed to enter an online state in a timely fashion" ??

Should I remove this task as potentially wasting CPU time that could be used on other projects ?

Had about 30 projects pause = crash at this point?

Tristan Olive ID: 22 Posts: 283
14 Oct 2020 03:35 PM

It should be ok to let it go. Research projects sometimes need to make some adjustments to their jobs if they are seeing errors. If this particular one does not manage to finish successfully, it will expire and report what the trouble was. In the meantime, it should not tie up your CPU much, but should roll over to a job that is not having problems.

If that isn't true, and this particular job is showing as running for a long time even though it isn't doing anything, you can remove it to expedite the process of it being reported back as problematic. The general Charity Engine approach is that everything is automated and you don't need to intervene, but you can certainly do so if you think that is for the best.

IB
Ian Baker ID: 6761996 Posts: 10
14 Oct 2020 04:46 PM

Thanks Tristan

The reason I interrviend was because the jobs stopped, but also stoped any others running (cpu+0%). Had a long list of 'waiting to run' with no work being done and 20+ jobs 70% complete with errors. Reset the project which deleted them all and then other projects carried on and got to work. Only rosetta and WCG stable. The others do the same LHC and cosmology ??? For the time being set cosmology to not receive any more tasks as just stopping the machine doing any work. 

Maybe my setuop does not like the tasks ???  I'll see in a week if they can complete on this machine. I think it maybe my end because they are running ok on my smaller, slower machine, however LHC wont work on both

Tristan Olive ID: 22 Posts: 283
15 Oct 2020 03:30 AM

I'd wonder if it's an issue with VirtualBox on your system. Both Cosmology and LHC make use of that. You could try removing VirtualBox, if you don't otherwise use it, and that should stop those projects from sending VirtualBox jobs that don't work.