submitting a job to a SGE 8.1.9 Cluster from an application called galaxy logged the following error:
galaxy.jobs.runners.drmaa WARNING 2019-02-13 13:20:43,111 (427) drmaa.Session.runJob() failed, will retry: code 17: MUNGE authentication failed: Invalid credential format
I have verified UIDs and GIDs across host and cluster are alike as well as verified perms for munge dirs and files match install docs. Also verfied munge.key matched across cluster and host.
Used this python script outside of galaxy to submit job to cluster with success:
import drmaa
from multiprocessing.pool import ThreadPool
import tempfile
import os
import stat
session = drmaa.Session()
session.initialize()
def main():
smt = "ls . > test.out"
script_file = tempfile.NamedTemporaryFile(mode="w", dir=os.getcwd(), delete=False)
script_file.write(smt)
script_file.close()
print "Job is in file %s" % script_file.name
os.chmod(script_file.name, stat.S_IRWXG | stat.S_IRWXU)
jt = session.createJobTemplate()
print "jt created"
jt.jobEnvironment = {'BASH_ENV': '~/.bashrc'}
print "environment set"
jt.remoteCommand = os.path.join(os.getcwd(),script_file.name)
print "remote command set"
jobid = session.runJob(jt)
print "Job submitted with id: %s, waiting ..." % jobid
retval = session.wait(jobid, drmaa.Session.TIMEOUT_WAIT_FOREVER)
if __name__=='__main__':
main()
WHEN I try this same script with Python Multithreading, I get error Script and error are:
import drmaa
from multiprocessing.pool import ThreadPool
import tempfile
import os
import stat
pool = ThreadPool(1)
session = drmaa.Session()
session.initialize()
def pTask(n):
smt = "ls . > test.out"
script_file = tempfile.NamedTemporaryFile(mode="w", dir=os.getcwd(), delete=False)
script_file.write(smt)
script_file.close()
print "Job is in file %s" % script_file.name
os.chmod(script_file.name, stat.S_IRWXG | stat.S_IRWXU)
jt = session.createJobTemplate()
print "jt created"
jt.jobEnvironment = {'BASH_ENV': '~/.bashrc'}
print "environment set"
jt.remoteCommand = os.path.join(os.getcwd(),script_file.name)
print "remote command set"
jobid = session.runJob(jt)
print "Job submitted with id: %s, waiting ..." % jobid
retval = session.wait(jobid, drmaa.Session.TIMEOUT_WAIT_FOREVER)
pool.map(pTask, (1,))
Result is:
Job is in file /home/svc-clingalprod/tmpu3A6Rk
jt created
environment set
error: getting configuration: MUNGE authentication failed: Invalid credential format
remote command set
Traceback (most recent call last):
File "remote_mthread.py", line 29, in <module>
pool.map(pTask, (1,))
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 250, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
drmaa.errors.DeniedByDrmException: code 17: MUNGE authentication failed: Invalid credential format
Where do I go from here in isolating the cause of the Invalid Credential format error?