Our team uses python to execute hive queries. However, a heavy query always blocks other light-weight queries and has to wait more than an hour.
Is it possible to set the priority or vcpu resources for an individual connection?
Is setting the "yarn.nodemanager.resource.cpu-vcores" or "mapred.job.priority" in the configuration a solution?
configuration = {
"mapred.job.priority": 'LOW',
"yarn.nodemanager.resource.cpu-vcores": 2
}
# configuration={}
con = hive.connect(ip, port=10000, auth=auth, kerberos_service_name='hive', database=db_name, configuration=configuration)
If yes, how can I fix the It is not in list of params that are allowed to be modified at runtime error?
Thanks
from Set Hive priory for individual query/ connection
No comments:
Post a Comment