Removing Database Level Locks in HIVE
Recently we started noticing “CREATE TABLE” taking incredibly long amounts of time to execute and after which they’d fail. A more detailed look in to the issue revealed that we had upgraded HIVE and the new version, which now allowed ACID, would lock the database by default even if ACID support was turned off. So basically, during a SELECT or INSERT was running in HIVE, HIVE created a zookeeper SHARED lock on the entire database that contained those tables.
I did some digging through the code and found this:
https://github.com/apache/hive/blob/68bc618bf0b1fd3839c3c52c2103b58719b3cb81/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#L166 Notice the lock on the whole database.
To solve this problem, this link recommends to turn off locking altogether:
http://mail-archives.apache.org/mod_mbox/hive-user/201408.mbox/%3C0eba01cfc035$3501e4f0$9f05aed0$@com%3E That was not an option for us as we were doing a full replace of the table and we want to make sure no one is reading from the table when we are replacing its contents.
Another solution, although not elegant is to unlock the zookeeper lock on the schema manually every so often. Here’s a script if you decide to go that route. If you have a better way of solving this issue please tweet to me @yashranadive
require 'zookeeper'
require 'trollop'
# Removes zookeeper SHARED locks from a database
# usage ruby remove_zookeeper_locks.rb <name_of_hive_database>
# e.g. ruby remove_zookeeper_locks.rb default
opts = Trollop::options do
opt :hive_schema, "Hive Schema to unlock", :type => :string, :required => true # string --hive-schema <s>
opt :zookeeper_server, "Zookeeper server", :type => :string, :required => false # string --zookeeper-server <s>
end
logger = Logger.new(STDOUT)
zk = Zookeeper.new(opts[:zookeeper_server] || "default_zookeeper_server:2181")
hive_schema = opts[:hive_schema]
path = "/hive_zookeeper_namespace_hive/#{hive_schema}"
nodes = zk.get_children(:path => path)[:children] || []
logger.info( "ZK Nodes for #{path}: #{nodes}" )
locks = nodes.select { |i| i[/LOCK-SHARED/] }
if (locks.nil? || locks.empty?)
logger.info( "No SHARED Locks found on #{hive_schema}" )
else
locks.map do |lock_name|
zk.delete(:path => "/hive_zookeeper_namespace_hive/#{hive_schema}/#{lock_name}")
logger.info( "Removed lock /hive_zookeeper_namespace_hive/#{hive_schema}/#{lock_name}" )
end
end