问题: 存在文件大小为0,处于打开状态的文件,程序读取这些文件会报错
1 2 3 4
| [root@cdh85-29 ~]# hadoop fs -du -h hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak 0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594769101120.log.gz 0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594856701472.log.gz 0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594941000485.log.gz
|
cloudera论坛也有类型的错误 :
1
| Cannot obtain block length for LocatedBlock
|
https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517
但是这个方法并没有解决我的问题。 hdfs debug recoverLease -path 这样也关闭不了文件 ,纠删码策略下 不知道什么bug 这些文件关闭不了。
我的解决方法:
获取hdfs没有正常关闭的文件并删除:
1
| hadoop fsck /flume/ -files -openforwrite | grep "OPENFORWRITE" >tmp.txt
|
tmp.txt 内容如下:
1 2 3 4 5 6
| Connecting to namenode via http://cdh85-39:9870/fsck?ugi=root&files=1&openforwrite=1&path=%2Fflume /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE/pk_day=2020-11-23/pk_hour=16/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1606118400268.snappy.tmp 89401 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: OK /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594769101120.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594856701472.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594941000485.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594956902326.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK
|
cat tmp.txt | awk -F ’ ’ ‘{print $1}’
移动损坏的文件:
1
| cat tmp.txt | awk -F ' ' '{print $1}' | xargs -t -I '{}' sudo -u hdfs hdfs dfs -mv {} /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/
|