分享知识,分享快乐

0%

解决hdfs文件大小为0

问题: 存在文件大小为0,处于打开状态的文件,程序读取这些文件会报错

1
2
3
4
[root@cdh85-29 ~]# hadoop fs -du -h  hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak
0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594769101120.log.gz
0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594856701472.log.gz
0 1.1 G hdfs://ns1/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594941000485.log.gz

cloudera论坛也有类型的错误 :

1
Cannot obtain block length for LocatedBlock

https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517

但是这个方法并没有解决我的问题。 hdfs debug recoverLease -path 这样也关闭不了文件 ,纠删码策略下 不知道什么bug 这些文件关闭不了。

我的解决方法:

获取hdfs没有正常关闭的文件并删除:

1
hadoop fsck /flume/ -files -openforwrite | grep "OPENFORWRITE"  >tmp.txt

tmp.txt 内容如下:

1
2
3
4
5
6
Connecting to namenode via http://cdh85-39:9870/fsck?ugi=root&files=1&openforwrite=1&path=%2Fflume
/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE/pk_day=2020-11-23/pk_hour=16/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1606118400268.snappy.tmp 89401 bytes, replicated: replication=3, 1 block(s), OPENFORWRITE: OK
/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594769101120.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK
/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594856701472.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK
/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594941000485.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK
/flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE-1.1594956902326.log.gz 0 bytes, erasure-coded: policy=RS-6-3-1024k, 1 block(s), OPENFORWRITE: OK

cat tmp.txt | awk -F ’ ’ ‘{print $1}’

移动损坏的文件:

1
cat tmp.txt | awk -F ' ' '{print $1}' | xargs -t -I '{}' sudo -u hdfs hdfs dfs -mv {} /flume/BankCardAuthReqDTO/CREDIT-PRODUCT-RESULT-LOG-MEMBER-RESPONSE_bak/