业务背景
约定五天前的HDFS数据为过期版本数据,写一个脚本自动删除过期版本数据
<code class="hljs lasso">$ hadoop fs -ls /user/pms/workspace/ouyangyewei/dataFound 9 itemsdrwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-01drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-02drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-03drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-04drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-05drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09</code>
脚本实现
<code class="hljs lasso"><code class="language-awk hljs ruby"># ---------------------------------------------------------## 删除历史版本(五天前的为过期版本数据)## ---------------------------------------------------------old_version=$(hadoop fs -ls /user/pms/workspace/ouyangyewei/data | awk 'BEGIN{ five_days_ago=strftime("%F", systime()-5*24*3600) }{ split($8,arr,"/"); if(arr[7]<five_days_ago){printf "%s\n",="" $8}="" }')="" arr="(${old_version//" })="" for="" version="" in="" ${arr[@]}="" do="" hadoop="" fs="" -rmr="" $version="" done=""</code></code> 执行以后
<code class="hljs lasso"><code class="hljs lasso"><code class="language-awk hljs ruby"><code class="hljs lasso">$ hadoop fs -ls /user/pms/workspace/ouyangyewei/dataFound 4 itemsdrwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-06drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-07drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-08drwxr-xr-x - pms pms 0 2015-08-11 17:03 /user/pms/workspace/ouyangyewei/data/2015-08-09</code></code></code></code>