How to find the largest files

Recently, my company website wants to host media files (user uploaded files) to some cloud storage service. But in preparation, we realized that our media folder is too big, about 48GiB. We need to delete unnecessary files to save the storage cost. Part of it, we need to hunt down big files.

I looked around to find a command to list largest files in Linux. Most of the tutorials on Internet tell you to use a combination of commands, which is often du | sort | head, or find | sort | head. None of them satisfies me, because:

  • Combination with du only gives result of directories, not files.

    $ du -hsx -- * | sort -rh | head -10
    29G     institution
    4,2G    picture
    2,0G    ckeditor-uploads
    1,9G    uploads
    1,2G    file
    1,1G    pl-report
    698M    school
    586M    articles
    143M    accommodation
    120M    institutions
    
  • Combination with find give you non-human-friendly file size, and the command is too long and cryptic:

    $ find . -type f -printf '%s %p\n'| sort -nr | head -10
    127837320 ./file/2016/03/23/GuidebookIssue10_Compiled_Version_JPEG.pdf
    102294295 ./institution/video/2019/08/07/Shannon_College_of_Hotel_Management.mp4
    99698410 ./uploads/2019/08/14/easyuni_whatwedo_joshuachew.mp4
    99601102 ./institution/video/2019/08/07/This_Is_RCSI.mp4
    97910940 ./institution/video/2019/08/15/Baylor_Virtual_Tour__Academic_Excellence.mp4
    97436072 ./institution/video/2019/07/31/Campus_Tour_ThinkDkIT.mp4
    97350748 ./institution/video/2019/08/02/IT_Sligo_Open_Day.mp4
    97251171 ./institution/video/2019/07/31/A_Day_in_the_Life_at_GMIT.mp4
    95924392 ./institution/video/2019/07/08/Paris_Study_Tour_2_by_Sunway_Le_Cordon_Bleu.mp4
    92570745 ./institution/video/2019/08/02/International_Students_at_Independent_College_Dublin.mp4
    
  • Someone tries to make the file size human-friendly by piping find result to xargs ls -lh. But it is very inefficient because that means each find result line will launch a process of ls.

As a new fan of Rust, I looked around in Rust ecosystem and finally found a tool that fits me the best, dust. The command is simple, short, easy to remember:

$ dust -rF

Where:

  • -F means showing files only.

  • -r means sorting reversly, biggest first.

Dust

Its display is also modern, by using graphical characters to build tree, and using colors.

I hope this post will pop up on Internet to help someones with similar need.