Skip to content

Simple brute force duplicate file identification

Here is a way to identify files that have duplicates.

find dir -type f -print0 | xargs -0 md5sum > filelist.txt
sort filelist.txt > filesort.txt
uniq -w 33 -D filesort.txt

# more legible
uniq -w 33 --all-repeated=separate filesort.txt # also --all-rep=sep works

This will show which files have duplicates. I saved the results in a file instead of piping everything so one can go back to filesort.txt and identify the other files which have the same md5.

Make sure you actually compare the files. Some files could possibly have the same md5sum without being the same. They will likely have a different size. It is possible for two files of the same size to have the same md5sum.For better positive hits, use sha256 (slower).

Advertisements

Crash dump anaylsis

Here is how to analyze a kernel crash dump in CentOs

First, to install vmlinux with debugging symbols

strings /var/crash/127.0.0.1-2013-06-22-18\:15\:01/vmcore | less

Look for the kernel version. In this case:

OSRELEASE=2.6.32-220.el6.x86_64

Go to

http://debuginfo.centos.org/6/x86_64/ and download
kernel-debuginfo-2.6.32-220.el6.x86_64.rpm  and
kernel-debuginfo-common-x86_64-2.6.32-220.el6.x86_64.rpm

Install on target machine with

rpm -ivh kernel-debuginfo-common-x86_64-2.6.32-220.el6.x86_64.rpm
rpm -ivh kernel-debuginfo-2.6.32-220.el6.x86_64.rpm

Finally, the crash dump analysis:

  crash /usr/lib/debug/lib/modules/2.6.32-220.el6.x86_64/vmlinux ./vmcore


Use the “bt” command to pull up a backtrace. It will tell you what program was running and what happened to cause the crash.

robocopy hints

rem robocopy options
rem /dst ::daylight savings time adjustment
rem /R:20 :: retry 20 times (30 seconds between tries)
rem /e    :: copy subdirectories including empties.
rem /XO :: exclude older files
rem /mov (remove from source) used for removing from ds1.
rem /purge remove dest files no longer in source


rem debugging
rem /L to test the command
rem /LOG:file :: log to file
rem /TEE      :: file and console output
rem /V        :: verbose.
rem /np       :: Don't show percentage copied

robocopy g:/source h:/dest /log:g:/source-copy.log /tee /dst /e /r:20 /xo /np

Dircolors

dircolors is something I would like to hate. It is so nice, though. I found some handy workarounds if you find yourself straining to see some of the colors because you, perhaps, opted for a light background color.

There are a few ways to go about this depending on your flavor of system.

First, you will need the existence of a light background dircolor file. Many times this can be found as /etc/DIR_COLORS.lightbgcolor .

One way to get this to work is to set an environment variable

eval `dircolors /etc/DIR_COLORS.lightbgcolor`

Another way to get the behavior, on some flavors of *nix, is to link this file to a dot file in your home directory.

ln -s /etc/DIR_COLORS.lightbgcolor ~/.dir_colors

Resetting file permissions

One of the worst things about NTFS as far as I am concerned is forcing permissions on removable media, when those permissions can so easily be overridden. Here is how to reset the permissions on removable media from windows 7.

  • start up cmd as administrator
  • takeown /f d:\path /r
  • icacls d:\path /reset /T

d:\path is the drive and path to the file or directory you want to reset.

Those attending fart class always try fo

Those attending fart class always try for a passing grade. #joke

ssh speed up

Here is a new one. When ssh slows down put

GSSAPIAuthentication no

into /etc/ssh/ssh_config