Find corrupted images
So I deleted all my pictures, and I restored them which resulted in a bunch of corrupted images; thousands of corrupted images. To fix this, I wrote the following script in MATLAB using the image processing toolbox:
// insert blog here
Using matlab I tried to determine what a corrupted image is. First when using the image processing toolbox to open an image, I noticed:
g = imread(s); Warning: JPEG library error (8 bit), "Corrupt JPEG data: premature end of data segment"." Warning: JPEG library error (8 bit), "Invalid JPEG file structure: two SOI markers"."
Also, a histogram of such a file looked liked this:
So, the only challenge is to find the spike, or simply a crazy high percent of 128, the mean value. Simple enough.
cd('/media/95543211-fd8f-4fc9-9b24-3a787113e4c2/+JPEG'); jpegs = dir('.'); num_files = 100; file_count = length(jpegs); G = zeros(1,num_files-2); for i = 3:(num_files+2) name = jpegs(i).name; disp(['working: ' name]); if true try I = rgb2gray(imread(name)); [w, l] = size(I); gray_percent = sum(sum(I==128))/(w*l); G(i-2) = gray_percent; if gray_percent > 0.07 disp(['moving . . . ' name]); movefile(name, ['too_much_gray/' name]); else disp(['good: ' name]); movefile(name, ['noerr/' name]); end catch disp(['bad: ' name]); end end end
Then a script to see which images might be corrupt:
And a ruby script to move the results (yeah — really inefficient, I know).
So files that might crash matlab are at least removed.
#!/bin/bash for f in * do # echo "Processing $f file..." # take action on each file. $f store current file name if ! identify "$f" &> /dev/null; then echo "$f" fi done
and (yes, this is silly)
#!/home/bonhoeffer/.rvm/rubies/ruby-1.9.3-p286/bin/ruby filez = <<EOF __003999 __026328 __029322 __032335 __035823 __035842 __036090 __038688 __039670 __048554 __048561 __048634 19991215_22_43_43_033877 19991215_22_43_43_034820 19991215_22_43_43_049844 19991215_22_43_56_038011 19991215_22_44_16_010202 20070729_14_42_57_048540 EOF puts filez.split(' ').size filez.split(' ').each do |f| puts "mv #{f} matlab_bad/#{f}" `mv #{f} matlab_bad/#{f}` end