Find corrupted images
So I deleted all my pictures, and I restored them which resulted in a bunch of corrupted images; thousands of corrupted images. To fix this, I wrote the following script in MATLAB using the image processing toolbox:
// insert blog here
Using matlab I tried to determine what a corrupted image is. First when using the image processing toolbox to open an image, I noticed:
g = imread(s); Warning: JPEG library error (8 bit), "Corrupt JPEG data: premature end of data segment"." Warning: JPEG library error (8 bit), "Invalid JPEG file structure: two SOI markers"."
Also, a histogram of such a file looked liked this:
So, the only challenge is to find the spike, or simply a crazy high percent of 128, the mean value. Simple enough.
cd('/media/95543211-fd8f-4fc9-9b24-3a787113e4c2/+JPEG');
jpegs = dir('.');
num_files = 100;
file_count = length(jpegs);
G = zeros(1,num_files-2);
for i = 3:(num_files+2)
name = jpegs(i).name;
disp(['working: ' name]);
if true
try
I = rgb2gray(imread(name));
[w, l] = size(I);
gray_percent = sum(sum(I==128))/(w*l);
G(i-2) = gray_percent;
if gray_percent > 0.07
disp(['moving . . . ' name]);
movefile(name, ['too_much_gray/' name]);
else
disp(['good: ' name]);
movefile(name, ['noerr/' name]);
end
catch
disp(['bad: ' name]);
end
end
end
Then a script to see which images might be corrupt:
And a ruby script to move the results (yeah — really inefficient, I know).
So files that might crash matlab are at least removed.
#!/bin/bash
for f in *
do
# echo "Processing $f file..."
# take action on each file. $f store current file name
if ! identify "$f" &> /dev/null; then
echo "$f"
fi
done
and (yes, this is silly)
#!/home/bonhoeffer/.rvm/rubies/ruby-1.9.3-p286/bin/ruby
filez = <<EOF
__003999
__026328
__029322
__032335
__035823
__035842
__036090
__038688
__039670
__048554
__048561
__048634
19991215_22_43_43_033877
19991215_22_43_43_034820
19991215_22_43_43_049844
19991215_22_43_56_038011
19991215_22_44_16_010202
20070729_14_42_57_048540
EOF
puts filez.split(' ').size
filez.split(' ').each do |f|
puts "mv #{f} matlab_bad/#{f}"
`mv #{f} matlab_bad/#{f}`
end

Be the first to write a comment.