Concepts
While most images for training are produced with high quality using digital cameras, maybe with label, the domain of interests in real world could differ in some combination of factors, including scene, intra-category variation, object location and pose..., maybe without label, domain adaption algorithms is trying to minimize this domain shift performance degradation.
Experiment
Try to implement 4.2 part in paper: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recogntion
Office Dataset
Download link, including raw images, SurfFeatures, SurfFeatures with object id, DeCaf features, 31 categories (pen, scissor, bike...). My office dataset folder tree view (up to level 2):
t-SNE clustering results using Surf features and Decaf features
- t-SNE Home Page on TuDelft, Matlab implementation download link, usage example (mnist dataset):
Treeview of my t-SNE package folder, add it into Matlab's search path.
>> train_X = loadMNISTImages('DataTest/mnist/train-images-idx3-ubyte');
>> train_X = train_X';
>> train_labels = loadMNISTLabels('DataTest/mnist/train-labels-idx1-ubyte');
>> ind = randperm(size(train_X,1));
>> train_X_show = train_X(ind(1:5000), :);
>> train_labels_show = train_labels(ind(1:5000));
>> no_dims = 2;
>> init_dims = 30;
>> perplexity = 30;
>> mappedX = tsne(train_X_show, train_labels_show, no_dims, init_dims, perplexity);
Preprocessing data using PCA...
Computed P-values 500 of 5000 datapoints...
Computed P-values 1000 of 5000 datapoints...
Computed P-values 1500 of 5000 datapoints...
Computed P-values 2000 of 5000 datapoints...
. . .
Iteration 990: error is 1.3156
Iteration 1000: error is 1.315
>> gscatter(mappedX(:,1), mappedX(:,2), train_labels_show);
loadMNISTImages()
function and loadMNISTLabels()
function can be found in ufldl tutorial. The max iteration value can be changed in tsne_p.m
file. Result:
- create thumbnails (30x30)
script for creating thumbnails (resize.sh):
#!/bin/bash
# create directories
for Folder in `find "./images/" -type d -not -path "./images/amazon*"`
do
NewFolder30x30=${Folder/images/thumbnails\/30x30}
echo "mkdir: "$NewFolder30x30
mkdir -p $NewFolder30x30
done
# convert images
for Img in `find "./images/" -name "*.jpg" -not -path "./images/amazon/*"`
do
NewImg30x30=${Img/images/thumbnails\/30x30}
echo "Resizing image: "$NewImg30x30
convert -resize 30x30\! $Img $NewImg30x30
done
- load features and run t-SNE
For Surf features(surf.m):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | clear, clc;
system('find "../SurfFeatures/" -name "*800*.mat" -not -path "../SurfFeatures/amazon/*" > fea_WRAPUP.txt');
% load feature vectors
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s');
feaVecs = zeros(length(feaFiles{1}), 800);
for i = 1:length(feaFiles{1})
load(feaFiles{1}{i});
feaVecs(i,:) = histogram';
end
fclose(fid);
% construct labels
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s%s%s%s%s%s', 'delimiter', '/');
labels = zeros(length(feaFiles{1}), 1);
uniqueDomain = unique(feaFiles{3});
uniqueCate = unique(feaFiles{5});
for i = 1:length(feaFiles{1})
labels(i) = 2 * find(strcmp(uniqueCate, feaFiles{5}(i)))+ ...
find(strcmp(uniqueDomain, feaFiles{3}(i))) - 3;
end
% run tsne clustering
no_dims = 2;
init_dims = 30;
perplexity = 30;
mappedX = tsne(feaVecs, labels, no_dims, init_dims, perplexity);
%gscatter(mappedX(:,1), mappedX(:,2), labels);
% write to file for displaying images overlayed with each other
x_min = min(mappedX(:, 1));
x_max = max(mappedX(:, 1));
y_min = min(mappedX(:, 2));
y_max = max(mappedX(:, 2));
BIGIMG = ones(int16(x_max - x_min) * 10 + 60, ...
int16(y_max - y_min) * 10 + 60, 3) * 256;
wid = fopen('tsne.res', 'w');
for i = 1:length(feaFiles{1})
imgSrc = strcat('../thumbnails/30x30/',feaFiles{3}{i},'/images/',...
feaFiles{5}{i}, '/frame_', feaFiles{6}{i}(11:14),'.jpg');
fprintf(wid, '%f %f %s\n', mappedX(i, 1), mappedX(i, 2), imgSrc);
img = imread(imgSrc);
x_start = int16(mappedX(i, 1) - x_min) * 10 + 15;
y_start = int16(mappedX(i, 2) - y_min) * 10 + 15;
BIGIMG(x_start:(x_start + 29), y_start:(y_start+29), :) = img;
end
fclose(fid);
fclose(wid);
BIGIMG = BIGIMG/256;
imshow(BIGIMG);
imwrite(BIGIMG, 'surftsne.jpg', 'jpg');
|
For DeCaf features(decaf.m):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | clear, clc;
system('find "../DecafFeatures/" -name "*.mat" -not -path "../DecafFeatures/amazon/*" > fea_WRAPUP.txt');
% load feature vectors
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s');
feaVecs = zeros(length(feaFiles{1}), 4096);
for i = 1:length(feaFiles{1})
load(feaFiles{1}{i});
feaVecs(i,:) = fc7';
end
fclose(fid);
% construct labels
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s%s%s%s%s%s', 'delimiter', '/');
labels = zeros(length(feaFiles{1}), 1);
uniqueDomain = unique(feaFiles{3});
uniqueCate = unique(feaFiles{5});
for i = 1:length(feaFiles{1})
labels(i) = 2 * find(strcmp(uniqueCate, feaFiles{5}(i)))+ ...
find(strcmp(uniqueDomain, feaFiles{3}(i))) - 3;
end
% run tsne clustering
no_dims = 2;
init_dims = 30;
perplexity = 30;
mappedX = tsne(feaVecs, labels, no_dims, init_dims, perplexity);
%gscatter(mappedX(:,1), mappedX(:,2), labels);
% write to file for displaying images overlayed with each other
x_min = min(mappedX(:, 1));
x_max = max(mappedX(:, 1));
y_min = min(mappedX(:, 2));
y_max = max(mappedX(:, 2));
BIGIMG = ones(int16(x_max - x_min) * 10 + 60, ...
int16(y_max - y_min) * 10 + 60, 3) * 256;
wid = fopen('tsne.res', 'w');
for i = 1:length(feaFiles{1})
imgSrc = strcat('../thumbnails/30x30/',feaFiles{3}{i},'/images/',...
feaFiles{5}{i}, '/', feaFiles{6}{i}(1:10),'.jpg');
fprintf(wid, '%f %f %s\n', mappedX(i, 1), mappedX(i, 2), imgSrc);
img = imread(imgSrc);
x_start = int16(mappedX(i, 1) - x_min) * 10 + 15;
y_start = int16(mappedX(i, 2) - y_min) * 10 + 15;
BIGIMG(x_start:(x_start + 29), y_start:(y_start+29), :) = img;
end
fclose(fid);
fclose(wid);
BIGIMG = BIGIMG/256;
imshow(BIGIMG);
imwrite(BIGIMG, 'decaftsne.jpg', 'jpg');
|
Result using surf features:
Result using decaf features:
- domain shift classification (objrecog.m)
For amazon domain, take 20 samples from each category, for dslr and webcam domain, take 8 samples from each category, using libsvm tool, first get a model from source domain (could be combined with the target webcam domain), then test the model with the remaining samples in webcam domain. Preprocessing steps is to load data and split them 10 times.
Result: