Keep Sharp

Domain adaptation with caffe features (Experiment)

Concepts

While most images for training are produced with high quality using digital cameras, maybe with label, the domain of interests in real world could differ in some combination of factors, including scene, intra-category variation, object location and pose..., maybe without label, domain adaption algorithms is trying to minimize this domain shift performance degradation.

Experiment

Try to implement 4.2 part in paper: DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recogntion

Office Dataset

Download link, including raw images, SurfFeatures, SurfFeatures with object id, DeCaf features, 31 categories (pen, scissor, bike...). My office dataset folder tree view (up to level 2):

t-SNE clustering results using Surf features and Decaf features

Treeview of my t-SNE package folder, add it into Matlab's search path.

>> train_X = loadMNISTImages('DataTest/mnist/train-images-idx3-ubyte');
>> train_X = train_X';
>> train_labels = loadMNISTLabels('DataTest/mnist/train-labels-idx1-ubyte');
>> ind = randperm(size(train_X,1));
>> train_X_show = train_X(ind(1:5000), :); 
>> train_labels_show = train_labels(ind(1:5000));
>> no_dims = 2;
>> init_dims = 30; 
>> perplexity = 30; 
>> mappedX = tsne(train_X_show, train_labels_show, no_dims, init_dims, perplexity);
Preprocessing data using PCA...
Computed P-values 500 of 5000 datapoints...
Computed P-values 1000 of 5000 datapoints...
Computed P-values 1500 of 5000 datapoints...
Computed P-values 2000 of 5000 datapoints...

. . .

Iteration 990: error is 1.3156
Iteration 1000: error is 1.315
>> gscatter(mappedX(:,1), mappedX(:,2), train_labels_show);

loadMNISTImages() function and loadMNISTLabels() function can be found in ufldl tutorial. The max iteration value can be changed in tsne_p.m file. Result:

  • create thumbnails (30x30)

script for creating thumbnails (resize.sh):

#!/bin/bash

# create directories 
for Folder in `find "./images/" -type d -not -path "./images/amazon*"`
do
    NewFolder30x30=${Folder/images/thumbnails\/30x30}
    echo "mkdir: "$NewFolder30x30
    mkdir -p $NewFolder30x30
done

# convert images
for Img in `find "./images/" -name "*.jpg" -not -path "./images/amazon/*"`
do
    NewImg30x30=${Img/images/thumbnails\/30x30} 
    echo "Resizing image: "$NewImg30x30
    convert -resize 30x30\! $Img $NewImg30x30
done
  • load features and run t-SNE

For Surf features(surf.m):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
clear, clc;
system('find "../SurfFeatures/" -name "*800*.mat" -not -path "../SurfFeatures/amazon/*" > fea_WRAPUP.txt');

% load feature vectors
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s');

feaVecs = zeros(length(feaFiles{1}), 800);
for i = 1:length(feaFiles{1})
    load(feaFiles{1}{i});
    feaVecs(i,:) = histogram';
end

fclose(fid);


% construct labels
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s%s%s%s%s%s', 'delimiter', '/');
labels = zeros(length(feaFiles{1}), 1);

uniqueDomain = unique(feaFiles{3});
uniqueCate = unique(feaFiles{5});

for i = 1:length(feaFiles{1})
    labels(i) = 2 * find(strcmp(uniqueCate, feaFiles{5}(i)))+ ...
                find(strcmp(uniqueDomain, feaFiles{3}(i))) - 3;
end

% run tsne clustering
no_dims = 2;
init_dims = 30;
perplexity = 30;
mappedX = tsne(feaVecs, labels, no_dims, init_dims, perplexity);
%gscatter(mappedX(:,1), mappedX(:,2), labels);

% write to file for displaying images overlayed with each other
x_min = min(mappedX(:, 1));
x_max = max(mappedX(:, 1));
y_min = min(mappedX(:, 2));
y_max = max(mappedX(:, 2));
BIGIMG = ones(int16(x_max - x_min) * 10 + 60, ...
    int16(y_max - y_min) * 10 + 60, 3) * 256;

wid = fopen('tsne.res', 'w');
for i = 1:length(feaFiles{1})
    imgSrc = strcat('../thumbnails/30x30/',feaFiles{3}{i},'/images/',...
                    feaFiles{5}{i}, '/frame_', feaFiles{6}{i}(11:14),'.jpg');
    fprintf(wid, '%f %f %s\n', mappedX(i, 1), mappedX(i, 2), imgSrc);
    img = imread(imgSrc);
    x_start = int16(mappedX(i, 1) - x_min) * 10 + 15;
    y_start = int16(mappedX(i, 2) - y_min) * 10 + 15;
    BIGIMG(x_start:(x_start + 29), y_start:(y_start+29), :) = img;
end

fclose(fid);
fclose(wid);

BIGIMG = BIGIMG/256;
imshow(BIGIMG);
imwrite(BIGIMG, 'surftsne.jpg', 'jpg');

For DeCaf features(decaf.m):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
clear, clc;
system('find "../DecafFeatures/" -name "*.mat" -not -path "../DecafFeatures/amazon/*" > fea_WRAPUP.txt');

% load feature vectors
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s');

feaVecs = zeros(length(feaFiles{1}), 4096);
for i = 1:length(feaFiles{1})
    load(feaFiles{1}{i});
    feaVecs(i,:) = fc7';
end

fclose(fid);


% construct labels
fid = fopen('fea_WRAPUP.txt');
feaFiles = textscan(fid, '%s%s%s%s%s%s', 'delimiter', '/');
labels = zeros(length(feaFiles{1}), 1);

uniqueDomain = unique(feaFiles{3});
uniqueCate = unique(feaFiles{5});

for i = 1:length(feaFiles{1})
    labels(i) = 2 * find(strcmp(uniqueCate, feaFiles{5}(i)))+ ... 
                find(strcmp(uniqueDomain, feaFiles{3}(i))) - 3;   
end

% run tsne clustering
no_dims = 2;
init_dims = 30; 
perplexity = 30; 
mappedX = tsne(feaVecs, labels, no_dims, init_dims, perplexity);
%gscatter(mappedX(:,1), mappedX(:,2), labels);

% write to file for displaying images overlayed with each other
x_min = min(mappedX(:, 1));
x_max = max(mappedX(:, 1));
y_min = min(mappedX(:, 2));
y_max = max(mappedX(:, 2));
BIGIMG = ones(int16(x_max - x_min) * 10 + 60, ... 
    int16(y_max - y_min) * 10 + 60, 3) * 256;

wid = fopen('tsne.res', 'w');
for i = 1:length(feaFiles{1})
    imgSrc = strcat('../thumbnails/30x30/',feaFiles{3}{i},'/images/',...
                    feaFiles{5}{i}, '/', feaFiles{6}{i}(1:10),'.jpg');
    fprintf(wid, '%f %f %s\n', mappedX(i, 1), mappedX(i, 2), imgSrc);
    img = imread(imgSrc);
    x_start = int16(mappedX(i, 1) - x_min) * 10 + 15; 
    y_start = int16(mappedX(i, 2) - y_min) * 10 + 15; 
    BIGIMG(x_start:(x_start + 29), y_start:(y_start+29), :) = img;
end

fclose(fid);
fclose(wid);

BIGIMG = BIGIMG/256;
imshow(BIGIMG);
imwrite(BIGIMG, 'decaftsne.jpg', 'jpg');

Result using surf features:

Result using decaf features:

  • domain shift classification (objrecog.m)

For amazon domain, take 20 samples from each category, for dslr and webcam domain, take 8 samples from each category, using libsvm tool, first get a model from source domain (could be combined with the target webcam domain), then test the model with the remaining samples in webcam domain. Preprocessing steps is to load data and split them 10 times.

Result:

project code

Comments