Computer Vision @ University of Sussex – Spring 2018 Coursework Assignment Deadline: 11th May 2018 at 4PM This assignment brief was first released on 27th March 2018 The assignment grade, which is...

1 answer below »
I have alreay done the summer report. Could you help me to do the MATLAB programmng and the 1-page report as well!


Computer Vision @ University of Sussex – Spring 2018 Coursework Assignment Deadline: 11th May 2018 at 4PM This assignment brief was first released on 27th March 2018 The assignment grade, which is worth 50% of the total grade, is separated into 2 components: a 2-page report and source code. The 2-page report should comprise: 1. 1-page summary of a research paper “Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly” (https://arxiv.org/pdf/1707.00600.pdf) [1]; 2. 1-page summary on your implementation of a zero-shot recognition system based on a Direct Attribute Prediction (DAP) concept (Lecture 13). DAP system models the probability of a certain object (e.g. polar bear) being present in the image using the probabilities of being present for each of the attributes that a bear is known to have. For example, if we detect the attributes “white”, “furry”, “bulbous”, “not lean”, “not brown” etc. in the image, i.e. attributes that a polar bear is known to have, we can be fairly confident that there is a bear in the image. Hence, we can recognize a polar bear without ever having seen a polar bear, if (1) we know what attributes a polar bear has, and (2) we have classifiers trained for these attributes, using images from other object classes (i.e. other animals). Follow the steps below to implement a DAP-based zero-shot recognition system. First, copy the Animals with Attributes dataset (originally appearing here1) from http://users.sussex.ac.uk/~nq28/Subset_of_Animals_with_Attributes2.zip (4.9GB). The dataset includes 50 animal categories/classes, and 85 attributes. The dataset provides a 50x85 predicate-matrix-binary.txt which you should read into Matlab using M = load(’predicate-matrix-binary.txt’); An entry (i, j)=1 in the matrix says that the i-th class has the j-th attribute (e.g. a bear is white), and an entry of (i, j)=0 says that the i-th class doesn’t have the j-th attribute (e.g. a bear is not white). Image set in JPEG format is provided. You should extract SIFT (Scale-Invariant Feature Transform) features/interest points from those images (Lecture 9). You should use Matlab’s built-in functions from the Computer Vision System Toolbox to do this feature detection and extraction step (https://uk.mathworks.com/help/vision/feature-detection-and-extraction.html). For an example on how to extract SIFT/SURF feature detector and descriptor, type openExample(’vision/ExtractSURFFeaturesFromAnImageExample’) in the Matlab command window. SURF (Speeded-Up Robust Features) is simply a speed-up version of SIFT. In Mat- lab, the default feature size is 64, to make it 128, you can set the ’SURFSize’ argument to 128 (https://uk.mathworks.com/help/vision/ref/extractfeatures.html). 1https://cvml.ist.ac.at/AwA2/ 1 Computer Vision @ University of Sussex – Spring 2018 Your zero-shot recognition system should split the object classes (not images) into a training and test set. In this scenario, the training classes are animals that your system will see, i.e. ones whose images the system has access to. In contrast, the test set contains classes (animals) for which your system will never see example images. The 40 training classes are given in trainclasses.txt and the 10 test classes are given in testclasses.txt. (Use [c1, c2] = textread(’classes.txt’, ’%u %s’); to read in the class names. You can use the same function but with a different second argument to read in testclasses.txt.) At each time, we will assume that a query image can only be classified as belonging to one of the 10 unseen classes, so chance performance (randomly guessing the label) will be 10%. You will use all or a random sample of all images from the training classes (or rather, their feature descriptors) to train a classifier for each of the 85 attributes. The predicate matrix mentioned above tells you which animals have which attributes. So if a bear is brown, you should assign the ”brown=1” tag to all of its images. Similarly, if a dalmatian is not brown, you should assign the tag ”brown=0” to all of its images. You will use the images tagged with ”brown=1” as the positive data in your classifier, and the images tagged with ”brown=0” as the negative data, for the ”brown” classifier. Use the Matlab fitcsvm function to train the classifiers. Save the model output by each attribute classifier as the j-th entry in a models cell array (initialized as models = cell(85, 1);) Note that if you sample data in such a way that you have either no positive or no negative data for some attribute classifier, you’ll get a classifier that only knows about one class, which is a problem. However, for every attribute, there are some classes that do and some that don’t have the attribute. So you just have to make sure you sample data from all classes, when training your attribute classifiers. You now have one classifier for each attribute. You next want to apply each attribute classifier j to each image l belonging to any of the test classes. You want to save the probability that the j-th attribute is present in the l-th image. To do so, you have to do one extra operation to your classifier. For each of the j attribute classifiers, run the function fitSVMPosterior on them, i.e. call model = modelsj; model = fitSVMPosterior(model); modelsj = model; (or if you want, run this function on each classifier before saving it into the cell array). Then to get the probability that the l-th image contains the attribute j, call [label, scores] = predict(model, x); where x is the feature descriptor for your image. Then scores will be a 1x2 vector, check model.ClassNames to know which probability belongs to which class. Ensure that the probabilities sum to 1, by calling assert(sum(scores) == 1), or if x contains the descriptors for multiple images, assert(all(sum(scores, 2) == 1)). Save these probabilities so you can easily access them in the next step. You will now actually predict which animals are present in each test image. To perform classification of a query test image, you will assign it to the test class (out of 10) whose attribute ”signature” it matches the best. How can we compute the probability that an image belongs to some animal category? Let’s use a toy example where we only have 2 animal classes and 5 attributes. We know (from a predicate matrix like the one discussed above) that the first class has the first, second, and fifth attributes, but does not have the third and fourth. Then the probability that the query image (with descriptor x) belongs to this class is P (class = 1|x) = P (attribute 1 = 1|x) × P (attribute 2 = 1|x) × P (attribute 3 = 0|x) × P (attribute 4 = 0|x) × P (attribute 5 = 1|x). The “|x” notation means “given x”, i.e. we compute some probability using the image descriptor x. Let’s say the second class is known to have attributes 3 and 5, and no others. Then the probability that the query image belongs to this class is P (class = 2|x) = P (attribute 1 = 0|x) × P (attribute 2 = 0|x) × P (attribute 3 = 1|x) × P (attribute 4 = 0|x) × P (attribute 5 = 1|x). You will assign the image with descriptor x to that class i which 2 Computer Vision @ University of Sussex – Spring 2018 gives the maximal P (class = i|x). For example, if P (class = 1|x) = 0.80 and P (class = 2|x) = 0.20, then you will assign x to class 1. You can call [ , ind] = max(probs); on a vector of probabilities such that probs(i) is P (class = i); then ind will give you the “winning” class to which x should be assigned. How do you compute P (attribute i = 1|x)? This is a probability value you’ve computed already. It is just the second entry of the scores output from running predict on the descriptor x (assuming you trained with labels of 1 and 0). If you need P (attribute i = 0|x), that’s just the first entry of scores (or more simply, 1 - the second entry). You will classify each test image from the 10 unseen (test) classes, and compute the average accuracy. What to include in your submission: 1. A function [bow feature] = extract bag of visual words feature(...); that out- puts a Kx1 array (where K is the free parameter K in the K-Means clustering algorithm) for each image. When using a SIFT/SURF feature extraction method, each image will have a variable number of SIFT/SURF feature descriptors, usually of size 128 each. You can aggregate all feature descriptors from multiple images, and perform K-Means cluster- ing over all of them (first explained on 7th March 2018, will be re-explained on 10th April 2018). An image can now be represented by the histogram over the K cluster centers. 2. A function [models] = train attribute models(...); that outputs a 85x1 cell array of attribute classifier models. You are free to pass in whatever arguments you need, and are welcome to add any additional outputs after models. 3. A function [probs attr] = compute attribute probs(...); that outputs a 85xNtest matrix of probabilities, where Ntest is the number of test images you choose to use; and probs attr(j, l) is the probability that the j-th attribute is present in the l-th image. Again, use any inputs you like, and any additional outputs after the first one. 4. A function [probs class] = compute class probs(...); that outputs an 10xNtest ma- trix of probabilities, where probs class(i, l) is the probability that the i-th class is present in the l-th image. 5. A function [acc] = compute accuracy(probs class, ground truth class); where ground truth class is a 1xNtest vector such that ground truth class(l) is the true (i.e. given in the dataset) class for the l-th image. acc is a single real number denoting the overall accuracy of your system, averaged over the Ntest test images. Also include the overall accuracy score in your 2-page report. 2-page report guide 25 points Background Here you should write a 1-page summary about a research paper “Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly” (https://arxiv.org/ pdf/1707.00600.pdf) [1]. 25 points Outline of methods employed This does not have to be in depth, and I do not expect you to regurgitate the contents of the lecture notes. You should state clearly what methods you have used, what parameters you have used with those methods and what the purpose of these methods were. If you 3 Computer Vision @ University of Sussex – Spring 2018 have developed any of your own approaches, or you have adapted either a built in approach, or improved on it, then you should discuss that here. 25 points Results achieved and analysis In this section, you
Answered Same DayMay 11, 2020

Answer To: Computer Vision @ University of Sussex – Spring 2018 Coursework Assignment Deadline: 11th May 2018...

Abr Writing answered on May 17 2020
159 Votes
MATLAB/classes.txt
1    antelope
2    grizzly+bear
3    killer+whale
4    beaver
5    dalmatian
6    persian+cat
7    horse
8    german+shepherd
9    blue+whale
10    siamese+cat
11    skunk
12    mole
13    tiger
14    hippopotamus
15    leopard
16    moose
17    spider+monkey
18    humpback+whale
19    elephant
20    gorilla
21    ox
22    fox
23    sheep
24    seal
25    chimpanzee
26    hamster
27    squirrel
28    rhinoceros
29    rabbit
30    bat
31    giraffe
32    wolf
33    chihuahua
34    rat
35    weasel
36    otter
37    buffalo
38    zebra
39    giant+panda
40    deer
41    bobcat
42    pig
43    lion
44    mouse
45    polar+bear
46    collie
47    walrus
48    raccoon
49    cow
50    dolphin
MATLAB/DAP/attributes.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('/agbs/share/datasets/Animals_with_Attributes/code/')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern = '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-%s.pic.bz2'
labels_pattern = '/agbs/share/datasets/Animals_with_Attributes/code/feat/%s-labels.pic.bz2'
all_features = ['cq','lss','phog','sift','surf','rgsift']
attribute_matrix = 2*loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classes.txt',nameonly)
attributenames = loadstr('/agbs/share/datasets/Animals_with_Attributes/predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
featurehist={}
for feature in all_features:
featurehist[feature]=[]

labels=[]
for classname in all_classes:
class_id = classnames.index(classname)
class_size = 0
for feature in all_features:
featurefilename = feature_
pattern % (classname,feature)
print '# ',featurefilename
histfile = bzUnpickle(featurefilename)
featurehist[feature].extend( histfile )

labelfilename = labels_pattern % classname
print '# ',labelfilename
print '#'
labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )

for feature in all_features:
featurehist[feature]=array(featurehist[feature]).T # shogun likes its data matrices shaped FEATURES x SAMPLES

labels = array(labels)
return featurehist,labels
def train_attribute(attribute_id, C, split=0):
from sg import sg
attribute_id = int(attribute_id)
print "# attribute ",attributenames[attribute_id]
C = float(C)
print "# C ", C

if split == 0:
train_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
test_classes=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
else:
classnames = loadstr('/agbs/share/datasets/Animals_with_Attributes/classnames.txt')
startid= (split-1)*10
stopid = split*10
test_classes = classnames[startid:stopid]
train_classes = classnames[0:startid]+classnames[stopid:]

Xtrn,Ltrn = create_data(train_classes,attribute_id)
Xtst,Ltst = create_data(test_classes,attribute_id)

if min(Ltrn) == max(Ltrn): # only 1 class
Lprior = mean(Ltrn)
prediction = sign(Lprior)*ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
return prediction,probabilities,Ltst

sg('loglevel', 'WARN')
widths={}
for feature in all_features:
traindata = array(Xtrn[feature][:,::50],float) # used to be 5*offset
sg('set_distance', 'CHISQUARE', 'REAL')
sg('clean_features', 'TRAIN')
sg('set_features', 'TRAIN', traindata)
sg('init_distance', 'TRAIN')
DM=sg('get_distance_matrix')
widths[feature] = median(DM.flatten())
del DM

sg('new_svm', 'LIBSVM')
sg('use_mkl', False) # we use fixed weights here
sg('clean_features', 'TRAIN')
sg('clean_features', 'TEST')

Lplatt_trn = concatenate([Ltrn[i::10] for i in range(9)]) # 90% for training
Lplatt_tst = Ltrn[9::10] # remaining 10% for platt scaling
for feature in all_features:
Xplatt_trn = concatenate([Xtrn[feature][:,i::10] for i in range(9)], axis=1)
sg('add_features', 'TRAIN', Xplatt_trn)
Xplatt_tst = Xtrn[feature][:,9::10]
sg('add_features', 'TEST', Xplatt_tst)
del Xplatt_trn,Xplatt_tst,Xtrn[feature]

sg('set_labels', 'TRAIN', Lplatt_trn)
sg('set_kernel', 'COMBINED', 5000)
for featureset in all_features:
sg('add_kernel', 1., 'CHI2', 'REAL', 10, widths[featureset]/5. )
sg('svm_max_train_time', 600*60.) # one hour should be plenty
sg('c', C)
sg('init_kernel', 'TRAIN')
try:
sg('train_classifier')
except (RuntimeWarning,RuntimeError): # can't train, e.g. all samples have the same labels
Lprior = mean(Ltrn)
prediction = sign(Lprior) * ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.) * ones(len(Ltst))
savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)
return prediction,probabilities,Ltst

[bias, alphas]=sg('get_svm')
#print bias,alphas
sg('init_kernel', 'TEST')
try:
prediction=sg('classify')
platt_params = SigmoidTrain(prediction, Lplatt_tst)
probabilities = SigmoidPredict(prediction, platt_params)

savetxt('./DAP/cvfold%d_C%g_%02d-val.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d-val.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d-val.labels' % (split, C, attribute_id), Lplatt_tst)
savetxt('./DAP/cvfold%d_C%g_%02d-val.platt' % (split, C, attribute_id), platt_params)
#print '#train-perf ',attribute_id,C,mean((prediction*Lplatt_tst)>0),mean(Lplatt_tst>0)
#print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Lplatt_tst)>0),mean(Lplatt_tst>0)
except RuntimeError:
Lprior = mean(Ltrn)
prediction = sign(Lprior)*ones(len(Ltst))
probabilities = 0.1+0.8*0.5*(Lprior+1.)*ones(len(Ltst))
print >> sys.stderr, "#Error during testing. Using constant platt scaling"
platt_params=[1.,0.]

# ----------------------------- now apply to test classes ------------------

sg('clean_features', 'TEST')
for feature in all_features:
sg('add_features', 'TEST', Xtst[feature])
del Xtst[feature]

sg('init_kernel', 'TEST')
prediction=sg('classify')
probabilities = SigmoidPredict(prediction, platt_params)

savetxt('./DAP/cvfold%d_C%g_%02d.txt' % (split, C, attribute_id), prediction)
savetxt('./DAP/cvfold%d_C%g_%02d.prob' % (split, C, attribute_id), probabilities)
savetxt('./DAP/cvfold%d_C%g_%02d.labels' % (split, C, attribute_id), Ltst)

#print '#test-perf ',attribute_id,C,mean((prediction*Ltst)>0),mean(Ltst>0)
#print '#platt-perf ',attribute_id,C,mean((sign(probabilities-0.5)*Ltst)>0),mean(Ltst>0)
return prediction,probabilities,Ltst
if __name__ == '__main__':
import sys
try:
attribute_id = int(sys.argv[1])
except IndexError:
print "Must specify attribute ID!"
raise SystemExit
try:
split = int(sys.argv[2])
except IndexError:
split = 0
try:
C = float(sys.argv[3])
except IndexError:
C = 10.
pred,prob,Ltst = train_attribute(attribute_id,C,split)
print "Done.", attribute_id, C, split
MATLAB/DAP/attributes.sh
#!/bin/bash
# Animals with Attributes Dataset
# Train all attribute classifiers for fixed split and regularizer
SPLIT=0
C=10
for A in `seq 1 85` ;
do
./new-attributes.py $A $SPLIT $C
done
MATLAB/DAP/build_matfiles.m
clear all, close all
% dataset
pnam = '/agbs/share/datasets/Animals_with_Attributes';
% output
outpath = '.';
% There are 6 feature representations:
% - cq: (global) color histogram (1x1 + 2x2 + 4x4 spatial pyramid, 128 bins each, each histogram L1-normalized)
% - lss[1]: local self similarity (2000 entry codebook, raw bag-of-visual-word counts)
% - phog[2]: histogram of oriented gradients (1x1 + 2x2 + 4x4 spatial pyramid, 12 bins each, each histogram L1-normalized or all zero)
% - rgsift[3]: rgSIFT descriptors (2000 entry codebook, bag-of-visual-word counts, L1-normalized)
% - sift[4]: SIFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
% - surf[5]: SUFT descriptors (2000 entry codebook, raw bag-of-visual-word counts)
feat = {'cq','lss','phog','rgsift','sift','surf'};
nfeat = [2688,2000,252,2000,2000,2000];
% [1] E. Shechtman, and M. Irani: "Matching Local Self-Similarities
% across Images and Videos", CVPR 2007.
%
% [2] A. Bosch, A. Zisserman, and X. Munoz: "Representing shape with
% a spatial pyramid kernel", CIVR 2007.
%
% [3] Koen E. A. van de Sande, Theo Gevers and Cees G. M. Snoek:
% "Evaluation of Color Descriptors for Object and Scene
% Recognition", CVPR 2008.
%
% [4] D. G. Lowe, "Distinctive Image Features from Scale-Invariant
% Keypoints", IJCV 2004.
%
% [5] H. Bay, T. Tuytelaars, and L. Van Gool: "SURF: Speeded Up
% Robust Features", ECCV 2006.
%% set some constants
% class names of all classes
[tmp,classes] = textread([pnam,'/classes.txt'],'%d %s'); clear tmp
% class names of training/test classes
trainclasses = textread([pnam,'/trainclasses.txt'],'%s');
testclasses = textread([pnam,'/testclasses.txt' ],'%s');
% classes(trainclasses_id) == trainclasses
trainclasses_id = -ones(length(trainclasses),1);
for i=1:length(trainclasses)
for j=1:length(classes)
if strcmp(trainclasses{i},classes{j})
trainclasses_id(i) = j;
end
end
end
% classes(testclasses_id) == testclasses
testclasses_id = -ones(length(testclasses),1);
for i=1:length(testclasses)
for j=1:length(classes)
if strcmp(testclasses{i},classes{j})
testclasses_id(i) = j;
end
end
end
% predicate names of all 85 predicates
[tmp,predicates] = textread([pnam,'/predicates.txt'],'%d %s');
% pca matrix: probability class-attribute pca(i,j) = P(a_j=1|c=i)
% contains RELATIVE CONNECTION STRENGTH linearly scaled to 0..100
pca = textread([pnam,'/predicate-matrix-continuous.txt']);
% class antelope has 4 missing values (black,white,blue,brown) => copy from lion
pca(1,1:4) = pca(43,1:4);
% derive binary matrix from continuous
pca_bin = pca > mean(pca(:));
% pca_bin = textread([pnam,'/predicate-matrix-binary.txt']);
save([outpath,'/constants.mat'],'pnam','feat','nfeat','classes',...
'trainclasses','testclasses','trainclasses_id','testclasses_id', ...
'predicates','pca','pca_bin')
%% save Matlab files one per feature type
nperclass = zeros(length(classes),1);
for idc = 1:50
for idf = [1:2,4:6]
fnam = [pnam,'/Features/',feat{idf},'-hist/',classes{idc}];
no = numel(dir(fnam))-2;
nperclass(idc) = no;
Xc = sparse(nfeat(idf),no);
for ido = 1:no
Xc(:,ido) = textread(sprintf('%s/%s_%04d.txt',fnam,classes{idc},ido),'%f');
end
fprintf('%s\t%04d: %s\n',feat{idf},ido,classes{idc})
save(sprintf('%s/feat/x_%s_c%02d.mat',outpath,feat{idf},idc),'Xc')
end
end
save([outpath,'/nperclass.mat'],'nperclass')
MATLAB/DAP/collect_results.m
datapath = '.';
load([datapath,'/constants.mat'])
for cvsplit = 0:5 % 1:5
for log3_C = -13:-9 % -13:-9
fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);
if exist(fnam,'file')
data = load(fnam);

% recompute predictions
% calculate p( attribute = j | image ) from p( train class = j | image )
pfa_te = data.pfc_te * ( pca ./ repmat(sum(pca,2),1,85) );
% calculate p( test class = j | image ) from p( attribute = j | image )
s_pcate = sum(pca(data.cte,:));
is_pcate = zeros(size(s_pcate));
is_pcate(s_pcate~=0) = 1./s_pcate(s_pcate~=0);
pfc_pr = pfa_te * (pca(data.cte,:).*repmat(is_pcate,10,1))';
% class assignment
mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
for i=1:length(ypr)
if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
ypr(i) = data.cte( id( mx(i,:) ) );
end
acc_pr = 100*sum(ypr==data.yte)/numel(ypr);
fprintf('split %d, C=%1.2e: Acc = %1.3f%% (%d/%d)\n',...
cvsplit,3^log3_C,acc_pr,sum(ypr==data.yte),numel(ypr))
else
fprintf('%s missing\n',fnam)
end
end
end
MATLAB/DAP/constants.mat
pnam:[1x44 char array]
feat:[1x6 cell array]
nfeat:[1x6 double array]
classes:[50x1 cell array]
predicates:[85x1 cell array]
prca:[50x85 double array]
prca_bin:[50x85 uint8 (logical) array]
MATLAB/DAP/DAP_eval.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Perform Multiclass Predicition from binary attributes and evaluates it.
"""
import os,sys
sys.path.append('/agbs/cluster/chl/libs/python2.5/site-packages/')
from numpy import *
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def loaddict(filename,converter=str):
D={}
for line in file(filename).readlines():
line = line.split()
D[line[0]] = converter(line[1].strip())

return D
# adapt these paths and filenames to match local installation
classnames = loadstr('../classes.txt',nameonly)
numexamples = loaddict('numexamples.txt',int)
def evaluate(split,C):
global test_classnames
attributepattern = './DAP/cvfold%d_C%g_%%02d.prob' % (split,C)

if split == 0:
test_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/testclasses.txt')
train_classnames=loadstr('/agbs/share/datasets/Animals_with_Attributes/trainclasses.txt')
else:
startid= (split-1)*10
stopid = split*10
test_classnames = classnames[startid:stopid]
train_classnames = classnames[0:startid]+classnames[stopid:]

test_classes = [ classnames.index(c) for c in test_classnames]
train_classes = [ classnames.index(c) for c in train_classnames]
M = loadtxt('/agbs/share/datasets/Animals_with_Attributes/predicate-matrix-binary.txt',dtype=float)
L=[]
for c in test_classes:
L.extend( [c]*numexamples[classnames[c]] )
L=array(L) # (n,)
P = []
for i in range(85):
P.append(loadtxt(attributepattern % i,float))
P = array(P).T # (85,n)
prior = mean(M[train_classes],axis=0)
prior[prior==0.]=0.5
prior[prior==1.]=0.5 # disallow degenerated priors
M = M[test_classes] # (10,85)
prob=[]
for p in P:
prob.append( prod(M*p + (1-M)*(1-p),axis=1)/prod(M*prior+(1-M)*(1-prior), axis=1) )
MCpred = argmax( prob, axis=1 )

d = len(test_classes)
confusion=zeros([d,d])
for pl,nl in zip(MCpred,L):
try:
gt = test_classes.index(nl)
confusion[gt,pl] += 1.
except:
pass
for row in confusion:
row /= sum(row)

return confusion,asarray(prob),L
def plot_confusion(confusion):
from pylab import figure,imshow,clim,xticks,yticks,axis,setp,gray,colorbar,savefig,gca
fig=figure(figsize=(10,9))
imshow(confusion,interpolation='nearest',origin='upper')
clim(0,1)
xticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],rotation='vertical',fontsize=24)
yticks(arange(0,10),[c.replace('+',' ') for c in test_classnames],fontsize=24)
axis([-.5,9.5,9.5,-.5])
setp(gca().xaxis.get_major_ticks(), pad=18)
setp(gca().yaxis.get_major_ticks(), pad=12)
fig.subplots_adjust(left=0.30)
fig.subplots_adjust(top=0.98)
fig.subplots_adjust(right=0.98)
fig.subplots_adjust(bottom=0.22)
gray()
colorbar(shrink=0.79)
savefig('AwA-ROC-confusion-DAP.pdf')
return
def plot_roc(P,GT):
from pylab import figure,xticks,yticks,axis,setp,gray,colorbar,savefig,gca,clf,plot,legend,xlabel,ylabel
from roc import roc
AUC=[]
CURVE=[]
for i,c in enumerate(test_classnames):
class_id = classnames.index(c)
tp,fp,auc=roc(None,GT==class_id, P[:,i] ) # larger is better
print "AUC: %s %5.3f" % (c,auc)
AUC.append(auc)
CURVE.append(array([fp,tp]))
order = argsort(AUC)[::-1]
styles=['-','-','-','-','-','-','-','--','--','--']
figure(figsize=(9,5))
for i in order:
c = test_classnames[i]
plot(CURVE[i][0],CURVE[i][1],label='%s (AUC: %3.2f)' % (c,AUC[i]),linewidth=3,linestyle=styles[i])

legend(loc='lower right')
xticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
yticks([0.0,0.2,0.4,0.6,0.8,1.0], [r'$0$', r'$0.2$',r'$0.4$',r'$0.6$',r'$0.8$',r'$1.0$'],fontsize=18)
xlabel('false negative rate',fontsize=18)
ylabel('true positive rate',fontsize=18)
savefig('AwA-ROC-DAP.pdf')
def main():
try:
split = int(sys.argv[1])
except IndexError:
split = 0
try:
C = float(sys.argv[2])
except IndexError:
C = 10.
confusion,prob,L = evaluate(split,C)
print "Mean class accuracy %g" % mean(diag(confusion)*100)
plot_confusion(confusion)
plot_roc(prob,L)

if __name__ == '__main__':
main()
MATLAB/DAP/liblinear_cv5.m
function liblinear_cv5(cvsplit,log3_C)
% path to liblinear
addpath /agbs/cluster/hn/mpi_animal_challenge/lib/liblinear-1.33/matlab
% path to Matlab feature representation
datapath = '/kyb/agbs/chl/mysrc/Animals-with-Attributes/code';
% build training-testing split
if cvsplit==0
% get original split
tmp = load([datapath,'/constants.mat'],'trainclasses_id','testclasses_id');
cte = tmp.testclasses_id';
ctr = tmp.trainclasses_id';
clear tmp
else
% build training-testing split
cte = (cvsplit-1)*10+(1:10); % test classes
ctr = setdiff(1:50,cte); % training classes
end
load([datapath,'/constants.mat'])
%% load training data (40 classes)
fprintf('Load training set\n')
Xtr = []; ytr = [];
for idc = ctr % 40 classes
Xc = [];
for idf = 1:6 % 6 features
data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
Xc = [Xc; data.Xc];
end
Xtr = [Xtr,Xc];
ytr = [ytr; idc*ones(size(Xc,2),1)];
fprintf(' %s(%d)\n',classes{idc},size(Xc,2))
end, Xtr = Xtr';
% train model
fprintf('Learning\n')
% logistic regression
C = 3^log3_C;
argstr = sprintf('-s 0 -c %f',C);
model = train(ytr, Xtr, argstr);
%% make prediction on training data
tic
[l,acc_tr,p] = predict(ytr, Xtr, model, '-b 1');
T = toc;
fprintf('training took %1.2f s\n',T)
pfc_tr = zeros(length(l),50); pfc_tr(:,model.Label) = p; % full 50 matrix
%% load test data (10 classes)
fprintf('Load test set\n')
Xte = []; yte = [];
for idc = cte % 10 classes
Xc = [];
for idf = 1:6 % 6 features
data = load(sprintf('%s/feat/x_%s_c%02d.mat',datapath,feat{idf},idc),'Xc');
Xc = [Xc; data.Xc];
end
Xte = [Xte,Xc];
yte = [yte; idc*ones(size(Xc,2),1)];
fprintf(' %s(%d)\n',classes{idc},size(Xc,2))
end, Xte = Xte';
%% predict train classes on test data
[l,acc_te,p] = predict(yte, Xte, model, '-b 1');
pfc_te = zeros(length(l),50); pfc_te(:,model.Label) = p; % full 50 matrix
%% predict test classes on test data
% calculate p( attribute = j | image ) from p( train class = j | image )
pfa_te = pfc_te * ( prca ./ repmat(sum(prca,2),1,85) );
% calculate p( test class = j | image ) from p( attribute = j | image )
pfc_pr = pfa_te * (prca(cte,:)./repmat(sum(prca(cte,:)),10,1))';
% class assignment
mx = repmat( max(pfc_pr,[],2), [1,size(pfc_pr,2)] ) == pfc_pr;
id = 1:size(mx,2); ypr = zeros(size(mx,1),1);
for i=1:length(ypr)
if sum(mx(i,:))==0, mx(i,1)=1; end % default is first test class
ypr(i) = cte( id( mx(i,:) ) );
end
acc_pr = 100*sum(ypr==yte)/numel(ypr);
fprintf('Accuracy = %1.4f%% (%d/%d)\n',acc_pr,sum(ypr==yte),numel(ypr))
% save results
fnam = sprintf('%s/cv/liblinear_cvfold%d_l3C%d.mat',datapath,cvsplit,log3_C);
save(fnam,'cvsplit','log3_C','argstr','C','acc_tr','acc_pr',...
'ctr','cte','pfc_tr','pfc_te','pfc_pr','ytr','yte','ypr')
MATLAB/DAP/new-attributes.py
#!/usr/bin/env python
"""
Animals with Attributes Dataset
Train one binary attribute classifier using all possible features.
Needs "shogun toolbox with python interface" for SVM training
"""
import os,sys
sys.path.append('./')
from numpy import *
from platt import *
import cPickle, bz2
def nameonly(x):
return x.split('\t')[1]
def loadstr(filename,converter=str):
return [converter(c.strip()) for c in file(filename).readlines()]
def bzUnpickle(filename):
return cPickle.load(bz2.BZ2File(filename))
# adapt these paths and filenames to match local installation
feature_pattern = './feat/%s-%s.pic.bz2'
labels_pattern = './feat/%s-labels.pic.bz2'
all_features = ['cq']
attribute_matrix = 2*loadtxt('../predicate-matrix-binary.txt',dtype=float)-1
classnames = loadstr('../classes.txt',nameonly)
attributenames = loadstr('../predicates.txt',nameonly)
def create_data(all_classes,attribute_id):
featurehist={}
for feature in all_features:
featurehist[feature]=[]

labels=[]
for classname in all_classes:
class_id = classnames.index(classname)
class_size = 0
for feature in all_features:
featurefilename = feature_pattern % (classname,feature)
print '# ',featurefilename
histfile = bzUnpickle(featurefilename)
featurehist[feature].extend( histfile )

labelfilename = labels_pattern % classname
print '# ',labelfilename
print '#'
labels.extend( bzUnpickle(labelfilename)[:,attribute_id] )

for feature in all_features:
featurehist[feature]=array(featurehist[feature]).T # shogun likes its data matrices shaped FEATURES x SAMPLES

labels = array(labels)
return featurehist,labels
def train_attribute(attribute_id, C, split=0):
from shogun import Classifier,Features,Kernel,Distance
attribute_id = int(attribute_id)
print "# attribute ",attributenames[attribute_id]
C = float(C)
print "# C ", C

if split == 0:
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here