Questions 1-4 are from Zygmunt:
----------------------------------------------------------------------------------------
1. I've downloaded the CIFAR 9.32 tar file and re-run the test command, this time I get:
Importing _ConvNet C++ module
CUDA Error: invalid device ordinal
R: The problem is related with gpu index. For example, the model
think it should run on gpu 3 of the machine( because you download my
trained model) but another machine has only one gpu card. Maybe you could
somehow change the gpu index here.
----------------------------------------------------------------------------------------
2. I get 0.1092 error on batch 6 from run12 model. Is this consistant with your results?
R: It is conistent with logs
----------------------------------------------------------------------------------------
3.I've been trying to run predictions and get the following error. Do you have an
idea how to fix it?
python shownet.py -f model_fc128-dcf-50/model_fc128-dcf-50_run01/ --write-mv-result=tmpfile
pdir: model_fc128-dcf-50
self.save_path: model_fc128-dcf-50
Traceback (most recent call last):
File "shownet.py", line 503, in
model = ShowConvNet(op, load_dic)
File "shownet.py", line 48, in __init__
ConvNet.__init__(self, op, load_dic)
File "/home/komet/software/cuda-convnet/dropconnect/dropnn/convnet.py", line 44, in __init__
IGPUModel.__init__(self, "ConvNet", op, load_dic, filename_options, dp_params=dp_params)
File "/home/komet/software/cuda-convnet/dropconnect/dropnn/gpumodel.py", line 86, in __init__
self.init_data_providers()
File "shownet.py", line 66, in init_data_providers
ConvNet.init_data_providers(self)
File "/home/komet/software/cuda-convnet/dropconnect/dropnn/gpumodel.py", line 115, in init_data_providers
self.img_size, self.img_channels, # options i've add to cifar data provider
AttributeError: ShowConvNet instance has no attribute 'img_size'
Earlier I got another error:
Traceback (most recent call last):
File "shownet.py", line 503, in
model = ShowConvNet(op, load_dic)
File "shownet.py", line 48, in __init__
ConvNet.__init__(self, op, load_dic)
File "/home/komet/software/cuda-convnet/dropconnect/dropnn/convnet.py", line 43, in __init__
dp_params['img_flip'] = op.get_value('img_flip')
File "/home/komet/software/cuda-convnet/dropconnect/dropnn/options.py", line 126, in get_value
return self.options[name].value
KeyError: 'img_flip'
As a temporary fix I commented out
dp_params['img_flip'] = op.get_value('img_flip')
in ConvNet.__init__() and it went away.
R: This is an out-of-dat problem with code, the code that generate those model is
not the same version as the verion it is released. The following hack will help:
1. def init_data_provider(self) in gpumodel.py: do follows
self.test_data_provider = DataProvider.get_instance(
self.data_path,
change---> #self.img_size, self.img_channels, # options i've add to cifar data provider
32, 3,
self.test_batch_range,
type=self.dp_type, dp_params=self.dp_params, test=True)
self.train_data_provider = DataProvider.get_instance(
self.data_path,
change---> #self.img_size, self.img_channels, # options i've add to cifar data provider
32, 3,
self.train_batch_range,
self.model_state["epoch"], self.model_state["batchnum"],
type=self.dp_type, dp_params=self.dp_params, test=False)
reason:
self.img_size/img_channels not present in this cifar model, just hard code their size
as: 32,3. they are not important because the input will be ignored in step 2.
2. in data.py:
def get_instance(cls, data_dir,
img_size, num_colors, # options i've add to cifar data provider
batch_range=None, init_epoch=1, init_batchnum=None, type="default", dp_params={}, test=False):
# why the fuck can't i reference DataProvider in the original definition?
#cls.dp_classes['default'] = DataProvider
type = type or DataProvider.get_batch_meta(data_dir)['dp_type'] # allow data to decide data provider
if type.startswith("dummy-"):
name = "-".join(type.split('-')[:-1]) + "-n"
if name not in dp_types:
raise DataProviderException("No such data provider: %s" % type)
_class = dp_classes[name]
dims = int(type.split('-')[-1])
return _class(dims)
elif type in dp_types:
if img_size == 0:
_class = dp_classes[type]
return _class(data_dir, batch_range, init_epoch, init_batchnum, dp_params, test)
else :
_class = dp_classes[type]
return _class(data_dir,
change---> #img_size, num_colors,
batch_range, init_epoch, init_batchnum, dp_params, test)
raise DataProviderException("No such data provider: %s" % type)
reason: there CroppedCIFARDataRandomProvider obj is created, and it does not
requires these two params. These two params are need only for GeneralDataProvdier
----------------------------------------------------------------------------------------
4. I run into another problem:
$ python shownet.py -f model_fc128-dcf-50/model_fc128-dcf-50_run01/ --write-mv-result=tmpfile
pdir: model_fc128-dcf-50
self.save_path: model_fc128-dcf-50
----------------
Error:
Path '/misc/vlgscratch1/FergusGroup/wan/cifar-10-py-colmajor/batches.meta' does not exist.
I was able to ad-hoc fix it by creating this path and symlinking to my cifar data directory.
Do you know if it's possible to alter the data path of the trained model?
R: yes, I think a good way to do this is that:
1) load the model in interactive python mode such as python shell.
2) you can alter data provider related fields of that object.
If you do this once, you can have a simple python code which can alter all the fields for you.
----------------------------------------------------------------------------------------
5. "First, on section 3.2 (Inference), the last paragraph says
``This is a weighted sum of Bernoulli variables, which can be approximated by a Gaussian distribution.''
What is the justification for this sentence? I think, maybe, here we assume the
dimensionality of v is large thus by Lyaponuv CLT it can be approximated by the
multivariate Gaussian stated in the paper. Is this close to what you have in mind?"
R: I am not sure if we want to apply CLT here, because each output is sum of weighted
Bernoulli distribution and number of neuron in the layer might not be large.
In my mind, I just do a moment-matching approximation here: E_M[a(M*W)v]\appox E_u[a(u)].
We have similar technique in stats, replace a distribution which is hard to compute with
an approximate distribution which is easy to compute while ensuring their 1st,2nd
memonets are the same.:
----------------------------------------------------------------------------------------
6. In Appendix, 8.2, proof of theorem 1 (DropConnect Complexity), I am not sure how to get
to equation (9) from the previous step..
R: In the proof, here are some symbol "overloading". Sorry for the inconvenience.
In equation(7) the thing inside R-complexity term is a value not a vector because
when you apply lemma 5, F is a 1-d output function class (G is takes a vector of 1-d
output Function class). It correspond to the R-complexity of each output neuron,
not the vector of neurons. Thus the term inside R-complexity of equation(8) is also a value.
In equation (9) W is a col-vector (a slice of nn transformation weight, which takes all
input and map to a single output neuron), that's why there is a transpose there.
In this case, D_M is a diag matrix and g(x_i) is a vector.
The trick in Lemma 5 that convert from a vector of 1-d output function to itself also
happens in (10) to the inequality below. That's why we have a \sqrt(n) before R-complexity
of G (feature extractor).