-
assigned issue to
TypeError while running create_embeddings.py with a GPU
Hi,
I tried running create_embeddings.py on my machine that has an Nvidia Geforce GT 750M but I got this error:
Using gpu device 0: GeForce GT 750M 2014-05-20 13:33:41 INFO trainer.py: 96 Setting validation period type to examples Traceback (most recent call last): File "create_embeddings.py", line 80, in <module> main(args) File "create_embeddings.py", line 16, in main trainer.run(args) File "build\bdist.win-amd64\egg\word2embeddings\nn\trainer.py", line 466, in run File "build\bdist.win-amd64\egg\word2embeddings\nn\trainer.py", line 363, in train File "build\bdist.win-amd64\egg\word2embeddings\nn\networks.py", line 117, in build File "build\bdist.win-amd64\egg\word2embeddings\nn\networks.py", line 139, in updates File "build\bdist.win-amd64\egg\word2embeddings\nn\layers.py", line 165, in updates File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 529, in grad grad_dict, wrt, cost_name) File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1207, in _populate_grad_dict rval = [access_grad_cache(elem) for elem in wrt] File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1167, in access_grad_cache term = access_term_cache(node)[idx] File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1028, in access_term_cache input_grads = node.op.grad(inputs, new_output_grads) File "C:\Anaconda\Lib\site-packages\Theano\theano\sandbox\cuda\basic_ops.py", line 80, in grad return [gpu_from_host(gz)] File "C:\Anaconda\Lib\site-packages\Theano\theano\gof\op.py", line 411, in call node = self.make_node(inputs, *kwargs) File "C:\Anaconda\Lib\site-packages\Theano\theano\sandbox\cuda\basic_ops.py", line 126, in make_node x.type)) TypeError: Expected a Theano variable with type TensorType. Got SparseVariable{csc,float32} with type Sparse[float32, cs c]
I have the latest github version of Theano, running on a Windows 8 machine with Python 2.7 ( Anaconda ).
Do you know what the source of the issue is ? Thank you.
Comments (17)
-
-
--- a/word2embeddings/nn/layers.py +++ b/word2embeddings/nn/layers.py @@ -258,7 +258,10 @@ class EmbeddingLayer(HiddenLayer): self.inputs = inputs input = self.inputs[0] concatenated_input = input.flatten() - indexed_rows = theano.sparse_grad(self.weights[concatenated_input]) + if theano.config.device == 'cpu': + indexed_rows = theano.sparse_grad(self.weights[concatenated_input]) + else: + indexed_rows = theano.grad(self.weights[concatenated_input]) concatenated_rows = indexed_rows.flatten() num_examples = input.shape[0] width = concatenated_rows.size//num_examples
#!python
-
repo owner @vvkulkarninitk
I would rather check for GPU, because the default value is CPU, so if the flags has nothing in them we should assume CPU unless GPU is stated explicitly.
Can you please send me a patch :)
-
@BAmine : Thanks for bringing this to our attention. The thing is sparse_grad operation is not currently supported on the GPU. I have checked in the potential fix. Please can you pull the latest code and let us know if it solves your problem.
-
@aboSamoor : Indeed might be better. Note that theano.config anyways handles the case of no explicit declaration. It automatically sets the property to cpu if no definition exists in theano.rc file. So its guaranteed to be one of 'cpu' or 'gpu'.
-
reporter Thanks for the quick reply. It's weird but I still get the exact same error I was getting before ( I checked that my device was set to gpu, I cleared the Theano cache, I checked for GPU instead of CPU, but nothing worked ).
-
repo owner @BAmine
Can you please post your new traceback (better if you used bitbucket code formatting)?
-
reporter Here you go :
Traceback (most recent call last): File "create_embeddings.py", line 80, in <module> main(args) File "create_embeddings.py", line 16, in main trainer.run(args) File "build\bdist.win-amd64\egg\word2embeddings\nn\trainer.py", line 466, in run File "build\bdist.win-amd64\egg\word2embeddings\nn\trainer.py", line 363, in train File "build\bdist.win-amd64\egg\word2embeddings\nn\networks.py", line 117, in build File "build\bdist.win-amd64\egg\word2embeddings\nn\networks.py", line 139, in updates File "build\bdist.win-amd64\egg\word2embeddings\nn\layers.py", line 165, in updates File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 529, in grad grad_dict, wrt, cost_name) File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1207, in _populate_grad_dict rval = [access_grad_cache(elem) for elem in wrt] File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1167, in access_grad_cache term = access_term_cache(node)[idx] File "C:\Anaconda\Lib\site-packages\Theano\theano\gradient.py", line 1028, in access_term_cache input_grads = node.op.grad(inputs, new_output_grads) File "C:\Anaconda\Lib\site-packages\Theano\theano\sandbox\cuda\basic_ops.py", line 80, in grad return [gpu_from_host(gz)] File "C:\Anaconda\Lib\site-packages\Theano\theano\gof\op.py", line 411, in __call__ node = self.make_node(*inputs, **kwargs) File "C:\Anaconda\Lib\site-packages\Theano\theano\sandbox\cuda\basic_ops.py", line 126, in make_node x.type)) TypeError: Expected a Theano variable with type TensorType. Got SparseVariable{csc,float32} with type Sparse[float32, cs c]
-
repo owner @BAmine
- Does your code run under CPU?
- Are you sure you reinstalled the package? this should not happen!
In the worst case in the file word2embeddings/nn/layers.py, replace
indexed_rows = theano.sparse_grad(self.weights[concatenated_input])
with
indexed_rows = theano.grad(self.weights[concatenated_input])
manually in your cloned repo and reinstall it using setup.py
-
reporter Ok, I'll try this, but shouldn't theano.grad() take 2 arguments and not 1 ? Thank you !
-
repo owner @BAmine
You are right, I just saw that vivek added grad instead of just removing the sparse_grad hint.
He is working on the fix.
-
Please sync to https://bitbucket.org/aboSamoor/word2embeddings/commits/2e0c1fb5e8b91eeaf56fdd4578b8a8c80845a16d. This should fix it.
-
reporter Thanks guys, now I don't get the error anymore, thanks a lot. Is it normal that the code is frozen for like 20 minutes now on this ? :
The output file is available at trainer.png The output file is available at validator.png 2014-05-21 17:49:14 INFO trainer.py: 366 Training will start with these settings 2014-05-21 17:49:14 INFO trainer.py: 367 floatX: float32 2014-05-21 17:49:14 INFO trainer.py: 368 intX: int64 2014-05-21 17:49:14 INFO trainer.py: 369 allow_gc: False 2014-05-21 17:49:14 INFO trainer.py: 370 device: gpu 2014-05-21 17:49:14 INFO trainer.py: 371 batch_size: 16
-
It is very likely training and training takes quite time, depending on the size of your corpus. We do print some messages but they are at the DEBUG Log level. Can you try training on the sample corpus and voc file to see if things go fine. If things work on the small corpus you can then try on a larger corpus.
-
reporter Thanks a lot, I tought there would be some debugging output. I've tried it on the small samples and it works, Thank you again.
-
Great ...if you want to enable debug output you need to stet it via '-l' flag. Use --log DEBUG to set logger to Debug level. By default it is set to INFO
Also you can dump a model every time period T seconds by passing te --dump-period <T> By default it is set to 1800 seconds which is about half an hour. So if you would like intermediate results you can tune that parameter. Hope this helps.
-
- changed status to closed
Closed
- Log in to comment
I will look at this.