Commit Graph

46 Commits

Author SHA1 Message Date
9bae70b225 Accelerated by another 5% using FP16/32 Batch-norm for Tensor Cores. 2018-04-17 02:51:11 +03:00
537d135feb Improve training performance - batch-norm using cuDNN. 2018-03-20 02:16:51 +03:00
880cf187d8 Fixed multi-GPU training for Tensor Cores 2018-03-09 19:44:46 +03:00
cad4d1618f Added support for Tensor Cores CC >= 7.0 (V100). For FP16/32 (mixed precision) define CUDNN_HALF should be used. 2018-02-25 16:29:44 +03:00
cd2bdec090 Updated to CUDA 9.1. And fixed no_gpu dependecies. 2018-02-23 15:05:31 +03:00
6332ea99ab one more fix 2018-02-23 00:13:08 +03:00
b2b5756d86 Added __float2half_rn() and __half2float() 2018-02-22 23:52:43 +03:00
dda993f3dd Use half_float16 instead of float32 if defined both CUDNN and CUDNN_HALF. Use Tensor Cores. 2018-02-22 22:54:40 +03:00
9920410ba9 minor fix 2017-07-14 12:11:45 +03:00
d7a30ada7e Fixed behavior if missing library cudnn.lib 2017-01-16 12:51:42 +03:00
3b9afd4cd2 Fixed behavior if missing library cudnn.lib 2017-01-16 00:44:41 +03:00
75fe603722 :vegan: :charizard: 2016-11-24 22:56:23 -08:00
c7a700dc22 new font strategy 2016-11-05 14:09:21 -07:00
352ae7e65b ADAM 2016-10-26 08:35:44 -07:00
73f7aacf35 better multigpu 2016-09-20 11:34:49 -07:00
5c067dc447 good chance I didn't break anything 2016-09-12 13:55:20 -07:00
8f1b4e0962 updates and things 2016-09-01 16:48:41 -07:00
afb8b4f98b CVPR prep 2016-06-22 21:46:32 -07:00
08c7cf9c88 no mean on input binarization 2016-06-19 14:28:15 -07:00
8322a58cf6 hate warnings 2016-06-14 11:30:28 -07:00
729ce43e6e stuff 2016-06-09 17:20:31 -07:00
ec3d050a76 hope i didn't break anything 2016-06-02 15:25:24 -07:00
13209df7bb art, cudnn 2016-05-13 11:59:43 -07:00
c7b10ceadb so much need to commit 2016-05-06 16:25:16 -07:00
cff59ba135 go updates 2016-03-16 04:30:48 -07:00
d1965bdb96 Go 2016-03-13 23:18:42 -07:00
16d06ec0db stuff 2016-02-29 13:54:12 -08:00
913d355ec1 lots of stuff 2016-01-28 12:30:38 -08:00
892923514f fixed darknet, stuff 2015-12-08 15:12:10 -08:00
c2738835f0 Faster batch normalization 2015-12-07 17:18:04 -08:00
0f7f2899b6 Fix for cuda 7.5 2015-11-15 19:51:26 -08:00
8fd18add6e CVPR Experiments 2015-11-03 19:23:42 -08:00
d00f0a1ccd Changes to make routing work better 2015-07-21 16:09:33 -07:00
6553b3f0e3 no comment 2015-03-29 19:31:47 -07:00
d7d7da2653 Fixed im2col mistake >< face#palm 2015-03-26 19:13:59 -07:00
e92f7d301c smaller gridsize in bias 2015-03-24 18:27:12 -07:00
7100de0b59 going to break stuff 2015-03-22 21:28:45 -07:00
664c5dd2f2 Subdivisions for batches 2015-03-22 09:56:40 -07:00
9d418102f4 using caffe's im2col, it's so much better\! 2015-03-21 14:17:39 -07:00
4af116e996 gonna change im2col 2015-03-21 12:25:14 -07:00
dcb000b553 refactoring and added DARK ZONE 2015-03-11 22:20:15 -07:00
0f645836f1 Detection is back, baby\! 2015-02-10 19:41:03 -08:00
979d02126b Generalizing conv layer so deconv is easier 2015-02-09 13:27:58 -08:00
bfffadc755 Stable place to commit 2015-02-04 12:41:20 -08:00
153705226d Bias updates bug fix 2015-01-27 13:31:06 -08:00
809f924db2 CUDA so fast 2015-01-22 16:38:24 -08:00