Commit Graph

114 Commits

Author SHA1 Message Date
f92b20580a Some fixes for AVX support on CPU 2018-08-14 01:51:31 +03:00
b1dddf02cc Fixed AVX compiled bug 2018-08-13 02:43:45 +03:00
1f2155b886 Experiments 2018-08-11 02:49:55 +03:00
a9fef1bd66 Bug fixes. Tested im2col_cpu_custom_transpose - bad way. 2018-08-11 00:26:53 +03:00
3e856ec04e Optimized: transpose 2018-08-10 01:27:20 +03:00
d6162af210 Optimized on CPU: gemm_bin, im2col, activation, transpose 2018-08-09 02:31:36 +03:00
a284a7da8d Try to use avx_hs() - slow and requires alignment 4096 bits < (l.size*l.size*l.c)
May be faster only from 8192 bits and more.
2018-08-08 19:08:58 +03:00
0a326e7afe XNOR-net on CPU AVX2 2018-08-08 02:45:47 +03:00
cfc5fedbb6 Just used spaces for indents instead of Tabs 2018-07-10 23:29:15 +03:00
ec68838342 Fixed memory leaks for Yolo: train, test 2018-05-23 18:27:18 +03:00
c1bb8c129d Fixed xnor for random=1 2018-05-19 16:52:05 +03:00
8b5344ee2d Added BFLOPs output for network configurations 2018-05-14 13:34:40 +03:00
028696bf15 Output improvements for detector results:
When printing detector results, output was done in random order, obfuscating results for interpreting. Now:
1. Text output includes coordinates of rects in (left,right,top,bottom in pixels) along with label and score
2. Text output is sorted by rect lefts to simplify finding appropriate rects on image
3. If several class probs are > thresh for some detection, the most probable is written first and coordinates for others are not repeated
4. Rects are imprinted in image in order by their best class prob, so most probable rects are always on top and not overlayed by less probable ones
5. Most probable label for rect is always written first
Also:
6. Message about low GPU memory include required amount
2018-05-03 16:33:46 +03:00
9bae70b225 Accelerated by another 5% using FP16/32 Batch-norm for Tensor Cores. 2018-04-17 02:51:11 +03:00
c52fa47428 Loss-graph store automatically (iterations == max_batches) at the end of training 2018-04-16 13:09:10 +03:00
eb9c88ef73 Fixed bug in Tensor Cores V100 (1. Desc in Batch norm, 2. Manually selected algo).
Also fixed time measure on Linux for multi-threading.
2018-04-15 01:51:21 +03:00
537d135feb Improve training performance - batch-norm using cuDNN. 2018-03-20 02:16:51 +03:00
880cf187d8 Fixed multi-GPU training for Tensor Cores 2018-03-09 19:44:46 +03:00
cad4d1618f Added support for Tensor Cores CC >= 7.0 (V100). For FP16/32 (mixed precision) define CUDNN_HALF should be used. 2018-02-25 16:29:44 +03:00
cd2bdec090 Updated to CUDA 9.1. And fixed no_gpu dependecies. 2018-02-23 15:05:31 +03:00
f558d5c39c Fix 2018-02-22 23:16:36 +03:00
dda993f3dd Use half_float16 instead of float32 if defined both CUDNN and CUDNN_HALF. Use Tensor Cores. 2018-02-22 22:54:40 +03:00
033e934ce8 If there is excessive GPU-RAM consumption by CUDNN then then do not use Workspace 2018-02-21 19:14:01 +03:00
4b0be8c701 Optimized resizing of network for random=1 2018-02-21 15:06:11 +03:00
bc810016a1 cuDNN 6.0 supported. Also speed of console example improved. 2017-08-03 01:36:22 +03:00
d7a30ada7e Fixed behavior if missing library cudnn.lib 2017-01-16 12:51:42 +03:00
3b9afd4cd2 Fixed behavior if missing library cudnn.lib 2017-01-16 00:44:41 +03:00
62235e9aa3 cpu batch norm works 2016-11-18 21:51:36 -08:00
fc9b867dd9 🔥 🔥 :dragonite: 2016-11-16 00:15:46 -08:00
0d6b107ed2 hey 2016-11-15 22:53:58 -08:00
c7a700dc22 new font strategy 2016-11-05 14:09:21 -07:00
352ae7e65b ADAM 2016-10-26 08:35:44 -07:00
481b57a96a So I have this new programming paradigm....... 2016-09-24 23:12:54 -07:00
73f7aacf35 better multigpu 2016-09-20 11:34:49 -07:00
5c067dc447 good chance I didn't break anything 2016-09-12 13:55:20 -07:00
8f1b4e0962 updates and things 2016-09-01 16:48:41 -07:00
845ab75796 some more stuff 2016-08-05 15:27:07 -07:00
9361292c42 updates 2016-07-19 14:50:01 -07:00
08c7cf9c88 no mean on input binarization 2016-06-19 14:28:15 -07:00
8322a58cf6 hate warnings 2016-06-14 11:30:28 -07:00
7520949d84 idk just in case 2016-06-08 11:07:31 -07:00
8a767f1066 stuff for carlo 2016-06-06 15:48:52 -07:00
4625a16ffd tactics 2016-06-06 13:22:45 -07:00
ec3d050a76 hope i didn't break anything 2016-06-02 15:25:24 -07:00
881d6ee9b6 fixed 2016-05-13 13:46:31 -07:00
13209df7bb art, cudnn 2016-05-13 11:59:43 -07:00
c7b10ceadb so much need to commit 2016-05-06 16:25:16 -07:00
cff59ba135 go updates 2016-03-16 04:30:48 -07:00
913d355ec1 lots of stuff 2016-01-28 12:30:38 -08:00
1578ec70d7 idk 2016-01-18 15:40:14 -08:00