Commit Graph

153 Commits

Author SHA1 Message Date
038289eb7d Some conv-lstm, sgdr and other fixes 2019-05-14 14:35:22 +03:00
4f72fcc015 Added grouped convolutional (depth-wise convolutional) 2019-05-10 16:46:48 +03:00
31dc6c8680 Added LSTM sequence detector, and blur data augmentation (for OpenCV only) 2019-05-07 16:28:31 +03:00
b3254ed523 Fixed many warnings 2019-03-18 19:25:48 +03:00
b6e15f1656 ZED 3D Camera support added to ./uselib (yolo_console_cpp.exe) example 2019-03-18 02:48:52 +03:00
75f2a3e7cf Added object Detection & Tracking using conv-rnn layer on frames from video 2019-03-02 03:32:24 +03:00
9b09abe122 Fixed convolutional-layer when it is used as base for crnn-layer 2019-02-28 20:47:22 +03:00
00de023601 fully separate C-API from CPP-API 2019-02-19 15:57:18 +01:00
b3579380dc improve compatibility with c++ compilers, prepare for CMake 2019-02-15 17:27:12 +01:00
28106c0fd8 Optimized memory allocation for XNOR on CPU 2019-02-12 22:16:11 +03:00
9e07605bc5 get_connected_workspace_size() and get_convolutional_workspace_size() 2019-02-08 00:51:20 +03:00
12b6e93893 CHECK_CUDA is used everywhere 2019-02-05 16:18:36 +03:00
f09a9c3315 XNOR uses Tensor Cores on Turing GPU CC>=7.3 (not Volta) 2019-02-02 00:24:34 +03:00
c7309c1fdb Fixed CRNN (RNN based on Convolution) layer 2019-02-01 01:30:02 +03:00
640bdbc063 LSTM, RNN, GRU - use connected_layer that uses cuDNN. Fixed CRNN for conv-layer with cuDNN. 2019-01-28 23:50:51 +03:00
090d934c0f Minor speedup on CPU 2019-01-26 19:12:46 +03:00
2d3220cef5 Look at wmma::bmma_sync(), bmmaBitOpXOR, bmmaAccumulateOpPOPC 2019-01-23 00:35:44 +03:00
46be08db37 Minor fix 2019-01-22 16:23:44 +03:00
3a51f4af74 Experimental repack 2019-01-18 19:52:11 +03:00
5343aa4235 CUDA minor performance improvement 2019-01-16 18:08:11 +03:00
4c05166215 Temporary experimental XNOR on GPU (repack channels) 2019-01-16 02:43:44 +03:00
c75fbb5f2e Minor fix 2019-01-06 15:45:10 +03:00
48d461f9bd Temporary experimental XNOR improvements on CPU 2019-01-04 23:19:45 +03:00
64e478db07 Fix training approach (convolutional layer) 2018-12-27 00:31:28 +03:00
cb998db949 Some fix for CUDNN_HALF 2018-12-11 21:16:18 +03:00
7c2f302321 Fixed nan issue for training with CUDNN_HALF=1 by using Tensor Cores 2018-12-07 22:40:10 +03:00
25f65f6878 Added fast_binarize_weights_gpu() 2018-11-05 22:38:35 +03:00
31ac46ba22 Fixed bug for 32-bit compilation without GPU. 2018-10-23 21:59:57 +03:00
d487bdf471 transpose 32x32 on GPU 2018-10-19 22:55:25 +03:00
9e2c894a32 Transpose on CPU fix 2018-10-19 16:31:55 +03:00
7dd97537fb XNOR-net tiny-yolo_xnor.cfg ~2x faster than cuDNN on CUDA (nVidia GPU Maxwell) 2018-09-22 02:01:14 +03:00
c0e01fd63c Test for XNOR-conv on CUDA 2018-09-08 02:46:05 +03:00
b141f85cab Compile fix 2018-09-07 15:07:46 +03:00
007878393f Temporary Slow implementation of XNOR on CUDA (shared_memory) 2018-09-06 23:21:26 +03:00
c4a9e3422e Temporary implementation of XNOR on CUDA 2018-08-31 02:47:58 +03:00
9753b72aeb temp fix, don't use it 2018-08-30 17:24:41 +03:00
18d5e4f39c Fixed yolov3-tiny_xnor.cfg 2018-08-24 18:29:40 +03:00
31b6b0bad3 XNOR-net 4x acceleration on CPU for yolov2-tiny - 22 FPS (CPU Core i7 6700K) 2018-08-23 02:44:21 +03:00
f606b5456e XNOR-net 21 FPS on CPU yolov2-tiny.cfg 2018-08-22 17:52:48 +03:00
f92b20580a Some fixes for AVX support on CPU 2018-08-14 01:51:31 +03:00
b1dddf02cc Fixed AVX compiled bug 2018-08-13 02:43:45 +03:00
1f2155b886 Experiments 2018-08-11 02:49:55 +03:00
a9fef1bd66 Bug fixes. Tested im2col_cpu_custom_transpose - bad way. 2018-08-11 00:26:53 +03:00
3e856ec04e Optimized: transpose 2018-08-10 01:27:20 +03:00
d6162af210 Optimized on CPU: gemm_bin, im2col, activation, transpose 2018-08-09 02:31:36 +03:00
a284a7da8d Try to use avx_hs() - slow and requires alignment 4096 bits < (l.size*l.size*l.c)
May be faster only from 8192 bits and more.
2018-08-08 19:08:58 +03:00
0a326e7afe XNOR-net on CPU AVX2 2018-08-08 02:45:47 +03:00
cfc5fedbb6 Just used spaces for indents instead of Tabs 2018-07-10 23:29:15 +03:00
ec68838342 Fixed memory leaks for Yolo: train, test 2018-05-23 18:27:18 +03:00
c1bb8c129d Fixed xnor for random=1 2018-05-19 16:52:05 +03:00