Commit Graph

1111 Commits

Author SHA1 Message Date
b3579380dc improve compatibility with c++ compilers, prepare for CMake 2019-02-15 17:27:12 +01:00
3d9c8530a0 Use Tensor Cores only when (channels % 8 == 0) and (filters % 8 == 0) 2019-02-12 23:13:25 +03:00
28106c0fd8 Optimized memory allocation for XNOR on CPU 2019-02-12 22:16:11 +03:00
449fcfed75 Fix for GCC on ARM 32/64-bit 2019-02-12 22:15:35 +03:00
00e992a600 Compile fix 2019-02-12 02:12:46 +03:00
5448e07445 Try to fuse conv_xnor+shortcut -> conv_xnor 2019-02-12 02:05:15 +03:00
9e138adf09 more accurate time measurements in Demo 2019-02-12 02:01:10 +03:00
7dff7365cb Minor demo fix 2019-02-11 23:16:30 +03:00
f154d2070a Fixed RNN (LSTM, RNN, CRNN, GRU) for CUDNN_HALF=1 2019-02-08 00:51:41 +03:00
9e07605bc5 get_connected_workspace_size() and get_convolutional_workspace_size() 2019-02-08 00:51:20 +03:00
6832290eee Fixed set_batch_network(), when workspace larger for smaller batch 2019-02-08 00:49:51 +03:00
58de6b2d3d Minor fix for CHECK_CUDA() 2019-02-08 00:48:27 +03:00
98103552fb Minor fix 2019-02-07 18:19:34 +03:00
6c28da5def Draw top5 accuracy on the Loss-chart for training Classifier 2019-02-07 18:05:58 +03:00
fc663f6efe Another minor fix 2019-02-07 15:02:36 +03:00
9bb7455a0e Minor fix 2019-02-07 14:47:43 +03:00
c999f53e9d Merge pull request #2359 from aughey/master
Change for py  code
2019-02-07 14:43:01 +03:00
7587d47c46 Partial fixed 2019-02-06 18:54:38 -06:00
0543278a5b Partial fixed 2019-02-07 00:15:31 +03:00
022ce74fe9 Rewriting darknet_video.py to reuse darknet.py as a lib 2019-02-06 14:59:23 -06:00
64b217aa86 Update Readme.md 2019-02-06 14:51:02 +03:00
c50b0e0c8a Minor Python and C API improvement 2019-02-06 14:38:12 +03:00
b76f1c0006 Merge pull request #2352 from aughey/master
Changes to better support python bindings
2019-02-06 14:23:26 +03:00
285088adc4 Fixed checking CC for enabling Tensor Cores 2019-02-06 01:55:42 +03:00
e1bbeb8367 CUDNN_HALF and CC 7.5 by default in darknet.sln 2019-02-05 11:59:58 -06:00
fa1415e3c2 CUDNN_HALF and CC 7.5 by default in darknet.sln 2019-02-05 20:43:07 +03:00
c00d3c92db Making a fast API compatible way to copy image data.
This can improve on the python array_to_image function
where we already have an allocated image struct and simply
need to copy the data into the correct format/shape without
reallocating.

Aside: I think array_to_image does two frees because the pointer
data_as returns doesn't own the memory that is ultimately freed
in free_image.
2019-02-05 11:40:21 -06:00
7e9416aa80 Making a pointer version of network_predict for python.
The python binding requires the network struct to be passed
as a pointer to it rather than a struct copy.
2019-02-05 11:35:45 -06:00
8726d7b0db Optimizing network_predict_image to resize only if necessary.
This speeds up the processing considerably if the user is so
kind to resize the image prior to doing detection.
2019-02-05 11:30:38 -06:00
edfdf2c20e Fixed bug in Tensor Cores training 2019-02-05 19:33:10 +03:00
12b6e93893 CHECK_CUDA is used everywhere 2019-02-05 16:18:36 +03:00
ce2e0eff00 DEBUG=1 fixed 2019-02-05 00:36:17 +03:00
d767e8ca38 Minor fixes 2019-02-04 23:29:06 +03:00
5446d19576 Checks Compute Capability and forcibly disables Tensor Cores for CC < 7.0 2019-02-04 23:28:40 +03:00
f7cb538b32 Compile fix 2019-02-03 00:37:00 +03:00
584f840b40 CUDA_CHECK definition for debug 2019-02-03 00:19:04 +03:00
61156239e0 Minor performance improvement 2019-02-03 00:18:30 +03:00
dc7e7f035d improve XNOR Tensor Cores GEMM - N 2x unrolled - minor performance improvement 2019-02-02 17:57:30 +03:00
41814fc4b3 Minor fixes 2019-02-02 15:16:57 +03:00
ff0733ed40 Speedup repack_input_kernel_bin() 2019-02-02 15:16:25 +03:00
2d747cab2b Minor fixes 2019-02-02 03:16:30 +03:00
f91d5a5e09 Fixed __shfl() and __ballot() warnings 2019-02-02 03:16:05 +03:00
e1ec8a8b07 Update Readme.md 2019-02-02 00:58:09 +03:00
f09a9c3315 XNOR uses Tensor Cores on Turing GPU CC>=7.3 (not Volta) 2019-02-02 00:24:34 +03:00
e17bd9ba8f Minor fix 2019-02-01 01:32:26 +03:00
a607784626 Added crnn.train.cfg just for test 2019-02-01 01:32:03 +03:00
c7309c1fdb Fixed CRNN (RNN based on Convolution) layer 2019-02-01 01:30:02 +03:00
bd91d0a908 Add try-catch to the http_stream.cpp 2019-01-31 14:33:05 +03:00
c71354ab2e Added cudaGetLastError() for cudaHostAlloc() to reset last cuda error 2019-01-31 14:22:07 +03:00
381f90ebb8 Fixed CUDA error checking 2019-01-29 13:46:30 +03:00