c50b0e0c8a
Minor Python and C API improvement
2019-02-06 14:38:12 +03:00
b76f1c0006
Merge pull request #2352 from aughey/master
...
Changes to better support python bindings
2019-02-06 14:23:26 +03:00
285088adc4
Fixed checking CC for enabling Tensor Cores
2019-02-06 01:55:42 +03:00
e1bbeb8367
CUDNN_HALF and CC 7.5 by default in darknet.sln
2019-02-05 11:59:58 -06:00
fa1415e3c2
CUDNN_HALF and CC 7.5 by default in darknet.sln
2019-02-05 20:43:07 +03:00
c00d3c92db
Making a fast API compatible way to copy image data.
...
This can improve on the python array_to_image function
where we already have an allocated image struct and simply
need to copy the data into the correct format/shape without
reallocating.
Aside: I think array_to_image does two frees because the pointer
data_as returns doesn't own the memory that is ultimately freed
in free_image.
2019-02-05 11:40:21 -06:00
7e9416aa80
Making a pointer version of network_predict for python.
...
The python binding requires the network struct to be passed
as a pointer to it rather than a struct copy.
2019-02-05 11:35:45 -06:00
8726d7b0db
Optimizing network_predict_image to resize only if necessary.
...
This speeds up the processing considerably if the user is so
kind to resize the image prior to doing detection.
2019-02-05 11:30:38 -06:00
edfdf2c20e
Fixed bug in Tensor Cores training
2019-02-05 19:33:10 +03:00
12b6e93893
CHECK_CUDA is used everywhere
2019-02-05 16:18:36 +03:00
ce2e0eff00
DEBUG=1 fixed
2019-02-05 00:36:17 +03:00
d767e8ca38
Minor fixes
2019-02-04 23:29:06 +03:00
5446d19576
Checks Compute Capability and forcibly disables Tensor Cores for CC < 7.0
2019-02-04 23:28:40 +03:00
f7cb538b32
Compile fix
2019-02-03 00:37:00 +03:00
584f840b40
CUDA_CHECK definition for debug
2019-02-03 00:19:04 +03:00
61156239e0
Minor performance improvement
2019-02-03 00:18:30 +03:00
dc7e7f035d
improve XNOR Tensor Cores GEMM - N 2x unrolled - minor performance improvement
2019-02-02 17:57:30 +03:00
41814fc4b3
Minor fixes
2019-02-02 15:16:57 +03:00
ff0733ed40
Speedup repack_input_kernel_bin()
2019-02-02 15:16:25 +03:00
2d747cab2b
Minor fixes
2019-02-02 03:16:30 +03:00
f91d5a5e09
Fixed __shfl() and __ballot() warnings
2019-02-02 03:16:05 +03:00
e1ec8a8b07
Update Readme.md
2019-02-02 00:58:09 +03:00
f09a9c3315
XNOR uses Tensor Cores on Turing GPU CC>=7.3 (not Volta)
2019-02-02 00:24:34 +03:00
e17bd9ba8f
Minor fix
2019-02-01 01:32:26 +03:00
a607784626
Added crnn.train.cfg just for test
2019-02-01 01:32:03 +03:00
c7309c1fdb
Fixed CRNN (RNN based on Convolution) layer
2019-02-01 01:30:02 +03:00
bd91d0a908
Add try-catch to the http_stream.cpp
2019-01-31 14:33:05 +03:00
c71354ab2e
Added cudaGetLastError() for cudaHostAlloc() to reset last cuda error
2019-01-31 14:22:07 +03:00
381f90ebb8
Fixed CUDA error checking
2019-01-29 13:46:30 +03:00
2790464de1
Another compile fix
2019-01-29 00:11:32 +03:00
ae8a8e6016
Compile fix
2019-01-29 00:05:08 +03:00
640bdbc063
LSTM, RNN, GRU - use connected_layer that uses cuDNN. Fixed CRNN for conv-layer with cuDNN.
2019-01-28 23:50:51 +03:00
0e1f3eaf35
Fixed DLL/SO
2019-01-28 20:32:30 +03:00
3692c174c5
Compile fix
2019-01-28 20:25:14 +03:00
110b5240a4
Fixed LSTM-layer
2019-01-28 20:22:14 +03:00
85b99872cb
Use non-default stream for all CUDA-functions
2019-01-28 20:19:26 +03:00
00b87281f3
Fixed RNN (RNN, GRU, LSTM) with cuDNN (batch-norm)
2019-01-27 03:42:44 +03:00
9576cd4d89
Fixed memory allocation
2019-01-26 23:25:09 +03:00
090d934c0f
Minor speedup on CPU
2019-01-26 19:12:46 +03:00
630f441e08
Minor CPU speedup - i7 6500K: 1000ms (AVX=1) instead of 1500ms (old AVX=1) and 2000ms (AVX=0)
2019-01-26 02:54:41 +03:00
1b15e2f8df
Compile fix on Windows
2019-01-24 20:30:15 +03:00
da044776d1
Merge pull request #2282 from davidssmith/master
...
add LSTM layer
2019-01-24 20:19:57 +03:00
a7366a5a0a
Compile fix for CC < 7.3
2019-01-24 20:19:01 +03:00
96773df469
add lstm_layer.o to Makefile
2019-01-24 09:38:45 -06:00
5e778cd91e
add LSTM layer
2019-01-23 22:02:09 -06:00
29aa716bd9
Update Readme.md
2019-01-23 18:04:31 +03:00
2d3220cef5
Look at wmma::bmma_sync(), bmmaBitOpXOR, bmmaAccumulateOpPOPC
2019-01-23 00:35:44 +03:00
b47db904ee
Merge pull request #2272 from Sauraus/master
...
gcc on OSX required explicit return value for empty (char *) in detec…
2019-01-22 21:45:38 +03:00
8960fbfb3f
gcc on OSX required explicit return value for empty (char *) in detection_to_json
2019-01-22 10:17:42 -08:00
2cd37ec73e
Another minor fix
2019-01-22 17:31:07 +03:00