Document
- The simple c version author is Eric
- Overlap Data Transfers in CUDA
CNN accelerated by cuda.
The start-of-art result's of popular datasets
- Test on mnist and get 99.76%, after voting(99.82%) (best 99.79%)
- Test on cifar-10 and get 81.42% (best 90%)
- Test on cifar-100 and get 51.13% (best 65%)
- Use DropConnect to train the NetWork
- Support checkpoint, the program will save the best test result and save the network weight in the file "Result/checkPoint.txt", If the program exit accidentally, you can continue the program form this checkpoint.
- Translate the data set of mnist, including scale, rotate, distortion, accordding to Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis.
- The log will be saved in the file "Result/log.txt".
- In the convolutional layers, you can chose combine feature maps, according to notes on Convolutional Neural NetWorks.
- Support local connection layers. The demo Configure file Cifar10 is very small but can get 79.9%.
- If you want the program run fast, you can set the "TEST_EPOCH" to be large.
Depend on opencv and cuda
You can compile the code on windows or linux.
###SDK include path(-I)
- linux: /usr/local/cuda/samples/common/inc/ (For include file "helper_cuda"); /usr/local/include/opencv/ (Depend on situation)
- windows: X:/Program Files (x86) /NVIDIA Corporation/CUDA Samples/v6.5/common/inc (For include file "helper_cuda"); X:/Program Files/opencv/vs2010/install/include (Depend on situation)
###Library search path(-L)
- linux: /usr/local/lib/
- windows: X:/Program Files/opencv/vs2010/install/x86/cv10/lib (Depend on situation)
###libraries(-l)
- opencv_core
- opencv_highgui
- opencv_imgproc
- opencv_imgcodecs (need for opencv3.0)
- cublas
- curand
- cudadevrt
###GPU compute
- capability 2.0
###Windows
- Install vs2010.
- Download and install opencv-2.4 or other higher versions
- Download and install cuda-5.0 or other higher versions
- When you create a new project using VS2010, You can find NVIDIA-CUDA project template, create a cuda-project.
- View-> Property Pages-> Configuration Properties-> CUDA C/C++ -> Device-> Code Generation-> compute_20,sm_20
- View-> Property Pages-> Configuration Properties-> CUDA C/C++ -> Common-> Generate Relocatable Device Code-> Yes(-rdc=true)
- View-> Property Pages-> Configuration Properties-> Linker-> Input-> Additional Dependencies-> libraries(-l)
- View-> Property Pages-> Configuration Properties-> VC++ Directories-> General-> Library search path(-L)
- View-> Property Pages-> Configuration Properties-> VC++ Directories-> General-> Include Directories(-I)
###Linux
- Install opencv and cuda
- Start the nsight from cuda
- Create an 'empty cuda' project and import the clone code
- Project->Proerties for add-> Build-> Settings->CUDA->Device linker mode: separate compilation
- Project->Proerties for add-> Build-> Settings->CUDA->Generate PTX code 2.0
- Project->Proerties for add-> Build-> Settings->CUDA->Generate GPU code 2.0
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Compiler->includes: +/usr/local/cuda/samples/common/inc/; + opencv sdk include path ;
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Linkers->Libraries: libraries(-l)
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Linkers->Libraries search path(-L): /usr/local/lib/
Config
- Author :zhxfl
- Mail :[email protected]
- 单位 :中国科学技术大学苏州研究院多核系统实验室
- Welcome for any suggest!!