|
|
|
|
![]() NVIDIA Jetson Nano 開発者キット B01 ASIN: B084DSDDLT マクニカ [NVIDIA社国内正規代理店]が販売 ※ 2020年出荷の B01版 ※ B01版は Intel 8260NGWが使える |
![]() SUCCUL ACアダプター 5V 4A 大手メーカーOEM社製品 センタープラス スイッチング式 最大出力20W 出力プラグ外径5.5mm(内径2.1mm)PSE取得品 ASIN: B015RKFAA2 ※ 4A電源を使用する場合は右のジャンパーピンも必要 |
![]() Bullet ジャンパーピン5色セット 取手付き JP01 ASIN: B005KVKKZY ※ 4A電源を使用する場合に必要 |
![]() NVIDIA Jetson Nanoケース、冷却ファン付きNVIDIA Jetson Nano開発者キット用アクリルケース(3.0-5.8V)シェルエンクロージャークーラー. ASIN: B07VLXHZLF ※ 4A電源を使用するなら冷却対策も! |
| NVIDIA Jetson Nano 10W mode | NVIDIA Jetson Nano 10W mode | Raspberry Pi 3 Model B+ | 性能比較 |
| GPU and CUDNN ON | GPU and CUDNN OFF | CPU only | . |
| 0.7秒 | 36秒 | 292秒 | 417倍(8倍) |
$ uname -a
Linux user-desktop 4.9.140-tegra #1 SMP PREEMPT Wed Mar 13 00:32:22 PDT 2019 aarch64 aarch64 aarch64 GNU/Linux
$ free -h
total used free shared buff/cache available
Mem: 3.9G 514M 3.0G 26M 371M 3.2G
Swap: 0B 0B 0B
$ cat /proc/cpuinfo
processor : 0
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 1
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 2
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 3
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
$ sudo apt-get update
$ sudo apt-get -y install cmake
$ cd
$ git clone https://github.com/pjreddie/darknet.git --depth 1
$ cd darknet
$ git show
commit 61c9d02ec461e30d55762ec7669d6a1d3c356fb2 (grafted, HEAD -> master, origin/master, origin/HEAD)
Author: Joseph Redmon <pjreddie@gmail.com>
Date: Fri Sep 14 08:03:20 2018 -0700
$ sudo apt-get -y install nano
$ nano Makefile
# GPUと CUDNNを有効にする
$ sed -i -e "s/GPU=0/GPU=1/g" Makefile
$ sed -i -e "s/CUDNN=0/CUDNN=1/g" Makefile
# nvcc: not found
$ make
/bin/sh: 1: nvcc: not found
Makefile:92: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 127
$ /usr/local/cuda/bin/nvcc
nvcc fatal : No input files specified; use option --help for more information
$ /usr/local/cuda/bin/nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sun_Sep_30_21:09:22_CDT_2018
Cuda compilation tools, release 10.0, V10.0.166
# nvccにパスを通す
$ export PATH=${PATH}:/usr/local/cuda/bin
$ make
gcc -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DGPU -DCUDNN obj/captcha.o obj/lsd.o obj/super.o obj/art.o obj/tag.o obj/cifar.o obj/go.o obj/rnn.o obj/segmenter.o obj/regressor.o obj/classifier.o obj/coco.o obj/yolo.o obj/detector.o obj/nightmare.o obj/instance-segmenter.o obj/darknet.o libdarknet.a -o darknet -lm -pthread -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand -lcudnn -lstdc++ libdarknet.a
$ ls -l darknet
-rwxrwxr-x 1 user user 1792208 4月 19 22:51 darknet
$ ./darknet
usage: ./darknet <function>
$ wget https://pjreddie.com/media/files/yolov2.weights
--2019-04-19 22:56:42-- https://pjreddie.com/media/files/yolov2.weights
Resolving pjreddie.com (pjreddie.com)... 128.208.4.108
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 203934260 (194M) [application/octet-stream]
Saving to: ‘yolov2.weights’
yolov2.weights 100%[===================>] 194.49M 5.57MB/s in 34s
2019-04-19 22:57:18 (5.65 MB/s) - ‘yolov2.weights’ saved [203934260/203934260]
$ ./darknet detect cfg/yolov2.cfg yolov2.weights data/person.jpg
layer filters size input output
0 conv 32 3 x 3 / 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BFLOPs
1 max 2 x 2 / 2 608 x 608 x 32 -> 304 x 304 x 32
2 conv 64 3 x 3 / 1 304 x 304 x 32 -> 304 x 304 x 64 3.407 BFLOPs
3 max 2 x 2 / 2 304 x 304 x 64 -> 152 x 152 x 64
4 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
5 conv 64 1 x 1 / 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BFLOPs
6 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
7 max 2 x 2 / 2 152 x 152 x 128 -> 76 x 76 x 128
8 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
9 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128 0.379 BFLOPs
10 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
11 max 2 x 2 / 2 76 x 76 x 256 -> 38 x 38 x 256
12 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
13 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
14 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
15 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
16 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
17 max 2 x 2 / 2 38 x 38 x 512 -> 19 x 19 x 512
18 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
19 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
20 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
21 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
22 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
23 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
24 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
25 route 16
26 conv 64 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 64 0.095 BFLOPs
27 reorg / 2 38 x 38 x 64 -> 19 x 19 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 19 x 19 x1280 -> 19 x 19 x1024 8.517 BFLOPs
30 conv 425 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 425 0.314 BFLOPs
31 detection
mask_scale: Using default '1.000000'
Loading weights from yolov2.weights...Done!
data/person.jpg: Predicted in 0.730307 seconds.
horse: 82%
dog: 86%
person: 86%
NVIDIA Jetson Nanoは 0.7秒 10Wモード
Raspberry Pi 3は 161秒
data/person.jpg: Predicted in 161.068979 seconds.
horse: 91%
dog: 85%
person: 85%
$ wget http://pjreddie.com/media/files/vgg-conv.weights
URL transformed to HTTPS due to an HSTS policy
--2019-04-19 22:59:37-- https://pjreddie.com/media/files/vgg-conv.weights
Resolving pjreddie.com (pjreddie.com)... 128.208.4.108
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58858768 (56M) [application/octet-stream]
Saving to: ‘vgg-conv.weights’
vgg-conv.weights 100%[===================>] 56.13M 5.21MB/s in 14s
2019-04-19 22:59:52 (3.91 MB/s) - ‘vgg-conv.weights’ saved [58858768/58858768]
$ ./darknet nightmare cfg/vgg-conv.cfg vgg-conv.weights data/scream.jpg 10
policy: Using default 'constant'
max_batches: Using default '0'
layer filters size input output
0 conv 64 3 x 3 / 1 224 x 224 x 3 -> 224 x 224 x 64 0.173 BFLOPs
1 conv 64 3 x 3 / 1 224 x 224 x 64 -> 224 x 224 x 64 3.699 BFLOPs
2 max 2 x 2 / 2 224 x 224 x 64 -> 112 x 112 x 64
3 conv 128 3 x 3 / 1 112 x 112 x 64 -> 112 x 112 x 128 1.850 BFLOPs
4 conv 128 3 x 3 / 1 112 x 112 x 128 -> 112 x 112 x 128 3.699 BFLOPs
5 max 2 x 2 / 2 112 x 112 x 128 -> 56 x 56 x 128
6 conv 256 3 x 3 / 1 56 x 56 x 128 -> 56 x 56 x 256 1.850 BFLOPs
7 conv 256 3 x 3 / 1 56 x 56 x 256 -> 56 x 56 x 256 3.699 BFLOPs
8 conv 256 3 x 3 / 1 56 x 56 x 256 -> 56 x 56 x 256 3.699 BFLOPs
9 max 2 x 2 / 2 56 x 56 x 256 -> 28 x 28 x 256
10 conv 512 3 x 3 / 1 28 x 28 x 256 -> 28 x 28 x 512 1.850 BFLOPs
11 conv 512 3 x 3 / 1 28 x 28 x 512 -> 28 x 28 x 512 3.699 BFLOPs
12 conv 512 3 x 3 / 1 28 x 28 x 512 -> 28 x 28 x 512 3.699 BFLOPs
13 max 2 x 2 / 2 28 x 28 x 512 -> 14 x 14 x 512
14 conv 512 3 x 3 / 1 14 x 14 x 512 -> 14 x 14 x 512 0.925 BFLOPs
15 conv 512 3 x 3 / 1 14 x 14 x 512 -> 14 x 14 x 512 0.925 BFLOPs
16 conv 512 3 x 3 / 1 14 x 14 x 512 -> 14 x 14 x 512 0.925 BFLOPs
17 max 2 x 2 / 2 14 x 14 x 512 -> 7 x 7 x 512
Loading weights from vgg-conv.weights...Done!
Iteration: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, done
0 scream_vgg-conv_10_000000
# GPUと CUDNNを無効にする
$ sed -i -e "s/GPU=1/GPU=0/g" Makefile
$ sed -i -e "s/CUDNN=1/CUDNN=0/g" Makefile
gcc -Iinclude/ -Isrc/ -Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast obj/captcha.o obj/lsd.o obj/super.o obj/art.o obj/tag.o obj/cifar.o obj/go.o obj/rnn.o obj/segmenter.o obj/regressor.o obj/classifier.o obj/coco.o obj/yolo.o obj/detector.o obj/nightmare.o obj/instance-segmenter.o obj/darknet.o libdarknet.a -o darknet -lm -pthread libdarknet.a
real 0m46.887s
user 0m57.984s
sys 0m3.388s
$ ls -l darknet
-rwxrwxr-x 1 user user 573688 4月 20 14:38 darknet
$ ./darknet detect cfg/yolov2.cfg yolov2.weights data/person.jpg
layer filters size input output
0 conv 32 3 x 3 / 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BFLOPs
1 max 2 x 2 / 2 608 x 608 x 32 -> 304 x 304 x 32
2 conv 64 3 x 3 / 1 304 x 304 x 32 -> 304 x 304 x 64 3.407 BFLOPs
3 max 2 x 2 / 2 304 x 304 x 64 -> 152 x 152 x 64
4 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
5 conv 64 1 x 1 / 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BFLOPs
6 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
7 max 2 x 2 / 2 152 x 152 x 128 -> 76 x 76 x 128
8 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
9 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128 0.379 BFLOPs
10 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
11 max 2 x 2 / 2 76 x 76 x 256 -> 38 x 38 x 256
12 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
13 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
14 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
15 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
16 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
17 max 2 x 2 / 2 38 x 38 x 512 -> 19 x 19 x 512
18 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
19 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
20 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
21 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
22 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
23 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
24 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
25 route 16
26 conv 64 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 64 0.095 BFLOPs
27 reorg / 2 38 x 38 x 64 -> 19 x 19 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 19 x 19 x1280 -> 19 x 19 x1024 8.517 BFLOPs
30 conv 425 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 425 0.314 BFLOPs
31 detection
mask_scale: Using default '1.000000'
Loading weights from yolov2.weights...Done!
data/person.jpg: Predicted in 36.355337 seconds.
horse: 82%
dog: 86%
person: 86%
# NVIDIA Jetson Nanoは 10Wモード
# GPUと CUDNNを無効にした状態では 36秒
# GPUと CUDNNを有効にした状態では 0.7秒
pi@raspberrypi:~ $ git clone https://github.com/pjreddie/darknet.git --depth 1
pi@raspberrypi:~ $ cd darknet
pi@raspberrypi:~/darknet $ git show
commit 61c9d02ec461e30d55762ec7669d6a1d3c356fb2
pi@raspberrypi:~/darknet $ ls -l darknet
-rwxr-xr-x 1 pi pi 476192 Apr 20 06:51 darknet
pi@raspberrypi:~/darknet $ wget https://pjreddie.com/media/files/yolov2.weights
yolov2.weights 100%[===================>] 194.49M 1.52MB/s in 85s
2019-04-20 06:53:29 (2.28 MB/s) - ‘yolov2.weights’ saved [203934260/203934260]
pi@raspberrypi:~/darknet $ ./darknet detect cfg/yolov2.cfg yolov2.weights data/person.jpg
layer filters size input output
0 conv 32 3 x 3 / 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BFLOPs
1 max 2 x 2 / 2 608 x 608 x 32 -> 304 x 304 x 32
2 conv 64 3 x 3 / 1 304 x 304 x 32 -> 304 x 304 x 64 3.407 BFLOPs
3 max 2 x 2 / 2 304 x 304 x 64 -> 152 x 152 x 64
4 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
5 conv 64 1 x 1 / 1 152 x 152 x 128 -> 152 x 152 x 64 0.379 BFLOPs
6 conv 128 3 x 3 / 1 152 x 152 x 64 -> 152 x 152 x 128 3.407 BFLOPs
7 max 2 x 2 / 2 152 x 152 x 128 -> 76 x 76 x 128
8 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
9 conv 128 1 x 1 / 1 76 x 76 x 256 -> 76 x 76 x 128 0.379 BFLOPs
10 conv 256 3 x 3 / 1 76 x 76 x 128 -> 76 x 76 x 256 3.407 BFLOPs
11 max 2 x 2 / 2 76 x 76 x 256 -> 38 x 38 x 256
12 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
13 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
14 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
15 conv 256 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 256 0.379 BFLOPs
16 conv 512 3 x 3 / 1 38 x 38 x 256 -> 38 x 38 x 512 3.407 BFLOPs
17 max 2 x 2 / 2 38 x 38 x 512 -> 19 x 19 x 512
18 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
19 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
20 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
21 conv 512 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 512 0.379 BFLOPs
22 conv 1024 3 x 3 / 1 19 x 19 x 512 -> 19 x 19 x1024 3.407 BFLOPs
23 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
24 conv 1024 3 x 3 / 1 19 x 19 x1024 -> 19 x 19 x1024 6.814 BFLOPs
25 route 16
26 conv 64 1 x 1 / 1 38 x 38 x 512 -> 38 x 38 x 64 0.095 BFLOPs
27 reorg / 2 38 x 38 x 64 -> 19 x 19 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 19 x 19 x1280 -> 19 x 19 x1024 8.517 BFLOPs
30 conv 425 1 x 1 / 1 19 x 19 x1024 -> 19 x 19 x 425 0.314 BFLOPs
31 detection
mask_scale: Using default '1.000000'
Loading weights from yolov2.weights...Done!
data/person.jpg: Predicted in 292.377208 seconds.
horse: 82%
dog: 86%
person: 86%
| NVIDIA Jetson Nano 10W mode | Raspberry Pi 3 Model B+ | 性能比較 | |
| $ ./cpuminer --benchmark scrypt algorithm | Total: 6.64 kH/s | Total: 1.21 kH/s | 5.48倍 |
| $ ./cpuminer --benchmark -a sha256d sha256d algorithm | Total: 3389 kH/s | Total: 1142 kH/s | 2.96倍 |
$ cd $ git clone https://github.com/tpruvot/cpuminer-multi $ cd cpuminer-multi/ $ ./build.sh configure: error: OpenSSL crypto library required make: *** No targets specified and no makefile found. Stop. $ sudo apt-get -y install automake autoconf pkg-config libcurl4-openssl-dev libjansson-dev libssl-dev libgmp-dev make g++ $ ./build.sh /home/user/cpuminer-multi/algo/cryptonight.c:327: undefined reference to `fast_aesb_pseudo_round_mut' /tmp/ccjffwmY.ltrans10.ltrans.o:/home/user/cpuminer-multi/algo/cryptonight.c:329: more undefined references to `fast_aesb_pseudo_round_mut' follow collect2: error: ld returned 1 exit status Makefile:941: recipe for target 'cpuminer' failed make[2]: *** [cpuminer] Error 1 make[2]: Leaving directory '/home/user/cpuminer-multi' Makefile:2766: recipe for target 'all-recursive' failed make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory '/home/user/cpuminer-multi' Makefile:585: recipe for target 'all' failed make: *** [all] Error 2 strip: 'cpuminer': No such file $ ./build-linux-arm.sh make[2]: Leaving directory '/home/user/cpuminer-multi' make[1]: Leaving directory '/home/user/cpuminer-multi' => done. $ ls -l cpuminer -rwxrwxr-x 1 user user 1412352 4月 19 23:31 cpuminer Stripping... => done. $ ./cpuminer -help ** cpuminer-multi 1.3.6 by tpruvot@github **
$ ./cpuminer --benchmark ** cpuminer-multi 1.3.6 by tpruvot@github ** BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd (tpruvot) [2019-04-19 23:42:53] 4 miner threads started, using 'scrypt' algorithm. [2019-04-19 23:42:54] CPU #0: 1.66 kH/s [2019-04-19 23:42:54] CPU #1: 1.66 kH/s [2019-04-19 23:42:54] CPU #2: 1.66 kH/s [2019-04-19 23:42:54] CPU #3: 1.66 kH/s [2019-04-19 23:42:54] Total: 6.64 kH/s [2019-04-19 23:42:58] Total: 6.65 kH/s [2019-04-19 23:43:03] CPU #0: 1.66 kH/s [2019-04-19 23:43:03] CPU #1: 1.66 kH/s [2019-04-19 23:43:03] CPU #2: 1.66 kH/s [2019-04-19 23:43:03] CPU #3: 1.66 kH/s [2019-04-19 23:43:03] Total: 6.65 kH/s [2019-04-19 23:43:08] Total: 6.65 kH/s [2019-04-19 23:43:13] CPU #2: 1.66 kH/s [2019-04-19 23:43:13] CPU #1: 1.66 kH/s [2019-04-19 23:43:13] CPU #3: 1.66 kH/s [2019-04-19 23:43:13] Total: 6.65 kH/s NVIDIA Jetson Nanoは 10Wモード ラズパイ3は [2017-10-14 08:28:00] thread 2: 4096 hashes, 0.95 khash/s [2017-10-14 08:28:00] thread 0: 4096 hashes, 0.95 khash/s [2017-10-14 08:28:00] thread 1: 4096 hashes, 0.93 khash/s [2017-10-14 08:28:05] thread 0: 4733 hashes, 0.96 khash/s [2017-10-14 08:28:05] thread 2: 4753 hashes, 0.95 khash/s [2017-10-14 08:28:05] Total: 2.84 khash/s
$ ./cpuminer --benchmark -a sha256d ** cpuminer-multi 1.3.6 by tpruvot@github ** BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd (tpruvot) [2019-04-19 23:45:16] 4 miner threads started, using 'sha256d' algorithm. [2019-04-19 23:45:19] CPU #0: 847.45 kH/s [2019-04-19 23:45:19] CPU #1: 846.30 kH/s [2019-04-19 23:45:19] CPU #3: 847.65 kH/s [2019-04-19 23:45:19] CPU #2: 847.41 kH/s [2019-04-19 23:45:21] Total: 3389 kH/s [2019-04-19 23:45:26] CPU #0: 846.96 kH/s [2019-04-19 23:45:26] CPU #1: 847.05 kH/s [2019-04-19 23:45:26] CPU #3: 847.71 kH/s [2019-04-19 23:45:26] Total: 3389 kH/s [2019-04-19 23:45:26] CPU #2: 847.47 kH/s [2019-04-19 23:45:31] Total: 3389 kH/s [2019-04-19 23:45:36] CPU #0: 846.90 kH/s [2019-04-19 23:45:36] CPU #1: 847.10 kH/s [2019-04-19 23:45:36] CPU #3: 847.72 kH/s [2019-04-19 23:45:36] Total: 3389 kH/s NVIDIA Jetson Nanoは 10Wモード ラズパイ3は [2017-10-14 08:30:18] thread 0: 3385205 hashes, 672.18 khash/s [2017-10-14 08:30:23] thread 1: 3399853 hashes, 679.74 khash/s [2017-10-14 08:30:23] thread 2: 3402732 hashes, 680.62 khash/s [2017-10-14 08:30:23] Total: 2033 khash/s
pi@raspberrypi:~ $ cd
pi@raspberrypi:~ $ sudo apt-get -y install automake autoconf pkg-config libcurl4-openssl-dev libjansson-dev libssl-dev libgmp-dev make g++
pi@raspberrypi:~ $ git clone https://github.com/tpruvot/cpuminer-multi
pi@raspberrypi:~ $ cd cpuminer-multi/
pi@raspberrypi:~/cpuminer-multi $ git show
commit 39fff9e5b91690ad21a89e1ffd2a5d7cdc444a1a
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Sat Mar 16 17:16:41 2019 +0100
pi@raspberrypi:~/cpuminer-multi $ ./build-linux-arm.sh
make[2]: Leaving directory '/home/pi/cpuminer-multi'
make[1]: Leaving directory '/home/pi/cpuminer-multi'
=> done.
$ ls -l cpuminer
-rwxr-xr-x 1 pi pi 1892376 Apr 20 07:44 cpuminer
Stripping...
=> done.
pi@raspberrypi:~/cpuminer-multi $ ./cpuminer -help
** cpuminer-multi 1.3.6 by tpruvot@github **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd (tpruvot)
Usage: cpuminer-multi [OPTIONS]
Options:
-a, --algo=ALGO specify the algorithm to use
allium Garlicoin double lyra2
axiom Shabal-256 MemoHash
bitcore Timetravel with 10 algos
blake Blake-256 14-rounds (SFR)
blakecoin Blake-256 single sha256 merkle
blake2b Blake2-B (512)
blake2s Blake2-S (256)
bmw BMW 256
c11/flax C11
cryptolight Cryptonight-light
cryptonight Monero
decred Blake-256 14-rounds 180 bytes
dmd-gr Diamond-Groestl
drop Dropcoin
fresh Fresh
geek GeekCash
groestl GroestlCoin
heavy Heavy
jha JHA
keccak Keccak (Old and deprecated)
keccakc Keccak (CreativeCoin)
luffa Luffa
lyra2re Lyra2RE
lyra2rev2 Lyra2REv2
lyra2v3 Lyra2REv3 (Vertcoin)
myr-gr Myriad-Groestl
neoscrypt NeoScrypt(128, 2, 1)
nist5 Nist5
pluck Pluck:128 (Supcoin)
pentablake Pentablake
phi LUX initial algo
phi2 LUX newer algo
quark Quark
qubit Qubit
rainforest RainForest (256)
scrypt scrypt(1024, 1, 1) (default)
scrypt:N scrypt(N, 1, 1)
scrypt-jane:N (with N factor from 4 to 30)
shavite3 Shavite3
sha256d SHA-256d
sia Blake2-B
sib X11 + gost (SibCoin)
skein Skein+Sha (Skeincoin)
skein2 Double Skein (Woodcoin)
sonoa A series of 97 hashes from x17
s3 S3
timetravel Timetravel (Machinecoin)
vanilla Blake-256 8-rounds
x11evo Permuted x11
x11 X11
x12 X12
x13 X13
x14 X14
x15 X15
x16r X16R (Raven)
x16s X16S (Pigeon)
x17 X17
x20r X20R
xevan Xevan (BitSend)
yescrypt Yescrypt
zr5 ZR5
-o, --url=URL URL of mining server
-O, --userpass=U:P username:password pair for mining server
-u, --user=USERNAME username for mining server
-p, --pass=PASSWORD password for mining server
--cert=FILE certificate for mining server using SSL
-x, --proxy=[PROTOCOL://]HOST[:PORT] connect through a proxy
-t, --threads=N number of miner threads (default: number of processors)
-r, --retries=N number of times to retry if a network call fails
(default: retry indefinitely)
-R, --retry-pause=N time to pause between retries, in seconds (default: 30)
--time-limit=N maximum time [s] to mine before exiting the program.
-T, --timeout=N timeout for long poll and stratum (default: 300 seconds)
-s, --scantime=N upper bound on time spent scanning current work when
long polling is unavailable, in seconds (default: 5)
--randomize Randomize scan range start to reduce duplicates
-f, --diff-factor Divide req. difficulty by this factor (std is 1.0)
-m, --diff-multiplier Multiply difficulty by this factor (std is 1.0)
-n, --nfactor neoscrypt N-Factor
--coinbase-addr=ADDR payout address for solo mining
--coinbase-sig=TEXT data to insert in the coinbase when possible
--max-log-rate limit per-core hashrate logs (default: 5s)
--no-longpoll disable long polling support
--no-getwork disable getwork support
--no-gbt disable getblocktemplate support
--no-stratum disable X-Stratum support
--no-extranonce disable Stratum extranonce support
--no-redirect ignore requests to change the URL of the mining server
-q, --quiet disable per-thread hashmeter output
--no-color disable colored output
-D, --debug enable debug output
-P, --protocol-dump verbose dump of protocol-level activities
--hide-diff Hide submitted block and net difficulty
-S, --syslog use system log for output messages
-B, --background run the miner in the background
--benchmark run in offline benchmark mode
--cputest debug hashes from cpu algorithms
--cpu-affinity set process affinity to cpu core(s), mask 0x3 for cores 0 and 1
--cpu-priority set process priority (default: 0 idle, 2 normal to 5 highest)
-b, --api-bind IP/Port for the miner API (default: 127.0.0.1:4048)
--api-remote Allow remote control
--max-temp=N Only mine if cpu temp is less than specified value (linux)
--max-rate=N[KMG] Only mine if net hashrate is less than specified value
--max-diff=N Only mine if net difficulty is less than specified value
-c, --config=FILE load a JSON-format configuration file
-V, --version display version information and exit
-h, --help display this help text and exit
pi@raspberrypi:~/cpuminer-multi $ ./cpuminer --benchmark
** cpuminer-multi 1.3.6 by tpruvot@github **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd (tpruvot)
[2019-04-20 07:46:26] 4 miner threads started, using 'scrypt' algorithm.
[2019-04-20 07:46:28] CPU #1: 0.31 kH/s
[2019-04-20 07:46:28] CPU #2: 0.31 kH/s
[2019-04-20 07:46:28] CPU #3: 0.31 kH/s
[2019-04-20 07:46:28] CPU #0: 0.30 kH/s
[2019-04-20 07:46:31] Total: 1.23 kH/s
[2019-04-20 07:46:36] CPU #3: 0.31 kH/s
[2019-04-20 07:46:36] Total: 1.22 kH/s
[2019-04-20 07:46:36] CPU #2: 0.30 kH/s
[2019-04-20 07:46:36] CPU #1: 0.30 kH/s
[2019-04-20 07:46:36] CPU #0: 0.29 kH/s
[2019-04-20 07:46:41] Total: 1.23 kH/s
[2019-04-20 07:46:46] CPU #1: 0.30 kH/s
[2019-04-20 07:46:46] CPU #2: 0.30 kH/s
[2019-04-20 07:46:46] CPU #3: 0.31 kH/s
[2019-04-20 07:46:46] Total: 1.21 kH/s
pi@raspberrypi:~/cpuminer-multi $ ./cpuminer --benchmark -a sha256d
** cpuminer-multi 1.3.6 by tpruvot@github **
BTC donation address: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd (tpruvot)
[2019-04-20 07:47:08] 4 miner threads started, using 'sha256d' algorithm.
[2019-04-20 07:47:16] CPU #2: 285.96 kH/s
[2019-04-20 07:47:16] CPU #3: 285.28 kH/s
[2019-04-20 07:47:16] CPU #1: 283.61 kH/s
[2019-04-20 07:47:16] CPU #0: 275.14 kH/s
[2019-04-20 07:47:21] Total: 1124 kH/s
[2019-04-20 07:47:25] CPU #2: 290.39 kH/s
[2019-04-20 07:47:25] CPU #3: 290.31 kH/s
[2019-04-20 07:47:25] Total: 1133 kH/s
[2019-04-20 07:47:26] CPU #1: 287.98 kH/s
[2019-04-20 07:47:26] CPU #0: 279.75 kH/s
[2019-04-20 07:47:27] Total: 1118 kH/s
[2019-04-20 07:47:30] Total: 1137 kH/s
[2019-04-20 07:47:31] CPU #2: 275.05 kH/s
[2019-04-20 07:47:31] CPU #3: 275.09 kH/s
[2019-04-20 07:47:31] Total: 1102 kH/s
[2019-04-20 07:47:36] CPU #1: 280.75 kH/s
[2019-04-20 07:47:36] CPU #0: 279.76 kH/s
[2019-04-20 07:47:36] Total: 1142 kH/s
| Test | Jetson Nano | Raspberry Pi 3 Model B+ |
| Test #1 | 44.09 sec | 59.82 sec |
| Test #2 | 259.91 sec | 480.32 sec |
| Test #3 | 975.74 sec | Failed |
| Test #4 | 1636.24 sec | 4407.35 sec |
| Test #5 | 6.40 sec | 15.74 sec |
sudo apt-get -y install ninja-build sudo pip install --upgrade git+https://github.com/Maratyszcza/PeachPy sudo: pip: command not found sudo apt-get -y install python-pip user@user-desktop:~$ python -V # Python 2.7.15rc1 sudo pip install --upgrade git+https://github.com/Maratyszcza/PeachPy sudo pip install --upgrade git+https://github.com/Maratyszcza/confu cd git clone https://github.com/Maratyszcza/NNPACK.git cd NNPACK confu setup python configure.py ninja [87/198] LINK bin/convolution-inference-bench FAILED: /home/user/NNPACK/bin/convolution-inference-bench /home/user/NNPACK/deps/googlebenchmark/src/benchmark_register.cc:227: undefined reference to `benchmark::BenchmarkName::str[abi:cxx11]() const' /home/user/NNPACK/lib/libgooglebenchmark.a(json_reporter.cc.o): In function `benchmark::JSONReporter::PrintRunData(benchmark::BenchmarkReporter::Run const&)': /home/user/NNPACK/deps/googlebenchmark/src/json_reporter.cc:192: undefined reference to `benchmark::BenchmarkName::str[abi:cxx11]() const' /home/user/NNPACK/lib/libgooglebenchmark.a(reporter.cc.o):/home/user/NNPACK/deps/googlebenchmark/src/reporter.cc:86: more undefined references to `benchmark::BenchmarkName::str[abi:cxx11]() const' follow collect2: error: ld returned 1 exit status [90/198] CXX bench/hxgemm.cc ninja: build stopped: subcommand failed. # python configure.pyの場合は Google Benchmark側のマイナーチェンジでビルドエラーが発生します回避策:
Maratyszcza commented Apr 28, 2019 confu recipe for Google Benchmark needs an update for recent Google Benchmark changes (mostly, add new files). One option is to fix the recipe here. Another option is to build NNPACK with CMake instead of Ninja.build NNPACK with CMake instead of Ninja
sudo apt-get -y install ninja-build cd git clone https://github.com/Maratyszcza/NNPACK.git cd NNPACK mkdir build cd build cmake -G Ninja .. ninja # NNPACKのベンチマーク(動作テスト) ninja test
user@user-desktop:~/NNPACK/build$ time ninja test
[0/1] Running tests...
Test project /home/user/NNPACK/build
Start 1: convolution-inference-smoketest
1/34 Test #1: convolution-inference-smoketest ......... Passed 44.09 sec
Start 2: convolution-inference-alexnet
2/34 Test #2: convolution-inference-alexnet ........... Passed 259.91 sec
Start 3: convolution-inference-overfeat
3/34 Test #3: convolution-inference-overfeat .......... Passed 975.74 sec
Start 4: convolution-inference-vgg
4/34 Test #4: convolution-inference-vgg ............... Passed 1636.24 sec
Start 5: convolution-output-smoketest
5/34 Test #5: convolution-output-smoketest ............ Passed 6.40 sec
Start 6: convolution-output-alexnet
6/34 Test #6: convolution-output-alexnet .............. Passed 3258.54 sec
Start 7: convolution-output-overfeat
7/34 Test #7: convolution-output-overfeat ............. Passed 15647.77 sec
Start 8: convolution-output-vgg
8/34 Test #8: convolution-output-vgg .................. Passed 24916.11 sec
Start 9: convolution-input-gradient-smoketest
9/34 Test #9: convolution-input-gradient-smoketest .... Passed 2.85 sec
Start 10: convolution-input-gradient-alexnet
10/34 Test #10: convolution-input-gradient-alexnet ...... Passed 1644.86 sec
Start 11: convolution-input-gradient-overfeat
11/34 Test #11: convolution-input-gradient-overfeat .....***Failed 10810.61 sec
[0/1] Running tests...
Test project /home/pi/NNPACK/build
Start 1: convolution-inference-smoketest
1/34 Test #1: convolution-inference-smoketest ......... Passed 59.82 sec
Start 2: convolution-inference-alexnet
2/34 Test #2: convolution-inference-alexnet ........... Passed 480.32 sec
Start 3: convolution-inference-overfeat
3/34 Test #3: convolution-inference-overfeat ..........***Failed 2103.42 sec
Start 4: convolution-inference-vgg
4/34 Test #4: convolution-inference-vgg ............... Passed 4407.35 sec
Start 5: convolution-output-smoketest
5/34 Test #5: convolution-output-smoketest ............ Passed 15.74 sec