HOME
  Security
   Software
    Hardware
  
FPGA
  CPU
   Android
    Raspberry Pi
  
nLite
  Xcode
   etc.
    ALL
  
LINK
BACK
 

2019/05/08

NVIDIA Jetson Nanoで NNPACKをビルドする方法、Raspberry Pi 3B+とのベンチマーク比較 NVIDIA Jetson Nanoで NNPACKをビルドする方法、Raspberry Pi 3B+とのベンチマーク比較

(NVIDIA Jetson Nanoで NNPACKをビルドしてみるテスト、ビルドするだけ、ラズパイと速度比較)

Tags: [Raspberry Pi], [電子工作], [ディープラーニング]





● NVIDIA Jetson Nanoで NNPACKをビルドする方法

 NVIDIA Jetson Nanoで NNPACKをビルドする方法

Maratyszcza/NNPACK
 Acceleration package for neural networks on multi-core CPUs
 NNPACK is an acceleration package for neural network computations. NNPACK aims to provide high-performance implementations of convnet layers for multi-core CPUs.

 ビルドして NNPACKの ninja testの動作試験で NVIDIA Jetson Nanoを長時間フル稼働させて自己満足に浸ります。


●今回動かした NVIDIA Jetson Nanoの Ubuntu OSのバージョン

user@user-desktop:~$ uname -a
Linux user-desktop 4.9.140-tegra #1 SMP PREEMPT Wed Mar 13 00:32:22 PDT 2019 aarch64 aarch64 aarch64 GNU/Linux


● NVIDIA Jetson Nanoで NNPACKを Gitのソースコードからビルドしてみる

# お決まりの sudo apt-get updateで最新状態に更新する
sudo apt-get update

# Development builds

# Install ninja build system
sudo apt-get -y install ninja-build

sudo apt-get -y install git cmake

# sudo: pip: command not found
sudo apt-get -y install python-pip

# Install PeachPy assembler and confu configuration system
sudo pip install --upgrade git+https://github.com/Maratyszcza/PeachPy
sudo pip install --upgrade git+https://github.com/Maratyszcza/confu

# Then clone NNPACK, install dependencies, configure, and build
cd
git clone https://github.com/Maratyszcza/NNPACK.git
cd NNPACK
confu setup
python configure.py

# 4コアでビルドで時間短縮
time ninja -j4
# real    1m23.922s
# user    2m13.404s
# sys     0m8.956s

# ninja smoketest(ninja smoketestは 90秒で終わる)
time ninja smoketest
# [==========] 160 tests from 9 test cases ran. (3310 ms total)
# [  PASSED  ] 160 tests.
# real    1m31.677s
# user    3m37.924s
# sys     0m9.012s

# ninja test(ninja testは物凄い時間が掛かる。)
time ninja test


● NNPACK ninja smoketest

user@user-desktop:~/NNPACK$ time ninja smoketest
[31/57] RUN fourier-test
[==========] Running 16 tests from 16 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FFT8_WITHIN_ROWS
[ RUN      ] FFT8_WITHIN_ROWS.match_reference
[       OK ] FFT8_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from FFT8_WITHIN_ROWS (1 ms total)

[----------] 1 test from FFT16_WITHIN_ROWS
[ RUN      ] FFT16_WITHIN_ROWS.match_reference
[       OK ] FFT16_WITHIN_ROWS.match_reference (2 ms)
[----------] 1 test from FFT16_WITHIN_ROWS (2 ms total)

[----------] 1 test from IFFT8_WITHIN_ROWS
[ RUN      ] IFFT8_WITHIN_ROWS.match_reference
[       OK ] IFFT8_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from IFFT8_WITHIN_ROWS (1 ms total)

[----------] 1 test from IFFT16_WITHIN_ROWS
[ RUN      ] IFFT16_WITHIN_ROWS.match_reference
[       OK ] IFFT16_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from IFFT16_WITHIN_ROWS (3 ms total)

[----------] 1 test from FFT8_DUAL_REAL_WITHIN_ROWS
[ RUN      ] FFT8_DUAL_REAL_WITHIN_ROWS.match_reference
[       OK ] FFT8_DUAL_REAL_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from FFT8_DUAL_REAL_WITHIN_ROWS (1 ms total)

[----------] 1 test from FFT16_DUAL_REAL_WITHIN_ROWS
[ RUN      ] FFT16_DUAL_REAL_WITHIN_ROWS.match_reference
[       OK ] FFT16_DUAL_REAL_WITHIN_ROWS.match_reference (2 ms)
[----------] 1 test from FFT16_DUAL_REAL_WITHIN_ROWS (2 ms total)

[----------] 1 test from IFFT8_DUAL_REAL_WITHIN_ROWS
[ RUN      ] IFFT8_DUAL_REAL_WITHIN_ROWS.match_reference
[       OK ] IFFT8_DUAL_REAL_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from IFFT8_DUAL_REAL_WITHIN_ROWS (1 ms total)

[----------] 1 test from IFFT16_DUAL_REAL_WITHIN_ROWS
[ RUN      ] IFFT16_DUAL_REAL_WITHIN_ROWS.match_reference
[       OK ] IFFT16_DUAL_REAL_WITHIN_ROWS.match_reference (2 ms)
[----------] 1 test from IFFT16_DUAL_REAL_WITHIN_ROWS (2 ms total)

[----------] 1 test from FFT4_ACROSS_ROWS
[ RUN      ] FFT4_ACROSS_ROWS.match_reference
[       OK ] FFT4_ACROSS_ROWS.match_reference (1 ms)
[----------] 1 test from FFT4_ACROSS_ROWS (2 ms total)

[----------] 1 test from FFT8_ACROSS_ROWS
[ RUN      ] FFT8_ACROSS_ROWS.match_reference
[       OK ] FFT8_ACROSS_ROWS.match_reference (4 ms)
[----------] 1 test from FFT8_ACROSS_ROWS (4 ms total)

[----------] 1 test from IFFT4_ACROSS_ROWS
[ RUN      ] IFFT4_ACROSS_ROWS.match_reference
[       OK ] IFFT4_ACROSS_ROWS.match_reference (1 ms)
[----------] 1 test from IFFT4_ACROSS_ROWS (1 ms total)

[----------] 1 test from IFFT8_ACROSS_ROWS
[ RUN      ] IFFT8_ACROSS_ROWS.match_reference
[       OK ] IFFT8_ACROSS_ROWS.match_reference (4 ms)
[----------] 1 test from IFFT8_ACROSS_ROWS (4 ms total)

[----------] 1 test from FFT8_REAL_ACROSS_ROWS
[ RUN      ] FFT8_REAL_ACROSS_ROWS.match_reference
[       OK ] FFT8_REAL_ACROSS_ROWS.match_reference (2 ms)
[----------] 1 test from FFT8_REAL_ACROSS_ROWS (2 ms total)

[----------] 1 test from FFT16_REAL_ACROSS_ROWS
[ RUN      ] FFT16_REAL_ACROSS_ROWS.match_reference
[       OK ] FFT16_REAL_ACROSS_ROWS.match_reference (4 ms)
[----------] 1 test from FFT16_REAL_ACROSS_ROWS (6 ms total)

[----------] 1 test from IFFT8_REAL_ACROSS_ROWS
[ RUN      ] IFFT8_REAL_ACROSS_ROWS.match_reference
[       OK ] IFFT8_REAL_ACROSS_ROWS.match_reference (2 ms)
[----------] 1 test from IFFT8_REAL_ACROSS_ROWS (2 ms total)

[----------] 1 test from IFFT16_REAL_ACROSS_ROWS
[ RUN      ] IFFT16_REAL_ACROSS_ROWS.match_reference
[       OK ] IFFT16_REAL_ACROSS_ROWS.match_reference (3 ms)
[----------] 1 test from IFFT16_REAL_ACROSS_ROWS (5 ms total)

[----------] Global test environment tear-down
[==========] 16 tests from 16 test cases ran. (49 ms total)
[  PASSED  ] 16 tests.
[36/57] RUN convolution-input-gradient-smoketest
[==========] Running 26 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from FT8x8
[ RUN      ] FT8x8.single_tile
[       OK ] FT8x8.single_tile (1 ms)
[ RUN      ] FT8x8.input_subtile
[       OK ] FT8x8.input_subtile (0 ms)
[ RUN      ] FT8x8.multi_tile
[       OK ] FT8x8.multi_tile (2 ms)
[ RUN      ] FT8x8.implicit_padding
[       OK ] FT8x8.implicit_padding (65 ms)
[ RUN      ] FT8x8.small_batch
[       OK ] FT8x8.small_batch (7 ms)
[ RUN      ] FT8x8.few_input_channels
[       OK ] FT8x8.few_input_channels (7 ms)
[ RUN      ] FT8x8.few_output_channels
[       OK ] FT8x8.few_output_channels (7 ms)
[ RUN      ] FT8x8.non_square_kernel
[       OK ] FT8x8.non_square_kernel (1 ms)
[ RUN      ] FT8x8.non_square_image
[       OK ] FT8x8.non_square_image (2 ms)
[----------] 9 tests from FT8x8 (93 ms total)

[----------] 9 tests from FT16x16
[ RUN      ] FT16x16.single_tile
[       OK ] FT16x16.single_tile (3 ms)
[ RUN      ] FT16x16.input_subtile
[       OK ] FT16x16.input_subtile (2 ms)
[ RUN      ] FT16x16.multi_tile
[       OK ] FT16x16.multi_tile (10 ms)
[ RUN      ] FT16x16.implicit_padding
[       OK ] FT16x16.implicit_padding (227 ms)
[ RUN      ] FT16x16.small_batch
[       OK ] FT16x16.small_batch (33 ms)
[ RUN      ] FT16x16.few_input_channels
[       OK ] FT16x16.few_input_channels (30 ms)
[ RUN      ] FT16x16.few_output_channels
[       OK ] FT16x16.few_output_channels (31 ms)
[ RUN      ] FT16x16.non_square_kernel
[       OK ] FT16x16.non_square_kernel (5 ms)
[ RUN      ] FT16x16.non_square_image
[       OK ] FT16x16.non_square_image (6 ms)
[----------] 9 tests from FT16x16 (350 ms total)

[----------] 8 tests from WT8x8
[ RUN      ] WT8x8.single_tile
[       OK ] WT8x8.single_tile (2 ms)
[ RUN      ] WT8x8.input_subtile
[       OK ] WT8x8.input_subtile (1 ms)
[ RUN      ] WT8x8.multi_tile
[       OK ] WT8x8.multi_tile (4 ms)
[ RUN      ] WT8x8.implicit_padding
[       OK ] WT8x8.implicit_padding (22 ms)
[ RUN      ] WT8x8.small_batch
[       OK ] WT8x8.small_batch (15 ms)
[ RUN      ] WT8x8.few_input_channels
[       OK ] WT8x8.few_input_channels (13 ms)
[ RUN      ] WT8x8.few_output_channels
[       OK ] WT8x8.few_output_channels (16 ms)
[ RUN      ] WT8x8.non_square_image
[       OK ] WT8x8.non_square_image (1 ms)
[----------] 8 tests from WT8x8 (78 ms total)

[----------] Global test environment tear-down
[==========] 26 tests from 3 test cases ran. (522 ms total)
[  PASSED  ] 26 tests.
[41/57] RUN winograd-test
[==========] Running 6 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from F6k3
[ RUN      ] F6k3.input
[       OK ] F6k3.input (14 ms)
[ RUN      ] F6k3.kernel
[       OK ] F6k3.kernel (1 ms)
[ RUN      ] F6k3.output
[       OK ] F6k3.output (2 ms)
[----------] 3 tests from F6k3 (19 ms total)

[----------] 3 tests from F6x6_3x3
[ RUN      ] F6x6_3x3.input
[       OK ] F6x6_3x3.input (28 ms)
[ RUN      ] F6x6_3x3.kernel
[       OK ] F6x6_3x3.kernel (13 ms)
[ RUN      ] F6x6_3x3.output
[       OK ] F6x6_3x3.output (18 ms)
[----------] 3 tests from F6x6_3x3 (73 ms total)

[----------] Global test environment tear-down
[==========] 6 tests from 2 test cases ran. (94 ms total)
[  PASSED  ] 6 tests.
[42/57] RUN sgemm-test
Running main() from gtest_main.cc
[==========] Running 6 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from FAST6x8_NEON
[ RUN      ] FAST6x8_NEON.kc1
[       OK ] FAST6x8_NEON.kc1 (4 ms)
[ RUN      ] FAST6x8_NEON.kc2
[       OK ] FAST6x8_NEON.kc2 (5 ms)
[ RUN      ] FAST6x8_NEON.kc10
[       OK ] FAST6x8_NEON.kc10 (13 ms)
[----------] 3 tests from FAST6x8_NEON (39 ms total)

[----------] 3 tests from FULL6x8_NEON
[ RUN      ] FULL6x8_NEON.kc1
[       OK ] FULL6x8_NEON.kc1 (49 ms)
[ RUN      ] FULL6x8_NEON.kc2
[       OK ] FULL6x8_NEON.kc2 (68 ms)
[ RUN      ] FULL6x8_NEON.kc10
[       OK ] FULL6x8_NEON.kc10 (202 ms)
[----------] 3 tests from FULL6x8_NEON (320 ms total)

[----------] Global test environment tear-down
[==========] 6 tests from 2 test cases ran. (361 ms total)
[  PASSED  ] 6 tests.
[44/57] RUN sxgemm-test
Running main() from gtest_main.cc
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FAST_S4GEMM_3x3
[ RUN      ] FAST_S4GEMM_3x3.neon
[       OK ] FAST_S4GEMM_3x3.neon (74 ms)
[----------] 1 test from FAST_S4GEMM_3x3 (74 ms total)

[----------] 1 test from FULL_S4GEMM_3x3
[ RUN      ] FULL_S4GEMM_3x3.neon
[       OK ] FULL_S4GEMM_3x3.neon (345 ms)
[----------] 1 test from FULL_S4GEMM_3x3 (345 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (420 ms total)
[  PASSED  ] 2 tests.
[48/57] RUN hxgemm-test
Running main() from gtest_main.cc
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FAST_H4GEMM_3x3
[ RUN      ] FAST_H4GEMM_3x3.neonhp
[       OK ] FAST_H4GEMM_3x3.neonhp (162 ms)
[----------] 1 test from FAST_H4GEMM_3x3 (162 ms total)

[----------] 1 test from FULL_H4GEMM_3x3
[ RUN      ] FULL_H4GEMM_3x3.neon
[       OK ] FULL_H4GEMM_3x3.neon (823 ms)
[----------] 1 test from FULL_H4GEMM_3x3 (823 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (985 ms total)
[  PASSED  ] 2 tests.
[49/57] RUN convolution-output-smoketest
[==========] Running 52 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 18 tests from FT8x8
[ RUN      ] FT8x8.single_tile
[       OK ] FT8x8.single_tile (1 ms)
[ RUN      ] FT8x8.single_tile_with_relu
[       OK ] FT8x8.single_tile_with_relu (1 ms)
[ RUN      ] FT8x8.input_subtile
[       OK ] FT8x8.input_subtile (0 ms)
[ RUN      ] FT8x8.input_subtile_with_relu
[       OK ] FT8x8.input_subtile_with_relu (1 ms)
[ RUN      ] FT8x8.multi_tile
[       OK ] FT8x8.multi_tile (2 ms)
[ RUN      ] FT8x8.multi_tile_with_relu
[       OK ] FT8x8.multi_tile_with_relu (2 ms)
[ RUN      ] FT8x8.implicit_padding
[       OK ] FT8x8.implicit_padding (71 ms)
[ RUN      ] FT8x8.implicit_padding_with_relu
[       OK ] FT8x8.implicit_padding_with_relu (73 ms)
[ RUN      ] FT8x8.small_batch
[       OK ] FT8x8.small_batch (8 ms)
[ RUN      ] FT8x8.small_batch_with_relu
[       OK ] FT8x8.small_batch_with_relu (8 ms)
[ RUN      ] FT8x8.few_input_channels
[       OK ] FT8x8.few_input_channels (8 ms)
[ RUN      ] FT8x8.few_input_channels_with_relu
[       OK ] FT8x8.few_input_channels_with_relu (8 ms)
[ RUN      ] FT8x8.few_output_channels
[       OK ] FT8x8.few_output_channels (7 ms)
[ RUN      ] FT8x8.few_output_channels_with_relu
[       OK ] FT8x8.few_output_channels_with_relu (8 ms)
[ RUN      ] FT8x8.non_square_kernel
[       OK ] FT8x8.non_square_kernel (0 ms)
[ RUN      ] FT8x8.non_square_kernel_with_relu
[       OK ] FT8x8.non_square_kernel_with_relu (0 ms)
[ RUN      ] FT8x8.non_square_image
[       OK ] FT8x8.non_square_image (2 ms)
[ RUN      ] FT8x8.non_square_image_with_relu
[       OK ] FT8x8.non_square_image_with_relu (1 ms)
[----------] 18 tests from FT8x8 (202 ms total)

[----------] 18 tests from FT16x16
[ RUN      ] FT16x16.single_tile
[       OK ] FT16x16.single_tile (3 ms)
[ RUN      ] FT16x16.single_tile_with_relu
[       OK ] FT16x16.single_tile_with_relu (4 ms)
[ RUN      ] FT16x16.input_subtile
[       OK ] FT16x16.input_subtile (1 ms)
[ RUN      ] FT16x16.input_subtile_with_relu
[       OK ] FT16x16.input_subtile_with_relu (1 ms)
[ RUN      ] FT16x16.multi_tile
[       OK ] FT16x16.multi_tile (10 ms)
[ RUN      ] FT16x16.multi_tile_with_relu
[       OK ] FT16x16.multi_tile_with_relu (10 ms)
[ RUN      ] FT16x16.implicit_padding
[       OK ] FT16x16.implicit_padding (215 ms)
[ RUN      ] FT16x16.implicit_padding_with_relu
[       OK ] FT16x16.implicit_padding_with_relu (219 ms)
[ RUN      ] FT16x16.small_batch
[       OK ] FT16x16.small_batch (36 ms)
[ RUN      ] FT16x16.small_batch_with_relu
[       OK ] FT16x16.small_batch_with_relu (37 ms)
[ RUN      ] FT16x16.few_input_channels
[       OK ] FT16x16.few_input_channels (34 ms)
[ RUN      ] FT16x16.few_input_channels_with_relu
[       OK ] FT16x16.few_input_channels_with_relu (34 ms)
[ RUN      ] FT16x16.few_output_channels
[       OK ] FT16x16.few_output_channels (32 ms)
[ RUN      ] FT16x16.few_output_channels_with_relu
[       OK ] FT16x16.few_output_channels_with_relu (33 ms)
[ RUN      ] FT16x16.non_square_kernel
[       OK ] FT16x16.non_square_kernel (3 ms)
[ RUN      ] FT16x16.non_square_kernel_with_relu
[       OK ] FT16x16.non_square_kernel_with_relu (3 ms)
[ RUN      ] FT16x16.non_square_image
[       OK ] FT16x16.non_square_image (5 ms)
[ RUN      ] FT16x16.non_square_image_with_relu
[       OK ] FT16x16.non_square_image_with_relu (6 ms)
[----------] 18 tests from FT16x16 (687 ms total)

[----------] 16 tests from WT8x8
[ RUN      ] WT8x8.single_tile
[       OK ] WT8x8.single_tile (1 ms)
[ RUN      ] WT8x8.single_tile_with_relu
[       OK ] WT8x8.single_tile_with_relu (0 ms)
[ RUN      ] WT8x8.input_subtile
[       OK ] WT8x8.input_subtile (1 ms)
[ RUN      ] WT8x8.input_subtile_with_relu
[       OK ] WT8x8.input_subtile_with_relu (0 ms)
[ RUN      ] WT8x8.multi_tile
[       OK ] WT8x8.multi_tile (3 ms)
[ RUN      ] WT8x8.multi_tile_with_relu
[       OK ] WT8x8.multi_tile_with_relu (2 ms)
[ RUN      ] WT8x8.implicit_padding
[       OK ] WT8x8.implicit_padding (20 ms)
[ RUN      ] WT8x8.implicit_padding_with_relu
[       OK ] WT8x8.implicit_padding_with_relu (20 ms)
[ RUN      ] WT8x8.small_batch
[       OK ] WT8x8.small_batch (8 ms)
[ RUN      ] WT8x8.small_batch_with_relu
[       OK ] WT8x8.small_batch_with_relu (8 ms)
[ RUN      ] WT8x8.few_input_channels
[       OK ] WT8x8.few_input_channels (7 ms)
[ RUN      ] WT8x8.few_input_channels_with_relu
[       OK ] WT8x8.few_input_channels_with_relu (8 ms)
[ RUN      ] WT8x8.few_output_channels
[       OK ] WT8x8.few_output_channels (7 ms)
[ RUN      ] WT8x8.few_output_channels_with_relu
[       OK ] WT8x8.few_output_channels_with_relu (6 ms)
[ RUN      ] WT8x8.non_square_image
[       OK ] WT8x8.non_square_image (1 ms)
[ RUN      ] WT8x8.non_square_image_with_relu
[       OK ] WT8x8.non_square_image_with_relu (2 ms)
[----------] 16 tests from WT8x8 (95 ms total)

[----------] Global test environment tear-down
[==========] 52 tests from 3 test cases ran. (984 ms total)
[  PASSED  ] 52 tests.
[50/57] RUN convolution-kernel-gradient-smoketest
[==========] Running 18 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 9 tests from FT8x8
[ RUN      ] FT8x8.single_tile
[       OK ] FT8x8.single_tile (0 ms)
[ RUN      ] FT8x8.input_subtile
[       OK ] FT8x8.input_subtile (1 ms)
[ RUN      ] FT8x8.multi_tile
[       OK ] FT8x8.multi_tile (1 ms)
[ RUN      ] FT8x8.implicit_padding
[       OK ] FT8x8.implicit_padding (65 ms)
[ RUN      ] FT8x8.small_batch
[       OK ] FT8x8.small_batch (7 ms)
[ RUN      ] FT8x8.few_input_channels
[       OK ] FT8x8.few_input_channels (7 ms)
[ RUN      ] FT8x8.few_output_channels
[       OK ] FT8x8.few_output_channels (6 ms)
[ RUN      ] FT8x8.non_square_kernel
[       OK ] FT8x8.non_square_kernel (0 ms)
[ RUN      ] FT8x8.non_square_image
[       OK ] FT8x8.non_square_image (1 ms)
[----------] 9 tests from FT8x8 (90 ms total)

[----------] 9 tests from FT16x16
[ RUN      ] FT16x16.single_tile
[       OK ] FT16x16.single_tile (3 ms)
[ RUN      ] FT16x16.input_subtile
[       OK ] FT16x16.input_subtile (1 ms)
[ RUN      ] FT16x16.multi_tile
[       OK ] FT16x16.multi_tile (8 ms)
[ RUN      ] FT16x16.implicit_padding
[       OK ] FT16x16.implicit_padding (194 ms)
[ RUN      ] FT16x16.small_batch
[       OK ] FT16x16.small_batch (31 ms)
[ RUN      ] FT16x16.few_input_channels
[       OK ] FT16x16.few_input_channels (28 ms)
[ RUN      ] FT16x16.few_output_channels
[       OK ] FT16x16.few_output_channels (27 ms)
[ RUN      ] FT16x16.non_square_kernel
[       OK ] FT16x16.non_square_kernel (3 ms)
[ RUN      ] FT16x16.non_square_image
[       OK ] FT16x16.non_square_image (5 ms)
[----------] 9 tests from FT16x16 (300 ms total)

[----------] Global test environment tear-down
[==========] 18 tests from 2 test cases ran. (391 ms total)
[  PASSED  ] 18 tests.

  YOU HAVE 8 DISABLED TESTS

[51/57] RUN fully-connected-output-smoketest
[==========] Running 11 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 11 tests from MRxNR_4x24
[ RUN      ] MRxNR_4x24.single_input_channel
[       OK ] MRxNR_4x24.single_input_channel (4 ms)
[ RUN      ] MRxNR_4x24.few_input_channels
[       OK ] MRxNR_4x24.few_input_channels (5 ms)
[ RUN      ] MRxNR_4x24.many_input_channels
[       OK ] MRxNR_4x24.many_input_channels (86 ms)
[ RUN      ] MRxNR_4x24.batch_subblock
[       OK ] MRxNR_4x24.batch_subblock (11 ms)
[ RUN      ] MRxNR_4x24.small_batch_size
[       OK ] MRxNR_4x24.small_batch_size (4 ms)
[ RUN      ] MRxNR_4x24.batch_remainder_subblock
[       OK ] MRxNR_4x24.batch_remainder_subblock (18 ms)
[ RUN      ] MRxNR_4x24.large_batch_size
[       OK ] MRxNR_4x24.large_batch_size (58 ms)
[ RUN      ] MRxNR_4x24.output_channels_subblock
[       OK ] MRxNR_4x24.output_channels_subblock (78 ms)
[ RUN      ] MRxNR_4x24.few_output_channels
[       OK ] MRxNR_4x24.few_output_channels (8 ms)
[ RUN      ] MRxNR_4x24.output_channels_remainder_subblock
[       OK ] MRxNR_4x24.output_channels_remainder_subblock (205 ms)
[ RUN      ] MRxNR_4x24.many_output_channels
[       OK ] MRxNR_4x24.many_output_channels (118 ms)
[----------] 11 tests from MRxNR_4x24 (597 ms total)

[----------] Global test environment tear-down
[==========] 11 tests from 1 test case ran. (597 ms total)
[  PASSED  ] 11 tests.
[52/57] RUN max-pooling-output-smoketest
0
[==========] Running 32 tests from 4 test cases.
[----------] Global test environment set-up.
[----------] 8 tests from MAX_POOLING_2x2
[ RUN      ] MAX_POOLING_2x2.single_pool
[       OK ] MAX_POOLING_2x2.single_pool (0 ms)
[ RUN      ] MAX_POOLING_2x2.few_horizontal_pools
[       OK ] MAX_POOLING_2x2.few_horizontal_pools (5 ms)
[ RUN      ] MAX_POOLING_2x2.few_vertical_pools
[       OK ] MAX_POOLING_2x2.few_vertical_pools (5 ms)
[ RUN      ] MAX_POOLING_2x2.large_image
[       OK ] MAX_POOLING_2x2.large_image (54 ms)
[ RUN      ] MAX_POOLING_2x2.indivisible_size
[       OK ] MAX_POOLING_2x2.indivisible_size (0 ms)
[ RUN      ] MAX_POOLING_2x2.implicit_padding
[       OK ] MAX_POOLING_2x2.implicit_padding (26 ms)
[ RUN      ] MAX_POOLING_2x2.small_batch
[       OK ] MAX_POOLING_2x2.small_batch (6 ms)
[ RUN      ] MAX_POOLING_2x2.few_channels
[       OK ] MAX_POOLING_2x2.few_channels (6 ms)
[----------] 8 tests from MAX_POOLING_2x2 (102 ms total)

[----------] 8 tests from MAX_POOLING_3x3_STRIDE_2x2
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.single_pool
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.single_pool (0 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.few_horizontal_pools
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.few_horizontal_pools (5 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.few_vertical_pools
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.few_vertical_pools (6 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.large_image
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.large_image (59 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.indivisible_size
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.indivisible_size (0 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.implicit_padding
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.implicit_padding (151 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.small_batch
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.small_batch (8 ms)
[ RUN      ] MAX_POOLING_3x3_STRIDE_2x2.few_channels
[       OK ] MAX_POOLING_3x3_STRIDE_2x2.few_channels (8 ms)
[----------] 8 tests from MAX_POOLING_3x3_STRIDE_2x2 (237 ms total)

[----------] 8 tests from MAX_POOLING_1x2
[ RUN      ] MAX_POOLING_1x2.single_pool
[       OK ] MAX_POOLING_1x2.single_pool (0 ms)
[ RUN      ] MAX_POOLING_1x2.few_horizontal_pools
[       OK ] MAX_POOLING_1x2.few_horizontal_pools (3 ms)
[ RUN      ] MAX_POOLING_1x2.few_vertical_pools
[       OK ] MAX_POOLING_1x2.few_vertical_pools (9 ms)
[ RUN      ] MAX_POOLING_1x2.large_image
[       OK ] MAX_POOLING_1x2.large_image (55 ms)
[ RUN      ] MAX_POOLING_1x2.indivisible_size
[       OK ] MAX_POOLING_1x2.indivisible_size (1 ms)
[ RUN      ] MAX_POOLING_1x2.implicit_padding
[       OK ] MAX_POOLING_1x2.implicit_padding (6 ms)
[ RUN      ] MAX_POOLING_1x2.small_batch
[       OK ] MAX_POOLING_1x2.small_batch (8 ms)
[ RUN      ] MAX_POOLING_1x2.few_channels
[       OK ] MAX_POOLING_1x2.few_channels (7 ms)
[----------] 8 tests from MAX_POOLING_1x2 (89 ms total)

[----------] 8 tests from MAX_POOLING_2x1
[ RUN      ] MAX_POOLING_2x1.single_pool
[       OK ] MAX_POOLING_2x1.single_pool (0 ms)
[ RUN      ] MAX_POOLING_2x1.few_horizontal_pools
[       OK ] MAX_POOLING_2x1.few_horizontal_pools (13 ms)
[ RUN      ] MAX_POOLING_2x1.few_vertical_pools
[       OK ] MAX_POOLING_2x1.few_vertical_pools (3 ms)
[ RUN      ] MAX_POOLING_2x1.large_image
[       OK ] MAX_POOLING_2x1.large_image (50 ms)
[ RUN      ] MAX_POOLING_2x1.indivisible_size
[       OK ] MAX_POOLING_2x1.indivisible_size (0 ms)
[ RUN      ] MAX_POOLING_2x1.implicit_padding
[       OK ] MAX_POOLING_2x1.implicit_padding (7 ms)
[ RUN      ] MAX_POOLING_2x1.small_batch
[       OK ] MAX_POOLING_2x1.small_batch (7 ms)
[ RUN      ] MAX_POOLING_2x1.few_channels
[       OK ] MAX_POOLING_2x1.few_channels (6 ms)
[----------] 8 tests from MAX_POOLING_2x1 (86 ms total)

[----------] Global test environment tear-down
[==========] 32 tests from 4 test cases ran. (514 ms total)
[  PASSED  ] 32 tests.
[53/57] RUN softmax-output-smoketest
[==========] Running 4 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from OUT_OF_PLACE
[ RUN      ] OUT_OF_PLACE.few_channels
[       OK ] OUT_OF_PLACE.few_channels (8 ms)
[ RUN      ] OUT_OF_PLACE.small_batch
[       OK ] OUT_OF_PLACE.small_batch (20 ms)
[----------] 2 tests from OUT_OF_PLACE (28 ms total)

[----------] 2 tests from IN_PLACE
[ RUN      ] IN_PLACE.few_channels
[       OK ] IN_PLACE.few_channels (8 ms)
[ RUN      ] IN_PLACE.small_batch
[       OK ] IN_PLACE.small_batch (19 ms)
[----------] 2 tests from IN_PLACE (27 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 2 test cases ran. (57 ms total)
[  PASSED  ] 4 tests.
[56/57] RUN convolution-inference-smoketest
[==========] Running 160 tests from 9 test cases.
[----------] Global test environment set-up.
[----------] 16 tests from FT8x8
[ RUN      ] FT8x8.single_tile
[       OK ] FT8x8.single_tile (1 ms)
[ RUN      ] FT8x8.single_tile_with_relu
[       OK ] FT8x8.single_tile_with_relu (1 ms)
[ RUN      ] FT8x8.input_subtile
[       OK ] FT8x8.input_subtile (0 ms)
[ RUN      ] FT8x8.input_subtile_with_relu
[       OK ] FT8x8.input_subtile_with_relu (1 ms)
[ RUN      ] FT8x8.multi_tile
[       OK ] FT8x8.multi_tile (2 ms)
[ RUN      ] FT8x8.multi_tile_with_relu
[       OK ] FT8x8.multi_tile_with_relu (2 ms)
[ RUN      ] FT8x8.implicit_padding
[       OK ] FT8x8.implicit_padding (68 ms)
[ RUN      ] FT8x8.implicit_padding_with_relu
[       OK ] FT8x8.implicit_padding_with_relu (68 ms)
[ RUN      ] FT8x8.few_input_channels
[       OK ] FT8x8.few_input_channels (8 ms)
[ RUN      ] FT8x8.few_input_channels_with_relu
[       OK ] FT8x8.few_input_channels_with_relu (7 ms)
[ RUN      ] FT8x8.few_output_channels
[       OK ] FT8x8.few_output_channels (8 ms)
[ RUN      ] FT8x8.few_output_channels_with_relu
[       OK ] FT8x8.few_output_channels_with_relu (7 ms)
[ RUN      ] FT8x8.non_square_kernel
[       OK ] FT8x8.non_square_kernel (1 ms)
[ RUN      ] FT8x8.non_square_kernel_with_relu
[       OK ] FT8x8.non_square_kernel_with_relu (1 ms)
[ RUN      ] FT8x8.non_square_image
[       OK ] FT8x8.non_square_image (1 ms)
[ RUN      ] FT8x8.non_square_image_with_relu
[       OK ] FT8x8.non_square_image_with_relu (2 ms)
[----------] 16 tests from FT8x8 (178 ms total)

[----------] 16 tests from FT16x16
[ RUN      ] FT16x16.single_tile
[       OK ] FT16x16.single_tile (3 ms)
[ RUN      ] FT16x16.single_tile_with_relu
[       OK ] FT16x16.single_tile_with_relu (3 ms)
[ RUN      ] FT16x16.input_subtile
[       OK ] FT16x16.input_subtile (1 ms)
[ RUN      ] FT16x16.input_subtile_with_relu
[       OK ] FT16x16.input_subtile_with_relu (2 ms)
[ RUN      ] FT16x16.multi_tile
[       OK ] FT16x16.multi_tile (9 ms)
[ RUN      ] FT16x16.multi_tile_with_relu
[       OK ] FT16x16.multi_tile_with_relu (10 ms)
[ RUN      ] FT16x16.implicit_padding
[       OK ] FT16x16.implicit_padding (206 ms)
[ RUN      ] FT16x16.implicit_padding_with_relu
[       OK ] FT16x16.implicit_padding_with_relu (208 ms)
[ RUN      ] FT16x16.few_input_channels
[       OK ] FT16x16.few_input_channels (33 ms)
[ RUN      ] FT16x16.few_input_channels_with_relu
[       OK ] FT16x16.few_input_channels_with_relu (33 ms)
[ RUN      ] FT16x16.few_output_channels
[       OK ] FT16x16.few_output_channels (32 ms)
[ RUN      ] FT16x16.few_output_channels_with_relu
[       OK ] FT16x16.few_output_channels_with_relu (33 ms)
[ RUN      ] FT16x16.non_square_kernel
[       OK ] FT16x16.non_square_kernel (3 ms)
[ RUN      ] FT16x16.non_square_kernel_with_relu
[       OK ] FT16x16.non_square_kernel_with_relu (2 ms)
[ RUN      ] FT16x16.non_square_image
[       OK ] FT16x16.non_square_image (6 ms)
[ RUN      ] FT16x16.non_square_image_with_relu
[       OK ] FT16x16.non_square_image_with_relu (5 ms)
[----------] 16 tests from FT16x16 (589 ms total)

[----------] 28 tests from WT8x8
[ RUN      ] WT8x8.single_tile
[       OK ] WT8x8.single_tile (1 ms)
[ RUN      ] WT8x8.single_tile_with_relu
[       OK ] WT8x8.single_tile_with_relu (1 ms)
[ RUN      ] WT8x8.single_tile_with_subsample2x2
[       OK ] WT8x8.single_tile_with_subsample2x2 (1 ms)
[ RUN      ] WT8x8.single_tile_with_subsample2x2_relu
[       OK ] WT8x8.single_tile_with_subsample2x2_relu (0 ms)
[ RUN      ] WT8x8.input_subtile
[       OK ] WT8x8.input_subtile (1 ms)
[ RUN      ] WT8x8.input_subtile_with_relu
[       OK ] WT8x8.input_subtile_with_relu (0 ms)
[ RUN      ] WT8x8.input_subtile_with_subsample2x2
[       OK ] WT8x8.input_subtile_with_subsample2x2 (1 ms)
[ RUN      ] WT8x8.input_subtile_with_subsample2x2_relu
[       OK ] WT8x8.input_subtile_with_subsample2x2_relu (0 ms)
[ RUN      ] WT8x8.multi_tile
[       OK ] WT8x8.multi_tile (2 ms)
[ RUN      ] WT8x8.multi_tile_with_relu
[       OK ] WT8x8.multi_tile_with_relu (2 ms)
[ RUN      ] WT8x8.multi_tile_with_subsample2x2
[       OK ] WT8x8.multi_tile_with_subsample2x2 (2 ms)
[ RUN      ] WT8x8.multi_tile_with_subsample2x2_relu
[       OK ] WT8x8.multi_tile_with_subsample2x2_relu (1 ms)
[ RUN      ] WT8x8.implicit_padding
[       OK ] WT8x8.implicit_padding (16 ms)
[ RUN      ] WT8x8.implicit_padding_with_relu
[       OK ] WT8x8.implicit_padding_with_relu (16 ms)
[ RUN      ] WT8x8.implicit_padding_with_subsample2x2
[       OK ] WT8x8.implicit_padding_with_subsample2x2 (13 ms)
[ RUN      ] WT8x8.implicit_padding_with_subsample2x2_relu
[       OK ] WT8x8.implicit_padding_with_subsample2x2_relu (12 ms)
[ RUN      ] WT8x8.few_input_channels
[       OK ] WT8x8.few_input_channels (7 ms)
[ RUN      ] WT8x8.few_input_channels_with_relu
[       OK ] WT8x8.few_input_channels_with_relu (8 ms)
[ RUN      ] WT8x8.few_input_channels_with_subsample2x2
[       OK ] WT8x8.few_input_channels_with_subsample2x2 (5 ms)
[ RUN      ] WT8x8.few_input_channels_with_subsample2x2_relu
[       OK ] WT8x8.few_input_channels_with_subsample2x2_relu (5 ms)
[ RUN      ] WT8x8.few_output_channels
[       OK ] WT8x8.few_output_channels (6 ms)
[ RUN      ] WT8x8.few_output_channels_with_relu
[       OK ] WT8x8.few_output_channels_with_relu (7 ms)
[ RUN      ] WT8x8.few_output_channels_with_subsample2x2
[       OK ] WT8x8.few_output_channels_with_subsample2x2 (4 ms)
[ RUN      ] WT8x8.few_output_channels_with_subsample2x2_relu
[       OK ] WT8x8.few_output_channels_with_subsample2x2_relu (5 ms)
[ RUN      ] WT8x8.non_square_image
[       OK ] WT8x8.non_square_image (1 ms)
[ RUN      ] WT8x8.non_square_image_with_relu
[       OK ] WT8x8.non_square_image_with_relu (1 ms)
[ RUN      ] WT8x8.non_square_image_with_subsample2x2
[       OK ] WT8x8.non_square_image_with_subsample2x2 (1 ms)
[ RUN      ] WT8x8.non_square_image_with_subsample2x2_relu
[       OK ] WT8x8.non_square_image_with_subsample2x2_relu (1 ms)
[----------] 28 tests from WT8x8 (121 ms total)

[----------] 14 tests from WT8x8_FP16
[ RUN      ] WT8x8_FP16.single_tile
[       OK ] WT8x8_FP16.single_tile (1 ms)
[ RUN      ] WT8x8_FP16.single_tile_with_relu
[       OK ] WT8x8_FP16.single_tile_with_relu (0 ms)
[ RUN      ] WT8x8_FP16.input_subtile
[       OK ] WT8x8_FP16.input_subtile (0 ms)
[ RUN      ] WT8x8_FP16.input_subtile_with_relu
[       OK ] WT8x8_FP16.input_subtile_with_relu (0 ms)
[ RUN      ] WT8x8_FP16.multi_tile
[       OK ] WT8x8_FP16.multi_tile (2 ms)
[ RUN      ] WT8x8_FP16.multi_tile_with_relu
[       OK ] WT8x8_FP16.multi_tile_with_relu (2 ms)
[ RUN      ] WT8x8_FP16.implicit_padding
[       OK ] WT8x8_FP16.implicit_padding (14 ms)
[ RUN      ] WT8x8_FP16.implicit_padding_with_relu
[       OK ] WT8x8_FP16.implicit_padding_with_relu (14 ms)
[ RUN      ] WT8x8_FP16.few_input_channels
[       OK ] WT8x8_FP16.few_input_channels (7 ms)
[ RUN      ] WT8x8_FP16.few_input_channels_with_relu
[       OK ] WT8x8_FP16.few_input_channels_with_relu (7 ms)
[ RUN      ] WT8x8_FP16.few_output_channels
[       OK ] WT8x8_FP16.few_output_channels (6 ms)
[ RUN      ] WT8x8_FP16.few_output_channels_with_relu
[       OK ] WT8x8_FP16.few_output_channels_with_relu (7 ms)
[ RUN      ] WT8x8_FP16.non_square_image
[       OK ] WT8x8_FP16.non_square_image (1 ms)
[ RUN      ] WT8x8_FP16.non_square_image_with_relu
[       OK ] WT8x8_FP16.non_square_image_with_relu (1 ms)
[----------] 14 tests from WT8x8_FP16 (64 ms total)

[----------] 16 tests from FT8x8_PRECOMPUTE
[ RUN      ] FT8x8_PRECOMPUTE.single_tile
[       OK ] FT8x8_PRECOMPUTE.single_tile (1 ms)
[ RUN      ] FT8x8_PRECOMPUTE.single_tile_with_relu
[       OK ] FT8x8_PRECOMPUTE.single_tile_with_relu (1 ms)
[ RUN      ] FT8x8_PRECOMPUTE.input_subtile
[       OK ] FT8x8_PRECOMPUTE.input_subtile (1 ms)
[ RUN      ] FT8x8_PRECOMPUTE.input_subtile_with_relu
[       OK ] FT8x8_PRECOMPUTE.input_subtile_with_relu (0 ms)
[ RUN      ] FT8x8_PRECOMPUTE.multi_tile
[       OK ] FT8x8_PRECOMPUTE.multi_tile (2 ms)
[ RUN      ] FT8x8_PRECOMPUTE.multi_tile_with_relu
[       OK ] FT8x8_PRECOMPUTE.multi_tile_with_relu (2 ms)
[ RUN      ] FT8x8_PRECOMPUTE.implicit_padding
[       OK ] FT8x8_PRECOMPUTE.implicit_padding (71 ms)
[ RUN      ] FT8x8_PRECOMPUTE.implicit_padding_with_relu
[       OK ] FT8x8_PRECOMPUTE.implicit_padding_with_relu (71 ms)
[ RUN      ] FT8x8_PRECOMPUTE.few_input_channels
[       OK ] FT8x8_PRECOMPUTE.few_input_channels (8 ms)
[ RUN      ] FT8x8_PRECOMPUTE.few_input_channels_with_relu
[       OK ] FT8x8_PRECOMPUTE.few_input_channels_with_relu (9 ms)
[ RUN      ] FT8x8_PRECOMPUTE.few_output_channels
[       OK ] FT8x8_PRECOMPUTE.few_output_channels (8 ms)
[ RUN      ] FT8x8_PRECOMPUTE.few_output_channels_with_relu
[       OK ] FT8x8_PRECOMPUTE.few_output_channels_with_relu (8 ms)
[ RUN      ] FT8x8_PRECOMPUTE.non_square_kernel
[       OK ] FT8x8_PRECOMPUTE.non_square_kernel (1 ms)
[ RUN      ] FT8x8_PRECOMPUTE.non_square_kernel_with_relu
[       OK ] FT8x8_PRECOMPUTE.non_square_kernel_with_relu (0 ms)
[ RUN      ] FT8x8_PRECOMPUTE.non_square_image
[       OK ] FT8x8_PRECOMPUTE.non_square_image (2 ms)
[ RUN      ] FT8x8_PRECOMPUTE.non_square_image_with_relu
[       OK ] FT8x8_PRECOMPUTE.non_square_image_with_relu (2 ms)
[----------] 16 tests from FT8x8_PRECOMPUTE (188 ms total)

[----------] 16 tests from FT16x16_PRECOMPUTE
[ RUN      ] FT16x16_PRECOMPUTE.single_tile
[       OK ] FT16x16_PRECOMPUTE.single_tile (3 ms)
[ RUN      ] FT16x16_PRECOMPUTE.single_tile_with_relu
[       OK ] FT16x16_PRECOMPUTE.single_tile_with_relu (3 ms)
[ RUN      ] FT16x16_PRECOMPUTE.input_subtile
[       OK ] FT16x16_PRECOMPUTE.input_subtile (2 ms)
[ RUN      ] FT16x16_PRECOMPUTE.input_subtile_with_relu
[       OK ] FT16x16_PRECOMPUTE.input_subtile_with_relu (1 ms)
[ RUN      ] FT16x16_PRECOMPUTE.multi_tile
[       OK ] FT16x16_PRECOMPUTE.multi_tile (9 ms)
[ RUN      ] FT16x16_PRECOMPUTE.multi_tile_with_relu
[       OK ] FT16x16_PRECOMPUTE.multi_tile_with_relu (10 ms)
[ RUN      ] FT16x16_PRECOMPUTE.implicit_padding
[       OK ] FT16x16_PRECOMPUTE.implicit_padding (208 ms)
[ RUN      ] FT16x16_PRECOMPUTE.implicit_padding_with_relu
[       OK ] FT16x16_PRECOMPUTE.implicit_padding_with_relu (212 ms)
[ RUN      ] FT16x16_PRECOMPUTE.few_input_channels
[       OK ] FT16x16_PRECOMPUTE.few_input_channels (34 ms)
[ RUN      ] FT16x16_PRECOMPUTE.few_input_channels_with_relu
[       OK ] FT16x16_PRECOMPUTE.few_input_channels_with_relu (33 ms)
[ RUN      ] FT16x16_PRECOMPUTE.few_output_channels
[       OK ] FT16x16_PRECOMPUTE.few_output_channels (32 ms)
[ RUN      ] FT16x16_PRECOMPUTE.few_output_channels_with_relu
[       OK ] FT16x16_PRECOMPUTE.few_output_channels_with_relu (32 ms)
[ RUN      ] FT16x16_PRECOMPUTE.non_square_kernel
[       OK ] FT16x16_PRECOMPUTE.non_square_kernel (3 ms)
[ RUN      ] FT16x16_PRECOMPUTE.non_square_kernel_with_relu
[       OK ] FT16x16_PRECOMPUTE.non_square_kernel_with_relu (3 ms)
[ RUN      ] FT16x16_PRECOMPUTE.non_square_image
[       OK ] FT16x16_PRECOMPUTE.non_square_image (5 ms)
[ RUN      ] FT16x16_PRECOMPUTE.non_square_image_with_relu
[       OK ] FT16x16_PRECOMPUTE.non_square_image_with_relu (6 ms)
[----------] 16 tests from FT16x16_PRECOMPUTE (596 ms total)

[----------] 28 tests from WT8x8_PRECOMPUTE
[ RUN      ] WT8x8_PRECOMPUTE.single_tile
[       OK ] WT8x8_PRECOMPUTE.single_tile (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.single_tile_with_relu
[       OK ] WT8x8_PRECOMPUTE.single_tile_with_relu (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.single_tile_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.single_tile_with_subsample2x2 (0 ms)
[ RUN      ] WT8x8_PRECOMPUTE.single_tile_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.single_tile_with_subsample2x2_relu (0 ms)
[ RUN      ] WT8x8_PRECOMPUTE.input_subtile
[       OK ] WT8x8_PRECOMPUTE.input_subtile (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.input_subtile_with_relu
[       OK ] WT8x8_PRECOMPUTE.input_subtile_with_relu (0 ms)
[ RUN      ] WT8x8_PRECOMPUTE.input_subtile_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.input_subtile_with_subsample2x2 (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.input_subtile_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.input_subtile_with_subsample2x2_relu (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.multi_tile
[       OK ] WT8x8_PRECOMPUTE.multi_tile (2 ms)
[ RUN      ] WT8x8_PRECOMPUTE.multi_tile_with_relu
[       OK ] WT8x8_PRECOMPUTE.multi_tile_with_relu (2 ms)
[ RUN      ] WT8x8_PRECOMPUTE.multi_tile_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.multi_tile_with_subsample2x2 (2 ms)
[ RUN      ] WT8x8_PRECOMPUTE.multi_tile_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.multi_tile_with_subsample2x2_relu (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.implicit_padding
[       OK ] WT8x8_PRECOMPUTE.implicit_padding (17 ms)
[ RUN      ] WT8x8_PRECOMPUTE.implicit_padding_with_relu
[       OK ] WT8x8_PRECOMPUTE.implicit_padding_with_relu (18 ms)
[ RUN      ] WT8x8_PRECOMPUTE.implicit_padding_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.implicit_padding_with_subsample2x2 (13 ms)
[ RUN      ] WT8x8_PRECOMPUTE.implicit_padding_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.implicit_padding_with_subsample2x2_relu (13 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_input_channels
[       OK ] WT8x8_PRECOMPUTE.few_input_channels (8 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_input_channels_with_relu
[       OK ] WT8x8_PRECOMPUTE.few_input_channels_with_relu (8 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_input_channels_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.few_input_channels_with_subsample2x2 (5 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_input_channels_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.few_input_channels_with_subsample2x2_relu (5 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_output_channels
[       OK ] WT8x8_PRECOMPUTE.few_output_channels (7 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_output_channels_with_relu
[       OK ] WT8x8_PRECOMPUTE.few_output_channels_with_relu (8 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_output_channels_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.few_output_channels_with_subsample2x2 (4 ms)
[ RUN      ] WT8x8_PRECOMPUTE.few_output_channels_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.few_output_channels_with_subsample2x2_relu (5 ms)
[ RUN      ] WT8x8_PRECOMPUTE.non_square_image
[       OK ] WT8x8_PRECOMPUTE.non_square_image (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.non_square_image_with_relu
[       OK ] WT8x8_PRECOMPUTE.non_square_image_with_relu (2 ms)
[ RUN      ] WT8x8_PRECOMPUTE.non_square_image_with_subsample2x2
[       OK ] WT8x8_PRECOMPUTE.non_square_image_with_subsample2x2 (1 ms)
[ RUN      ] WT8x8_PRECOMPUTE.non_square_image_with_subsample2x2_relu
[       OK ] WT8x8_PRECOMPUTE.non_square_image_with_subsample2x2_relu (1 ms)
[----------] 28 tests from WT8x8_PRECOMPUTE (129 ms total)

[----------] 14 tests from WT8x8_FP16_PRECOMPUTE
[ RUN      ] WT8x8_FP16_PRECOMPUTE.single_tile
[       OK ] WT8x8_FP16_PRECOMPUTE.single_tile (0 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.single_tile_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.single_tile_with_relu (1 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.input_subtile
[       OK ] WT8x8_FP16_PRECOMPUTE.input_subtile (1 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.input_subtile_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.input_subtile_with_relu (0 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.multi_tile
[       OK ] WT8x8_FP16_PRECOMPUTE.multi_tile (1 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.multi_tile_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.multi_tile_with_relu (2 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.implicit_padding
[       OK ] WT8x8_FP16_PRECOMPUTE.implicit_padding (16 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.implicit_padding_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.implicit_padding_with_relu (16 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.few_input_channels
[       OK ] WT8x8_FP16_PRECOMPUTE.few_input_channels (7 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.few_input_channels_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.few_input_channels_with_relu (7 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.few_output_channels
[       OK ] WT8x8_FP16_PRECOMPUTE.few_output_channels (7 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.few_output_channels_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.few_output_channels_with_relu (7 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.non_square_image
[       OK ] WT8x8_FP16_PRECOMPUTE.non_square_image (1 ms)
[ RUN      ] WT8x8_FP16_PRECOMPUTE.non_square_image_with_relu
[       OK ] WT8x8_FP16_PRECOMPUTE.non_square_image_with_relu (1 ms)
[----------] 14 tests from WT8x8_FP16_PRECOMPUTE (69 ms total)

[----------] 12 tests from DIRECT_1x1
[ RUN      ] DIRECT_1x1.channel_tile
[       OK ] DIRECT_1x1.channel_tile (2 ms)
[ RUN      ] DIRECT_1x1.channel_tile_with_relu
[       OK ] DIRECT_1x1.channel_tile_with_relu (2 ms)
[ RUN      ] DIRECT_1x1.channel_subtile
[       OK ] DIRECT_1x1.channel_subtile (14 ms)
[ RUN      ] DIRECT_1x1.channel_subtile_with_relu
[       OK ] DIRECT_1x1.channel_subtile_with_relu (15 ms)
[ RUN      ] DIRECT_1x1.input_multi_tile
[       OK ] DIRECT_1x1.input_multi_tile (5 ms)
[ RUN      ] DIRECT_1x1.input_multi_tile_with_relu
[       OK ] DIRECT_1x1.input_multi_tile_with_relu (4 ms)
[ RUN      ] DIRECT_1x1.output_multi_tile
[       OK ] DIRECT_1x1.output_multi_tile (9 ms)
[ RUN      ] DIRECT_1x1.output_multi_tile_with_relu
[       OK ] DIRECT_1x1.output_multi_tile_with_relu (10 ms)
[ RUN      ] DIRECT_1x1.input_output_multi_tile
[       OK ] DIRECT_1x1.input_output_multi_tile (17 ms)
[ RUN      ] DIRECT_1x1.input_output_multi_tile_with_relu
[       OK ] DIRECT_1x1.input_output_multi_tile_with_relu (17 ms)
[ RUN      ] DIRECT_1x1.odd_image_size
[       OK ] DIRECT_1x1.odd_image_size (625 ms)
[ RUN      ] DIRECT_1x1.odd_image_size_with_relu
[       OK ] DIRECT_1x1.odd_image_size_with_relu (655 ms)
[----------] 12 tests from DIRECT_1x1 (1375 ms total)

[----------] Global test environment tear-down
[==========] 160 tests from 9 test cases ran. (3310 ms total)
[  PASSED  ] 160 tests.

real    1m31.677s
user    3m37.924s
sys     0m9.012s


● NNPACK ninja testは不安定で途中で失敗する

 ラズパイ3Bでも同様に失敗します。

● NNPACK ninja testで「演算エラー」で FAILEDになる原因は?

 ラズパイ負荷対応の信頼の有る電源でもエラーが発生します。
[ FAILED ] FT8x8.conv1 (979227 ms)
[ FAILED ] FT16x16.conv2 (463511 ms)
[ FAILED ] WT8x8.conv1 (989511 ms)
[ FAILED ] FT16x16.conv2 (491525 ms)
[ FAILED ] FT16x16.conv2 (383976 ms)

 NVIDIA Jetson Nano 開発者キットでもエラーが発生します。
[ FAILED ] FT16x16.conv2 (55760 ms)
[ FAILED ] FT16x16.conv1 (14418 ms)
[ FAILED ] FT16x16.conv2 (55256 ms)

 NNPACKそのもののバグ(ARM CPUで実行した時の特性)と思われます。

NVIDIA Jetson Nano

user@user-desktop:~/NNPACK$ time ninja test
[46/96] RUN fourier-test
[==========] Running 16 tests from 16 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FFT8_WITHIN_ROWS
[ RUN      ] FFT8_WITHIN_ROWS.match_reference
[       OK ] FFT8_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from FFT8_WITHIN_ROWS (1 ms total)

...
[63/96] RUN convolution-input-gradient-overfeat-fast-test
[==========] Running 11 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 4 tests from FT8x8
[ RUN      ] FT8x8.conv2
[       OK ] FT8x8.conv2 (56182 ms)
[ RUN      ] FT8x8.conv3
[       OK ] FT8x8.conv3 (44394 ms)
[ RUN      ] FT8x8.conv4
[       OK ] FT8x8.conv4 (410675 ms)
[ RUN      ] FT8x8.conv5
[       OK ] FT8x8.conv5 (900643 ms)
[----------] 4 tests from FT8x8 (1411895 ms total)

[----------] 4 tests from FT16x16
[ RUN      ] FT16x16.conv2
/home/user/NNPACK/test/testers/convolution.h:316: Failure
Expected: (median(maxErrors)) < (errorLimit()), actual: 1.11727e-05 vs 1e-05
[  FAILED  ] FT16x16.conv2 (55760 ms)
[ RUN      ] FT16x16.conv3
[       OK ] FT16x16.conv3 (44780 ms)
[ RUN      ] FT16x16.conv4
[       OK ] FT16x16.conv4 (394191 ms)
[ RUN      ] FT16x16.conv5
[       OK ] FT16x16.conv5 (904767 ms)
[----------] 4 tests from FT16x16 (1399498 ms total)
...
[----------] Global test environment tear-down
[==========] 11 tests from 3 test cases ran. (4149927 ms total)
[  PASSED  ] 10 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] FT16x16.conv2

 1 FAILED TEST
FAILED: convolution-input-gradient-overfeat-fast-test
/home/user/NNPACK/bin/convolution-input-gradient-overfeat-fast-test --gtest_color=yes
ninja: build stopped: subcommand failed.

real    284m34.982s
user    1117m48.968s
sys     0m49.336s

● NNPACK ninja test
 実行 2回目

user@user-desktop:~/NNPACK$ time ninja test
[0/40] RUN fourier-test
[==========] Running 16 tests from 16 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FFT8_WITHIN_ROWS
[ RUN      ] FFT8_WITHIN_ROWS.match_reference
[       OK ] FFT8_WITHIN_ROWS.match_reference (0 ms)
[----------] 1 test from FFT8_WITHIN_ROWS (0 ms total)
...
[3/40] RUN convolution-output-vgg-a-test
[==========] Running 42 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 14 tests from FT8x8
...
[----------] 14 tests from FT16x16
[ RUN      ] FT16x16.conv1
/home/user/NNPACK/test/testers/convolution.h:263: Failure
Expected: (median(maxErrors)) < (errorLimit()), actual: 7.4158e-05 vs 1e-05
[  FAILED  ] FT16x16.conv1 (14418 ms)
[ RUN      ] FT16x16.conv1_with_relu
[       OK ] FT16x16.conv1_with_relu (14702 ms)
[ RUN      ] FT16x16.conv2
[       OK ] FT16x16.conv2 (120044 ms)
[ RUN      ] FT16x16.conv2_with_relu
[       OK ] FT16x16.conv2_with_relu (120216 ms)
[ RUN      ] FT16x16.conv3
[       OK ] FT16x16.conv3 (115229 ms)
[ RUN      ] FT16x16.conv3_with_relu
[       OK ] FT16x16.conv3_with_relu (115319 ms)
[ RUN      ] FT16x16.conv4
[       OK ] FT16x16.conv4 (238250 ms)
[ RUN      ] FT16x16.conv4_with_relu
[       OK ] FT16x16.conv4_with_relu (238488 ms)
[ RUN      ] FT16x16.conv5
[       OK ] FT16x16.conv5 (111644 ms)
[ RUN      ] FT16x16.conv5_with_relu
[       OK ] FT16x16.conv5_with_relu (111813 ms)
[ RUN      ] FT16x16.conv6
[       OK ] FT16x16.conv6 (237932 ms)
[ RUN      ] FT16x16.conv6_with_relu
[       OK ] FT16x16.conv6_with_relu (238227 ms)
[ RUN      ] FT16x16.conv8
[       OK ] FT16x16.conv8 (51440 ms)
[ RUN      ] FT16x16.conv8_with_relu
[       OK ] FT16x16.conv8_with_relu (51413 ms)
[----------] 14 tests from FT16x16 (1779136 ms total)
...
[----------] Global test environment tear-down
[==========] 42 tests from 3 test cases ran. (5343155 ms total)
[  PASSED  ] 41 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] FT16x16.conv1

 1 FAILED TEST
FAILED: convolution-output-vgg-a-test
/home/user/NNPACK/bin/convolution-output-vgg-a-test --gtest_color=yes
ninja: build stopped: subcommand failed.

real    99m15.786s
user    389m15.972s
sys     0m17.180s

● NNPACK ninja test
 実行 3回目

user@user-desktop:~/NNPACK$ time ninja test
[0/40] RUN fourier-test
[==========] Running 16 tests from 16 test cases.
[----------] Global test environment set-up.
[----------] 1 test from FFT8_WITHIN_ROWS
[ RUN      ] FFT8_WITHIN_ROWS.match_reference
[       OK ] FFT8_WITHIN_ROWS.match_reference (1 ms)
[----------] 1 test from FFT8_WITHIN_ROWS (1 ms total)
...
[7/40] RUN convolution-input-gradient-overfeat-fast-test
[==========] Running 11 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 4 tests from FT8x8
[ RUN      ] FT8x8.conv2
[       OK ] FT8x8.conv2 (56683 ms)
[ RUN      ] FT8x8.conv3
[       OK ] FT8x8.conv3 (43826 ms)
[ RUN      ] FT8x8.conv4
[       OK ] FT8x8.conv4 (406204 ms)
[ RUN      ] FT8x8.conv5
[       OK ] FT8x8.conv5 (904378 ms)
[----------] 4 tests from FT8x8 (1411091 ms total)

[----------] 4 tests from FT16x16
[ RUN      ] FT16x16.conv2
/home/user/NNPACK/test/testers/convolution.h:316: Failure
Expected: (median(maxErrors)) < (errorLimit()), actual: 1.13845e-05 vs 1e-05
[  FAILED  ] FT16x16.conv2 (55256 ms)
[ RUN      ] FT16x16.conv3
[       OK ] FT16x16.conv3 (44449 ms)
[ RUN      ] FT16x16.conv4
[       OK ] FT16x16.conv4 (407022 ms)
[ RUN      ] FT16x16.conv5
[       OK ] FT16x16.conv5 (905012 ms)
[----------] 4 tests from FT16x16 (1411739 ms total)
...
[----------] Global test environment tear-down
[==========] 11 tests from 3 test cases ran. (4190813 ms total)
[  PASSED  ] 10 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] FT16x16.conv2

 1 FAILED TEST
FAILED: convolution-input-gradient-overfeat-fast-test
/home/user/NNPACK/bin/convolution-input-gradient-overfeat-fast-test --gtest_color=yes
ninja: build stopped: subcommand failed.

real    286m29.349s
user    1123m17.960s
sys     0m33.652s


● NVIDIA Jetson Nanoで NNPACKをビルドする方法

 NNPACKを「Development builds」では無く、「recommended way to build」でビルドしてみる。

# お決まりの sudo apt-get updateで最新状態に更新する
sudo apt-get update

# Building For most users, the recommended way to build NNPACK is through CMake

# Install ninja build system
sudo apt-get -y install ninja-build

sudo apt-get -y install git cmake

# sudo: pip: command not found
sudo apt-get -y install python-pip

# ImportError: No module named setuptools
sudo apt-get -y install python-setuptools

sudo pip install --upgrade git+https://github.com/Maratyszcza/PeachPy
sudo pip install --upgrade git+https://github.com/Maratyszcza/confu

cd
git clone https://github.com/Maratyszcza/NNPACK.git
cd NNPACK
confu setup

mkdir build
cd build
cmake -G Ninja ..
ninja

time ninja test
user@user-desktop:~/NNPACK/build$ time ninja test
[0/1] Running tests...
Test project /home/user/NNPACK/build
      Start  1: convolution-inference-smoketest
 1/34 Test  #1: convolution-inference-smoketest .........   Passed   28.18 sec
      Start  2: convolution-inference-alexnet
 2/34 Test  #2: convolution-inference-alexnet ...........   Passed  191.48 sec
      Start  3: convolution-inference-overfeat
 3/34 Test  #3: convolution-inference-overfeat ..........   Passed  979.89 sec
      Start  4: convolution-inference-vgg
 4/34 Test  #4: convolution-inference-vgg ...............   Passed  1635.33 sec
      Start  5: convolution-output-smoketest
 5/34 Test  #5: convolution-output-smoketest ............   Passed    6.52 sec
      Start  6: convolution-output-alexnet
 6/34 Test  #6: convolution-output-alexnet ..............   Passed  3275.18 sec
      Start  7: convolution-output-overfeat
 7/34 Test  #7: convolution-output-overfeat .............   Passed  15770.11 sec
      Start  8: convolution-output-vgg
 8/34 Test  #8: convolution-output-vgg ..................   Passed  24912.73 sec
      Start  9: convolution-input-gradient-smoketest
 9/34 Test  #9: convolution-input-gradient-smoketest ....   Passed    2.87 sec
      Start 10: convolution-input-gradient-alexnet
10/34 Test #10: convolution-input-gradient-alexnet ......   Passed  1648.51 sec
      Start 11: convolution-input-gradient-overfeat
11/34 Test #11: convolution-input-gradient-overfeat .....***Failed  11023.48 sec
      Start 12: convolution-input-gradient-vgg


● NVIDIA Jetson Nanoと Raspberry Pi 3B+との NNPACKのベンチマーク比較

 Orange Pi PC 2もベンチマーク比較に追加。
Test(単位:秒)NVIDIA Jetson NanoRaspberry Pi 3B+Orange Pi PC 2
Test  #1: convolution-inference-smoketest28.1869.54202.68
Test  #2: convolution-inference-alexnet191.48462.03715.02
Test  #3: convolution-inference-overfeat979.89FailedException: Other
Test  #4: convolution-inference-vgg1635.334488.535817.81
Test  #5: convolution-output-smoketest6.5215.8648.16
Test  #6: convolution-output-alexnet3275.18未計測未計測
Test  #7: convolution-output-overfeat15770.11未計測未計測
Test  #8: convolution-output-vgg24912.73未計測未計測
Test  #9: convolution-input-gradient-smoketest2.87未計測未計測
Test #10: convolution-input-gradient-alexnet1648.51未計測未計測
Test #11: convolution-input-gradient-overfeatFailed未計測未計測

Orange Pi PC 2

user@orangepipc2:~$ uname -a
Linux orangepipc2 4.19.20-sunxi64 #5.75 SMP Fri Feb 8 10:29:25 CET 2019 aarch64 GNU/Linux


● NNPACKでビルド時のエラー

 解決方法 Ninjaでは無く CMakeを使って NNPACKをビルドする。

Build Error: symbol(s) not found #166

Maratyszcza commented Apr 28, 2019
confu recipe for Google Benchmark needs an update for recent Google Benchmark changes (mostly, add new files).
One option is to fix the recipe here.
https://github.com/Maratyszcza/confu/blob/master/confu/recipes/googlebenchmark.py

Another option is to build NNPACK with CMake instead of Ninja.
‘write’, declared with attribute warn_unused_result [-Wunused-result]
    write(STDOUT_FILENO, out_buffer, prefix_chars + format_chars + CLOG_SUFFIX_LENGTH);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/user/NNPACK/deps/clog/src/clog.c: In function ‘clog_vlog_debug’:
/home/user/NNPACK/deps/clog/src/clog.c:416:4: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result]
    write(STDOUT_FILENO, out_buffer, prefix_chars + format_chars + CLOG_SUFFIX_LENGTH);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[68/198] CXX deps/googlebenchmark/src/benchmark.cc
/home/user/NNPACK/deps/googlebenchmark/src/benchmark.cc: In function ‘std::unique_ptr benchmark::internal::{anonymous}::CreateReporter(const string&, benchmark::ConsoleReporter::OutputOptions)’:
/home/user/NNPACK/deps/googlebenchmark/src/benchmark.cc:300:24: warning: ‘CSVReporter’ is deprecated: The CSV Reporter will be removed in a future release [-Wdeprecated-declarations]
     return PtrType(new CSVReporter);
                        ^~~~~~~~~~~
In file included from /home/user/NNPACK/deps/googlebenchmark/src/benchmark.cc:15:0:
/home/user/NNPACK/deps/googlebenchmark/include/benchmark/benchmark.h:1520:61: note: declared here
     "The CSV Reporter will be removed in a future release") CSVReporter
                                                             ^~~~~~~~~~~
[87/198] LINK bin/convolution-inference-bench
FAILED: /home/user/NNPACK/bin/convolution-inference-bench
g++ -pthread  -o /home/user/NNPACK/bin/convolution-inference-bench /home/user/NNPACK/build/bench/convolution-inference.cc.o -lrt /home/user/NNPACK/lib/libnnpack.a /home/user/NNPACK/lib/libpthreadpool.a /home/user/NNPACK/lib/libcpuinfo.a /home/user/NNPACK/lib/libclog.a /home/user/NNPACK/lib/libgooglebenchmark.a

collect2: error: ld returned 1 exit status
[90/198] CXX bench/hxgemm.cc
ninja: build stopped: subcommand failed.


● NNPACKでビルドセットアップ時のエラー

# ImportError: No module named setuptools
sudo apt-get -y install python-setuptools
# Setting up python-setuptools (33.1.1-1) ...

# ImportError: No module named six
sudo pip install six
# Successfully installed six-1.12.0
python -c "import six; print (six.__version__)"
# 1.12.0

# error: invalid command 'bdist_wheel'
# Failed building wheel for PyYAML
# Failed building wheel for ninja-syntax
sudo pip install wheel
# Successfully installed wheel-0.33.3

sudo pip install --upgrade git+https://github.com/Maratyszcza/PeachPy
# Successfully installed PeachPy-0.2.0 enum34-1.1.6

sudo pip install --upgrade git+https://github.com/Maratyszcza/confu
# Successfully installed confu-0.0.1

cd
git clone https://github.com/Maratyszcza/NNPACK.git
cd NNPACK
confu setup

mkdir build
cd build
cmake -G Ninja ..
ninja

time ninja test



Tags: [Raspberry Pi], [電子工作], [ディープラーニング]

●関連するコンテンツ(この記事を読んだ人は、次の記事も読んでいます)

NVIDIA Jetson Nano 開発者キットを買ってみた。メモリ容量 4GB LPDDR4 RAM
NVIDIA Jetson Nano 開発者キットを買ってみた。メモリ容量 4GB LPDDR4 RAM

  Jetson Nanoで TensorFlow PyTorch Caffe/Caffe2 Keras MXNet等を GPUパワーで超高速で動かす!

Raspberry Piでメモリを馬鹿食いするアプリ用に不要なサービスを停止してフリーメモリを増やす方法
Raspberry Piでメモリを馬鹿食いするアプリ用に不要なサービスを停止してフリーメモリを増やす方法

  ラズパイでメモリを沢山使用するビルドやアプリ用に不要なサービス等を停止して使えるメインメモリを増やす

【成功版】最新版の Darknetに digitalbrain79版の Darknet with NNPACKの NNPACK処理を適用する
【成功版】最新版の Darknetに digitalbrain79版の Darknet with NNPACKの NNPACK処理を適用する

  ラズパイで NNPACK対応の最新版の Darknetを動かして超高速で物体検出や DeepDreamの悪夢を見る

【成功版】Raspberry Piで NNPACK対応版の Darknet Neural Network Frameworkをビルドする方法
【成功版】Raspberry Piで NNPACK対応版の Darknet Neural Network Frameworkをビルドする方法

  ラズパイに Darknet NNPACK darknet-nnpackをソースからビルドして物体検出を行なう方法

【成功版】Raspberry Piで Darknet Neural Network Frameworkをビルドする方法
【成功版】Raspberry Piで Darknet Neural Network Frameworkをビルドする方法

  ラズパイに Darknet Neural Network Frameworkを入れて物体検出や悪夢のグロ画像を生成する

【成功版】Raspberry Piに TensorFlow Deep Learning Frameworkをインストールする方法
【成功版】Raspberry Piに TensorFlow Deep Learning Frameworkをインストールする方法

  ラズパイに TensorFlow Deep Learning Frameworkを入れて Google DeepDreamで悪夢を見る方法

Raspberry Piで TensorFlow Deep Learning Frameworkを自己ビルドする方法
Raspberry Piで TensorFlow Deep Learning Frameworkを自己ビルドする方法

  ラズパイで TensorFlow Deep Learning Frameworkを自己ビルドする方法

Raspberry Piで Caffe Deep Learning Frameworkで物体認識を行なってみるテスト
Raspberry Piで Caffe Deep Learning Frameworkで物体認識を行なってみるテスト

  ラズパイで Caffe Deep Learning Frameworkを動かして物体認識を行なってみる

【ビルド版】Raspberry Piで DeepDreamを動かしてキモイ絵をモリモリ量産 Caffe Deep Learning Framework
【ビルド版】Raspberry Piで DeepDreamを動かしてキモイ絵をモリモリ量産 Caffe Deep Learning Framework

  ラズパイで Caffe Deep Learning Frameworkをビルドして Deep Dreamを動かしてキモイ絵を生成する

【インストール版】Raspberry Piで DeepDreamを動かしてキモイ絵をモリモリ量産 Caffe Deep Learning
【インストール版】Raspberry Piで DeepDreamを動かしてキモイ絵をモリモリ量産 Caffe Deep Learning

  ラズパイで Caffe Deep Learning Frameworkをインストールして Deep Dreamを動かしてキモイ絵を生成する

Raspberry Piで Caffe2 Deep Learning Frameworkをソースコードからビルドする方法
Raspberry Piで Caffe2 Deep Learning Frameworkをソースコードからビルドする方法

  ラズパイで Caffe 2 Deep Learning Frameworkをソースコードから自己ビルドする方法

Orange Pi PC 2の 64bitのチカラで DeepDreamしてキモイ絵を高速でモリモリ量産してみるテスト
Orange Pi PC 2の 64bitのチカラで DeepDreamしてキモイ絵を高速でモリモリ量産してみるテスト

  OrangePi PC2に Caffe Deep Learning Frameworkをビルドして Deep Dreamを動かしてキモイ絵を生成する

Raspberry Piに Jupyter Notebookをインストールして拡張子 ipynb形式の IPythonを動かす
Raspberry Piに Jupyter Notebookをインストールして拡張子 ipynb形式の IPythonを動かす

  ラズパイに IPython Notebookをインストールして Google DeepDream dream.ipynbを動かす

Raspberry Piで Deep Learningフレームワーク Chainerをインストールしてみる
Raspberry Piで Deep Learningフレームワーク Chainerをインストールしてみる

  ラズパイに Deep Learningのフレームワーク Chainerを入れてみた

Raspberry Piで DeepBeliefSDKをビルドして画像認識フレームワークを動かす方法
Raspberry Piで DeepBeliefSDKをビルドして画像認識フレームワークを動かす方法

  ラズパイに DeepBeliefSDKを入れて画像の物体認識を行なう

Raspberry Piで Microsoftの ELLをビルドする方法
Raspberry Piで Microsoftの ELLをビルドする方法

  ラズパイで Microsoftの ELL Embedded Learning Libraryをビルドしてみるテスト、ビルドするだけ

Raspberry Piで MXNet port of SSD Single Shot MultiBoxを動かして画像の物体検出をする方法
Raspberry Piで MXNet port of SSD Single Shot MultiBoxを動かして画像の物体検出をする方法

  ラズパイで MXNet port of SSD Single Shot MultiBox Object Detectorで物体検出を行なってみる

Raspberry Piで Apache MXNet Incubatingをビルドする方法
Raspberry Piで Apache MXNet Incubatingをビルドする方法

  ラズパイで Apache MXNet Incubatingをビルドしてみるテスト、ビルドするだけ

Raspberry Piで OpenCVの Haar Cascade Object Detectionでリアルタイムにカメラ映像の顔検出を行なってみる
Raspberry Piで OpenCVの Haar Cascade Object Detectionでリアルタイムにカメラ映像の顔検出を行なってみる

  ラズパイで OpenCVの Haar Cascade Object Detection Face & Eyeでリアルタイムでカメラ映像の顔検出をする方法

Raspberry Piで NNPACKをビルドする方法
Raspberry Piで NNPACKをビルドする方法

  ラズパイで NNPACKをビルドしてみるテスト、ビルドするだけ

Raspberry Pi 3の Linuxコンソール上で使用する各種コマンドまとめ
Raspberry Pi 3の Linuxコンソール上で使用する各種コマンドまとめ

  ラズパイの Raspbian OSのコマンドラインで使用する便利コマンド、負荷試験や CPUシリアル番号の確認方法等も




[HOME] | [BACK]
リンクフリー(連絡不要、ただしトップページ以外は Web構成の変更で移動する場合があります)
Copyright (c) 2019 FREE WING,Y.Sakamoto
Powered by 猫屋敷工房 & HTML Generator

http://www.neko.ne.jp/~freewing/raspberry_pi/nvidia_jetson_nano_build_nnpack/