Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

EC-ECC: Accelerating Elliptic Curve Cryptography for Edge Computing on Embedded GPU TX2

EC-ECC: Accelerating Elliptic Curve Cryptography for Edge Computing on Embedded GPU TX2 Driven by artificial intelligence and computer vision industries, Graphics Processing Units (GPUs) are now rapidly achieving extraordinary computing power. In particular, the NVIDIA Tegra K1/X1/X2 embedded GPU platforms, which are also treated as edge computing devices, are now widely used in embedded environments such as mobile phones, game consoles, and vehicle-mounted systems to support high-dimension display, auto-pilot, and so on. Meanwhile, with the rise of the Internet of Things (IoT), the demand for cryptographic operations for secure communications and authentications between edge computing nodes and IoT devices is also expanding. In this contribution, instead of the conventional implementations based on FPGA, ASIC, and ARM CPUs, we provide an alternative solution for cryptographic implementation on embedded GPU devices. Targeting the new cipher suite added in TLS 1.3, we implement Edwards25519/448 and Curve25519/448 on an edge computing platform, embedded GPU NVIDIA Tegra X2, where various performance optimizations are customized for the target platform, including a novel parallel method for the register-limited embedded GPUs. With about 15 W of power consumption, it can provide 210k/31k ops/s of Curve25519/448 scalar multiplication, 834k/123k ops/s of fixed-point Edwards25519/448 scalar multiplication, and 150k/22k ops/s of unknown-point one, which are respectively the primitives and main workloads of key agreement, signature generation, and verification of the TLS 1.3 protocol. Our implementations achieve 8 to 26 times speedup of OpenSSL running in the very powerful ARM CPU of the same platform and outperform the state-of-the-art implementations in FPGA by a wide margin with better power efficiency. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Embedded Computing Systems (TECS) Association for Computing Machinery

EC-ECC: Accelerating Elliptic Curve Cryptography for Edge Computing on Embedded GPU TX2

Loading next page...
 
/lp/association-for-computing-machinery/ec-ecc-accelerating-elliptic-curve-cryptography-for-edge-computing-on-y21tstuwFc

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2022 Association for Computing Machinery.
ISSN
1539-9087
eISSN
1558-3465
DOI
10.1145/3492734
Publisher site
See Article on Publisher Site

Abstract

Driven by artificial intelligence and computer vision industries, Graphics Processing Units (GPUs) are now rapidly achieving extraordinary computing power. In particular, the NVIDIA Tegra K1/X1/X2 embedded GPU platforms, which are also treated as edge computing devices, are now widely used in embedded environments such as mobile phones, game consoles, and vehicle-mounted systems to support high-dimension display, auto-pilot, and so on. Meanwhile, with the rise of the Internet of Things (IoT), the demand for cryptographic operations for secure communications and authentications between edge computing nodes and IoT devices is also expanding. In this contribution, instead of the conventional implementations based on FPGA, ASIC, and ARM CPUs, we provide an alternative solution for cryptographic implementation on embedded GPU devices. Targeting the new cipher suite added in TLS 1.3, we implement Edwards25519/448 and Curve25519/448 on an edge computing platform, embedded GPU NVIDIA Tegra X2, where various performance optimizations are customized for the target platform, including a novel parallel method for the register-limited embedded GPUs. With about 15 W of power consumption, it can provide 210k/31k ops/s of Curve25519/448 scalar multiplication, 834k/123k ops/s of fixed-point Edwards25519/448 scalar multiplication, and 150k/22k ops/s of unknown-point one, which are respectively the primitives and main workloads of key agreement, signature generation, and verification of the TLS 1.3 protocol. Our implementations achieve 8 to 26 times speedup of OpenSSL running in the very powerful ARM CPU of the same platform and outperform the state-of-the-art implementations in FPGA by a wide margin with better power efficiency.

Journal

ACM Transactions on Embedded Computing Systems (TECS)Association for Computing Machinery

Published: Feb 8, 2022

Keywords: ECC

References