Archived

This content is available here for research, reference, and/or recordkeeping.

Author ORCID Identifier

https://orcid.org/0000-0002-8059-3824

Date Available

4-24-2026

Year of Publication

2026

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Engineering

Department/School/Program

Electrical and Computer Engineering

Faculty

Ishan Thakkar

Faculty

Daniel Lau

Abstract

In recent years, artificial intelligence has achieved remarkable success across domains such as computer vision, natural language processing, and scientific computing. This progress has been driven largely by advances in deep learning, particularly deep neural networks (DNNs), including convolutional neural networks (CNNs) and transformer-based models. While these models deliver unprecedented accuracy, often surpassing human performance, their computational complexity continues to grow rapidly due to multibillion- and trillion-parameter designs. As model sizes and deployment scales expand, the demand for energy-efficient and high-throughput hardware accelerators has intensified. Conventional electronic platforms based on CPUs, GPUs, ASICs, and FPGAs are increasingly constrained by the slowdown of CMOS scaling, limiting advances in latency, energy efficiency, and throughput.

Photonic integrated circuits (PICs) offer a promising alternative for accelerating tensor computations by exploiting the inherent advantages of light, including high bandwidth, low latency, and wavelength-division multiplexing. Optical signal propagation avoids resistive losses and impedance-related limitations of electronic interconnects, enabling highly parallel matrix-vector operations. However, despite their potential, PIC-based accelerators face significant challenges, including limited optical power budgets, insertion losses, crosstalk noise, constrained precision scalability, thermal sensitivity, and fabrication variability. Existing photonic accelerator architectures also suffer from high power overheads due to digital-to-analog and analog-to-digital converters, limited support for dynamic tensor operations required by transformer workloads, and strong trade-offs between achievable precision and system scalability.

This dissertation addresses these challenges through a comprehensive architectural, fabrication-aware, and experimental investigation of scalable photonic accelerator systems. First, it develops microring-based photonic generalized matrix multiplication (GEMM) accelerators that enable byte-size integer arithmetic through precision-scalable bit-slicing techniques, improving compatibility with modern mixed-precision neural networks. Second, it presents a device-circuit-signaling co-design methodology that systematically addresses the three principal constraints on photonic tensor core scalability: inter-modulation crosstalk, two-photon absorption losses, and the precision-limited optical dynamic range, through novel circuit organizations, silicon nitride device platforms, and stochastic binary signaling. Third, it provides a unified comparative scalability analysis of structurally valid microring-based photonic tensor core configurations under consistent power budget models. Fourth, it introduces stochastic photonic computing approaches for transformer and large language model inference, mapping both static and dynamic attention computations onto optical hardware.

Beyond architectural modeling and evaluation, this work extends to fabrication-aware photonic system design. It incorporates system-level PIC architecture development, layout implementation in GDSII, design rule checking, and tape-out considerations informed by foundry-level fabrication workflows using the Advanced Micro Foundry process design kit, particularly for digitally and stochastically driven dot-product engines. These efforts bridge theoretical accelerator design and practical silicon photonics realization. In addition, this dissertation establishes an experimental foundation for spintronic reservoir computing through ferromagnetic resonance characterization of magnetic thin films. An automated broadband vector network analyzer ferromagnetic resonance measurement system is developed, and permalloy thin-film measurements validate the experimental methodology for characterizing candidate materials for spin-wave-based unconventional computing.

By unifying architectural modeling, scalable precision design, fabrication-aware photonic implementation through a foundry tape-out of microring modulators and weight banks, and experimental ferromagnetic resonance characterization, this research advances the design of energy-efficient, scalable, and flexible tensor processing architectures based on photonic integrated circuits. The outcomes contribute toward practical, high-performance artificial intelligence hardware platforms supporting next-generation deep learning workloads.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2026.115

Archival?

Archival

Share

COinS