Generative Deep Learning Model for a Multi-level NanoOptic Broadband Power Splitter

Tang, Yingheng; Kojima, Keisuke; Koike-Akino, Toshiaki; Wang, Ye; Wu, Pengxiang; TaherSima, Mohammad; Jha, Devesh; Parsons, Kieran; Qi, Minghao

TR2020-025 March 11, 2020

Abstract

A novel Conditional Variational Autoencoder (CVAE) model with the adversarial censoring is presented to help to generate the 550nm broad bandwidth (1250nm to 1800nm) power splitter with arbitrary splitting ratio

Optical Fiber Communication Conference and Exposition (OFC)
Generative Deep Learning Model for a Multi-level Nano-Optic Broadband Power Splitter

Yingheng Tang1,2, Keisuke Kojima1,*, Toshiaki Koike-Akino1, Ye Wang1, Pengxiang Wu1,
Mohammad Tahersima1, Devesh Jha1, Kieran Parsons1, and Minghao Qi2

1Mitsubishi Electric Research Labs. (MERL), 201 Broadway, Cambridge, MA 02139, USA.
2School of Electrical and Computer Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, IN 47907, USA
*kojima@merl.com

Abstract: A novel Conditional Variational Autoencoder (CVAE) model with the adversarial censoring is presented to help to generate the 550nm broad bandwidth (1250nm to 1800nm) power splitter with arbitrary splitting ratio. © 2020 The Author(s)

OCIS codes: (130.3120) Integrated optics devices; (230.1150) All-optical devices

1. Introduction

Utilizing machine learning to improve the automation in photonic design has attracted increased attention. An artificial intelligence integrated optimization process using neural networks (NN) can accelerate optimization by reducing the required number of numerical simulations [1]. Mohammad et al [2] used Deep Neural Network (DNN) in the inverse direction, i.e., use target performance data (such as transmission spectra) as input, and device design as output. However, the DNN network structure we used (i.e., ResNet) was one-to-one deterministic mapping, which generates only one certain device for every performance target. Another limiting factor of our previous demonstrations is that the nanostructured device consists of binary pixels (i.e., etch hole is present or not). To overcome this limitation, we propose a multilevel pixel structure (i.e., multi etch hole dimensions), which is a more complex optimization problem and requires more sophisticated optimization algorithms. In the research area of metamaterials, a few groups have proposed to use generative network for the pattern generation based on random numbers. Liu et al. has applied the Generative Adversarial Networks (GAN) [3]. and Ma et al. has employed the Variational Autoencoder (VAE) [4] for their applications. Inspired by these works, we propose to utilize Conditional Variational Autoencoder (CVAE) [5] in our power splitter design application. The VAE can model the distribution of the splitters with different splitting ratios, and thereby allows generating novel patterns subject to this same distribution through data sampling. When coupled with conditions as CVAE, it enables to produce patterns satisfying the given conditions such as target performance specifications. In addition, to further improve the performance, an additional adversarial block is introduced to regularize the CVAE so that physically meaningful device representation will be learned independent of conditions. In our application, we use different hole sizes to express the appearances, which serve as the conditions of CVAE. In this way, the generated patterns can work better in the light guidance and make the generated devices more stable.

Our device footprints are 2.25x2.25 µm² with a 20x20 etched hole combination. It is the first demonstration to apply the CVAE model for assisting the silicon photonics device design. We confirm that the optimized device has an overall performance close to 90% across all the bandwidth from C-band to O-band (1250nm to 1800nm). To the best of authors’ knowledge, this is the smallest broadband power splitter with arbitrary ratio.

2. Power splitter structure

Our device is a multi-mode interference (MMI) based power splitter with a footprint of 2.25 µm×2.25 µm with air cladding, a waveguide width of 500nm and the height of 220nm. We added a 20×20 Hole Vector (HV) to express the nanostructured hole configuration. The hole spacing is 130 nm, and the minimum and maximum hole diameter are 72 nm, and 40 nm respectively. The HV training data only consist of binary numbers initially and is obtained through direct binary search method. Note that the VAE models the probabilistic distribution for the different HV, each generated value of the HV is a Bernoulli’s distribution. In order to best reflect the result, different hole sizes are used to represent the probability of the appearance of etched holes at certain locations. The Figure 1 shows the sample footprint of the power splitter.
3. CVAE structure with adversarial supervision

We first use the variational encoder [6] and convolutional layers [7] to encode the original 20x20 HV to certain of latent variables. Then, the new HV is generated with the following process: a latent vector $z$ (whose length is 60 in our application) is sampled from the prior distribution $P_{θ}(z)$; then the data $x$ is generated from the conditional distribution $P_{θ}(x|z)$: $z \sim P_{θ}(z), x \sim P_{θ}(x|z)$. As shown in Fig.1, the original HV passes two convolutional layers and reduces to two sets of intermediate parameters: mean ($μ$) and standard deviation ($σ$), which represent the underlying probability distribution. In order to make the back propagation feasible, the reparametrize trick is applied, which is shown in the following equation:

$$z^i = μ^i + σ^i \times ε$$

Here $ε$ is a standard normal distribution, $i$ is the batch number. Then reparametrized latent variable $z$ is concatenated with additionally encoded condition parameter $s$ to decode back to the HV. When the input pattern information is encoded from the encoder, some performance data may be entangled into the latent variable as well. This may cause degradation of device performance for the generated pattern. To improve the performance of the generator, an adversarial block was added to isolate the latent variable $z$ from the condition $s$ (the performance data) in order to learn condition-free device physics better [8]. The loss function is shown as follows:

$$Loss = -[y_n \cdot \log x_n + (1 - y_n) \cdot \log(1 - x_n)]$$

$$+ \frac{1}{2} \sum_{j=1}^{l} 1 + \log(\sigma^2_{z_j}) - μ^2_{z_j} - σ^2_{z_j}$$

$$- α \cdot MSE_{loss}(s,s)$$

The loss function has two parts. The first is the VAE loss which contains the binary cross-entropy loss and the Kullback–Leibler divergence. The second part is the mean-square error (MSE) loss of the adversarial block. Since the condition information contained in the latent variable $z$ needs to be minimized as the generative decoder already feeds
the condition information, the MSE loss between $s$ and $\bar{s}$ (generated condition data from the adversarial block) needs to be maximized.

4. Results

After the training, we tested the generator to produce different devices given target spectra. Although the CVAE model analyzes the Bernoulli-distributed HV, we can interpret the probability value of hole presence in each generated each generated HV as different sizes of etched holes at certain locations so that we can generate novel multi-level nanophotonic devices even from binary-level training data. In order to verify the effectiveness of the generator, we consider devices to realize different splitting ratios, 5:5, 6:4, 7:3, 8:2. Figure 4 shows the results generated by the model and the finite-difference time-domain (FDTD) verification for those generated devices. The reflection is smaller than -20dB and the achieved transmission is larger than 87% across the bandwidth between 1250nm-1800nm.

5. Summary

A novel CVAE with adversarial censoring model is proposed and applied for generating arbitrary ratio power splitter with the bandwidth between 1250nm and 1800nm. The FDTD simulations show the overall transmission close to 90% for newly generated multi-level nanophotonic devices, using only binary-level nanophotonic datasets.

6. References