Vanishing Gradient in Deep Networks - Quant Trader Interview Question
Difficulty: Medium
Category: Linear Algebra & Machine Learning
Practice quant interview questions from top firms including Jane Street, Citadel, Two Sigma, DE Shaw, and other leading quantitative finance companies.
Topics: vanishing-gradient, deep-learning, neural-networks, sigmoid, backpropagation
Problem Description
You are training a very deep neural network (100+ layers) for a complex financial time-series forecasting task. You observe that the early layers of your network learn very slowly compared to the later layers. The activation function used in each layer is a sigmoid function. Explain why this phenomenon, known as the vanishing gradient problem, occurs in the early layers.
Practice this medium trader interview question on MyntBit - the all-in-one quant learning platform with 200+ quant interview questions for Jane Street, Citadel, Two Sigma, and other top quantitative finance firms.