GRU uses so-called reset and update gate to dispel the problem of vanishing gradient. These gates can be trained to keep information for long without losing the information througha period or eliminating unnecessary information to predict[25]. The whole GRU structure information is shown in Fig. 3.