An online visual tracking Attentional RNN model, including region attention and scale attention.
Region attention: to measure the different important roles of object parts in tracking the target, the bounding box of the object is divided into several grids and each grid is weighted according to their difference from the surroundings.
Scale attention: to adapt to the changed targets and the scenarios, is used to select the accurate object location from multiple scale features.
Jointly training recurrent network and attention network.