The network is a new graph neural network structure combining LSTM aggregator ... and Multi-modal factorized bilinear pooling is used to fuse the multimodal features. Finally, the corresponding BDI-II ...