Mohsen Fayyaz

PhD Candidate, University of Bonn, Computer Vision Group of Prof. Dr. Juergen Gall

See My Resume

About Me

Experienced Doctoral Researcher at the university of Bonn with a demonstrated history of working in the research industry. Skilled in Computer Vision and Deep Learning. Strong research professional with a Master’s Degree focused in Artificial Intelligence.


University of Bonn

Doctoral Researcher

Computer vision doctoral researcher at the University of Bonn, faculty of Computer Science III under supervision of Prof. Dr. J. Gall.


Computer Vision Researcher

Designing and Developing Deep Neural Networks Architectures for Computer Vision Tasks.

Iran University of Science and Technology

Research Assistant

Machine learning and Deep learning researcher in IUST HPC lab with the supervision of Prof. Dr. M. Fathy.


University of Bonn

November 2017 - Now

PhD Candidate, Computer Science

Supervisor: Prof. Dr. J. Gall


September 2014 - September 2016

Master of Science in Computer Science - Artificial Intelligence and Robotics

Supervisors: Prof. Mahmood Fathy, Dr. Mojtaba Hosseini
Advisor: Dr. Mohammad Sabokrou
Thesis: Activity Recognition in Video based on Convolutional Neural Networks
GPA: 19.13/20.00
Ranked First with highest GPA among all Computer Engineering students (AI) since 2014

Semnan University

September 2010 - September 2014

Bachelor of Science in Computer Software Engineering

Supervisor: Dr. K. Kiani
Thesis: Designing and Implementing a Cloud-based Accounting System
Ranked First with highest GPA among all of the university computer engineering students since 2010


AVID: Adversarial Visual Irregularity Detection

2018 - ACCV
M. Sabokrou*, M. Pourreza*, M. Fayyaz*, R. Entezari, R. Fathy, J. Gall, E. Adeli
Real-time detection of irregularities in visual data is very invaluable and useful in many prospective applications including surveillance, patient monitoring systems, etc. With the surge of deep learning methods in the recent years, researchers have tried a wide spectrum of methods for different applications. However, for the case of irregularity or anomaly detection in videos, training an end-to-end model is still an open challenge, since often irregularity is not well-defined and there are not enough irregular samples to use during training ...

See Publication

Spatio-Temporal Channel Correlation Networks for Action Classification

July 2018 - ECCV
A. Diba*, M. Fayyaz*, V. Sharma, M. Arzani, R. Yousefzadeh, J. Gall, L. Van Gool
The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts of 3D CNNs. We name our novel block'Spatio- Temporal Channel Correlation'(STC)…

See Publication

Temporal 3D ConvNets by Temporal Transition Layer

May 2018 - CVPR Workshop on Brave New Ideas in Video Understanding 2018
A. Diba*, M. Fayyaz*, V. Sharma, A. Karami, M. Arzani, R. Yousefzadeh, L. Van Gool
The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on spatio-temporal approaches with fixed temporal convolution kernel depths. We introduce a new temporal layer that models variable temporal convolution kernel depths. We embed this new temporal layer in our proposed 3D CNN. We extend the DenseNet architecture - which normally is 2D - with 3D filters and pooling kernels. We name our proposed video convolutional network `Temporal 3D ConvNet'~(T3D) and its new temporal layer `Temporal Transition Layer'~(TTL)...

See Publication

Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes

March 2018 - Computer Vision and Image Understanding
M. Sabokro*, M. Fayyaz*, M. Fathy, Z. Moayed, R. Klette
The detection of abnormal behaviour in crowded scenes has to deal with many challenges. This paper presents an efficient method for detection and localization of anomalies in videos. Using fully convolutional neural networks (FCNs) and temporal data, a pre-trained supervised FCN is transferred into an unsupervised FCN ensuring the detection of (global) anomalies in scenes. High performance in terms of speed and accuracy is achieved by investigating the cascaded detection as a result of reducing computation complexities...

See Publication

Deep-cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes

Feb 2017 - IEEE Transactions on Image Processing
M. Sabokro, M. Fayyaz, M. Fathy, R. Klette
This paper proposes a fast and reliable method for anomaly detection and localization in video data showing crowded scenes. Time-efficient anomaly localization is an ongoing challenge and subject of this paper. We propose a cubic patch-based method, characterized by a cascade of classifiers, which makes use of an advanced feature learning approach. Our cascade of classifiers has two main stages...

See Publication

STFCN - Spatio-Temporal Fully Convolutional Neural Network for Semantic Segmentation of Street Scenes

Sep 2016 - ACCV Workshop
M. Fayyaz, M. Sabokro, M. Hajizadeh, M. Fathy, F. Huang, R. Klette
This paper presents a novel method to involve both spatial and temporal features for semantic video segmentation. Current work on convolutional neural networks(CNNs) has shown that CNNs provide advanced spatial features supporting a very good performance of solutions for both image and video analysis, especially for the semantic segmentation task.

See Publication

A novel approach for Finger Vein verification based on self-taught learning

Nov 2015 - 9th Iranian Conference on Machine Vision and Image Processing (MVIP)
In this paper, we propose a method for user Finger Vein Authentication (FVA) as a biometric system. Using the discriminative features for classifying theses finger veins is one of the main tips that make difference in related works, thus we propose to learn a set of representative features, based on auto-encoders.

See Publication


  • PyTorch
  • Torch
  • Tensorflow
  • Caffe
  • Python
  • C
  • C#
  • C++

Get in Touch