A Study of The Convolutional Neural Networks Applications

  • Ahmed S. Shamsaldin Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq http://orcid.org/0000-0002-4148-0333
  • Polla Fattah Software and Informatics Engineering Department, College of Engineering, Salahaddin University-Erbil, Erbil, Kurdistan Region - F.R. Iraq http://orcid.org/0000-0001-8027-3540
  • Tarik A. Rashid Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq http://orcid.org/0000-0002-8661-258X
  • Nawzad K. Al-Salihi Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq http://orcid.org/0000-0002-4180-799X
Keywords: Convolutional Neural Networks, Natural Language, Computer Vision, Deep Learning,

Abstract

At present, deep learning is widely used in a broad range of arenas. A convolutional neural networks (CNN) is becoming the star of deep learning as it gives the best and most precise results when cracking real-world problems. In this work, a brief description of the applications of CNNs in two areas will be presented: First, in computer vision, generally, that is, scene labeling, face recognition, action recognition, and image classification; Second, in natural language processing, that is, the fields of speech recognition and text classification.

 

Downloads

Download data is not yet available.

Author Biographies

Ahmed S. Shamsaldin, Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq

Mr. Ahmed Saadaldin Shamsaldin is currently a teaching assistnat and a researcher at the University of Kurdistan Hewler. He graduated from the University of Kurdistan Hewler in 2019 with an M.Sc. in Computer systems Engineering. He also obtained a B.Sc. in computer engineering from the University of Kurdistan Hewler in 2016. His areas of interest are Embedded Systems, Industrial Computer Applications, IoT, C Programing and Microcontrollers. 

Polla Fattah, Software and Informatics Engineering Department, College of Engineering, Salahaddin University-Erbil, Erbil, Kurdistan Region - F.R. Iraq

Dr. Polla Fattah is currently a Senior Lecturer at the Software and Informatics Engineering Department, College of Engineering in Salahaddin University-Erbil. He is an experienced Lecturer with a demonstrated history of working in the higher education industry. Skilled in Data Analysis,Data Mining, Machine Learning, Artificial Intelligence, and Software development in general like web apps. Using Languages Like, Javascript, Python, R PHP and C++.

Tarik A. Rashid, Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq

Dr. Tarik Ahmed Rashid received his Ph.D. in Computer Science and Informatics degree from College of Engineering, Mathematical and Physical Sciences, University College Dublin (UCD) in 2001-2006. He pursued his Post-Doctoral Follow at the Computer Science and Informatics School, College of Engineering, Mathematical and Physical Sciences, University College Dublin (UCD) from 2006-2007. He Joined the University of Kurdistan Hewlêr (UKH) in 2017.

Nawzad K. Al-Salihi, Department of Computer Science and Engineering, School of Science and Engineering, University of Kurdistan Hewler, Erbil, Kurdistan Region - F.R. Iraq

Dr. Nawzad Kameran Al-Salihi (Saleyi) is an Assistant Professor and Chair in Department of Computer Science and Engineering (CSE) at University of Kurdistan-Hewler (UKH). Dr. Al-Salihi has an extensive academic background, which includes a Ph.D. in Electronic and Computer Engineering from Brunel University, London, UK (2010), Thesis title: "Precise Positioning in Real-Time using GPS-RTK Signal for Visually Impaired People NavigationSystem"; MSc in Advanced Manufacturing Systems from Brunel University, London, UK; BSc in Automation Engineering from University of Skovde, Sweden. Dr. Al-Salihi has widespread professional experience in satellite navigation systems (GPS, GLONASS and GALILEO), wireless mobile communications and advanced manufacturing systems. He has published several academic journal articles and presented at several international conferences. 

References

Bengio, Y., Lamblin, P., Popovici, D. & Larochelle, H., (2007). Greedy layer-wise training of deep networks. In Advances in neural information processing systems, 153-160.

Bhandare, A., Bhide, M., Gokhale, P. & Chandavarkar, R., (2016). Applications of convolutional neural networks. International Journal of Computer Science and Information Technologies, 7(5), 2206-2215.
Boureau, Y. L., Ponce, J. & LeCun, Y. (2010). A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th international conference on machine learning (ICML-10), 111-118.
Chellapilla, K., Puri, S. & Simard, P. (2006). High performance convolutional neural networks for document processing. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00112631⟩
Chéron, G., Laptev, I. & Schmid, C. (2015). P-cnn: Pose-based cnn features for action recognition. In Proceedings of the IEEE international conference on computer vision, 3218-3226.
Cireşan, D., Meier, U. & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745.
Dalal, N. & Triggs, B. (2005). Histograms of oriented gradients for human detection.
Farabet, C., Couprie, C., Najman, L. & LeCun, Y. (2012). Learning hierarchical features for scene labeling. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1915-1929.
Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4), 193-202.
Girshick, R., Donahue, J., Darrell, T. & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 580-587.
Gkioxari, G., Girshick, R. & Malik, J. (2015). Contextual action recognition with r* cnn. ICCV '15 Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 1080-1088. DOI: 10.1109/ICCV.2015.129.
Graves, A., Mohamed, A.R. & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, 6645-6649.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J. & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354-377.
He, K., Zhang, X., Ren, S. & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778.
Hecht-Nielsen, R. (1988). Theory of the backpropagation neural network. Neural Networks, 1, 445.
Hinton, G.E., Osindero, S. & Teh, Y.W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.
Huang, J.T., Li, J. & Gong, Y. (2015). An analysis of convolutional neural networks for speech recognition. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4989-4993.
Hubel, Hubel, D.H. & Wiesel, T.N. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1), 106-154.
Ji, S., Xu, W., Yang, M. & Yu, K. (2012). 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35(1), 221-231.
Johnson, R. & Zhang, T. (2014). Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058.
Johnson, R. & Zhang, T. (2015). Semi-supervised convolutional neural networks for text categorization via region embedding. In Advances in neural information processing systems, 919-927.
Kim, Y., Jernite, Y., Sontag, D. & Rush, A.M. (2016). Character-aware neural language models. In Thirtieth AAAI Conference on Artificial Intelligence.
Krizhevsky, A., Sutskever, I. & Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097-1105.
Lawrence, S., Giles, C.L., Tsoi, A.C. & Back, A.D. (1997). Face recognition: A convolutional neural-network approach. IEEE transactions on neural networks, 8(1), 98-113.
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E. & Jackel, L.D. (1990). Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems, 396-404.
LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
LeCun, Y. & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995.
Long, J., Shelhamer, E. & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431-3440.
Lowe, D.G. (1999). Object recognition from local scale-invariant features. In iccv, 99(2), 1150-1157.
Mao, Q., Dong, M., Huang, Z. & Zhan, Y. (2014). Learning salient features for speech emotion recognition using convolutional neural networks. IEEE transactions on multimedia, 16(8), 2203-2213.
Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Nair, V. & Hinton, G.E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), 807-814.
Palaz, D. & Collobert, R. (2015). Analysis of cnn-based speech recognition system using raw speech as input (No. REP_WORK). Idiap.
Pennington, J., Socher, R. & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 1532-1543.
Pinheiro, P.H. & Collobert, R. (2014). Recurrent convolutional neural networks for scene labeling. In 31st International Conference on Machine Learning (ICML) (No. CONF).
Qian, Y., Bi, M., Tan, T. & Yu, K. (2016). Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(12), 2263-2276.
Rajeswar, M.S., Sankar, A.R., Balasubramaniam, V.N. & Sudheer, C.D. (2015). Scaling up the training of deep CNNs for human action recognition. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 1172-1177.
Ranzato, M.A., Huang, F.J., Boureau, Y.L. & LeCun, Y. (2007). Unsupervised learning of invariant feature hierarchies with applications to object recognition. In 2007 IEEE conference on computer vision and pattern recognition, 1-8.
Santos, C.D. & Zadrozny, B. (2014). Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 1818-1826.
Shamsaldin, A., Rashid, T., Al-Rashid Agha, R., Al-Salihi, N. and Mohammadi, M. (2019). Donkey and smuggler optimization algorithm: A collaborative working approach to path finding. Journal of Computational Design and Engineering, 6(4), pp.562-583.
https://doi.org/10.1016/j.jcde.2019.04.004
Simonyan, K. & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Simonyan, K. & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, 568-576.
Song, W. & Cai, J. (2015). End-to-end deep neural network for automatic speech recognition. Standford CS224D Reports.
Steinkraus, D., Buck, I. & Simard, P.Y. (2005). Using GPUs for machine learning algorithms. In Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 1115-1120.
Strigl, D., Kofler, K. & Podlipnig, S. (2010). Performance and scalability of GPU-based convolutional neural networks. In 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 317-324.
Swietojanski, P. & Arnab G. (2014). Convolutional neural networks for
recognition. IEEE Signal Processing Letters 21.9 (2014), 1120-1124.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1-9.
Uetz, R. & Behnke, S. (2009). Large-scale object recognition with CUDA-accelerated hierarchical neural networks. In 2009 IEEE international conference on intelligent computing and intelligent systems, 1, 536-541.
Wang, K., Wang, X., Lin, L., Wang, M. & Zuo, W. (2014). 3d human activity recognition with reconfigurable convolutional neural networks. In Proceedings of the 22nd ACM international conference on Multimedia, 97-106.
Wang, P., Cao, Y., Shen, C., Liu, L. & Shen, H.T. (2016). Temporal pyramid pooling-based convolutional neural network for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 27(12), 2613-2622.
Xia, L. & Aggarwal, J. K. (2013). Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2834-2841.
Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y. & Zhang, Z. (2015). The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 842-850.
Yan, Z., Zhang, H., Piramuthu, R., Jagadeesh, V., DeCoste, D., Di, W. & Yu, Y. (2015). HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In Proceedings of the IEEE international conference on computer vision, 2740-2748.
Yang, J., Yu, K., Gong, Y. & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In 2009 IEEE Conference on computer vision and pattern recognition, 1794-1801.
Zeiler, M.D. & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, Cham., 818-833.
Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE multimedia, 19(2), 4-10.
Zhang, X., Zhao, J. & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Advances in neural information processing systems, 649-657.
Zheng, W.Q., Yu, J.S. & Zou, Y.X. (2015). An experimental study of speech emotion recognition based on deep convolutional neural networks. In 2015 international conference on affective computing and intelligent interaction (ACII), 827-831.
Published
2019-12-27
How to Cite
Shamsaldin, A., Fattah, P., Rashid, T., & Al-Salihi, N. (2019, December 27). A Study of The Convolutional Neural Networks Applications. UKH Journal of Science and Engineering, 3(2), 31-40. https://doi.org/https://doi.org/10.25079/ukhjse.v3n2y2019.pp31-40
Section
Research Articles