Analisis Pengaruh Fungsi Aktivasi, Learning Rate Dan Momentum Dalam Menentukan Mean Square Error (MSE) Pada Jaringan Saraf Restricted Boltzmann Machines (RBM)
Authors
Susilawati Susilawati , Muhathir MuhathirDOI:
10.31289/jite.v2i2.2162Published:
2019-01-27Issue:
Vol. 2 No. 2 (2019): EDISI JANUARIKeywords:
Jaringan Saraf, Mean Square Error (MSE), Restricted boltzmann machines (RBM)Downloads
Abstract
Restricted boltzmann machines (RBM) merupakan algoritma pembelajaran jaringan syaraf tanpa pengawaas (unsupervised learning) yang hanya terdiri dari dua lapisan yang visible layer dan hidden layer. Kinerja RBM sangat dipengaruhi oleh parameter-parameternya seperti fungsi aktivasi yang digunakan untuk mengaktifkan neuron pada jaringan dan learning rate serta momentum untuk mempercepat proses pembelajaran. Pemilihan fungsi aktivasi yang tepat sangat mempengaruhi kinerja dalam menentukan Mean Square Error (MSE) pada jaringan saraf RBM. Fungsi aktivasi yang digunakan pada jaringan RBM adalah fungsi aktivasi sigmoid. Beberapa varian dari fungsi aktivasi sigmoid seperti fungsi sigmoid biner dan sigmoid tangen hiperbolik (tanh). Dengan menggunakan dataset MNIST untuk pembelajaran dan pengujian, terlihat bahwa tingkat keberhasilan untuk klasifikasi pada fungsi aktivasi sigmoid biner, ditentukan oleh nilai MSE yang kecil. Berbeda dengan fungsi aktivasi tangen nilai MSE menaik seiring bertambahnya jumlah epoch. Fungsi aktivasi sigmoid biner dengan learning rate 0.05 dan momentum 0.7 memiliki tingkat pengenalan tulisan tangan yang tinggi sebesar 93.42%, diikuti dengan learning rate 0.01 momentum 0.9 yakni 91.92%, learning rate 0.05 momentum 0.5 yakni 91.31%, learning rate 0.01 momentum 0.7 sebesar 90.56% dan terakhir learning rate 0.01 momentum 0.5 sebesar 87.49%.
References
Ahmed. A, Yu. K, Xu. W, Gong. Y, dan Xing. P. E (2008). Training hierarchical feed-forward visual recognition models using transfer learning from pseudo tasks, Proceedings of the 10th European Conference on Computer Vision (ECCV’08): pp. 69–82.
Bengio. Y, Lamblin. P, Popovici. D, dan Larochelle. H (2007). Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems 19 (NIPS’06) : pp. 153– 160
Collobert. R, dan Weston. J (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning, Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML’08), pp. 160–167.
Hadsell. R, Erkan. A, Sermanet. P, Scoffier. M, Muller. U, dan LeCun. Y (2008). Deep belief net learning in a long-range vision system for autonomous offroad driving, Proc. Intelligent Robots and Systems (IROS’08), pp. 628–633.
Hinton, Geoffrey. 2010. A Practical Guide to Training Restricted Boltzmann Machines. University of Toronto.
Hinton. E. G, Osindero. S, dan Teh. W. Y (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554.
Hinton. E. G (2002). Training products of experts by minimizing contrastive divergence, Neural Computation, vol. 14, pp. 1771–1800.
Kai, Ding., Zhibin, Liu., Lianwen, Jin. & Xinghua, Zhu. 2007. A Comparative study of GABOR feature and gradient feature for handwritten 17hinese character recognition, International Conference on Wavelet Analysis and Pattern Recognition : pp. 1182-1186
Larochelle. H, Erhan. D, Courville. A, Bergstra. J, dan Bengio. Y (2007). An empirical evaluation of deep architectures on problems with many factors of variation, Proceedings of the Twenty-fourth International Conference on Machine Learning (ICML’07) : pp. 473–480.
Le Cun, Yann. & Corinna Cortes. 2010. The MNIST Database of Handwritten Digits. Web. <http://yann.lecun.com/exdb/mnist/>.
Lee. H, Grosse. R, Ranganath. R, dan Ng. Y. A (2009). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, Proceedings of the Twenty-sixth International Conference on Machine Learning (ICML’09), Montreal (Qc), Canada: ACM.
Lotfi, Abdelhadi. & Benyettou, Abdelkader (2011). Using probabilistic neural network for handwritten digit recognition. Journal of Artificial Intelligence. ISSN 1994-5450.
Mnih. A, dan Hinton. E. G (2009). A scalable hierarchical distributed language model, Advances in Neural Information Processing Systems 21 (NIPS’08) : pp. 1081–1088.
Osindero. S, dan Hinton. E. G (2008). Modeling image patches with a directed hierarchy of Markov random field, Advances in Neural Information Processing Systems 20 (NIPS’07): pp. 1121–1128.
Ranzato. M, Boureau. –L. Y, dan LeCun. Y (2008). Sparse feature learning for deep belief networks, in Advances in Neural Information Processing Systems 20 (NIPS’07): pp. 1185–1192.
Ranzato. M, Poultney. C, Chopra. S, dan LeCun. Y (2007). Efficient learning of parse representations with an energy-based model, Advances in Neural Information Processing Systems 19 (NIPS’06): pp. 1137–1144.
Ranzato. M, dan Szummer. M (2008). Semi-supervised learning of compact document representations with deep networks, Proceedings of the Twenty-fifth nternational Conference on Machine Learning (ICML’08): vol. 307, pp. 792–799.
Salakhutdinov. R, dan Hinton. E. G (2008). Using deep belief nets to learn covariance kernels for Gaussian processes, Advances in Neural Information Processing Systems 20 (NIPS’07): pp. 1249–1256.
Salakhutdinov. R, dan Hinton. E. G (2007). Learning a nonlinear embedding by preserving class neighbourhood structure, Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS’07): San Juan, Porto Rico: Omnipress.
Salakhutdinov. R, dan Hinton. E. G (2007). Semantic hashing. Proceedings of the 2007 Workshop on Information Retrieval and applications of Graphical Models (SIGIR 2007), Amsterdam: Elsevier.
Salakhutdinov. R, Mnih. A, dan Hinton E. G (2007). Restricted Boltzmann machines for collaborative filtering, Proceedings of the Twenty-fourth International Conference on Machine Learning (ICML’07): pp. 791–798, New York.
Smolensky. P (1986). Information processing in dynamical systems: Foundations of harmony theory, Parallel Distributed Processing, vol. 1, pp. 194–28.
Susilawati, (2017), Algoritma Restricted Boltmann Machines (RBM) untuk Pengenalan Tulisan Tangan Angka, Prosiding seminar nasional Teknologi Informatika, ISBN :978-602-50006-0-7, pp. 128-136
Taylor. G, dan Hinton. G (2009). Factored conditional restricted Boltzmann machines for modeling motion style, Proceedings of the 26th International Conference on Machine Learning (ICML’09): pp. 1025–1032.
Taylor. G, Hinton. E. G, dan Roweis. S (2007). Modeling human motion using binary latent variables, Advances in Neural Information Processing Systems 19 (NIPS’06): pp. 1345–1352.
Torralba. A, Fergus. R, dan Weiss. Y (2008). Small codes and large databases for recognition, Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR’08): pp. 1–8.
Weston. J, Ratle. F, dan Collobert. R (2008). Deep learning via semi-supervised embedding, Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML’08): pp. 1168–1175, New York.
Yee, The. & Hinton, Geoffrey. (2010). Rate-coded Restricted Boltzmann Machines for Face Recognition. University of Toronto. http://citeseerx.i st.psu.edu/viewdoc/download? doi=10.1.1.135.5929&rep=rep1&type=pdf>.12 September 2014.
License
This work is licensed under aCreative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).