Abstract [eng] |
The final master's thesis deals with the topic of authorship of language texts. The deep learning networks chosen for the authorship of English language texts are the Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) and autoencoders. The study compares the selected methods with other machine learning methods: support vector machine (SVM), k-nearest neighbours algorithm (CNN) and Bayesian probabilistic classifier (Bayes). The data used are Lithuanian language texts - 147 parliamentary speeches with a total number of more than 110,000. The n-gram model was chosen for the metrics. The highest accuracy obtained in the study was 74%. Based on the results, conclusions and recommendations are presented. The paper consists of: introduction, text authorship identification, analysis of artificial intelligence methods for text authorship identification, results of the experimental study, conclusions, recommendations and reference list. Thesis consists of 46 p. text without appendixes, 12 pictures, 5 tables, 37 bibliographical entries. Appendixes are included separately. |