Teksto autorystės modeliavimas ir identifikavimas

Daumantas Tinteris

Title	Teksto autorystės modeliavimas ir identifikavimas
Translation of Title	Text authorship modeling and identification.
Authors	Tinteris, Daumantas
Full Text
Pages	52
Keywords [eng]	Authorship identification ; artificial intelligence methods ; n-grams ; analytical research review ; Lithuanian texts.
Abstract [eng]	The final master's thesis deals with the topic of authorship of language texts. The deep learning networks chosen for the authorship of English language texts are the Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) and autoencoders. The study compares the selected methods with other machine learning methods: support vector machine (SVM), k-nearest neighbours algorithm (CNN) and Bayesian probabilistic classifier (Bayes). The data used are Lithuanian language texts - 147 parliamentary speeches with a total number of more than 110,000. The n-gram model was chosen for the metrics. The highest accuracy obtained in the study was 74%. Based on the results, conclusions and recommendations are presented. The paper consists of: introduction, text authorship identification, analysis of artificial intelligence methods for text authorship identification, results of the experimental study, conclusions, recommendations and reference list. Thesis consists of 46 p. text without appendixes, 12 pictures, 5 tables, 37 bibliographical entries. Appendixes are included separately.
Dissertation Institution	Vilniaus Gedimino technikos universitetas.
Type	Master thesis
Language	Lithuanian
Publication date	2022

„Teksto autorystės modeliavimas ir identifikavimas“