Machine Learning and Data Mining


연구 분야: Databases



학회: International Scientific and Practical Conference on Information Technologies and Intelligent Decision Making Systems


초록

The article discusses the main tasks of machine learning. The functional structure of a computer algorithm for solving machine learning problems and a data mining model are considered. The solution of the simplest machine learning problem using the classification method of linear regression is proposed. The analysis of existing learning algorithms based on a decision tree is carried out. Based on the analysis performed, a decision tree was selected for implementation using the so-called C4.5 algorithm. The article builds a decision tree using specific training data. The application of a simple and understandable algorithm for building trees is to create all possible trees, calculate the number of erroneously classified data for each of them and select a tree with a minimum number of errors. As a result, an optimal learning algorithm is formed for the decision tree in terms of data errors in the learning process. The top-down algorithm implemented in the article for building a decision tree selects the attribute with the largest increase in information at each step. Entropy is used as a metric of the amount of information in the training data set D. In the process of implementation, this algorithm is analyzed to identify opportunities for its application to a more complex task.


Author Profile
Dmitry A. Kurasov

University of Tyumen 6 Street Volodarskogo Tyumen 625003 Russia

Russia
Author Profile
Anton S. Kutuzov

Chelyabinsk State University 129 Street Kashirin Brothers Chelyabinsk 454001 Russia

Russia
Author Profile
Dmitry S. Zvonarev

University of Tyumen 6 Street Volodarskogo Tyumen 625003 Russia

Russia

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Russia
사이트 Springer
좋아요 수 0

연관 논문 목록 (291건)