연구 분야: Databases
학회: International Scientific and Practical Conference on Information Technologies and Intelligent Decision Making Systems
The article discusses the main tasks of machine learning. The functional structure of a computer algorithm for solving machine learning problems and a data mining model are considered. The solution of the simplest machine learning problem using the classification method of linear regression is proposed. The analysis of existing learning algorithms based on a decision tree is carried out. Based on the analysis performed, a decision tree was selected for implementation using the so-called C4.5 algorithm. The article builds a decision tree using specific training data. The application of a simple and understandable algorithm for building trees is to create all possible trees, calculate the number of erroneously classified data for each of them and select a tree with a minimum number of errors. As a result, an optimal learning algorithm is formed for the decision tree in terms of data errors in the learning process. The top-down algorithm implemented in the article for building a decision tree selects the attribute with the largest increase in information at each step. Entropy is used as a metric of the amount of information in the training data set D. In the process of implementation, this algorithm is analyzed to identify opportunities for its application to a more complex task.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Russia |
| 사이트 | Springer |
| 좋아요 수 | 0 |