This paper describes the a posteriori method of decision tree learning after the tree is applied to a real domain, such as medical diagnoses. Without collecting a new set of diagnosis examples, the presented algorithm reconstructs a decision tree preserving the error rate of diagnosis from an original tree and a frequency of diagnoses, which is counted at reaching the corresponding terminal node of that tree when applied to a real domain. The new tree has a shorter path length to diagnose and a logically same meaning with the original tree because of generating a set of pseudoexamples whose unobserved attribute values uniformly distribute in the value range. To reduce the computational cost, a method to avoid a generation of a pseudoexample set also is presented. The context dependencies between attributes are considered by introducing an attribute concatenation. The experiments show that an average path length will be reduced by 6 to 10 percent after reconstruction of a randomly generated decision tree with nonoptimized diagnosis frequencies.
ASJC Scopus subject areas
- Theoretical Computer Science
- Information Systems
- Hardware and Architecture
- Computational Theory and Mathematics