In modern microprocessors, the access time of register file becomes a critical part in total delay. Instruction level or thread level parallelism improves Instructions Per. Cycle (IPC) by executing multiple instructions in one cycle. Such multiple instructions need to read or write data from/to register files simultaneously. To satisfy that, register file with sufficient ports should be designed. However, the area and access time of register file with large ports will increase sharply. Duplicated Register File (DupRF) architecture can reduce access time by distributing read ports. In this paper, we propose a new kind of DupRF architecture for embedded Simultaneous Multithreading (SMT) microprocessor and estimate the effect with respect to the area and access time. Especially, we measure the product of area and access time as computation cost. For a SMT microprocessor with 6 threads, 64-bit data-width and 6 function units, a 3-duplicate register file architecture can reduce access time by 12.61% with a slight increase of computation cost by 3.35% compared with the central register file architecture.