Approximate multiplier design is an effective technique to improve hardware performance at the cost of accuracy loss. The current approximate multipliers are mostly ASIC-based and are dedicated for one particular application. In contrast, FPGA has been an attractive choice for many applications, because of its high performance, reconfigurability, and fast development. This paper presents a novel methodology for designing approximate multipliers by employing the FPGA-based fabrics. The area and latency are significantly reduced by cutting the carry propagation path in the multiplier. Moreover, we explore higher-order multipliers on architectural space by using our proposed small-size approximate multipliers as elementary modules. For different accuracy requirements, eight configurations for approximate 8 × 8 multiplier are discussed. In terms of mean relative error distance (MRED), the accuracy loss of the proposed 8 × 8 multiplier is low as 0.17%. Compared with the exact multiplier, our proposed design can reduce area by 43.66% and power by 20.36%. The critical path latency reduction is up to 27.66%. The proposed multiplier design has a better accuracy-hardware tradeoff than other designs with com-parable accuracy.