This paper proposes a partially-parallel LDPC decoder based on a high-efficiency message-passing algorithm. Our proposed partially-parallel LDPC decoder performs the column operations for bit nodes in conjunction with the row operations for check nodes. Bit functional unit with pipeline architecture in our LDPC decoder allows us to perform column operations for every bit node connected to each of check nodes which are updated by the row operations in parallel. Our proposed LDPC decoder improves the timing when the column operations are performed, accordingly it improves the message-passing efficiency within the limited number of iterations for decoding. We implemented the proposed partially-parallel LDPC decoder on an FPGA, and simulated its decoding performance. Practical simulation shows that our proposed LDPC decoder reduces the number of iterations for decoding, and it improves the bit error performance with a small hardware overhead.