The highly parallel 2D-clustering FPGA implementation used for the input system of the Fast TracKer (FTK) processor for the ATLAS experiment of the Large Hadron Collider (LHC) at CERN is presented. The LHC after the 2013-2014 shutdown periods is planned to have increased luminosity, which will make it more difficult to have efficient online selection of rare events due to the increase of the overlapping collisions. FTK is a highly-parallelized hardware system that improves the online selection by executing real time track finding using the information from the silicon inner detector. The FTK system requires fast and robust clustering of the hits retrieved from the silicon detector on FPGA devices. We show the development of the original input boards and the implemented clustering algorithm. For the complicated 2D-clustering, a moving window technique is used to minimize the use of FPGA resources. The combination of custom developed boards and implementation of the clustering algorithm provides sufficient processing power to meet the specifications for the silicon inner detector of ATLAS up to the maximum LHC luminosity planned until 2022. The developed algorithm is easily adjustable to other image processing applications that require real-time 2D-clustering.