We propose an accurate and robust coarse-to-fine text detection scheme with user-intention which captures the intrinsic characteristics of natural scene texts. In the coarse detection stage, a double edge detector is designed to estimate the symmetry of stroke and the stroke width, which help segment the foreground. Then the initial user-intention region is extended to generate a coarse bounding box based on the estimated foreground. In the refinement stage, candidate connected components (CCs) from Niblack decomposition, are grouped together by location to form text lines after noise removal and layer selection. Experimental results demonstrate the effectiveness of the proposed method which yields higher performance compared with state-of-the-art methods.