Uniform resource locators (URLs), used for referencing web pages, play a vital role in cyber fraud because of their complicated structure; phishers, or in other words, attackers, employ tricky by-passing techniques to deceive users. Thus, information extracted from URLs might indicate significant and meaningful patterns essential for phishing detection. To enhance the accuracy of URL-based phishing detection, we need an accurate word segmentation technique to split URLs correctly. However, in contrast to traditional word segmentation techniques used in natural language processing (NLP), URL segmentation requires meticulous attention, as tokenization, the process of turning meaningless data into meaningful data, is not as easy to apply as in NLP. In our work, we concentrate on URL segmentation to propose a novel tokenization method, named URL-Tokenizer, by combining the Bert tokenizer and WordSegment tokenizer, in addition to adopting character-level and word-level segmentations simultaneously. Our experimental evaluations in detecting the phishing URLs show that our proposed method achieves a high accuracy of 95.7% with a balanced dataset, and 97.7% with an imbalanced dataset, whereas baseline models achieved 85.4% with a balanced dataset and 85.1% with an imbalanced dataset.