TY - JOUR
T1 - Data on the solution and processing time reached when constructing a phylogenetic tree using a quantum-inspired computer
AU - Onodera, Wataru
AU - Hara, Nobuyuki
AU - Aoki, Shiho
AU - Asahi, Toru
AU - Sawamura, Naoya
N1 - Funding Information:
This work was supported by Fujitsu Laboratories, Ltd. using a Quantum-Inspired Computing Digital Annealer.
Publisher Copyright:
© 2023 The Author(s)
PY - 2023/4
Y1 - 2023/4
N2 - Phylogenetic trees provide insight into the evolutionary trajectories of species and molecules. However, because (2n-5)! Phylogenetic trees can be constructed from a dataset containing n sequences, but this method of phylogenetic tree construction is not ideal from the viewpoint of a combinatorial explosion to determine the optimal tree using brute force. Therefore, we developed a method for constructing a phylogenetic tree using a Fujitsu Digital Annealer, a quantum-inspired computer that solves combinatorial optimization problems at a high speed. Specifically, phylogenetic trees are generated by repeating the process of partitioning a set of sequences into two parts (i.e., the graph-cut problem). Here, the optimality of the solution (normalized cut value) obtained by the proposed method was compared with the existing methods using simulated and real data. The simulation dataset contained 32–3200 sequences, and the average branch length according to a normal distribution or the Yule model ranged from 0.125 to 0.750, covering a wide range of sequence diversity. In addition, the statistical information of the dataset is described in terms of two indices: transitivity and average p-distance. As phylogenetic tree construction methods are expected to continue to improve, we believe that this dataset can be used as a reference for comparison and confirmation of the validity of the results. Further interpretation of these analyses is explained in W. Onodera, N. Hara, S. Aoki, T. Asahi, N. Sawamura, Phylogenetic tree reconstruction via graph cut presented using a quantum-inspired computer, Mol. Phylogenet. Evol. 178 (2023) 107636.
AB - Phylogenetic trees provide insight into the evolutionary trajectories of species and molecules. However, because (2n-5)! Phylogenetic trees can be constructed from a dataset containing n sequences, but this method of phylogenetic tree construction is not ideal from the viewpoint of a combinatorial explosion to determine the optimal tree using brute force. Therefore, we developed a method for constructing a phylogenetic tree using a Fujitsu Digital Annealer, a quantum-inspired computer that solves combinatorial optimization problems at a high speed. Specifically, phylogenetic trees are generated by repeating the process of partitioning a set of sequences into two parts (i.e., the graph-cut problem). Here, the optimality of the solution (normalized cut value) obtained by the proposed method was compared with the existing methods using simulated and real data. The simulation dataset contained 32–3200 sequences, and the average branch length according to a normal distribution or the Yule model ranged from 0.125 to 0.750, covering a wide range of sequence diversity. In addition, the statistical information of the dataset is described in terms of two indices: transitivity and average p-distance. As phylogenetic tree construction methods are expected to continue to improve, we believe that this dataset can be used as a reference for comparison and confirmation of the validity of the results. Further interpretation of these analyses is explained in W. Onodera, N. Hara, S. Aoki, T. Asahi, N. Sawamura, Phylogenetic tree reconstruction via graph cut presented using a quantum-inspired computer, Mol. Phylogenet. Evol. 178 (2023) 107636.
KW - Distance-matrix method
KW - Graph cut
KW - Phylogenetic reconstruction
KW - Quantum-inspired computing
UR - http://www.scopus.com/inward/record.url?scp=85148537233&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148537233&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2023.108970
DO - 10.1016/j.dib.2023.108970
M3 - Article
AN - SCOPUS:85148537233
SN - 2352-3409
VL - 47
JO - Data in Brief
JF - Data in Brief
M1 - 108970
ER -