Logic relationship analysis is a data mining method that comprehensively detects item triplets that satisfy logic relationships from a binary matrix dataset, such as an ortholog table in comparative genomics. Thanks to recent technological advancements, many binary matrix datasets are now being produced in genomics, transcriptomics, epigenomics, metagenomics, and many other fields for comparative purposes. However, regardless of presumed interpretability and importance of logic relationships, existing data mining methods are not based on the framework of statistical hypothesis testing. That means, the type-1 and type-2 error rates are neither controlled nor estimated. Here, we developed Logicome Profiler, which exhaustively detects statistically significant triplet logic relationships from a binary matrix dataset (Logicome means ome of logics). To test all item triplets in a dataset while avoiding false positives, Logicome Profiler adjusts a significance level by the Bonferroni or Benjamini-Yekutieli method for the multiple testing correction. Its application to an ocean metagenomic dataset showed that Logicome Profiler can effectively detect statistically significant triplet logic relationships among environmental microbes and genes, which include those among urea transporter, urease, and photosynthesis-related genes. Beyond omics data analysis, Logicome Profiler is applicable to various binary matrix datasets in general for finding significant triplet logic relationships. The source code is available at https://github.com/fukunagatsu/LogicomeProfiler.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)