Research Article | | Peer-Reviewed

Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique

Received: 24 September 2023    Accepted: 12 October 2023    Published: 30 October 2023
Views:       Downloads:
Abstract

Biclustering is a data mining technique used to analyze gene expression data. It consists of classifying subgroups of genes that behave similarly under subgroups of conditions and can behave independently under other conditions. These discovered co-expressed genes (called biclusters) can help to find specific biological aims like finding characteristics of a specific disease. A large number of biclustering algorithms have been developed. Generally, these algorithms give as output a large number of overlapped biclusters. The visualization of these biclusters is still a non-trivial task. In this paper, we present a new approach to display biclustering results from gene expression data on the same screen. It is based on a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. We illustrated the usefulness of our method with biclustering results from real and synthetic datasets and we compared it to other techniques that concentrate on biclustering overlaps issue. The method is implemented in a web-based interactive visualization tool called VisBicluster available at http://vis.usal.es/~visusal/visbicluster.

Published in Computational Biology and Bioinformatics (Volume 11, Issue 2)
DOI 10.11648/j.cbb.20231102.11
Page(s) 19-32
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Biclustering Visualization, Two-Dimensional Matrix, Filtering, Overlaps, InfoVis

References
[1] M. B. Eisen, P. T. Spellman, P. O. Brown, D. Botstein, Cluster analysis and display of genome-wide expression patterns, Proceedings of the National Academy of Sciences of the United States of America. 95 (1998) 14863–14868. doi: 10.1073/pnas.95.25.14863.
[2] R. R. Sokal, C. D. Michener, A statistical method for evaluating systematic relationships, Univ. Kansas, Sci. Bull. 38 (1958) 1409–1438. https://ci.nii.ac.jp/naid/10004143217/.
[3] J. A. Hartigan, M. A. Wong, Algorithm AS 136: A K-Means Clustering Algorithm, 1979. http://www.labri.fr/perso/bpinaud/userfiles/downloads/hartigan_1979_kmeans.pdf (accessed July 6, 2019).
[4] Y. Cheng, G. M. Church, Biclustering of expression data., Proceedings. International Conference on Intelligent Systems for Molecular Biology. 8 (2000) 93–103. http://www.ncbi.nlm.nih.gov/pubmed/10977070 (accessed April 4, 2017).
[5] S. C. Madeira, A. L. Oliveira, Biclustering algorithms for biological data analysis: a survey, IEEE/ACM Trans. Comput. Biol. Bioinforma. 1 (2004) 24–45. doi: 10.1109/TCBB.2004.2.
[6] C. North, Information Visualization, in: Handbook of Human Factors and Ergonomics, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2006: pp. 1222–1245. doi: 10.1002/0470048204.ch46.
[7] C. Ware, Information visualization: perception for design, Morgan Kaufman, 2004. https://dokumen.tips/documents/information-visualization-perception-for-design-2nd-edition.html (accessed July 13, 2019).
[8] B. Pontes, R. Giráldez, J. S. Aguilar-Ruiz, Biclustering on expression data: A review, Journal of Biomedical Informatics. 57 (2015) 163–180. doi: 10.1016/j.jbi.2015.06.028.
[9] H. Aouabed, M. Elloumi, R. Santamaría, An evaluation study of biclusters visualization techniques of gene expression data, Journal of Integrative Bioinformatics. 18 (2021). doi: 10.1515/JIB-2021-0019/MACHINEREADABLECITATION/RIS.
[10] H. Aouabed, R. Santamaria, M. Elloumi, Visualizing biclustering results on gene expression data: A survey, ACM International Conference Proceeding Series. (2021) 170–179. doi: 10.1145/3473258.3473284.
[11] D. Gonçalves, R. S. Costa, R. Henriques, Context-situated visualization of biclusters to aid decisions: going beyond subspaces with parallel coordinates, ACM International Conference Proceeding Series. (2022). doi: 10.1145/3531073.3531124.
[12] N. K. Verma, T. Sharma, S. Dixit, P. Agrawal, S. Sengupta, V. Singh, BIDEAL: A Toolbox for Bicluster Analysis—Generation, Visualization and Validation, SN Computer Science. 2 (2021). doi: 10.1007/S42979-020-00411-9.
[13] M. Sözdinler, A Review of Visualization Methods and Tools for the Biclustering, International Journal of Innovative Science and Research Technology. 6 (2021). www.ijisrt.com (accessed June 5, 2023).
[14] H. Aouabed, R. Santamaría, M. Elloumi, Suitable Overlapping Set Visualization Techniques and Their Application to Visualize Biclustering Results on Gene Expression Data, in: Springer, Cham, 2018: pp. 191–201. doi: 10.1007/978-3-319-99133-7_16.
[15] R. Santamaría, R. Therón, L. Quintales, BicOverlapper 2.0: visual analysis for gene expression, Bioinformatics. 30 (2014) 1785. doi: 10.1093/BIOINFORMATICS/BTU120.
[16] M. Streit, S. Gratzl, M. Gillhofer, A. Mayr, A. Mitterecker, S. Hochreiter, Furby: fuzzy force-directed bicluster visualization., BMC Bioinformatics. 15 Suppl 6 (2014) S4. doi: 10.1186/1471-2105-15-S6-S4.
[17] H. Aouabed, R. Santamaria, M. Elloumi, VisBicluster: A Matrix-Based Bicluster Visualization of Expression Data, J. Comput. Biol. (2020) cmb.2019.0385. doi: 10.1089/cmb.2019.0385.
[18] A. Lex, N. Gehlenborg, H. Strobelt, R. Vuillemot, H. Pfister, UpSet: Visualization of intersecting sets, IEEE Trans. Vis. Comput. Graph. 20 (2014) 1983–1992. doi: 10.1109/TVCG.2014.2346248.
[19] M. E. Baron, A Note on the Historical Development of Logic Diagrams: Leibniz, Euler and Venn, Math. Gaz. 53 (1969) 113. doi: 10.2307/3614533.
[20] R. Santamaría, R. Therón, L. Quintales, A visual analytics approach for understanding biclustering results from microarray data, BMC Bioinformatics. 9 (2008) 247. doi: 10.1186/1471-2105-9-247.
[21] S. Barkow, S. Bleuler, A. Prelić, P. Zimmermann, E. Zitzler, BicAT: A biclustering analysis toolbox, Bioinformatics. 22 (2006) 1282–1283. doi: 10.1093/bioinformatics/btl099.
[22] A. Prelić, S. Bleuler, P. Zimmermann, A. Wille, P. Bühlmann, W. Gruissem, L. Hennig, L. Thiele, E. Zitzler, A systematic comparison and evaluation of biclustering methods for gene expression data, Bioinformatics. 22 (2006) 1122–1129. doi: 10.1093/bioinformatics/btl060.
[23] V. I. Levenshtein, Binary Codes Capable of Correcting Deletions, Insertions and Reversals, Sov. Phys. Dokl. Vol. 10, p.707. 10 (1966) 707. http://adsabs.harvard.edu/abs/1966SPhD...10..707L.
[24] R. Santamaria, Visual analysis of gene expression data by means of biclustering, University of Salamanca, Spain, 2009.
[25] L. Lazzeroni, A. Owen, Plaid Models for Gene Expression Data, CEUR Workshop Proc. 1542 (2000) 33–36. doi: 10.1017/CBO9781107415324.004.
[26] V. A. Padilha, R. J. G. B. Campello, A systematic comparative evaluation of biclustering techniques, BMC Bioinformatics. 18 (2017) 55. doi: 10.1186/s12859-017-1487-1.
[27] A. Bhattacharjee, W. G. Richards, J. Staunton, C. Li, S. Monti, P. Vasa, C. Ladd, J. Beheshti, R. Bueno, M. Gillette, M. Loda, G. Weber, E. J. Mark, E. S. Lander, W. Wong, B. E. Johnson, T. R. Golub, D. J. Sugarbaker, M. Meyerson, Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses, Proceedings of the National Academy of Sciences of the United States of America. 98 (2001) 13790–13795. doi: 10.1073/pnas.191502998.
[28] B. Alsallakh, L. Micallef, W. Aigner, H. Hauser, S. Miksch, P. Rodgers, Visualizing Sets and Set-typed Data: State-of-the-Art and Future Challenges, Eurographics Conference on Visualization (EuroVis)– State of The Art Reports. (2014) 1–21. doi: 10.2312/eurovisstar.20141170.
[29] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, Knowledge Discovery and Data Mining: Towards a Unifying Framework, 1996. www.aaai.org (accessed January 4, 2020).
[30] D. Chang, L. Dooley, J. E. Tuovinen, Gestalt theory in visual screen design: a new look at an old subject, in: Seventh World Conference on Computers in Education, 2002. https://www.semanticscholar.org/paper/Gestalt-theory-in-visual-screen-design%3A-a-new-look-Chang-Dooley/41ca82e97d5ad678c9578d6a18d4600b708277d2 (accessed November 17, 2019).
[31] J. Mackinlay, Applying a theory of graphical presentation to the graphic design of user interfaces, in: Proceedings of the 1st Annual ACM SIGGRAPH Symposium on User Interface Software and Technology, UIST 1988, Association for Computing Machinery, Inc, 1988: pp. 179–189. doi: 10.1145/62402.62431.
[32] B. Shneiderman, The eyes have it: A task by data type taxonomy for information visualizations, Proceedings IEEE Symposium on Visual Languages. (1996) 336--343. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.224.3197 (accessed November 16, 2019).
[33] D. Keim, K. Jörn, G. Ellis, M. Florian, Mastering the information age: solving problems with visual analytics, Eurographics Association, 2010.
[34] S. Kaiser, R. Santamaria, T. Khamiakova, M. Sill, R. Theron, L. Quintales, F. Leisch, E. De, T. Maintainer, biclust: BiCluster Algorithms. R package version 1.0.2., (2013). https://cran.r-project.org/web/packages/biclust/biclust.pdf (accessed April 22, 2017).
Cite This Article
  • APA Style

    Haithem Aouabed, Mourad Elloumi. (2023). Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique. Computational Biology and Bioinformatics, 11(2), 19-32. https://doi.org/10.11648/j.cbb.20231102.11

    Copy | Download

    ACS Style

    Haithem Aouabed; Mourad Elloumi. Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique. Comput. Biol. Bioinform. 2023, 11(2), 19-32. doi: 10.11648/j.cbb.20231102.11

    Copy | Download

    AMA Style

    Haithem Aouabed, Mourad Elloumi. Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique. Comput Biol Bioinform. 2023;11(2):19-32. doi: 10.11648/j.cbb.20231102.11

    Copy | Download

  • @article{10.11648/j.cbb.20231102.11,
      author = {Haithem Aouabed and Mourad Elloumi},
      title = {Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique},
      journal = {Computational Biology and Bioinformatics},
      volume = {11},
      number = {2},
      pages = {19-32},
      doi = {10.11648/j.cbb.20231102.11},
      url = {https://doi.org/10.11648/j.cbb.20231102.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cbb.20231102.11},
      abstract = {Biclustering is a data mining technique used to analyze gene expression data. It consists of classifying subgroups of genes that behave similarly under subgroups of conditions and can behave independently under other conditions. These discovered co-expressed genes (called biclusters) can help to find specific biological aims like finding characteristics of a specific disease. A large number of biclustering algorithms have been developed. Generally, these algorithms give as output a large number of overlapped biclusters. The visualization of these biclusters is still a non-trivial task. In this paper, we present a new approach to display biclustering results from gene expression data on the same screen. It is based on a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. We illustrated the usefulness of our method with biclustering results from real and synthetic datasets and we compared it to other techniques that concentrate on biclustering overlaps issue. The method is implemented in a web-based interactive visualization tool called VisBicluster available at http://vis.usal.es/~visusal/visbicluster.},
     year = {2023}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique
    AU  - Haithem Aouabed
    AU  - Mourad Elloumi
    Y1  - 2023/10/30
    PY  - 2023
    N1  - https://doi.org/10.11648/j.cbb.20231102.11
    DO  - 10.11648/j.cbb.20231102.11
    T2  - Computational Biology and Bioinformatics
    JF  - Computational Biology and Bioinformatics
    JO  - Computational Biology and Bioinformatics
    SP  - 19
    EP  - 32
    PB  - Science Publishing Group
    SN  - 2330-8281
    UR  - https://doi.org/10.11648/j.cbb.20231102.11
    AB  - Biclustering is a data mining technique used to analyze gene expression data. It consists of classifying subgroups of genes that behave similarly under subgroups of conditions and can behave independently under other conditions. These discovered co-expressed genes (called biclusters) can help to find specific biological aims like finding characteristics of a specific disease. A large number of biclustering algorithms have been developed. Generally, these algorithms give as output a large number of overlapped biclusters. The visualization of these biclusters is still a non-trivial task. In this paper, we present a new approach to display biclustering results from gene expression data on the same screen. It is based on a two-dimensional matrix where each bicluster is represented as a column and each overlap between a set of biclusters is represented as a row. We illustrated the usefulness of our method with biclustering results from real and synthetic datasets and we compared it to other techniques that concentrate on biclustering overlaps issue. The method is implemented in a web-based interactive visualization tool called VisBicluster available at http://vis.usal.es/~visusal/visbicluster.
    VL  - 11
    IS  - 2
    ER  - 

    Copy | Download

Author Information
  • Computer Science Department, Faculty of Economic Sciences and Management, University of Sfax, Sfax, Tunisia

  • Computer Science Department, Faculty of Computing and Information Technology, University of Bisha, Bisha, Saudi Arabia

  • Sections