Novosibirsk State University Journal of Information Technologies
Scientic Journal

ISSN 2410-0420 (Online), ISSN 1818-7900 (Print)

Switch to
Russian

All Issues >> Contents: Volume 10, Issue No 1 (2012)

Analysis of regulatory regions of genes by Expert Discovery relation system, integrated into UGENE toolkit
Yu. Yu. Vaskin, I. V. Khomicheva, E. V. Ignatyeva, E. E. Vityayev

Novosibirsk State University
Institute of Cytology and Genetics of the SB RAS
Institute of Mathematics of the SB RAS

UDC code: 004.4

Abstract
The task of automatic extraction of hierarchical structure of eukaryotic gene regulatory region is posed on the junction of the fields of biology, mathematics and information technologies. A solution of the problem implies understanding of the sophisticated mechanisms of eukaryotic gene regulation and applying data mining technologies for analysis with many features. The paper discusses the integrated system implementing a powerful relation mining of biological data. The system allows taking into account prior information about the analyzed data that is known by the biologist, performing the analysis on each hierarchical level, searching for a solution from a simple hypothesis to a complex one. The integration of the system provides a convenient environment for conducting complex research and automating the work of the biologist.

Key Words
annotation, genes regulatory regions, recognition, relation data mining, integrated system, hierarchical analysis, complex signal

How to cite:
Vaskin Yu. Yu., Khomicheva I. V., Ignatyeva E. V., Vityayev E. E. Analysis of regulatory regions of genes by Expert Discovery relation system, integrated into UGENE toolkit // Vestnik NSU Series: Information Technologies. - 2012. - Volume 10, Issue No 1. - P. 73-86. - ISSN 1818-7900. (in Russian).

Full Text in Russian

Available in PDF

References
1. Kel A. E., Kolchanov N. A., Kel O. V., Romashchenko A. G., Ananko E. A., Ignatyeva E. V., Merkulova T. I., Podkolodnaya O. A., Stepanenko I. L., Kochetov A. V., Kolpakov F. A., Podkolodny N. L., Naumochkin A. A. TRRD – baza dannykh transkriptcionnykh regulyatornykh raionov genov eukariot // Molekulyarnaya biologiya. 1997. T. 31. S. 626–636.
2. Kel O. V., Kel A. E., Romashchenko A. G., Vingender E., Kolchanov N. A. Kompozitcionnyye regulyatornyye elementy: klassifikatciya i opisaniye v baze dannykh COMPEL // Molekulyarnaya biologiya. 1997. T. 31, № 4. S. 601–615.
3. Lemon B., Tjian R. Orchestrated Response: A Symphony of Transcription Factors for GeneControl // Genes Dev. 2000. Vol. 14. No. 20. P. 2551–2569.
4. Arnone M. I., Davidson E. H. The Hardwiring of Development: Organization and Function of Genomic Regulatory Systems // Development. 1997. Vol. 124 (10). P. 1851–1864.
5. Kolchanov N. A., Podkolodnaya O. A., Ananko E. A., Ignatieva E. V., Stepanenko I. L., KelMargoulis O. V., Kel A. E., Merkulova T. I., Goryachkovskaya T. N., Busygina T. V., Kolpakov F. A., Podkolodny N. L., Naumochkin A. N., Korostishevskaya I. M., Romashchenko A. G., Overton G. C. Transcription Regulatory Regions Database (TRRD): Its Status in 2000 // Nucleic Acids Res. 2000. Vol. 28. No. 1. P. 298–301.
6. Kovalerchuk B., Vityaev E. Data Mining in Finance: Advances in Relational and Hybrid Methods. (Kluwer international series in engineering and computer science; SECS 547). Kluwer Academic Publishers, 2000. 308 p.
7. Kolchanov N. A., Podkolodnaya O. A., Ananko E. A., Ignatyeva E. V., Stepanenko I. L., Khlebodarova T. M., Merkulova T. I., Merkulov V. M., Mishchenko E. L., Ibragimova S. S., Smirnova O. G., Podkolodny N. L., Romashchenko A. G., Oshchepkov D. Yu., Miginsky D. S. Regulyatornyye posledovatelnosti DNK: opisaniye v bazakh dannykh // Sistemnaya kompyuternaya biologiya / Pod red. N. A. Kolchanova, S. S. Goncharova, V. A. Likhoshvai, V. A. Ivanisenko. Seriya: Integratcionnyye proyekty. Novosibirsk: Izd-vo SO RAN, 2008. Vyp. 14. S. 15–37.
8. Nikolov D. B., Burley S. K. RNA Polymerase II Transcription Initiation: A Structural View // Proc. Natl. Acad. Sci. USA. 1997. Vol. 94. P. 15–22.
9. Kel O. V., Romaschenko A. G., Kel A. E., Wingender E., Kolchanov N. A. A Compilation of Composite Regulatory Elements Affecting Gene Transcription in Vertebrates // Nucleic Acids Res. 1995. Vol. 23 (20). P. 4097–4103.
10. Caley M., Smale S. T. Transcriptional Regulation in Eukaryotes. Cold Spring Harbor; N. Y.: Cold Spring Harbor Laboratory Press, 2000. 640 p.
11. Trifonov E. N. Genetic Level of DNA Sequences Is Determined by Superposition of Many Codes // Mol. Biol. 1997. Vol. 31. P. 759–767.
12. Fulton D. L., Sundararajan S., Badis G., Hughes T. R., Wasserman W. W., Roach J. C., Sladek R. TFCat: The Curated Catalog of Mouse and Human Transcription Factors // Genome Biol. 2009. Vol. 10 (3). P. R29.
13. Quandt K., Frech K., Karas H., Wingender E., Werner T. MatInd and MatInspector: New Fast and Versatile Tools for Detection of Consensus Matches in Nucleotide Sequence Data // Nucleic Acids Res. 1995. Vol. 23 (23). P. 4878–4884.
14. Stormo G. D. DNA Binding Sites: Representation and Discovery // Bioinformatics. 2000. Vol. 16. P. 16–23.
15. Oshchepkov D. Y., Vityaev E. E., Grigorovich D. A., Ignatieva E. V., Khlebodarova T. M. SITECON: A Tool for Detecting Conservative Conformational and Physicochemical Properties in Transcription Factor Binding Site Alignments and for Site Recognition // Nucleic Acids Res. 2004. Vol. 32. P. 208–212.
16. Levitsky V. G., Ignatieva E. V., Ananko E. A., Turnaev I. I., Merkulova T. I., Kolchanov N. A., Hodgman T. C. Effective Transcription Factor Binding Site Prediction Using a Combination of Optimization, a Genetic Algorithm and Discriminant Analysis to Capture Distant Interactions // BMC Bioinformatics. 2007. Vol. 8 (1). P. 481.
17. Kolchanov N. A., Merkulova T. I., Ignatieva E. V., Ananko E. A., Oshchepkov D. Y., Levitsky V. G., Vasiliev G. V., Klimova N. V., Merkulov V. M., Charles Hodgman T. Combined Experimental and Computational Approaches to Study the Regulatory Elements in Eukaryotic Genes // Brief Bioinform. 2007. Vol. 8 (4). P. 266–274.
18. Kel A., Kel-Margoulis O., Babenko V., Wingender E. Recognition of NFATp/AP-1 Composite Elements within Genes Induced upon the Activation of Immune Cells // J. Mol. Biol. 1999. Vol. 288 (3). P. 353–376.
19. Kel A., Konovalova T., Waleev T., Cheremushkin E., Kel-Margoulis O., Wingender E. Composite Module Analyst: A Fitness-Based Tool for Identification of Transcription Factor Binding Site Combinations // Bioinformatics. 2006. Vol. 22 (10). P. 1190–1197.
20. Khomicheva I. V., Vityaev E. E., Shipilov T. I., Levitsky V. G., Transcription Factor Binding Sites Recognition by the ExpertDiscovery System Based on the Recursive Complex Signals // Proc. of the V International Conference on Bioinformatics of Genome Regulation and Structure (BGRS2006, 16–22 July, Novosibirsk, Russia). Novosibirsk, 2006 Vol. 1. P. 77–80.
21. Khomicheva I. V., Vityaev E. E., Ananko E. A., Levitsky V. G., Shipilov T. I. Hierarchical Analysis of the Eukaryotic Transcription Regulatory Regions Based on the DNA Codes of Transcription // Proc. of the III Moscow Conference on Computational Molecular Biology. Moscow, Russia, 2007. P. 142–144.
22. Khomicheva I., Demin A., Vityaev E. Transcription Factor Binding Site Discovery by the Probabilistic Rules // PKDD Proc. XI European Conference on Principles and Practice of Knowledge Discovery in Databases / Eds. J. N. Kok, J. Koronacki et al. Warsaw, Poland, 2007. P. 104–109.
23. Okonechnikov K., Golosova O., Varlamov A., Fursov M. Unipro UGENE: An Open Source Toolkit for Complex Genome Analysis // Proc. of the XII Annual Bioinformatics Open Source Conference. URL: http://www.oboedit.org/BOSC2011/BOSC2011-program.pdf
24. Vityayev E. E. Izvlecheniye znany iz dannykh. Kompyuternoye poznaniye. Modeli kogntivnykh protcessov: Monogr. Novosibirsk, 2006.
25. Yule U. On the Association of Attributes in Statistics // Philosophical Transactions of the Royal Society of London. Ser. A. 1990. Vol. 194. P. 257–319.
26. Kendal M., Styuart A. Statisticheskiye vyvody i svyazi. M.: Nauka, 1973.
27. Fursov M. Y., Oshchepkov D. Y., Novikova O. S. UGENE: Interactive Computational Schemes for Genome Analysis // Proc. of the V Moscow International Congress on Biotechnology. 2009. Vol. 3. P. 14–15.
28. Cornish-Bowden A. Enzyme kinetics // Comprehensive Biotechnology. 1985. Vol. 1. P. 521–538.
29. Grillo G., Licciulli F., Liuni S., Sbisa E., Pesole G. PatSearch: A Program for the Detection of Patterns and Structural Motifs in Nucleotide Sequences // Nucleic Acids Res. 2003. Vol. 31 (13). P. 3608–3612.
30. Khomicheva I. V., Vityaev E. E., Ananko E. A., Shipilov T. I., Levitsky V. G. ExpertDiscovery System Application for the Hierarchical Analysis of the Eukaryotic Transcription Regulatory Regions Based on the DNA Codes of Transcription // Intelligent Data Analysis. 2008. Vol. 12. No. 5. P. 481–494.
31. Khomicheva I. V., Vityaev E. E., Shipilov T. I. Discovery of the Transcription Factor Binding Sites in the Aligned and Unaligned DNA Sequences // Proc. of the V International Conference on Bioinformatics of Genome Regulation and Structure (BGRS’2008, 22-28 June, Novosibirsk, Russia). Novosibirsk, 2008. P. 116.

Publication information
Main title Vestnik NSU Series: Information Technologies, Volume 10, Issue No 1 (2012).
Parallel title: Novosibirsk State University Journal of Information Technologies Volume 10, Issue No 1 (2012).

Key title: Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriâ: Informacionnye tehnologii
Abbreviated key title: Vestn. Novosib. Gos. Univ., Ser.: Inf. Tehnol.
Variant title: Vestnik NGU. Seriâ: Informacionnye tehnologii

Year of Publication: 2012
ISSN: 1818-7900 (Print), ISSN 2410-0420 (Online)
Publisher: Novosibirsk State University Press
DSpace handle


|Home Page| |All Issues| |Information for Authors| |Journal Boards| |Ethical principles| |Editorial Policy| |Contact Information| |Old Site in Russian|

inftech@vestnik.nsu.ru
© 2006-2017, Novosibirsk State University.