Novosibirsk State University Journal of Information Technologies
Scientic Journal

ISSN 2410-0420 (Online), ISSN 1818-7900 (Print)

Switch to
Russian

All Issues >> Contents: Volume 08, Issue No 1 (2010)

Program system expertdiscovery for dna regulatory regions analysis
Irina Vadimovna Khomicheva, Evgeny Evgenyevich Vityayev, Elena Vasilyevna Ignatyeva, Elena Anatolyevna Ananko, Timur Igorevich Shipilov

Institute of Cytology and Genetics SB RAS
Institute of Mathematics SB RAS

UDC code: 681.3:004.8

Abstract
The appearance of advanced experimental technologies in such fields of modern biology as genomics, transcriptomics, proteomics, cell biology, nanobioengineering, est. resulted in exponential growth of experimental data, that need to be analyzed and mined. The new methods of intelligent data analysis are challenged to solve the task of integration of primary raw experimental data, that are poorly consistent and structured, contain gaps, and separately can’t reconstruct completely the biologic system or process. We developed the integrated data mining method ExpertDiscovery, discovering the complex regularities of eukaryotic DNA regulatory regions organization. As the elementary signals to build the complex signals the system takes the different DNA characteristics, obtained, for instance, by another data mining tools. Using the regularities, discovered on the levels of research, the system allows to construct the hierarchical model of regulatory regions of specific group of genes.

Key Words
accuracy comparison, recognition, regulatory regions of genes, hierarchical analysis, integrated system, relational data mining, complex signal

How to cite:
Khomicheva I. V., Vityayev E. E., Ignatyeva E. V., Ananko E. A., Shipilov T. I. Program system expertdiscovery for dna regulatory regions analysis // Vestnik NSU Series: Information Technologies. - 2010. - Volume 08, Issue No 1. - P. 12-26. - ISSN 1818-7900. (in Russian).

Full Text in Russian

Available in PDF

References
1. Dynan W. S. Modularity in promoters and enhancers // Cell. 1989. Vol. 58 (1). P. 1–4.
2. Arnone M. I., Davidson E. H. The hardwiring of development: organization and function of genomic regulatory systems // Development, 1997. Vol. 124 (10). P. 1851–1864.
3. Nikolov D. B., Burley S. K. RNA polymerase II transcription initiation: A structural view // Proc. Natl. Acad. Sci. USA. 1997. Vol. 94. P. 15–22.
4. Blanchette M., Bataille A. R., Chen X., Poitras C., Laganiere J., Lefebvre C., Deblois G., Giguere V., Ferretti V., Bergeron D., Coulombe B., Robert F. Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression // Genome Res. 2006 May. Vol. 16 (5). P. 656–668.
5. Trifonov E. N. Genetic level of DNA sequences is determined by superposition of many codes // Mol. Biol. (Mosk). 1997. Vol. 31. P. 759–767.
6. Vityayev E. E., Orlov Yu. L., Khomicheva I. V., Shipilov T. I. Metody izvlecheniya znany i logicheskogo analiza regulyatornykh genomnykh posledovatelnostei // Sistemnaya kompyuternaya biologiya / Otv. red. N. A. Kolchanov, S. S. Goncharov, V. A. Likhoshvai, V. A. Ivanisenko. Novosibirsk: Izd-vo SO RAN, 2008. S. 126–136.
7. Xie X., Wu S., Lam K.-M., Yan H. PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm // Bioinformatics. 2006. Vol. 22. P. 2722–2728.
8. Vityayev E. E. Izvlecheniye znany iz dannykh. Kompyuternoye poznaniye. Modeli kognitivnykh protcessov: Monografiya. Novosibirsk, 2006. 293 s.
9. Vityaev B. Y., Kovalerchuk B. Relational Methodology for Data Mining and Knowledge Discovery // Intelligent Data Analysis. Special issue on «Philosophies and Methodologies for Knowledge Discovery and Intelligent Data Analysis» / Eds. Keith Rennolls, Evgenii Vityaev. IOS Press, 2008. Vol. 12 (2). P. 189–210.
10. Vityaev E., Kovalerchuk B. Empirical Theories Discovery based on the Measurement Theory // Mind and Machine. 2004. Vol. 14, № 4. P. 551–573.
11. Kovalerchuk B., Vityaev E. Data Mining in Finance: Advances in Relational and Hybrid methods. (Kluwer international series in engineering and computer science; SECS 547). Kluwer Academic Publishers, 2000. P. 308.
12. Kovalerchuk B., Vityaev E. Symbolic Methodology for Numeric Data Mining // Intelligent Data Analysis. Special issue on «Philosophies and Methodologies for Knowledge Discovery and Intelligent Data Analysis» / Eds. Keith Rennolls, Evgenii Vityaev. IOS Press, 2008. Vol. 12 (2). P. 165–188.
13. Vityaev E. The logic of prediction. In: Mathematical Logic in Asia. Proceedings of the 9th Asian Logic Conference (August 16–19, 2005, Novosibirsk, Russia) / Ed. by S. S. Goncharov, R. Downey, H. Ono. World Scientific, Singapore, 2006. P. 263–276.
14. Vishnevsky O. V., Kolchanov N. A. ARGO: a web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoter // Nucleic. Acid. Res. 2005. Vol. 33. P. 417–422.
15. Oshchepkov D. Y., Vityaev E. E., Grigorovich D. A., Ignatieva E. V., Khlebodarova T. M. SITECON: a tool for detecting conservative conformational and physicochemical properties in transcription factor binding site alignments and for site recognition // Nucleic. Acid. Res. 2004. Vol. 32 (Web Server issue). P. 208–212.
16. Orlov Y. L., Potapov V. N. Complexity: an internet resource for analysis of DNA sequence complexity // Nucleic. Acid. Res. 2004. Vol. 32 (Web Server issue). P. 628–633.
17. Levitsky V. G., Katokhin A. V., Podkolodnaya O. A., Furman D. P., Kolchanov N. A. NPRD: Nucleosome Positioning Region Database // Nucl. Acid. Res. 2005. Vol. 33. P. 67–70.
18. Kendal M., Styuart A. Statisticheskiye vyvody i svyazi. M.: Nauka, 1973. 899 s.
19. Khomicheva I. V., Vityaev E. E., Ananko E. A., Levitsky V. G., Shipilov T. I. Hierarchical analysis of the eukaryotic transcription regulatory regions based on the DNA codes of transcription // Proceedings of the 3-rd Moscow conference on computional molecular biology. Moscow, Russia, July 27–31, 2007a. P. 142–144.
20. Khomicheva I. V., Vityaev E. E., Ananko E. A., Levitsky V. G., Shipilov T. I. Hierarchical analysis of the eukaryotic transcription regulatory regions based on the DNA codes of transcription. Proceedings of the 3-rd Moscow conference on computional molecular biology. Moscow, Russia, July 27–31, 2007b. P. 142–144.
21. Khomicheva I., Demin A., Vityaev E. Transcription Factor Binding Site Discovery by the Probabilistic Rules. PKDD Proceedings: Joost N. Kok, Jacek Koronacki, Ramon Lopez de Mantaras, Stan Matwin, Dunja Mladenič, Andrzej Skowron, Knowledge Discovery in Databases: PKDD
2007 // XIth European Conference on Principles and Practice of Knowledge Discovery in Databases. Warsaw, Poland, September 17–21, 2007v; Proceedings. Lecture Notes in Artificial Intelligence
4702, Springer 2007v. P. 104–109.
22. Khomicheva I. V., Vityaev E. E., Shipilov T. I., Levitsky V. G. Transcription factor binding sites recognition by the ExpertDiscovery system based on the recursive complex signals // Proceedings of the Fifth International Conference on Bioinformatics of Genome Regulation and Structure (BGRS2006, 16–22 July, Novosibirsk, Russia), ICG, Novosibirsk, 2006. Vol. 1. P. 77–80.
23. Kolchanov N. A., Ignatieva E. V., Ananko E. A., Podkolodnaya O. A., Stepanenko I. L., Merkulova T. I., Pozdnyakov M. A., Podkolodny N. L., Naumochkin A. N., Romashchenko A. G. Transcription Regulatory Regions Database, (TRRD): its status in 2002 // Nucleic. Acid. Res. 2002. Vol. 30. P. 312–317.
24. Stormo G. D. DNA binding sites: representation and discovery // Bioinformatics. 2000. Vol. 16. P. 16–23.
25. Efron B., Gong G. A leisurely look at the bootstrap the jackknife and resampling // American Statistician. 1983. Vol. 37. P. 36–48.
26. Schneider T., Stephens R. Sequence logos: A new way to display consensus sequences // Nucleic. Acid. Res. 1990. Vol. 18; 20. P. 6097–6100.
27. Ulyanov A., Stormo G. Multi-alphabet consensus algorithm for identification of low specificity protein-DNA interactions // Nucl. Acid. Res. 1995. Vol. 23. P. 1434–1440.
28. Benos P. V., Bulyk M. L., Stormo G. D. Additivity in protein-DNA interactions: how good an approximation is it? // Nucleic. Acid. Res. 2002. Vol. 30. P. 4442–4451.
29. Man T. K., Stormo G. D. Non-independence of Mnt repressoroperator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay // Nucleic. Acid. Res.
2001. Vol. 29. P. 2471–2478.
30. Barash Y., Elidan G., Friedman F., Kaplan T. Modeling dependencies in protein-DNA binding sites // RECOMB, 2003. P. 28–37.
31. Udalova I. A., Mott R., Field D., Kwiatkowski D. Quantitative prediction of NF-kB DNAprotein interactions // Proc. Natl. Acad. Sci. USA. 2002. Vol. 99. P. 8167–8172.
32. Khomicheva I. V., Vityaev E. E., Shipilov T. I. Discovery of the transcription factor binding sites in the aligned and unaligned DNA sequences. Proceedings of the Fifth International Conference on Bioinformatics of Genome Regulation and Structure (BGRS’2008, 22–28 June, Novosibirsk, Russia), ICG, Novosibirsk, 2008. P. 116.
33. Ananko E. A., Bazhan S. I., Belova O. E., Kel A. E. Mekhanizmy regulyatcii transkriptcii interferon-indutciruyemykh genov: Opisaniye v informatcionnoi sisteme IIG-TRRD // Molekulyarnya biologiya. 1997. № 31. C. 701–713.
34. Leblanc J. F., Cohen L., Rodrigues M., Hiscott J. Synergism between distinct enhanson domains in viral induction of TI the human beta interferon gene // Mol. Cell. Biol. 1990. Vol. 10 (8). P. 3987–3993.
35. Lew D. J., Decker T., Strehlow I., Darnell J. E. Overlapping elements in the guanilatebinding protein gene promoter TI mediate transcriptional induction by alpha and gamma interferons // Mol. Cell. Biol. 1991. Vol. 11 (1). P. 182–191.
36. Li X., Leung S., Burns C., Stark G. R. Cooperative binding of Stat1-2 heterodimers and ISGF3 to tandem DNA elements // Biochimie. 1998. Vol. 80. P. 703–710.
37. Mirkovitch J., Decker T., Darnell J. E. Interferon induction of gene transcription analyzed by in vivo footprinting // Mol. Cell. Biol. 1992. Vol. 12 (1). P. 1–9.

Publication information
Main title Vestnik NSU Series: Information Technologies, Volume 08, Issue No 1 (2010).
Parallel title: Novosibirsk State University Journal of Information Technologies Volume 08, Issue No 1 (2010).

Key title: Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriâ: Informacionnye tehnologii
Abbreviated key title: Vestn. Novosib. Gos. Univ., Ser.: Inf. Tehnol.
Variant title: Vestnik NGU. Seriâ: Informacionnye tehnologii

Year of Publication: 2010
ISSN: 1818-7900 (Print), ISSN 2410-0420 (Online)
Publisher: Novosibirsk State University Press
DSpace handle


|Home Page| |All Issues| |Information for Authors| |Journal Boards| |Ethical principles| |Editorial Policy| |Contact Information| |Old Site in Russian|

inftech@vestnik.nsu.ru
© 2006-2017, Novosibirsk State University.