Novosibirsk State University Journal of Information Technologies
Scientic Journal

ISSN 2410-0420 (Online), ISSN 1818-7900 (Print)

Switch to
Russian

All Issues >> Contents: Volume 11, Issue No 1 (2013)

Principles of identification of objects in structured documents
Anna Anatolyevna Knyazeva

Tomsk Branch of the Institute of Computational Technologies SB RAS
UDC code: 004.622’417

Abstract
The paper describes the problem of real word objects identification, which are mentioned in the structured documents. The approach takes into account different features for identification and its weights depending on its significance. The application of the proposed model to the problem of identification of persons that act as authors of publications based on data from the electronic library catalog is considered.

Key Words
record linkage, structured documents, databases, identification of objects

How to cite:
Knyazeva A. A. Principles of identification of objects in structured documents // Vestnik NSU Series: Information Technologies. - 2013. - Volume 11, Issue No 1. - P. 58-67. - ISSN 1818-7900. (in Russian).

Full Text in Russian

Available in PDF

References
1. Knyazeva A. A., Kolobov O. S., Turchanovsky I. Yu., Fedotov A. M. Ranzhirovanny po isk v bibliograficheskikh bazakh dannykh // Vestn. Novosib. gos. un-ta. Seriya: Informatcion nyye tekhnologii. 2009. T. 7, vyp. 4. S. 81–96.
2. Knyazeva A. A., Turchanovsky I. Yu., Kolobov O. S. Avtomaticheskoye svyazyvaniye doku mentov // Elektronnyye biblioteki: perspektivnyye metody i tekhnologii, elektronnyye kol lektcii: Tr. XIV Vseros. nauch. konf. RCDL’2012. Pereslavl-Zalessky: Izd-vo «Universi tet goroda Pereslavlya», 2012. S. 360–369.
3. Knyazeva A. A., Turchanovsky I. Yu., Kolobov O. S. Avtomaticheskoye svyazyvaniye struktu rirovannykh dokumentov // Materialovedeniye, tekhnologii i ekologiya v 3-m tysyacheletii: Sb. dokl. V Vseros. konf. molodykh uchenykh [Elektronny resurs]. Tomsk: Izd-vo IOA SO RAN,
2012. CD-ROM.
4. Elfeky M. G., Elmagarmid A. K., Verykios V. S. TAILOR: A Record Linkage Tool Box // Proc. of the XVIII International Conference on Data Engineering (ICDE 02). IEEE Computer Societyyu Washington, DC, 2002. P. 17–28.
5. Rubtcov D. N., Barakhnin V. B. Vyyavleniye dublikatov v raznorodnykh bibliograficheskikh istochnikakh // Vestn. Novosib. gos. un-ta. Seriya: Informatcionnyye tekhnologii. 2009. T. 7, vyp. 3. S. 86–93.
6. Newcombe H. B., Kennedy J. M., Axford S. J., James A. P. Automatic Linkage of Vital Records // Science. 1959. Vol. 130. P. 954–959.
7. Fellegi I. P., Sunter A. B. A Theory for Record Linkage // J. of the American Statistical Association.
1969. Vol. 64. P. 1183–1210.
8. Belin T. R., Rubin D. B. A Method for Calibrating False-Match Rates in Record Linkage // J. of the American Statistical Association. 1995. Vol. 90. P. 694–707.
9. Bilenko M., Mooney R. Learning to Combine Trained Distance Metrics for Duplicate Detection in Databases: Technical Report AI-02-296 / Artificial Intelligence Lab. University of Texas at Austin, 2002.
10. Mahalanobis P. C. On the Generalized Distance in Statistics // Proc. of the National Institute of Sciences of India. 1936. Vol. 2 (1). P. 49–55.
11. Knyazeva A. A., Turchanovsky I. Yu., Kolobov O. S. Avtomatichesky avtoritetny kon trol dlya raspredelennykh bibliograficheskikh baz dannykh // Raspredelennyye informatcion nyye i vychislitelnyye resursy (DICR’2010): Materialy XIII Ros. konf. s uchastiyem ino strannykh uchenykh [Elektronny resurs]. Novosibirsk: IVT SO RAN, 2010. CD-ROM.
12. Knyazeva A. A., Kolobov O. S. Vosstanovleniye svyazei mezhdu bibliograficheskimi za pisyami // Sovremennyye problemy matematiki, informatiki i bioinformatiki: Materialy Mezhdunar. konf., posvyashch. 100-letiyu so dnya rozhdeniya chlena-korrespondenta AN SSSR Alekseya Andreyevicha Lyapunova [Elektronny resurs]. Novosibirsk: IVT SO RAN, 2011. CD-ROM.
13. Fedotov A. M., Zhizhimov O. L., Knyazeva A. A., Kolobov O. S., Mazov N. A., Turchanov sky I. Yu., Fedotova O. A. Problemy avtoritetnogo kontrolya dlya raspredelennykh elek tronnykh bibliotek i bibliograficheskikh baz // Vestn. Novosib. gos. un-ta. Seriya: Informa tcionnyye tekhnologii. 2011. T. 9, vyp. 1. S. 89–101.
14. Knyazeva A. A., Kolobov O. S., Turchanovsky I. Yu. Nalichiye informatcii dlya svyazyva niya na primere bazy dannykh «MedArt» // Raspredelennyye informatcionnyye i vychislitel nyye resursy (DICR’2012): Materialy XIV Ros. konf. s mezhdunar. uchastiyem [Elektronny resurs]. Novosibirsk: IVT SO RAN, 2012. CD-ROM.
15. Bennett R., Christal H.-D., O’Neill E. T., Tillett B. VIAF (Virtual International Authority File): Linking the Deutsche Nationalbibliothek and Library of Congress Name Authority Files // International Cataloging and Bibliographic Control. 2007. Vol. 36 (1). P. 12–19.

Publication information
Main title Vestnik NSU Series: Information Technologies, Volume 11, Issue No 1 (2013).
Parallel title: Novosibirsk State University Journal of Information Technologies Volume 11, Issue No 1 (2013).

Key title: Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriâ: Informacionnye tehnologii
Abbreviated key title: Vestn. Novosib. Gos. Univ., Ser.: Inf. Tehnol.
Variant title: Vestnik NGU. Seriâ: Informacionnye tehnologii

Year of Publication: 2013
ISSN: 1818-7900 (Print), ISSN 2410-0420 (Online)
Publisher: Novosibirsk State University Press
DSpace handle


|Home Page| |All Issues| |Information for Authors| |Journal Boards| |Ethical principles| |Editorial Policy| |Contact Information| |Old Site in Russian|

inftech@vestnik.nsu.ru
© 2006-2017, Novosibirsk State University.