Publications
Scientific publications
Алексеев С.С., Морозов В.В., Симаков К.В.
Методы машинного обучения в задачах извлечения информации из текстов по эталону
// Электронные библиотеки: перспективные методы и технологии, электронные коллекции: Труды XI Всероссийской научной конференции RCDL'2009. Петрозаводск: КарНЦ РАН, 2009. C. 237-246
Alexeev S.S., Morozov V.V., Simakov K.V. Machine learning in information extraction having etalon database // Digital Libraries: Advanced Methods and Technologies, Digital Collections: Proceedings of the XI All-Russian Research Conference RCDL'2009. Petrozavodsk: KRC RAS, 2009. Pp. 237-246
We describe a special case of task of information extraction from texts when a whole database of objects to extract is already exists. Such database includes only canonical representations of objects, so the task is to recognize them by their non-canonical descriptions in texts. To disambiguate the result of such recognition we research, test and compare a range of machine learning methods. The result of such comparison is also described.
Machine learning in information extraction having etalon database (397 Kb, total downloads: 263)
Last modified: October 16, 2009