workshop & conferences
Data Collection and Management Workshop

Date & Time
December 17-19, 2018
Location
Palacký University, Křižkovského 10, block C, auditorium 3.05 & Univerzitní 3, room 225
About workshop
This workshop deals with the data collection and management practices within the Sinophone Borderlands project. The aim is to present the methodologies to other project members, so that parallel data can be collected in multiple field sites, leading to interdisciplinary research within this project. The second aim is to inventorize the current and the best data management practices and to finalize the data management policy of the Sinophone Borderlands project.
Abstract
Poetry and song are inextricably interwoven in most indigenous Australian traditions. And the poetic masterpieces found across the continent are little-known outside their immediate communities, tied up as they are with the intricacies of the languages they are sung in. As a result, Australia has little awareness of the many hundreds of Shakespeares, Keatses, and Bob Dylans whose poetic masterpieces are composed in First Nations languages. The same goes for the continent’s rich and varied indigenous musical traditions. In this talk I will seek to give a glimpse into the richness of the poetic language found across a number of north Australian communities I have worked in, focussing on allusive subtlety, inner feeling, multilingual characterisation, and the deployment of vocabulary and grammar for expressive nuance, and the role of song in maintaining language knowledge through the powerful emotional charge it generates
Speakers
Nicholas (Nick) Evans, ARC Laureate Fellow and Distinguished Professor of Linguistics at the Australian National University, directs the Australian Research Council Centre of Excellence for the Dynamics of Language (CoEDL). He has carried out wide-ranging fieldwork on indigenous languages of Australia and Papua New Guinea. The driving interest of his work is the interplay between documenting and describing the far-reaching diversity contained in the world’s endangered languages and the many humanistic and scientific questions they can help us answer.
In addition to book-length grammars and dictionaries of several Aboriginal languages (Kayardild, Bininj Gun-wok, Dalabon) and edited collections on numerous linguistic topics, he has published over 180 scientific papers. His crossover book Dying Words: Endangered Languages and What They Have to Tell Us, which sets out a broad program for engaging with the world’s dwindling linguistic diversity has been translated into French, Japanese, Korean and German, with a Chinese translation soon to appear.
He has also worked as a linguist, interpreter and anthropologist in two Native Title claims in northern Australia, and as a promotor of Aboriginal art.
Nick is a member of the Australian Academy of the Humanities, the Australian Social Sciences Academy, a corresponding member of the British Academy, and a recipient of the inaugural Anneliese Maier Forschungspreis from the Alexander von Humboldt Foundation / German Ministry of Science and Education, and the Ken Hale Award from the Linguistics Society of America.
Dr. Johannes Dellert (University of Tübingen, Germany) is a computational linguist who currently works as a lecturer and researcher at the Department of Linguistics in Tübingen. His current research focuses on the development of novel approaches to the interpretation of non-standard language (as part of Prof. Detmar Meurers’ ICALL research group), and on developing new interactive tools for historical linguistics as a continuation of his PhD work in Prof. Gerhard Jäger’s group. As part of the latter, Dr. Dellert is also the main contributor and coordinator for the NorthEuraLex database.
Dr. Robert Forkel (Max Planck Institute for the Science of Human History, Jena, Germany) leads the data management group of the Department for Linguistic and Cultural Evolution at MPI SHH.
Luís Morgado Da Costa (Nanyang Technological University, Singapore) is a cognitive scientist with a wide range of interests, currently focusing his work on computational linguistics. He is currently a PhD student at the Interdisciplinary Graduate School, Nanyang Technological University (NTU), in Singapore. Before that, he was a research associate in the Computational Linguistics Lab, Division of Linguistics and Multilingual Studies, also at NTU, working on several projects, ranging from Natural Language Parsing and Generation, Computational Lexicography, Computer Assisted Language Learning, as well as general Mandarin Chinese and Japanese Linguistics.
The main focus of his current research is to model diverse aspects of linguistic knowledge in ways it can be applied to different tasks (e.g. Machine Translation, Word Sense Disambiguation, Computer Assisted Language Learning, etc). He works mainly with English and Mandarin Chinese, but has also done work with other languages such as Japanese, Kristang, Portuguese, Indonesian, Coptic and Abui.
He a member of DELPH-IN, sharing the communal commitment to develop open source NLP tools and resources for deep linguistic processing of natural languages, a member of the Global Wordnet Association, and a member of NTU’s Digital Humanities Cluster.
Dr. David Moeljadi (Nanyang Technological University, Singapore and soon Sinophone Project, Olomouc) graduated from Nanyang Technological University in Singapore in 2018, works as data scientist at Traveloka Services Pte. Ltd. in Singapore, a leading Southeast Asia online Indonesian travel company. His research is about computational linguistics, grammar engineering, treebanking, corpus, dictionaries and lexicography. He built an open-source computational grammar for Indonesian called INDRA (Indonesian Resource Grammar) which can parse and generate sentences and a treebank called JATI. He develops Wordnet Bahasa, a large scale open-source semantic dictionary of the Malay languages (Malaysian and Indonesian). He collaborates with the Language Development and Cultivation Agency, under the Ministry of Education and Culture of the Republic of Indonesia, created and develops a database for the authoritative Indonesian dictionary KBBI (Kamus Besar Bahasa Indonesia) and a database for loanwords in Indonesian. He collaborates with The CJK Dictionary Institute, Inc. in Japan and works as the chief translator of The Kanji Learner’s Dictionary: Indonesian Edition. Together with researchers from Tokyo University of Foreign Studies, He worked in a project to build an open-source morphological dictionary and analyser for Malay/Indonesian (MALINDO Morph), and in a project to build an Asian language parallel corpus (TALPCo). David is a member of the Indonesian Association for Lexicography and a member of the Deep Linguistic Processing with HPSG (DELPH-IN) research consortium.
PROGRAM
Day 1 – December 17, 2018
14:30, Křižkovského 10, block C, auditorium 3.05
Public lecture by Professor Nicholas Evans (ANU)
Lecture title: Waving to the other side: the language of poetry in indigenous Australian song
Day 2 – December 18, 2018: Data collection scope, techniques and manual
Univerzitní 3, room 225
Time | Speaker | Title |
Session 1: Data collection (language, culture, artifacts) | ||
9.30 | Coffee time | |
9.45 | František Kratochvíl | Linguistic data collection techniques and beyond (visuals) |
10.30 | Volker Gast | Parallel texts in research |
11.15 | Olaf Günther and Tereza Hejzlarová | Anthropological data collection and artifacts |
Lunch | ||
Session 2: Data collection (culture, knowledge systems) | ||
13.30 | Martin Soukup | Anthropological data collection techniques |
14.15 | Ondřej Kučera, Dan Faltýnek, Kateřina Šamajová, Renata Čižmárová | Ethnobotanical data collection |
15.00 | Alfred Gerstl | Narratives of political influence |
Coffee & Discussion: Sinophone Borderlands Data collection Manual Contents | ||
Evening: Dinner |
Day 3 – December 19, 2018: Data management and mining
Univerzitní 3, room 225
Time | Speaker | Title |
Session 3: Data management (language databases) | ||
9.30 | Coffee time | |
9.45 | Johannes Dellert (Jena) | NorthEuraLex |
10.30 | Robert Forkel (MPI) | CLLD, D-PLACE and Archeological Data |
11.15 | Luis Morgado Da Costa (NTU) | Digital humanities tools for dummies |
Lunch | ||
Session 4: Data management (culture, history) | ||
13.30 | Olaf Günther & Tereza Hejzlarová | Anthropological and cultural data management |
14.15 | Martin Soukup | Anthropological data management |
15.00 | Nicholas Evans | Ethical issues: property rights and repatriations – Australian perspective |
Final Discussion, Coffee and Conclusion |