Normal view MARC view ISBD view

Collaborative annotation for reliable natural language processing : technical and sociological aspects / Karen Fort.

By: Fort, Karen [author.]

Language: English Series: Focus series (London, England): Publisher: London : Wiley-ISTE, 2016Edition: 1stDescription: 1 online resourceContent type: text Media type: computer Carrier type: online resourceISBN: 9781848219045; https://onlinelibrary.wiley.com/doi/book/10.1002/9781119306696; 1119307651; 9781119306696; 9781119307655Subject(s): Natural language processing (Computer science)Genre/Form: Electronic books.DDC classification: 006.3/5 LOC classification: QA76.9.N38Online resources: Full text is available at Wiley Online Library Click here to view

Contents:

Table of Contents Preface ix List of Acronyms xi Introduction xiii Chapter 1. Annotating Collaboratively 1 1.1. The annotation process (re)visited 1 1.1.1. Building consensus 1 1.1.2. Existing methodologies 3 1.1.3. Preparatory work 7 1.1.4. Pre-campaign 13 1.1.5. Annotation 17 1.1.6. Finalization 21 1.2. Annotation complexity 24 1.2.1. Example overview 25 1.2.2. What to annotate? 28 1.2.3. How to annotate? 30 1.2.4. The weight of the context 36 1.2.5. Visualization 38 1.2.6. Elementary annotation tasks 40 1.3. Annotation tools 43 1.3.1. To be or not to be an annotation tool 43 1.3.2. Much more than prototypes 46 1.3.3. Addressing the new annotation challenges 49 1.3.4. The impossible dream tool 54 1.4. Evaluating the annotation quality 55 1.4.1. What is annotation quality? 55 1.4.2. Understanding the basics 56 1.4.3. Beyond kappas 63 1.4.4. Giving meaning to the metrics 67 1.5. Conclusion 75 Chapter 2. Crowdsourcing Annotation 77 2.1. What is crowdsourcing and why should we be interested in it? 77 2.1.1. A moving target 77 2.1.2. A massive success 80 2.2. Deconstructing the myths 81 2.2.1. Crowdsourcing is a recent phenomenon 81 2.2.2. Crowdsourcing involves a crowd (of non-experts) 83 2.2.3. “Crowdsourcing involves (a crowd of) non-experts” 87 2.3. Playing with a purpose 93 2.3.1. Using the players’ innate capabilities and world knowledge 94 2.3.2. Using the players’ school knowledge 96 2.3.3. Using the players’ learning capacities 97 2.4. Acknowledging crowdsourcing specifics 101 2.4.1. Motivating the participants 101 2.4.2. Producing quality data 107 2.5. Ethical issues 109 2.5.1. Game ethics 109 2.5.2. What’s wrong with Amazon Mechanical Turk? 111 2.5.3. A charter to rule them all 113 Conclusion 115 Appendix 117 Glossary 141 Bibliography 143 Index 163

Summary: This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems. These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )

Item type	Current location	Home library	Call number	Status	Date due	Barcode	Item holds
EBOOK	COLLEGE LIBRARY	COLLEGE LIBRARY	006.35 F7756 2016 (Browse shelf)	Available		CL-52302

Total holds: 0

Includes bibliographical references and index.

Table of Contents
Preface ix
List of Acronyms xi

Introduction xiii

Chapter 1. Annotating Collaboratively 1

1.1. The annotation process (re)visited 1

1.1.1. Building consensus 1

1.1.2. Existing methodologies 3

1.1.3. Preparatory work 7

1.1.4. Pre-campaign 13

1.1.5. Annotation 17

1.1.6. Finalization 21

1.2. Annotation complexity 24

1.2.1. Example overview 25

1.2.2. What to annotate? 28

1.2.3. How to annotate? 30

1.2.4. The weight of the context 36

1.2.5. Visualization 38

1.2.6. Elementary annotation tasks 40

1.3. Annotation tools 43

1.3.1. To be or not to be an annotation tool 43

1.3.2. Much more than prototypes 46

1.3.3. Addressing the new annotation challenges 49

1.3.4. The impossible dream tool 54

1.4. Evaluating the annotation quality 55

1.4.1. What is annotation quality? 55

1.4.2. Understanding the basics 56

1.4.3. Beyond kappas 63

1.4.4. Giving meaning to the metrics 67

1.5. Conclusion 75

Chapter 2. Crowdsourcing Annotation 77

2.1. What is crowdsourcing and why should we be interested in it? 77

2.1.1. A moving target 77

2.1.2. A massive success 80

2.2. Deconstructing the myths 81

2.2.1. Crowdsourcing is a recent phenomenon 81

2.2.2. Crowdsourcing involves a crowd (of non-experts) 83

2.2.3. “Crowdsourcing involves (a crowd of) non-experts” 87

2.3. Playing with a purpose 93

2.3.1. Using the players’ innate capabilities and world knowledge 94

2.3.2. Using the players’ school knowledge 96

2.3.3. Using the players’ learning capacities 97

2.4. Acknowledging crowdsourcing specifics 101

2.4.1. Motivating the participants 101

2.4.2. Producing quality data 107

2.5. Ethical issues 109

2.5.1. Game ethics 109

2.5.2. What’s wrong with Amazon Mechanical Turk? 111

2.5.3. A charter to rule them all 113

Conclusion 115

Appendix 117

Glossary 141

Bibliography 143

Index 163

This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems.

These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential.

About the Author
Karën Fort is Associate Professor at University Paris-Sorbonne (Paris 4) working on the STIH (meaning, text, computer science, history) team. Her current research interests include collaborative manual annotation, crowdsourcing and ethics.

There are no comments for this item.

to post a comment.