Ben Burtenshaw

Ben Burtenshaw

Antwerp, Flemish Region, Belgium
34K followers 500+ connections

About

Right now, I'm working on educational and learning material at Hugging Face which teaches…

Activity

Join now to see all activity

Experience

  • Hugging Face Graphic
  • -

    Paris, Île-de-France, France

  • -

  • -

    Paris, Île-de-France, France

  • -

  • -

  • -

    Antwerp, Flemish Region, Belgium

  • -

  • -

    Groningen, Netherlands

  • -

    Ghent, Flemish Region, Belgium

  • -

    Brussels Area, Belgium

  • -

    Antwerp Area, Belgium

  • -

    London, England, United Kingdom

Education

Licenses & Certifications

Publications

  • The future of open human feedback

    Nature Machine Intelligence

    Human feedback on conversations with language models is central to how these systems learn about the world, improve their capabilities and are steered towards desirable and safe behaviours. However, this feedback is mostly collected by frontier artificial intelligence labs and kept behind closed doors. Here we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for artificial intelligence. We first look for…

    Human feedback on conversations with language models is central to how these systems learn about the world, improve their capabilities and are steered towards desirable and safe behaviours. However, this feedback is mostly collected by frontier artificial intelligence labs and kept behind closed doors. Here we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for artificial intelligence. We first look for successful practices in the peer-production, open-source and citizen-science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the centre of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholder community of model trainers and feedback providers to support a general open feedback pool.

    See publication
  • Classifying toxicity in adolescent conversations : applications in paediatrics

    University of Antwerp

    This PhD thesis investigates and analyses the effectiveness of text classification models for detecting toxic language in paediatric settings. The literature highlights toxic language, and its effects like bullying and mental health problems, as fundamental societal challenges. Moreover, the World Health Organisation asserts that tackling bullying for adolescents should not be limited to educational settings and that it is the responsibility of healthcare institutions to address these issues…

    This PhD thesis investigates and analyses the effectiveness of text classification models for detecting toxic language in paediatric settings. The literature highlights toxic language, and its effects like bullying and mental health problems, as fundamental societal challenges. Moreover, the World Health Organisation asserts that tackling bullying for adolescents should not be limited to educational settings and that it is the responsibility of healthcare institutions to address these issues. Social media platforms have implemented text classification systems that protect against toxic language within their products, and paediatric wards should have comparable safeguards when using language-based technology. The thesis aims to expose methods from Natural Language Processing that are suitable for application in paediatrics, and highlight aspects of state-of-the-art methodology that demand consideration and attention. The thesis is structured in three parts; an introduction, a series of case studies, and a strategic analysis. The introduction is targeted at non-expert readers and intends to support the technical case studies chapters by clarifying systems and practices from the field of Natural Language Processing. The case studies part contains a series of contained experiments in text classification and toxic language detection. The final part returns to the systems from the case studies and analyses them against the context of paediatric application.

    See publication
  • A Dutch Dataset for Cross-lingual Multilabel Toxicity Detection

    Proceedings of the 14th Workshop on Building and Using Comparable Corpora (BUCC 2021)

    Multi-label toxicity detection is highly prominent, with many research groups, companies, and individuals engaging with it through shared tasks and dedicated venues. This paper describes a cross-lingual approach to annotating multi-label text classification on a newly developed Dutch language dataset, using a model trained on English data. We present an ensemble model of one Transformer model and an LSTM using Multilingual embeddings. The combination of multilingual embeddings and the…

    Multi-label toxicity detection is highly prominent, with many research groups, companies, and individuals engaging with it through shared tasks and dedicated venues. This paper describes a cross-lingual approach to annotating multi-label text classification on a newly developed Dutch language dataset, using a model trained on English data. We present an ensemble model of one Transformer model and an LSTM using Multilingual embeddings. The combination of multilingual embeddings and the Transformer model improves performance in a cross-lingual setting.

    See publication
  • Spans are Spans, stacking a binary word level approach to toxic span detection

    Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

    This paper describes the system developed by the Antwerp Centre for Digital humanities and literary Criticism [UAntwerp] for toxic span detection. We used a stacked generalisation ensemble of five component models, with two distinct interpretations of the task. Two models attempted to predict binary word toxicity based on ngram sequences, whilst 3 categorical span based models were trained to predict toxic token labels based on complete sequence tokens. The five models’ predictions were…

    This paper describes the system developed by the Antwerp Centre for Digital humanities and literary Criticism [UAntwerp] for toxic span detection. We used a stacked generalisation ensemble of five component models, with two distinct interpretations of the task. Two models attempted to predict binary word toxicity based on ngram sequences, whilst 3 categorical span based models were trained to predict toxic token labels based on complete sequence tokens. The five models’ predictions were ensembled within an LSTM model. As well as describing the system, we perform error analysis to explore model performance in relation to textual features.

    See publication
  • Offence in dialogues: A corpus-based study

    Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

    In recent years an increasing number of analyses of offensive language has been published, however, dealing mainly with the automatic detection and classification of isolated instances. In this paper we aim to understand the impact of offensive messages in online conversations diachronically, and in particular the change in offensiveness of dialogue turns. In turn, we aim to measure the progression of offence level as well as its direction-For example, whether a conversation is escalating or…

    In recent years an increasing number of analyses of offensive language has been published, however, dealing mainly with the automatic detection and classification of isolated instances. In this paper we aim to understand the impact of offensive messages in online conversations diachronically, and in particular the change in offensiveness of dialogue turns. In turn, we aim to measure the progression of offence level as well as its direction-For example, whether a conversation is escalating or declining in offence. We present our method of extracting linear dialogues from tree-structured conversations in social media data and make our code publicly available. Furthermore, we discuss methods to analyse this dataset through changes in discourse offensiveness. Our paper includes two main contributions; first, using a neural network to measure the level of offensiveness in conversations; and second, the analysis of conversations around offensive comments using decoupling functions.

    See publication
  • Sarcasm detection using an ensemble approach

    Proceedings of the Second Workshop on Figurative Language Processing

    We present an ensemble approach for the detection of sarcasm in Reddit and Twitter responses in the context of The Second Workshop on Figurative Language Processing held in conjunction with ACL 2020. The ensemble is trained on the predicted sarcasm probabilities of four component models and on additional features, such as the sentiment of the comment, its length, and source (Reddit or Twitter) in order to learn which of the component models is the most reliable for which input. The component…

    We present an ensemble approach for the detection of sarcasm in Reddit and Twitter responses in the context of The Second Workshop on Figurative Language Processing held in conjunction with ACL 2020. The ensemble is trained on the predicted sarcasm probabilities of four component models and on additional features, such as the sentiment of the comment, its length, and source (Reddit or Twitter) in order to learn which of the component models is the most reliable for which input. The component models consist of an LSTM with hashtag and emoji representations; a CNN-LSTM with casing, stop word, punctuation, and sentiment representations; an MLP based on Infersent embeddings; and an SVM trained on stylometric and emotion-based features. All component models use the two conversational turns preceding the response as context, except for the SVM, which only uses features extracted from the response. The ensemble itself consists of an adaboost classifier with the decision tree algorithm as base estimator and yields F1-scores of 67% and 74% on the Reddit and Twitter test data, respectively.

    See publication
  • Synthetic literature: Writing science fiction in a co-creative process

    Proceedings of the Workshop on Computational Creativity in Natural Language Generation

    This paper describes a co-creative text generation system applied within a science fiction setting to be used by an established novelist. The project was initiated as part of The Dutch Book Week, and the generated text will be published within a volume of science fiction stories. We explore the ramifications of applying Natural Language Generation within a cocreative process, and examine where the cocreative setting challenges both writer and machine. We employ a character-level language model…

    This paper describes a co-creative text generation system applied within a science fiction setting to be used by an established novelist. The project was initiated as part of The Dutch Book Week, and the generated text will be published within a volume of science fiction stories. We explore the ramifications of applying Natural Language Generation within a cocreative process, and examine where the cocreative setting challenges both writer and machine. We employ a character-level language model to generate text based on a large corpus of Dutch novels that exposes a number of tunable parameters to the user. The system is used through a custom graphical user interface, that helps the writer to elicit, modify and incorporate suggestions by the text generation system. Besides a literary work, the output of the present project also includes user-generated meta-data that is expected to contribute to the quantitative evaluation of the text-generation system and the co-creative process involved.

    See publication

Languages

  • Dutch

    Full professional proficiency

  • French

    Limited working proficiency

  • English

    Native or bilingual proficiency

Recommendations received

More activity by Ben

View Ben’s full profile

  • See who you know in common
  • Get introduced
  • Contact Ben directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses