Shared Tasks
We endorse the following shared task(s) and strongly encourage the participants to participate and submit the paper to the workshop.
Important Dates:
- Training/dev set release: August 27, 2024
- Test set release and evaluation phase starts: October 29, 2024
- Evaluation phase closes: November 2, 2024
- Leaderboard to be public: November 5, 2024
Test set release and evaluation phase starts: October 20, 2024Evaluation phase closes: October 25, 2024Leaderboard to be public: October 28, 2024- System description paper submission: November 15, 2024
- Acceptance Notification: December 7, 2024
- Camera-Ready Deadline: December 13, 2024
Task 1: Binary Multilingual Machine-Generated Text Detection (Human vs. Machine)
In the COLING Workshop on MGT Detection Task 1, we adopt a straightforward binary problem formulation: determining whether a given text is generated by a machine or authored by a human. This is the continuation and improvement of the SemEval Shared Task 8 (subtask A). We aim to refresh training and testing data with generations from novel LLMs and include new languages.
There are two subtasks:
- Subtask A: English-only MGT detection.
- Subtask B: Multilingual MGT detection.
Details can be found at [Github repository]
System Submission:
Please use the link below to submit your system.
Task 2: AI vs. Human – Academic Essay Authenticity Challenge
The objective is to identify machine-generated essays to safeguard academic integrity and prevent the misuse of large language models in educational settings. The input to the system would be essays including texts authored by both native and non-native speakers, as well as essays generated by various large language models.
The task is defined as follows: “Given an essay, identify whether it is generated by a machine or authored by a human.” This is a binary classification task and is offered in English and Arabic.
Details can be found at [Gitlab repository]
How to obtain data:
Please fill up data sharing consent form below. We will send the data as soon as possible.
Data sharing consent form
System Submission:
Please use the link below to submit your system.
Task 3: Cross-domain Machine-Generated Text Detection
In the COLING Workshop on MGT Detection Task 3, we focus on cross-domain robustness of detectors by testing submissions on the RAID benchmark. We adopt the same straightforward binary problem formulation as Task 1 however the texts will come from 8 different domains, 11 generative models, and 4 decoding strategies.
There are two subtasks:
- Subtask A: Non-Adversarial Cross-Domain MGT detection.
- Subtask B: Adversarial Cross-Domain MGT detection.
Details can be found at [Gitlab repository]