I was keen to join this as in title it seems to be relevant to our ACP – Academic Coursework Presentation – which has a Q&A component. TAFSIG is Testing, Assessment and Feedback Special Interest Group within BALEAP (glad she clarified that because I didn’t know!) They have a YouTube channel with weinar recordings (I shall have a look at that!)

4 speakers: Craig Davis, Nicola Harding (Manchester), Lori-ann Miln, Phillippa Bunch (Southhamtpon).

Context and collaborative development:

The presessionals are similar in some respects: large student and tutor cohorts (100+ tutors). Both assess writing in an essay (challenges brought by genAI and suspected malpractice but no real way of authenticating engagement with writing components).

Manchester: Reading to writing (80% writing, 100% reading) L2S (50%S seminar, 100% listening). Previously seminar with brief presentation followed by discussion was 100% of speaking score.

Southhampton: Assessments are more separated out. One key difference from Manchester is that students can choose their own topic for their researched essay = a lot of different research areas. Reading is demonstrated through writing. Speaking is a presentation followed by Q&A. Listening is a lecture followed by discussion where skills are assessed. There are different read/writing tutors and speaking/listening tutors.

Manchester and Soutthampton had a collaboration, identifying similar challenges a few years ago and sharing ideas around assessments and after several meetings realised they were all moving in the same direction, as GenAI use became more prominent and they were trying to deal with that in the assessment design. They were also prompted by feedback from lecturers across the university, saying that rehearsed presentation was not effectively assessing their ability to produce spontaneous speech.

Manchester Case Study

Created the question and answer assessment:

Question 1: Prepared question, same for all students (2 mins)
Question 2: product focused from question bank (3 mins)
Question 3: process focused question seleted from question bank (3 mins).

All students were given the same question: To what extent should AI be integrated into HE?

Example of Product focus: Tell us about a specific source you used in your essay – how and why did you use it?

Example of Process focus: What was the most significant/helpful piece of fb you received and how do you respond to it? (make sure student answers both parts of the question)

Follow up prompts: Why do you think that? /Is there anything else that you found helpful/unhelpful.

Scaffolded process:

Stage 1: Assessment overview (brief introduction during student orientation talks)
Stage 2: Tutor-led sesion focusing on the assessment (every assessment is supported by synchronous taught session, look at criteria and apply them to example answers).
Stage 3: individual tutorials x 2 redesigned to follow a Q&A format (existing activity, part of the course, mini QAA for that to enable practice.)
Stage 4: individual study (checklist of preparations and a reflection leading into final stage)
Stage 5 (final questions and preparations: final group tutorial, centres on the Q&A and the assessment coordinator drop-in – where students could ask any questions, not many did)

How were tutors supported?

The assessment was piloted with volunteer students from PS April 2025 and standardisation packs were created using videos from the pilot.
An answer guide (University of Manchester is quite prescriptive around this part of the assessment: pros and cons, but one of the pros is the easy ability to produce such a guide!) was created to support marking.
A streamlined criteria for live marking: was made as useful as possible for the tutors, as there are lots of students marking up to 18 students each, and doublemarking.
Marking template was provided with space to write which questions were asked and space for notes about the answers.

Students were assessed on speaking by focusing on: fluency, pronounciation, language. Students were also assessed response to the question by focusing on: relevance, specificity, development

Speaking (fluency, pronunciation, language) contributed only to speaking (25%) and Response to questions 1-3 each contibuted 25%. For writing score, 20% writing and then response to questions 1-3 contributed 33% each. So increasing the focus on process.

They were quite prescriptive in terms of what was provided to tutors, as it was the first time to run it large scale. Everything was live doublemarked, assessments all scheduled over 3 days.

How did students perform?

Generally fairly consistent. Outliers: Speaking: 57 students scored 20%+ higher on L2S than on Q&A. 12 students scored 50% or below in QAA but above 70% in L2S Writing: 20 students scored +25% lower on the Q&A

Overall, the assessment was well received by teachers and students. However, timetabling was a big challenging as everything was live doublemarked. Questions banks supported tutors and ensured consistency, while prompts supported students in speaking for the full 3 minutes (mostly). However more question-specific follow up prompts and more guidance on managing the discussion elements could be needed.

Part 1 and 2 had a lot of overlap and also some scripting and reading from the essay = more difficult to authenticate and mark, as not saying a lot about student engagement. Question 3 was more revealing about how students engaged with the process, and therefore more useful.

Next time: They want to switch the focus to process rather than product – product – process. Still the same components (1 rehearsed, 2 not). This will require revisiting the question and prompts. They may also change the ratings to make Q&A 40% of writing. In terms of identifying outliers earlier = having a minimum component threshold for referral, so if student falls below 40% for any component they will be flagged. They also want to have a gap cap between the writing and the speaking

Southhampton Case Study

Soutthampton did a formative presentation (4 minutes) and Q&A (2 minutes) then later a summative presentation (6 minutes) and Q&A (4 minutes). The Q&A consisted of 2 questions: 1) demonstrate understanding of content and 2) demonstrate reflection on research process.

They don’t have much quantitative data at the moment but plenty of qualitative. The Q&A was worth 20% of the assessment criteria with content, structure, communicaton, precision and accuracy also each worth 20%. Students needed to be able to talk about they used their sources and how they used feedback.

In terms of tutor support: they provided structured tutor training embedded in the induction programmes. They also provided a question bank to support consistent assessment delivery. Finally, they did standardisation sessions. They thought this would suffice. But, after moderation and observation of formative assessment, they saw a lot of variability and disparity/inconsistency in tutors’ ability to run the Q&A in terms of formulating suitable open ended questions, and scaffolding student responses in real time. There were also struggles around sustaining interaction beyond surface level clarifcation, in terms of not allowing enough time and space for students to develop detailed responses. Some interactions were very brief, while others were more developed and encouraged critical thinking, thereby resulting in a better score. So, based on the formatives, in preparation for the summatives, they produced an enhanced guidance and question bank with initial questions and possible follow up questions, which helped with the issues identified when it came to the summatives. It should be noted that the questions had to be able to fit everyone’s essay topc even though they would all be different.

Things they found interesting: students developed more confidence in oral academic English, there was stronger evidence of research engagement, fewer formal academic misconduct cases but also the quality of student enagement depended heavily on the tutor’s quetsioning technique. A positive outcome: students reported feeling more prepared to communicate in an academic environment.

Looking ahead, they want to perhaps shrink the prepared presentation and extend the Q&A, with increased emphasis on process, linked to student folder. They also want to enhance tutor development with targeted training.

Key considerations and shared findings

Both assessment designs responded to AI by increasing emphasis on more spontaneous and more authentic Q&A
Different approaches but common challenges particularly around questioning, interaction and consistency.
Overall positive student outcomes.

Ongoing debate on balancing the structure for fairness and reliability with flexibility for authenticity and responsiveness.

Discussion/Questions

The first question was about student grading: how much focused on subject knowledge, language etc. Manchester doesn’t score much for content, focuses more on language, and relevant response to questions (which is sort of content based).

The second question was about the degree of mitigation of AI use. The focus on process has helped, moving away from end-loaded assessment and building it in throughout the course as well as building in meaningful dialogue from day 1. Also makes feedback into more of a process as it is revisited. At each formative stage there is opportunity for discussion with the students and this can be very constructive. Also, knowing from day 1 that they will need to do this encourages engagement with the process. With online courses, the live transcription thing is a challenge. Live Q&As are much better quality than online, as students have to be more natural and spontaneous and use oral strategies when they aren’t sure what to say etc. Any suggestions to help with online are welcome!

Reading/writing team do a lot of critical reflection as part of the course. (This related to a question about how prepared students were for the level of criticality required by the speaking assessment.)

The question bank: Soutthampton are asking 6 questions each time formatively in reflective tutorials so students get used to the style of questioning. A list of questions is not given in advance, but they do get two practices so students get the chance to practice responding to that style of questioning process.

How important is the prepared part of the presentation, would it be better to get more quickly to the Q&A. Answer: good question, they have been tempted to do away with the presentation part but they keep coming back to the point that on most degree programmes ss have to give a presentation, whether or not they use AI to do that, so the presentation skills are stilll useful to take forward and therefore it still has value. Focus on process is very present already in the writing, this may be brought more into the speaking as well. In terms of keeping the presentation, the prepared part allows more confidence coming into the more spontaneous part. There is still thought about changing the weighting so that the prepared part has less weighting.

My thoughts

Phew! Interesting hour, well spent! Southhampton’s current approach seems more similar to ours – except we have 7-8 mins presentation and 2-3 mins of questions. We have a mock and final, which I guess equates to the formative-summative. The latest cycle did identify issues like those mentioned above around consistency in questioning, in terms of difficulty of questions asked and depth/length of interaction. We had already discussed the need to standardise this more, so there are some good ideas in this session to draw on!

I feel that the Q&A component of our criteria could also use fine-tuning, drawing on some of the ideas shared today. At the moment the Q&A is only worth half a criteria but that is not something we can change without a Studygroup-wide discussion and process so it is definitely not a very near future thing, but we definitely can improve on how we do and mark the Q&A part, then perhaps at some point if we can shift it towards representing more of the overall presentation score (which itself is 50% of students’ speaking score), we will be in a better position to do so.

For now, we have an entirely different kettle of fish spilling all over the place development-wise, however! 😉

Lizzie Pinard

Reflections of an English Language Teacher

BALEAP TAFSIG Webinar ‘Developing a viva-style assessment for a large language course: a collaboration between two large scale pre-sessional programmes’ –

Context and collaborative development:

Manchester Case Study

Southhampton Case Study

Key considerations and shared findings

Discussion/Questions

My thoughts

Leave a comment Cancel reply

Context and collaborative development:

Manchester Case Study

Southhampton Case Study

Key considerations and shared findings

Discussion/Questions

My thoughts

Share this:

Related

Leave a comment Cancel reply