Quality of assessment

Partly by design – but mostly due to the correct alignment of the stars – I’m one of the few lucky people who does kendo as part of their job. Depending on the time of year it can pretty much be non-stop. Believe me, it’s neither as easy or exciting as it sounds and, of course, there are times when all of it gets too much (both physically and mentally/emotionally)… but in general I’d say that because of this strong kendo element within my job I mostly enjoy my working life.

Sometimes, the non-kendo things are a real pain though, and one such thing rolled around last week: an annual “training” seminar. This year a professor was invited from a prestigious private university in Tokyo to lecture on the topic of “classroom assessments” …. brilliant.

Actually, the content of the lecture wasn’t actually that bad, it was just mostly irrelevant to my actual day-to-day work. Cue my brain to – as it generally does in situations like this – switch into kendo mode (I think this is actually the default setting). One topic in particular during the lecture caught my attention: “quality of assessment.”

Gradings (i.e. assessments) are something that we all go through, and I’m betting that all of us have experienced failure as well as success. This seems to be the normal way of the world and it’s probably healthy that we face a mix of each. Anyway, one thing that I’ve noted repeatedly over the past few years is that – despite my increased knowledge about and experience in kendo – I seem to have difficulty predicting if someone will pass or fail with accuracy. Either this is because I simply am not yet experienced enough (or smart enough) to understand the intricacies of the grading procedure, or it’s because of some sort of strong element of subjectiveness (even randomness?) within the procedure itself.

Last week at the seminar a couple of thoughts struck me (all though I am of course considering kendo in Japan here, I’m pretty sure the same questions can be applied to any national organisation):

– The ZNKR is quite consistent in the percentage of people who pass grades, how is this done?

– At gradings emphasis is always on the examinee, not the examiner. Are examiners trained and are their choices judged? Are “bad” examiners removed or re-trained?

Hmmmmm, I see the potential for some worms and a can.

Anyway, here are some points regarding the “quality of assessment” from last weeks lecture (in bold), with a few brainstormed questions from yours truly. Please feel free to consider, argue, or add in your own ideas in the comments.

Points to consider when looking at the quality of an assessment

1. Validity

The degree to which an assessment taps into what one intends to measure.

Do gradings really reflect what kendo practitioners really do during their keiko and in shiai, or do they have to show some something else (an idealised version of what they are supposed to do)?

Does the required content of gradings actually progress through levels, or does it remain somewhat the same between them?

Is there any bias? This could be age or gender bias, or perhaps questions about impartiality (especially pertinent in smaller organisations, or in arts where examinees are not anonymous).

Are participants being judged on what they can do or are they being compared to their opponents? If the latter is true, is it fair to match people who have wildly different ages or to mix genders?

etc.

2. Reliability

The degree to which assessment results are consistent no matter when and where a student takes an assessment or who scores the student’s response.

Is judging consistent across all examiners?

Is judging consistent across grading locations?

Is the content and task difficultly consistent across all parts of the grading process (shiai, kata, written)?

etc.

3. Practicality

The degree to which an assessment can be administered and maintained with available resources.

Does the organisation have enough people with the required experience (and training) to host a grading?

etc.

(True story: I remember being asked to read, then pass or fail the grading questions for 4dan in London years and years ago… I was 3dan at the time)

4. Impact

The degree to which an assessment gives positive and/or negative effects on test takers, teachers, students, and society.

Are participants simply “failed” or are they given useful feedback to promote future improvement?

Do the overall results provide useful information for kendo teachers to aid in the development of kendo for the future?

Are examiners fully aware of the ramifications for the future of kendo should people of sub-par ability be promoted?

etc.

I guess what I am sort of addressing here is the very obvious difficulty in ensuring that the grading process is done accurately/fairly. The current system seems to be highly subjective and seems to have – at least here in Japan (where grading times are extremely short and examinees are somewhat anonymous) – an element of randomness within it. After much thought on the matter, I’ve come to the tentative conclusion that the grading system is probably the weakest area (most open to problems) in modern kendo.

Anyway, these are just some thoughts that I’ve had for a while but which re-surfaced and became re-packaged based on the content of the lecture I listened to last week. If you have any ideas/thoughts/opinions on the matter please feel free to discuss in the comments, either here or on facebook. Cheers.

By George

George is the founder and chief editor of kenshi247.net.
For more information check out the About page.

View Archive

5 replies on “Quality of assessment”

I had a discussion with the staff at the Kagawa Federation a few weeks back because I wanted to have a list of criteria for each grade (starting with the kyus since I am working with a few beginners at the moment and I’d like to have them pass their kyus as soon as they can). I even had a similar discussion with an HS teacher here who’s been doing that for about 3 decades… The answer astonished me: neither the people at the federation nor the teacher were able to give me objective criteria for kyus of dans. The agreed answer was: we evaluate the examinees by comparing them to the examined batch… I understand that examiners experience does create a framework in which they can evaluate a majority of people with some accuracy, but it certainly does not look like there is a *system* to avoid obvious inconsistencies. And thank you again for the interesting post !

Thoughtful post, thank you. The font is migraine-inducing, though.

Cheers!

If you click reader view on your browser you should be able to see it in a plainer font.

Hi George, I think you’ve made some really good points about the subjectivity of Kendo. Kendo is definitely one of the most subjective arts – even the definition of yuuko-datotsu is made quite loosely, perhaps intentionally – open to interpretation, and flexible to suit different grades and ages.

In Australia, we have a national-level document that specifies the recommended criteria for grading (whilst recognising that state delegations take ownership of local gradings). However I do agree that it is still largely subjective because how do you judge “yep that’s a good enough men-uchi for a (grade)”?

I think one way to promote consistency is through exposure. The only frame of reference we have is the standard of Kendo we see every day (or week). In Japan, it’s quite easy to see hundreds of practitioners of various grades regularly, so over time I think one can get a decent idea of “how good is a typical (grade)”. Perhaps this is why Japan’s grading pass rate seems to be consistent – despite the randomness – through sheer statistics alone. If you toss a coin 10 times, you are unlikely to get a 50/50 rate for heads and tails (you might even get heads 10 times) – but do it 10,000 times and you will even out to 50/50.

Some people say that at higher dan grades “you have to score at least (x times) to pass”. Sure that would look good, but hypothetically speaking if two extremely good players fight, and unable to score on each other, does that mean both fail? I’d like to think the grading panel would look at something deeper and judge people independently. Likewise if someone is partnered up with someone who messed up their kata – it’s unfair to fail them because it’s something outside their control. So I’d like to think that how you respond to the situation is part of the grading too.

I have been in the grading panel myself (for kyu grades) and I definitely understand the difficulty of resisting the urge to argue with the other panel members whether someone “deserves” x grade, or to double-grade, etc. (since grading should be anonymous and objective, with no discussion between members). However if we are unable to discuss, how can we align ourselves to the same standard? If we can make a mistake as a shinpan, surely we can make a mistake at the grading panel, so surely it must be important to receive guidance there as well. As we know, FIK is spending a great deal of effort in referee seminars – however grading standardisation seems to be something that gets completely left to each local regions.

Fortunately one saving grace is that… ultimately, grade in Kendo doesn’t mean much. It’s not an academic score that impacts your day-to-day life. So perhaps that’s why it’s not a big deal – anyone who fails a grading can simply retry at a later date.

Thanks for the thought-provoking comment Bernard.

I’d like to believe your idea in the last paragraph, and I do agree to a certain extent, but it certainly affects some peoples lives. The most obvious example are police people who are aiming to become kendo teachers or already are. Even if you aren’t a policeman/woman, if kendo is a large part of your existence, it can indeed impact real-life stuff, unfortunately. I admit that this is only for a small number of people though.

Anyway, the percentage pass consistency here in Japan PROVES that the system is unfair. Remember that it’s grading fees (application and registration) that that are largely responsible for the ZNKRs existence (including salaries, trips abroad, what have you). Like in 2016, I posit that we will see (in fact, we just saw so in the Kyoto results) a slightly higher pass rate for yudansha grades… because it is a WKC year.

Personally I’m for scrapping grades after 5th dan, but the ZNKR would loose it’s largest (it’s only?) significant income stream.

This is a discussion to be had over beer! I can furnish you with real life examples of this unfairness then… !

This site uses Akismet to reduce spam. Learn how your comment data is processed.

By George

5 replies on “Quality of assessment”

Leave a Reply