Login | Register

Welcome, Guest. Please login or register.

February 23, 2020, 12:27:02 am

Author Topic: How does inter-subject scaling work?  (Read 2378 times)

0 Members and 1 Guest are viewing this topic.


  • Administrator
  • Great Wonder of ATAR Notes
  • *****
  • Posts: 10061
  • Oxford comma and Avett Brothers enthusiast.
  • Respect: +6701
How does inter-subject scaling work?
« on: October 25, 2019, 04:11:40 pm »
Please note that all of the information in this thread is based on my personal interpretation of publicly available documents. There may be inaccuracies. Please feel free to ask questions! To make a post, you will need to be logged in to your ATAR Notes account. If you don't have one, you can register here.

> Part 1: What is the ATAR?
> Part 2: How is the ATAR calculated?

Hey team! Just continuing this little series on the technical side of QCE and the ATAR. :) In previous threads, we've looked at:

> what the ATAR is conceptually; and
> broadly, how it's calculated.

We know so far that the ATAR is a percentile rank, reported on a scale of <30 to 99.95 in increments of 0.05. We know that the ATAR is calculated from a student's top five subjects (with some restrictions). We also know that those individual subject scores are reported out of 100, based on both internal assessments and (for, at least, General subjects) an external assessment.

Where we left it in the last thread, subject scores were reported out of 100, but we asked the question: is that all? Do we just leave them as raw marks based on the internal and external assessments? Let's explore this a little further.

Let's say there are two QCE students: Alex and Billy. Alex is studying the fictional QCE subject of Darts. Billy is studying the fictional QCE subject of Sudoku. Both Alex and Billy achieve a subject score of 70 in their subjects. Are these scores equally impressive?

On the face of it, we don't have much choice other than to say yes. With no further context, all we have is the numbers on paper. 70 is equal to 70, so it seems logical enough to say that Alex and Billy did equally well. But through QCE, we're in the fortunate position of actually having access to more information, and we should use that information accordingly.

For example, let's say that, hypothetically, Alex's score of 70 in QCE Darts was the highest score of anybody studying QCE Darts across the entire state, and Billy's score of 70 in QCE Sudoku was the lowest score of anybody studying QCE Sudoku across the entire state. All of a sudden, our perceptions of their scores change - Alex's seems a lot more impressive, and Billy's seems somewhat less impressive.

We should take this new information into account. To do that, we can use something called inter-subject scaling.

To clarify, you don't need to know any of this stuff to do well in QCE or to get a great ATAR. You could be entirely oblivious about how the system works and smash your studies; equally, you could know the system intricately but struggle in terms of your scores. But knowing how things like inter-subject scaling work may clear up a few misconceptions, which is why I'll go into it now.

The whole point of inter-subject scaling is to make QCE a level playing field. Scaling is not designed to "reward" or "punish" you for taking certain subjects, and it doesn't do this in practice, either. The idea of scaling is to counter variables (such as in the example of Alex and Billy above) to give a more accurate indication of QCE performance. If we didn't have something like inter-subject scaling, there would be incentive for students to simply choose the subjects they thought scoring highly in would be easiest. This isn't what we want; instead, with scaling, you can choose subjects with confidence based on what you're interested in and passionate about, what you're good at, or what you need as a university pre-requisite.

Scaling takes your "raw" score for each subject, and transforms it into a "scaled" score. What's important to note here is that your level of achievement does not change - all that changes is the scale it's being reported on. To explain this in a different way, let's say we were trying to compare the weights of two people:

> Person 1: 85kg
> Person 2: 150lb

Can we say that Person 2 is heavier just because 150 is a larger number than 85? No, absolutely not - the weights are being reported on different scales (kilograms for Person 1, pounds for Person 2), so to accurate compare them, we need to convert them onto a consistent scale. That's basically what inter-subject scaling is doing at QCE level.

We can't make a judgement call on which QCE subjects are inherently harder or easier, because this is super subjective. For example, I might find Maths Methods super easy but Legal Studies super difficult. For you, the opposite might be true. What we can do, though, is consider how competitive the cohort (everybody studying that subject in that year) is for each subject. To do that, we initially need to convert raw subject scores into percentiles.

Basically, we need to consider where a specific result in a specific subject places you in relation to the rest of the state. For example, maybe a raw result of 91 for English might place you at the 92nd percentile of the state - that is, your result was better than 92% of the state. Or maybe your 82 raw in Methods placed you at the 85th percentile of the state. Or maybe your 75 raw in Legal Studies placed you at the 72nd percentile of the state. These percentiles act as a first round of scaled results. So it might look something like the below. (Please note that the numbers here are entirely made up - we don't know how different subjects will scale until the end of 2020, but more on this later.)

What you might notice is that in the fictional example above, a raw result of 75 in Legal Studies and a raw result of 70 in Physics both initially scale to 72. So we can say that, at this stage, a 75 in Legal Studies is approximately as impressive as a 70 in Physics.

Since these percentile ranks supersede the initial raw results, we have our first round of scaled results. Great! But do we just leave it at that?

So far, we have accounted for where students' results ranked them in each subject. But what if it were actually easier to rank highly in some subjects than in others?

Let's consider another analogy. Imagine both you and your friend are competing in a 100-metre running race. You both run the race in the exact same time - say, 12 seconds flat. In your race, though, you were competing against Usain Bolt, Tyson Gay, and a bunch of other professional sprinters. Your friend was more fortunate, coming up against a bunch of toddlers. Due merely to the competition in each race, your 12-second race places you last, whilst your friend's 12-second race places them first.

It wouldn't seem fair to punish you because your race had very strong competition, or reward your friend because their race had very weak competition. Remember: your actual level of achievements was exactly the same. The same is true for QCE subjects. It wouldn't seem fair to punish students who take subjects with a lot of competition, or reward students who take subjects with weaker competition. So to counter this phenomenon, we find something called the polyrank.

The polyrank provides an overall indication of each student's QCE performance. It's derived from the average result from a student's top five scaled subjects. Using the same subjects and results as earlier, we can see that this student's polyrank is 74.40:

Great - but what does that actually mean?

Well, we can use the polyrank to help us determine how competitive each subject was (not how difficult the content was). To do this, we find the average polyrank of every single student in the state that achieved a specific result in a specific subject. We might look at the pool of students in Queensland who achieved a 91 in English, for example, and find the average polyrank of that sub-set of students. Doing this for each result for each subject will leave us with a new round of scores, which becomes our most updated round of scaled results.

(Another timely reminder that all of these numbers are completely fictional and used only for demonstrative purposes.)

Great - so now we have another round of scaled results. That's surely it, right? Nah - there's more.

The thing is, now that we have a new round of scaled scores, we can re-calculate each student's polyrank based on these updated results. We repeat the process, finding the (updated) average polyrank of students who achieved a specific result in a specific subject, leading to another round of scaled scores.

Then, because we have a new round of scaled scores, we can find new polyranks.

Then, because we have new polyranks, we can find a new round of scaled scores.

Then, because we have a new round of scaled scores, we can find new polyranks.

Then, because we have new polyranks, we can find a new round of scaled scores.

And so on. This process repeats indefinitely, essentially until results "flatten out", and we settle on a final round of scaled subject scores. These are quite specific, going to the second decimal place. Once we have our final scaled results, we can find the aggregate, which we spoke about in a previous thread.

When each student has an aggregate, we can rank these from highest to lowest, and therefore find each student's percentile rank. That is, their ATAR.

It's impossible to tell! As is hopefully now clear, inter-subject scaling depends solely on student performance, and will change from year to year. We have no past data to go off, so won't know how subjects will scale until the end of 2020 at the latest.
« Last Edit: February 21, 2020, 02:51:40 pm by Joseph41 »