Hi!
Your job is to build a class scheduler, which organizes kids into different classes. Each class consists of about 6 kids (although it can be more or less), and each class runs at a particular time (e.g. Tuesday at 17:00).
One of the key things that makes Mindjoy work is that we bring kids together into a small group of well-matched peers. Your task here is to group kids into classes which are well matched.
There are two goals:
- The schedule must be correct
- The schedule should try to be as optimal as possible. We'll present the "optimal" conditions later. Bear in mind that the optimum solution is probably an NP-hard problem, so don't worry about trying to find the single best solution. So long as your program attempts to find a reasonably good solution, in a reasonable amount of time, that's fine.
{
"students": [
{
"age": 9.7,
"skill": 3
},
{
"age": 10.9,
"skill": 2
}
],
"timeslots": [
{
"day": "Monday",
"time": "17:00",
"studentsAvailable": [0, 1]
},
{
"day": "Monday",
"time": "18:15",
"studentsAvailable": [1]
}
]
}Class 0 (Monday 17:00)
Index Age Skill
Student-0 9.7 3
Student-1 10.9 2
The example input is a trivial setup, but it will help to illustrate the problem.
There are only two kids, and two timeslots.
If we look at the 17:00 timeslot, in the studentsAvailable field we see [0,1]. The
0 and 1 are indices into the students list. This list of [0,1] tells us that both
students are available on Mondays at 17:00.
If we look at the other timeslot (at 18:15), we see that only student 1 is available
then.
There are two potential solutions to this input. The first possible solution is to
create two classes. One at 17:00, and another at 18:15, and place kid 0 into
the first class, and kid 1 into the second class. This is a silly solution, because
we want kids to be together.
The better solution is to create a single class at 17:00, and place both kids in
that class. This is what the Example output shows.
Note that when your scheduler runs, it will definitely be creating more than one class. On average, you should expect to see around N/6 classes, where N is the number of kids.
A correct solution is one in which no kid is placed in a timeslot for which they are unavailable. This is the only criteria for correctness.
An optimal solution is one which minimizes what we call a loss function. The term loss function comes from the Machine Learning world, but if you've never heard of it before, don't be frightened. It's just a number which we are trying to minimize. You may also have heard of the term objective function, and it's the same thing.
The loss function for our schedule is defined as:
total_loss = average of individual loss values of each class
In other words, we are trying to minimize the loss function for each class that we create. This doesn't mean that a solution which produces 5 classes is better than a solution which produces 6 classes. We are trying to minimize the average.
The loss function for each class is (in broad terms, ignoring some details):
class_loss = [variance of kid ages] + [variance of kid coding ability] + [class size]
If we add the details in, we get
class_loss = 0.6 * [variance of kid ages] + 0.4 * [variance of kid coding ability] + 0.5 * abs(6 - class_size)
Let's explain the above class loss equation:
- 0.6 * Variance of kid ages: The variance of the kid's ages is simply the
standard deviationof the set of kid's ages, measured in years. We use fractional years (eg 9.5 means 9 years and 6 months old). The 0.6 is simply a weighting constant. - 0.4 * Variance of kid coding ability: The variance of the coding ability is simply
the
standard deviationof the set of kid's coding ability scores. The 0.4 is simply a weighting constant. - 0.5 * abs(6 - classSize): What we're doing here is trying to make sure that each class is as close as possible to 6 kids. The 0.5 is simply a weighting constant.
- There can be multiple classes at one timeslot. For example, if there are 10 kids who are all available
at
Monday 17:00, then it might be best to create two classes for that timeslot, and place 5 kids into each of them. - You may use any language you want.
- You may not use an optimization library. Just use the standard library for your language of choice.
- Your program may not take more than 5 seconds.
- Your program only needs to work on the provided
input.jsonfile. - If there is anything confusing about the task, do not hesitate to reach out and ask for clarity. Any questions you might have about ambiguities are probably a good sign!
- A monte carlo solution is fine. You don't need to be fancy to find reasonably good solutions.
- In other words, it's fine if your program tries a lot of random solutions, or at least if it starts out by doing that.