Skip to content

Conversation

@jcdyer
Copy link
Contributor

@jcdyer jcdyer commented Aug 15, 2017

I have a suspicion that the problem was with the way the queries depend on the previous queries getting committed. I know edx.org uses a funky transaction mode for their MySQL instance which might cause problems with this behavior, and might also cause differences with a development box that uses a newer version of mysql (that might use a different transaction mode). This implementation makes the individual queries independent of one another, and also only performs one update query per batch, rather than one per object.

I still need to do manual tests, but wanted to get feedback.

Compare the implementation to the source I borrowed from here: https://github.com/open-craft/problem-builder/pull/143/files#diff-cd075b20c253e110da4aef02f64019d0R38

@bradenmacdonald @feanil

Copy link

@feanil feanil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. We should make batch_start configurable since we may need to stop and start this job as we tweak batch sizes based on end-user impact.

batch_size = options['batch_size']
sleep_time = options['sleep']
max_pk = Answer.objects.order_by('-pk')[0].pk
batch_start = 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to stop this command and re-start it, we'll want to start at a different ID. We should make this configurable. Optionally it could look at the lowest PK where the new field is empty? That may be more clever than necessary.


from django.db import migrations

def copy_course_id_to_course_key(apps, schema_editor):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just replace this forward migration with a noop in this 2.6.5.patch1 build, so devops doesn't have to bother manually faking it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to merge this branch into master, it will cause some pain to have a different version here. If we don't try to merge this branch into master, this version of the management command will disappear and will never have lived on the master branch, which may make it hard to dig up later on, if it's needed for historical reasons.

My suspicion is that faking the migration will be the lowest amount of pain, but I'll defer to @feanil's judgment on which approach they'd prefer.

Copy link
Contributor Author

@jcdyer jcdyer Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if we're just going to cut a release directly off this branch it shouldn't matter. We can still resolve conflicts and merge the branch afterward. I like your plan @bradenmacdonald. I've set the migration to noop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly. Let's release with this exact branch, and when we merge to master later (which we should, to keep the code, as you point out) we can resolve the migration conflict in favor of the existing migration that is not a noop.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that will make things a lot simpler, thanks @jcdyer @bradenmacdonald .

@jcdyer jcdyer force-pushed the cliff/migrate-course-key branch from f7d177f to 82db864 Compare August 17, 2017 12:03
@jcdyer
Copy link
Contributor Author

jcdyer commented Aug 17, 2017

@bradenmacdonald @feanil The current version is working for a manual test, but there still seems to be an error in the circle-ci test suite (unrelated to my PR). Sorry I squashed commits without thinking. I fixed an off-by-one error and did some other clean-up to get the script working (I had the queryset filtered using __ge instead of __gte, the circle.yml file had some fiddly bits that needed fiddling, etc.).

With 1,000,000 Answer entries, batches of 5000 (the default) and no sleep between batches, All 200 batches run in about 90 seconds. I reduced the default sleep time from 5 seconds between batches to 1 second between batches (Which means it will still spend more time sleeping than working, but not waste quite as much wall clock time).

Before this gets run, we will still need to

  1. tag a release.
  2. update edx-platform to use that release.
  3. Do a test run on stage.

I have permission to do #1 (just need thumbs on this PR), and create a PR for #2 (after the tag exists), but will need devops to do #3.

I will out of town the rest of this week. I'll be able to do minor tasks (like #s 1 and 2 above), but won't be available for larger work. We can either pick this up next week, or I can hand it off to @bradenmacdonald to shepherd through.

Copy link
Member

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, but I think I spotted a couple bugs.

I'm not worried about the CircleCI issues - they're definitely unrelated (probably flaky tests or dependency issues that have already been fixed on master) and I think we can be quite sure this PR isn't introducing any regressions.

One other general comment: Could we please rename copy_deprecated_course_ids to something that includes problem_builder in the name? Because the management command namespace is shared among all django apps in the edxapp virtualenv, and that name is very generic and could conflict with something else in the future.

offset = options['offset']
max_pk = Answer.objects.order_by('-pk')[0].pk
batch_start = offset
batch_stop = batch_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be batch_stop = batch_start + batch_size ?

Otherwise if I pass --batch-size 200 --offset 600 then we'd have batch_stop < batch_start

I tested this and found that --offset didn't seem to be working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks

queryset = Answer.objects.filter(
pk__gte=batch_start,
pk__lt=batch_stop,
course_key='',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://github.com/open-craft/problem-builder/blob/master/problem_builder/migrations/0003_auto_20161124_0755.py it seems that when we created the course_key column, we made it nullable with a null default, so I suspect that the live server actually has null values in this column, not empty strings. (Note that in migration 0005 we later fixed that, but that migration isn't on edX.org yet).

Using this should work:

            queryset = Answer.objects.filter(
                pk__gte=batch_start,
                pk__lt=batch_stop,
            ).filter(Q(course_key__isnull=True) | Q(course_key=''))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Updated.

Copy link
Member

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for those fixes! Conditional 👍 from me, if you also rename the copy_deprecated_course_ids command to something problem-builder specific as mentioned.

  • I tested this: Tested some data migrations on my devstack - small numbers of rows though
  • I read through the code
  • I checked for accessibility issues n/a
  • Includes documentation: has docstrings

@jcdyer jcdyer force-pushed the cliff/migrate-course-key branch 2 times, most recently from 15ccaa4 to 3463336 Compare August 21, 2017 11:15
@jcdyer
Copy link
Contributor Author

jcdyer commented Aug 21, 2017

Renamed command to problem_builder_migrate_keys. I'm going to tag a release.

@jcdyer
Copy link
Contributor Author

jcdyer commented Aug 21, 2017

Looks like someone created v2.6.5patch1 a couple days ago. Should I rebase on top of that, or release independently?

@bradenmacdonald
Copy link
Member

bradenmacdonald commented Aug 21, 2017

@jcdyer Hmm, not sure how that happened. I have deleted that tag as it was obviously incorrect. Please release independently. We'll have to use a slightly different name for the new tag, though.

@jcdyer
Copy link
Contributor Author

jcdyer commented Aug 21, 2017

Release tagged as v2.6.5patch2

@feanil
Copy link

feanil commented Aug 21, 2017

@jcdyer when you have the edx-platform PR, tag me and @fredsmith on it. He's on call this week but I'll be shepherding this through.

setup.py Outdated
version='2.7.2',
description='XBlock - Problem Builder',
packages=['problem_builder', 'problem_builder.v1'],
packages=['problem_builder', 'problem_builder.v1', 'problem_builder.management', 'problem_builder.management.commands'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip: It's even better to just use find_packages()

* Fix version number
* Make circle script more flexible for version bumps.
* Remove vestigial tests.
* Use find_packages in setup.py.
@jcdyer jcdyer force-pushed the cliff/migrate-course-key branch from af25c49 to 3b88683 Compare August 24, 2017 19:39
@jcdyer
Copy link
Contributor Author

jcdyer commented Aug 24, 2017

@bradenmacdonald This is ready for a final review before merging to master. I had to remove the tests because the master branch expects course_key to be non-null, which makes them 1. fail, and 2. useless if they're revised not to fail. The reason I'm keeping the management command around is to give it a presence in the history of the master branch.

@jcdyer jcdyer force-pushed the cliff/migrate-course-key branch from 3b88683 to ddcf831 Compare August 24, 2017 19:41
@bradenmacdonald
Copy link
Member

👍 Thanks @jcdyer !

@bradenmacdonald bradenmacdonald merged commit 7665063 into master Aug 27, 2017
@bradenmacdonald bradenmacdonald deleted the cliff/migrate-course-key branch August 27, 2017 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants