Current time: 10-20-2019, 09:10 PM Hello There, Guest! (LoginRegister)

Post Reply 
 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Archiving + CRON
02-03-2015, 08:02 PM
Post: #1
Archiving + CRON
Dumb question, but on the server AWS AMI is there a CRON job set up for regular archiving? My crawls of the user crontabs and cron.php + dependencies suggests not.... just want to check (having no confidence in my Linux abilities!).

Thanks
Find all posts by this user
Quote this message in a reply
02-04-2015, 12:23 AM
Post: #2
RE: Archiving + CRON
No. There is a crontab that kicks off the auto-scaling cron scheduling but nothing for the archiving. I've been planning on tying it into the existing cron processing so it will just start working automatically.

Are you just using the normal cron.php with the archive settings defined in settings.ini? If so I can go ahead and push an update today that kicks it off automatically.
Visit this user's website Find all posts by this user
Quote this message in a reply
02-04-2015, 12:27 AM
Post: #3
RE: Archiving + CRON
Right now, I've been kicking it off manually via cli/archive.php.
Before an update to include an archive cron job in the AMI, any chance you could look at these two issues:
https://github.com/WPO-Foundation/webpag...issues/395
https://github.com/WPO-Foundation/webpag...issues/394

Without those being fixed, an automated archive job may result in results being lost.
Find all posts by this user
Quote this message in a reply
02-04-2015, 12:57 AM
Post: #4
RE: Archiving + CRON
Good timing - I just fixed both of those. You mind giving it a try and seeing if it looks ok now before I tie it in to cron?
Visit this user's website Find all posts by this user
Quote this message in a reply
02-04-2015, 03:53 PM
Post: #5
RE: Archiving + CRON
Something funky going on - tests are being archived to S3 automatically - as soon as the test completes, the ZIP file appears in S3, but the test files remain on the WPT server. No entries appear in cli/archive.log.
Running cli/archive.php manually, and these tests are ignored (i.e. don't appear in the log at all).

Not yet sure if this is down to my local set-up or the archiving changes.
archive_days is set to 30.
Find all posts by this user
Quote this message in a reply
02-04-2015, 07:42 PM
Post: #6
RE: Archiving + CRON
I have the same behaviour as Kevin.

There's the cron job that call's getwork, and I'm pretty sure that somewhere in this the archive process is being triggered - though I haven't worked out quite which one of the possibilities it is.

Andy

Using WebPageTest - http://usingwpt.com/
Visit this user's website Find all posts by this user
Quote this message in a reply
02-04-2015, 11:59 PM
Post: #7
RE: Archiving + CRON
I wonder if it's in the validating of the archived test. It won't delete the test if it can't validate that the archived version is valid and something funky may be going on there.
Visit this user's website Find all posts by this user
Quote this message in a reply
02-05-2015, 02:33 PM (This post was last modified: 02-05-2015 02:34 PM by kevinrdixon.)
Post: #8
RE: Archiving + CRON
(02-04-2015 07:42 PM)andydavies Wrote:  There's the cron job that call's getwork, and I'm pretty sure that somewhere in this the archive process is being triggered - though I haven't worked out quite which one of the possibilities it is.

work/workdone.php calls work/postprocess.php, which in turn calls ArchiveTest(id, false) in archive.inc. That second parameter is crucial - it means the test info isn't saved into testinfo.json. This, I presume, is why these tests are ignored by cli/archive.php - it must use the testinfo.json objects for identifying tests.

So, there are two outcomes of this:
1. If you have archiving set up, every test run is archived straight away, but the test data remains on the WPT server.
2. These tests will never be 'properly' archived - that is, the files being removed from the WPT server.

The consequence is that eventually the WPT server will run out of disk space, even with archiving switched on.

I'm wondering whether postprocess.php should call ArchiveTest(id, true), or indeed not call ArchiveTest() at all and leave it up to an archive cron job?
Find all posts by this user
Quote this message in a reply
02-06-2015, 04:17 AM
Post: #9
RE: Archiving + CRON
My plan is to have the regular internal cron processing (triggered every 5, 10 and 60 minutes from agents polling getwork.php and from a cron job on the AMI) to trigger cli/archive.php to run once we're confident that it actually works correctly (or the equivalent code factored out into a common location).
Visit this user's website Find all posts by this user
Quote this message in a reply
02-06-2015, 04:20 AM
Post: #10
RE: Archiving + CRON
btw, archive.php not deleting the tests is the thing I'm concerned about and need to look into. It will only delete the test if it verifies that the archive is valid: https://github.com/WPO-Foundation/webpag...e.php#L231

I'm concerned that the validation check isn't working correctly for S3 private buckets. The code looks reasonable but I want to do some testing and maybe move the archive code to the new S3 libraries before I'm sure: https://github.com/WPO-Foundation/webpag...e.inc#L254
Visit this user's website Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump:


User(s) browsing this thread: 1 Guest(s)