MyBB Internal: One or more warnings occured. Please contact your administrator for assistance.
WebPagetest Forums - Unrealistic performance at beginning of agent start (EC2)

WebPagetest Forums

Full Version: Unrealistic performance at beginning of agent start (EC2)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
Hey Patrick -

I'm using the EC2 autoscale feature to keep costs down (since I only run tests a few hours a day), and I've noticed an unwelcome trend - it seems that the first few tests that run on any agent after it boots are extremely inaccurate. Load times can be as much as 4x slower, and if you look at the waterfall it appears the browser simply halts loading the page for several seconds.

Here's an example of the same url, run first at the beginning of the queue, and again at the end of the queue (only a few minutes apart, all settings and content the same). I have confirmed this is consistent behavior, happening every day for weeks since I set it up, and on different urls too. If you want more examples I have plenty.

I'm guessing that the agent is doing something in the background after it starts up the browser. Could it be that the EC2 images are so old that they're churning trying to update themselves to the latest code and browser?

Any suggestions other than keeping the agents always running, or throwing away the first X minutes of testing?

Thanks so much,
What size instances?

The OS should not auto-update and the browsers are installed when the instance starts up (and testing doesn't start until they have finished installing).

That said, it's clear that SOMETHING is eating the CPU time. Any chance you can enable tcpdump capture which might help identify if maybe Chrome is downloading something it shouldn't be.
Thanks Patrick -

I'm using c4.large instances (I'm the nerd that ran a bunch of tests on instance sizes, so I'm sure we have enough power here Wink )

'Enabling tcpdump capture' as in logging into the machine and monitoring wireshark while it runs? or is there an easier setting I'm not aware of...
There's a per-test setting... Advanced tab of advanced settings, in the middle: "Capture network packet trace (tcpdump)"
The tcpdump will show up as a download to the left of the waterfall.
Awesome, glad I checked. Updated my API calls to enable that and will post back in a few after we have some data. Thanks, and have a great weekend!
Hey Patrick -

I have a few test runs that have the network packet trace data. Here's one for reference.

I opened it up in Wireshark, and saw a huge gap in time between 1.9 and 9 seconds, where's it seems to be doing nothing. Here's a screenshot, am I reading this correctly?

[Image: SISXjOI.png]

If this verifies that nothing is being downloaded in the background, what else could the agent be doing?

I wonder if there is some screwy IPv6 stuff going on:

If you're feeling adventerous you can connect to a launched instance (administrator pw is 2dialit) go into the interface settings and make sure IPv6 is disabled, create an image from the instance and see if that does anything (and/or set the reg keys from the above article).

I finally have a dummynet replacement that works on Server 2012 R2 or later and should work in EC2 so right after Thanksgiving I should be able to build new AMIs with Server 2016 and IPv6 disabled and see if that helps.
It's really bizarre that I don't see the issue and neither does SpeedCurve and we both launch and destroy hundreds of instances weekly Sad

Is it always at the same point in the waterfall and always for the same page? I wonder if there is something that is only IPv6 reachable that is causing it to try to bring up a tunnel.
Tried to login to an agent while it was running and wasn't able to. Was using Remote Desktop Connection to the public IP address shown in my running instance list (of my EC2 account)...verified that it was actually up through the getTesters.php screen, yet still couldn't connect. Was launched as part of a 'default' security group which had all ports open (side note - I realize this is a huge security risk, how do I specify the security group of the autoscale instances?). What am I missing? I know I've been able to connect years ago (but those were manually started).

It is not always at the same point within a page, and it is not always the same page - but does seem to happen only on the first 2-3 tests that are run after an instance starts up. Here's a test of, which was 3rd in line in the queue, showing a page load of 5s, when we know that isn't the case. The waterfalls of all affects tests seem to show periods of simple inactivity, not waiting in a particular phase.

I am equally puzzled why no one else has noticed this. I'm using an EC2 image for the wpt server, so there shouldn't be anything out of the ordinary. Are you and SpeedCurve using autoscale to launch your new weekly instances?
Pages: 1 2 3
Reference URL's