Forums / The Science / Data Analysis / Looking at the Light Curve II: Hints of an eclipse?

Looking at the Light Curve II: Hints of an eclipse?

Citizen Sky is now officially permanent part of the AAVSO. In the coming weeks we will be moving additional content to the AAVSO site and freezing this site as an archive of the 1st three years of the project. Please visit the new landing page for future updates.


This is a follow up to my last post about looking at the light curve to see if the eclipse has started. If you haven't read it yet, please check it out first.

At right is the latest (as of this posting) data from our Quick Look file. Remember the most recent data is at the top. Again, if you look at it quickly it does seem like reports of the star are getting fainter. So let's check out a recent light curve:

Now the average line is starting to look like something real. The error bars of the 2-day bins are no longer always overlapping. Specifically, compare the last 2-day bin (at roughly 3.2 magnitudes) with the third to last 2-day bin (at roughly 3rd magnitude). So the raw data is starting to show a dip that is approaching statistical significance. Having photometric data here would be nice to confirm the visual data. But we don't have any yet. This illustrated a nice feature of visual observations - since we have so many observers and it is easier to report, we get data sooner and don't have to wait for photometric data reduction. If we didn't have any visual data, right now the light curve would look like this:

We would have no hint of the eclipse at all! So it's nice to have both types of data to support each other. It's been postulated before that the eclipse will begin to appear in infrared light before it appears in optical light. It just so happens that we have an intrepid observer, Thomas Rutherford, who is observing in the J and H bands. These are officially considered "near infrared" and are among the bands that large, high-altitude observatories like Keck use. Here is a 45-day light curve of epsilon Aurigae in J and H:
It's a little hard to see the faint error bars, but look closely and they are there. Sure enough, there seems to be a slow but steady drop in both J and H band. It seems to have started sooner in the J band, but that could be noise in the data. Additional tests (such as repeated measures ANOVAs - more on that in a few months when we begin writing more serious data analysis tutorials) will be needed to see if that is significant. But hold on! There is something else to consider. Look at the same light curve, but expanded out to 300 days:
 
Now we have light curve data by Brian McCandless as well as Thomas. This longer term J and H band light curve of epsilon Aurigae shows alot of variation in the system. So even though the short term variation may be statistically valid, it may not reflect the true start of the eclipse. It could be part of the general variation inherent to the system: background activity, if you will.

If we were going by the light curves alone we could not say the eclipse has started. That is, if we submitted this light curve to a journal by itself it would not pass the reviews of the referees. But we do have the added consideration that models based on past activity  of this star predict the beginning of an eclipse right now. So when you add that to the equation then chances that this is the beginning of the eclipse make more sense.

This is an important lesson about data analysis. The human eye and common sense can be your best tool. Remember to always think back to your research question and not get blinded by numbers alone. Our research question is: "has the eclipse started?" So we want to consider more than just the numbers, but also see how they fit into the big picture to make a decision.

So what do you think? Is this evidence of the start of the eclipse?
 

 
Aaron
*Note: Notice the blue crosshairs on the first light curve at the top of the post. Those indicate observations made by me. I appear to be consistently on the bright end of the submitted observations. This may be a bias in my eyes, my observing strategy, my local observing conditions, etc. I am tempted to consciously correct for it in future observations but that would be bad! One should never try to correct for bias in their personal visual observations unless you are absolutely sure what the cause is and how to fix it. In other words, if I knew that I was mixing up my comparison stars that would be one thing. But I don't know what is causing this bias, so I'm going to continue to report what I see. As long as I'm consistent and honest, that will be okay. As discussed before, many statistical routines count on spreads in the data.
 


Aaron wrote:"I appear to be consistently on the bright end of the submitted observations. This may be a bias in my eyes, my observing strategy, my local observing conditions, etc. I am tempted to consciously correct for it in future observations but that would be bad! One should never try to correct for bias in their personal visual observations unless you are absolutely sure what the cause is and how to fix it. In other words, if I knew that I was mixing up my comparison stars that would be one thing. But I don't know what is causing this bias, so I'm going to continue to report what I see. As long as I'm consistent and honest, that will be okay."Exactly- report what you see, not what you think you should be seeing!Check out my post on rules for variable star observing for more information. It's in the visual forum here:http://www.citizensky.org/forum/avoiding-bias-simonsens-rules-variable-star-observingMike Simonsen


Hi Aaron,You need to be careful when creating means that include both visual and V-band data. As Wolfgang has mentioned, there is an apparent offset between visual estimates and Johnson V-band measures for epsilon Aurigae. This offset is ok; it is most likely just the bandpass difference between the human eye and the standard V system for this set of stars. However, by combining both V and visual data in your means, you end up biasing somewhat bright when V data is available, and faint when V data is missing. Since the last few days of data on your graph only include visual data, the means will make the star appear to be fainter. What you need to do is to remove V data when calculating the means, and then display the visual data, the means/error bars, and then overplot the V-band data separately.In other words, mixing apples and oranges is never wise for a scientific experiment!Arne


I think it is less of a problem in the last day's bin because the difference in V and visual isn't great (in this light curve) and we usually only have 1 or 2 V band point(s) per bin vs. tens of visual points. So the visual data has a much stronger influence on the mean than the V data. However, I do think it's a big problem in the third-to-last day's bin. There is a stack of V band data there so its influence on the mean is greater, which is clearly causing that bin to appear brighter. One can remove the V band data from the mean by simply deselecting it in the light curve interface. I did so and here is the visual data alone, with the same 2 day mean curve:One could argue the difference from the end of the curve is significant compared to the brightness at the start of the curve. It's borderline, and as I said, would never make it into a journal. But when taken in context with the predictions, et al. I think it is more compelling than it was in last week's report.

robin_astro's picture
robin_astro
User offline. Last seen 37 weeks 4 days ago. Offline
Joined: 06/16/2009
Posts: 99
Teams: None

Any one here familiar with Statistical Process Control? We used to use it on our paper machines. (Isay used to, they still do AFAIK. It is just that I am retired from that business now) One of the tests to make sure that process adjustments were only made when there were statistically significant changes in a parameter was based on the number of consecutive measurements during a run which were consistently above or below the mean. I forget the exact calculations but if we take the mean up to day ..5058 as representing the mean value of the system, then I think I would be pretty convinced that something had happened to the system after getting a run of 10 consecutive results below the mean. (The probability of that happening by chance is 1: 2 to the power 9 or 1 in 512. Anyone fancy doing a t test on the two populations before and after day 5058 to prove me wrong? Of course the reason for the shift is another matter. Are the visual observers cracking under the pressure of expectations? Perhaps not.... Robin


I ran the t-test suggested by Robin on the visual observations in the light curve. So before I get to the results I want to take a step back for those who are brand new to all this. A t-test is a comparison of the signal to noise in two groups of data. It compares the difference of the means of the two groups with the difference of the noise of the two groups. Noise is defined as standard error (closely related to the error bars in the plots we've been looking at). So what we're trying to find out is whether the difference in the data before 2455058 and after 2455058 is greater than the relative variability in the two data sets. This will give us an idea of the likelihood that the difference is caused by random chance or caused by some other phenomenon. It's basically a statistical way to do what we've been doing by eye simply by looking at the light curve and error bars. My t-test results are: t(261)=-3.702, p<.001. This means that the groups are statistically different and the likelihood that the difference is due to chance is less than 0.1 percent. Very rarely do you get a significance level so low (p<.001). That is a direct result of the number of data points we have (263). This is a great illustration of how you can get precise answers with lots of visual observations. So it's important to observe this star even if it looks like the light curve is well covered by other people. With only 50observations, I'm not sure this test would have ended up being significant. The average magnitude of the pre2455058 group of observations is 3.08 and the average magnitude post2455058 is 3.14. Both have a standard error of .01. So, yes, we can say that statistically the light curve has dropped. But Robin has a good point. Is this due to the eclipse starting or one of any number of other factors? We don't know yet. Robin, why did you choose 2455058 as the break point? Was it just by eyeballing the light curve or because it was the most recent 10 bins of data?

robin_astro's picture
robin_astro
User offline. Last seen 37 weeks 4 days ago. Offline
Joined: 06/16/2009
Posts: 99
Teams: None

Hi Aaron,Well I must confess my choice of date was not entirely random. I have just returned from a BAA meeting where I presented a poster paper covering my pre 1st contact spectroscopic data so I can now reveal all :-) http://www.threehillsobservatory.co.uk/astro/spectra_40a.htmIn fact it corresponds to a step change in KI 7699 aborption and also in Richard Miles' I band data. (Actually it is the 3rd such step in the KI data, graph bottom right in the poster) Robin

Powered by Drupal