Your search results

First, note that the smallest L2-norm vector that can fit the training data for the core model is \(>=[2,0,0]\)

Posted by admin on December 1, 2022

| 0

First, note that the smallest L2-norm vector that can fit the training data for the core model is \(<\theta^\text<-s>>=[2,0,0]\)

On the other hand, in the presence of the spurious feature, the full model can fit the training data perfectly with a smaller norm by assigning weight \(1\) for the feature \(s\) (\(|<\theta^\text<-s>>|_2^2 = 4\) while \(|<\theta^\text<+s>>|_2^2 + w^2 = 2 < 4\)).

Generally, in the overparameterized regime, since the number of training examples is less than the number of features, there are some directions of data variation that are not observed in the training data. In this example, we do not observe any information about the second and third features. However, the non-zero weight for the spurious feature leads to a different assumption for the unseen directions. In particular, the full model does not assign weight \(0\) to the unseen directions. Indeed, by substituting \(s\) with \(<\beta^\star>^\top z\), we can view the full model as not using \(s\) but implicitly assigning weight \(\beta^\star_2=2\) to the second feature and \(\beta^\star_3=-2\) to the third feature (unseen directions at training).

Within this analogy, deleting \(s\) reduces the error to possess a test delivery with high deviations away from no on the next element, whereas removing \(s\) escalates the error to have a test delivery with a high deviations out of zero with the third function.

Drop in accuracy in test time depends on the relationship between the true target parameter (\(\theta^\star\)) and the true spurious feature parameters (\(<\beta^\star>\)) in the seen directions and unseen direction

As we saw in the previous example, by using the spurious feature, the full model incorporates \(<\beta^\star>\) into its estimate. The true target parameter (\(\theta^\star\)) and the true spurious feature parameters (\(<\beta^\star>\)) agree on some of the unseen directions and do not agree on the others. Thus, depending on which unseen directions are weighted heavily in the test time, removing \(s\) can increase or decrease the error. Continue Reading

Register

Reset Password

First, note that the smallest L2-norm vector that can fit the training data for the core model is \(>=[2,0,0]\)

Drop in accuracy in test time depends on the relationship between the true target parameter (\(\theta^\star\)) and the true spurious feature parameters (\(<\beta^\star>\)) in the seen directions and unseen direction

Recent Posts

Recent Comments

Archives

Categories

Meta

Advanced Search

Register

Reset Password

Mortgage Calculator

Latest Listings

Search

Find us on Facebook

Contact Us

Login

Register

Reset Password

Login

Register

Reset Password

First, note that the smallest L2-norm vector that can fit the training data for the core model is \(>=[2,0,0]\)

Drop in accuracy in test time depends on the relationship between the true target parameter (\(\theta^\star\)) and the true spurious feature parameters (\(<\beta^\star>\)) in the seen directions and unseen direction

Recent Posts

Recent Comments

Archives

Categories

Meta

Advanced Search

Login

Register

Reset Password

Mortgage Calculator

Latest Listings

Search

Find us on Facebook

Contact Us