Neural networks exhibit "grokking"—a sudden shift from memorization to generalization after prolonged training plateaus. While theoretical frameworks like Singular Learning Theory (SLT) characterize ...