Bath's law, the observation that the largest aftershock is, on average, 1.2 magnitudes smaller than its main shock, independent of main shock size, suggests some degree of self-similarity in earthquake triggering. This behavior can largely be explained with triggering models in which the increased triggering caused by larger magnitude events is exactly compensated for by their decreased numbers, and these models can account for many features of real seismicity catalogs. The Bath's law magnitude difference of 1.2 places a useful constraint on aftershock productivity in these models. A more general test of triggering self-similarity is to plot foreshock and aftershock rates as a function of magnitude m relative to the main shock magnitude, m(max), of the largest event in the sequence. Both computer simulations and theory show that these dN/dm curves should be nearly coincident, regardless of main shock magnitude. The aftershock dN/dm curves have the same Gutenberg-Richter b-value as the underlying distribution, but the foreshock dN/dm curves have the same b-value only for foreshock magnitudes less than about m(max) - 3. For larger foreshock values, the dN/dm curve flattens and converges with the aftershock dN/dm curve at m = m(max). This effect can explain observations of anomalously low b-values in some foreshock sequences and the decrease in apparent aftershock to foreshock ratios for small magnitude main shocks. Observed apparent foreshock and aftershock dN/dm curves for events close in space and time to M 2.5 to 5.5 main shocks in southern California appear roughly self-similar, but differ from triggering simulations is several key respects: (1) the aftershock b-values are significantly lower than that of the complete catalog, (2) the number of aftershocks is too large to be consistent with Bath's law, and (3) the foreshock-to-aftershock ratio is too large to be consistent with Bath's law. These observations indicate for southern California that triggering self-similarity is not obeyed for these small main shocks or that the space/time clustering is not primarily caused by earthquake-to-earthquake triggering.