Vision is fast and efficient. A novel natural scene can be categorized (e.g. does it contain an animal, a vehicle?) by human observers in less than 150 ms, and with minimal attentional resources. This ability still holds under strong backward masking conditions. In fact, with a stimulus onset asynchrony of about 30 ms (the time between the scene and mask onset), the first 30 ms of selective behavioral responses are essentially unaffected by the presence of the mask, suggesting that this type of "ultra-rapid" processing can rely on a sequence of swift feed-forward stages, in which the mask information never "catches up" with the scene information. Simulations show that the feed-forward propagation of the first wave of spikes generated at stimulus onset may indeed suffice for crude recognition or categorization. Scene awareness, however, may take significantly more time to develop, and probably requires feed-back processes. The main implication of these results for theories of masking is that pattern or metacontrast (backward) masking do not appear to bar the progression of visual information at a low level. These ideas bear interesting similarities to existing conceptualizations of priming and masking, such as Direct Parameter Specification or the Rapid Chase theory.