Most OA self-archiving growth will be prospective, rather than retrospective initially, because it is the current and forward-going research articles that are most urgently needed for research progress, and that is what is being mandated by institutions and funders; the legacy corpus can and will follow thereafter.
Hence, insofar as the current and forward-going articles are concerned, the default option should be to deposit the author’s final, peer-reviewed, revised, accepted draft (the postprint) in the author’s Open Access Institutional Repository, not necessarily or even preferentially the publisher’s PDF.
The author’s postprint is the draft with the fewest publisher constraints (and any publisher endorsement of making the PDF OA automatically covers the postprint too).
And, as Alma Swan and Cliff Lynch have pointed out, the PDF is the least useful for data-mining.
And, as can never be pointed out often enough, the purpose of OA self-archiving is the enhancement of access, usage and impact of the research, not the digital preservation of the publisher’s PDF! The postprint is a copy, not the original.
(For legacy deposits by authors who no longer have a digital draft of older articles, formatting the PDF, or scanning/OCR and reformatting, are obvious options.)