RESOURCES  UNSW EDUCATIONAL DEVELOPMENT & TECHNOLOGY WEBSITE
Flexible EducationWeb & Media ProductionWebCT @ UNSWLearning Resource Catalogue
Web & Media Production
UNSW Educational Development & Technology Centre RESOURCES home RESOURCES home
 
    

 

Producing Content for the WebWeb Design & ConstructionAccessibility in Web Design

Text content

Content structure
Creating your document
Formatting text
Converting MS Word documents
Using PDF
bulletUsing OCR
Copyright issues

OCR - optical character recognition

Optical character recognition software can convert a scanned image of a document into digital text, which may then be edited and reformatted. Most flatbed scanners have OCR software bundled when you purchase them; this software may or may not be adequate for your needs. If you will be doing very much OCR, it is worth investing in good quality software (such as Nuance Omnipage) — it will save many hours of making corrections to scanned documents.

There are advantages and disadvantages to using OCR for document conversion:

Advantages:

  • A document that was inaccessible to screen-readers can be made accessible.
  • The text in the document may now be searched, cut/pasted etc.
  • The file will be considerably more compact for download than an image file.

Disadvantages:

  • Page formatting is lost.
  • Graphics or images will need to be re-inserted.
  • The document may need considerable reformatting to be suitable for web use.
  • You will need to review the document for typographical errors generated by the OCR software.

It must be kept in mind that a document scanned and converted with OCR will probably require considerable editing and restructuring to be suitable for web delivery. It may be appropriate to provide several versions of the document:

  • Edited and simplified for reading from screen
  • Full-text OCR for accessibility and research
  • PDF with original formatting for printing and annotating

It is possible to ‘batch OCR’ pages of documents with an automatic document feeder — UNSW Publishing & Printing Services has this facility: http://www.publications.unsw.edu.au/, go to Services > Scanning/OCR.

Not that copyright legislation applies to supplying digital material on the web. See ‘Copyright issues’.

> copyright issues

 
 
 
 

 

UNSW Educational Development & Technology Centre RESOURCES home RESOURCES home