RESOURCES  UNSW EDUCATIONAL DEVELOPMENT & TECHNOLOGY WEBSITE
Flexible EducationWeb & Media ProductionWebCT @ UNSWLearning Resource Catalogue
Web & Media Production
UNSW Educational Development & Technology Centre RESOURCES home RESOURCES home
 
    

 

Producing Content for the WebWeb Design & ConstructionAccessibility in Web Design

Text content

Content structure
Creating your document
Formatting text
bulletConverting MS Word documents
Using PDF
Using OCR
Copyright issues

Converting MS Word documents

Word documents saved as HTML files are usable on the web, but have some inherent proble MS that it is preferable to remove if possible:

Removing MS Word-specific code

Word HTML files contain information that allows you to open them and edit them in MS Word. This creates very bloated HTML code that is more difficult for browsers to display. It is possible to filter out these elements, depending on which version of MS Word you are using.

Either:

  • When saving from newer versions of MS Word, go to 'Save as...' and select ‘Web page, filtered’
  • For older versions of MS Word, a filter may be downloaded from the Microsoft web site (PC only): http://office.microsoft.com/Assistance/2000/htmlfilter.aspx
  • Dreamweaver has a MS Word filter that is used by going to ‘File>Import>Word HTML’.

If you are using MS Word as your only web editor, save a full version of the file for editing purposes, and a filtered version for uploading to the web.

Retaining formatting when saving MS Word as HTML

Depending on the version of MS Word you are using, and how you have formatted the document, one of several things may happen to the text formatting:

Older versions of MS Word:

  • Formatting such as headings etc may be lost, with MS Word instead generating font tags to create font sizes. In this case, it is preferable to strip out font sizes and re-format headings etc using a web editor such as Dreamweaver.

Newer versions of MS Word:

  • If you are using the text styles set up in the standard MS Word template, the file may be saved with text formatting intact, and the correct formats applied to headings etc. (this varies between MS Word versions and platforms)
  • If you have created customised styles, MS Word will instead generate style sheets which approximate the MS Word styles you have created. This is an effective way to format HTML, but may be difficult for you to edit if you are unhappy with the results.

The most foolproof approach is to either use the standard MS Word template and styles, or set up your own template with a very basic set of styles, and see how the HTML file looks when you have saved and filtered it. When you have style settings that you are confident will convert satisfactorily without you needing to re-format the document, save the MS Word document to use as a template for other documents.

Using MS Word/Dreamweaver templates

We have created basic templates that you can use for producing your MS Word documents, then importing them into Dreamweaver, for use in WebCT.

If using older versions of MS Word (pre-2000), you should:

  • Use the MS Word document as a basis for your own, using the styles set up in it.
    (Note: pasting text in from other MS Word documents may import styles from the original document that you will need to remove — save your document as text only before cutting and pasting to avoid this.)
  • Save your document as HTML, and import into Dreamweaver to remove MS Word-specific code. It can then be formatted as you wish, or pasted into the supplied Dreamweaver document.

If using newer versions of MS Word (post-2000), you should:

  • Use the MS Word document as a basis for your own, using the styles set up in it.
    (Note: pasting text in from other MS Word documents may import styles from the original document that you will need to remove - save your document as text only before cutting and pasting to avoid this.)
  • Save your document as ‘Web page, filtered’, and open it in Dreamweaver. It can then be formatted as you wish (to retain style sheets generated by MS Word), or pasted into the supplied Dreamweaver document (to remove MS Word style sheet formatting).

    The template documents, plus a ‘readme’ file on using the documents can be downloaded here:
    http://www.edtec.unsw.edu.au/inter/dload/webmedia/dw_templates/dw_templates.html.

> using PDF

 
 
 
 

 

UNSW Educational Development & Technology Centre RESOURCES home RESOURCES home