Preparing your content for importing into a new web site system always, always takes longer than you think, and should be started early. Not having content ready frequently delay's site launches, and increases developer costs.
While it is helpful to have example content for the developers to work with, they need most of the actual content - before their deadline, in order to do the best job.
Many organisations have a body of static content that can be prepared well in advance because it changes very little over time. Starting this process early forces you to ask questions about the shape and structure of your content and can allow you to make the content more useful than if it were done in a panic.
Content
It is best to describe the structure and relationships between the content types before preparing any actual data. Lets say you have a content type 'Mission report', and that it involves a an evaluation, an implementing partner, and metadata like the author, the country, the region and the date. Finally you will probably want some keywords from a pre-defined vocabulary, to describe the subject matter and any issues arising.
Prepare a spreadsheet thus, and put some example data in. Remember that some of these things the CMS will need to understand.
| id |
author |
country |
region |
date |
partner |
budget($) |
evaluation |
| 1 |
John Doe |
Swaziland |
Africa |
9th Jan 2008 |
CARE |
blah blah |
| 2 |
Jane Doe |
Congo |
Africa |
07-04-07 |
Care |
blah blah |
There is a principle of information architecture that you should not repeat data in the system. Instead you should put the data somewhere else, and point to it. So lets look at this data and see how it might be improved.
The id is important. Of all the fields in the table, the id is unique, and that is how the database refers to that item. Every row you prepare should have its own number.
It is likely that the name will come up again if John Doe works for your organisation. Maybe there is more information you want to store or display about him, like his contact details or job title. You should consider making a new table called people, or staff, and keeping John there, and referring to his id instead of his name in this table.
By storing the the country and the region here you are making extra work for the data inputter and allowing mistakes. Swaziland will always be in Africa, so you don't need to store that here. The system can be told that elsewhere, and can display that information with some clever templating. Also countries are re-usable, so you would normally refer to countries by number-reference as well.
Note that the date formats are inconsistent. Try to use something that is numeric so that it's easier for the developer to an appropriate date format later.
Note how CARE is not capitalised in the second row. Maybe it won't notice, but if you want to make a list of all the projects CARE was implementing, then you don't know if the system will be case sensitive. This is another reason for putting implementing partners in another table.
So we've turned one project into four content types, potentially. Projects, countries, people, partners. Now here's the great bit. If jane gets married, you don't change her name on every project she worked on, you change it once in the people table. This structure makes it much easier to view projects by partner, countries by region, jane's projects by date, and any other combination.
You don't have to worry about turning everything into ID numbers in the first place, that can and perhaps should be done automatically. The important thing is that you decide what content types you need and which fields refer to which content types.
Vocabularies
If you want to use a tagging system to categorise your data, I advise you prepare limited vocabularies beforehand, and then modify and add to them as you apply them to the data. You need to decide for each vocabulary which of the above objects it describes, and whether more than one term can be applied. For example here are two vocabularies which could describe a publication in a library.
|
Document Types
book
periodical
tract
url
unpublished |
Subject
development
disasters
disability
water
poverty
urbanisation |
Subject (2)
agriculture
disease
- HIV/AIDS
- Tuberculosis
- Whooping cough
education
- teachers
- - training
- schools
trade
|
The first column is a mutually exclusive category. The second can have multiple terms applied. The third vocabulary is hierarchical. This can provide some great functionality if your CMS supports it.
Menus / Site structure
Please see the
usability page for an idea of how to determine your site structure. Matters like this can lead to non-productive internal debates, so it's good to throw it open to the users to decide.
Written content
In contrast to the structured list of projects mentioned above, you will need to provide copy for all the main pages on your site. I suggest this be done last, perhaps while the developers are at work on the CMS. Then the copy can be written with a better idea of how the final system will look and work. Also these pages 'fit around' the rest of the content so it's less important that they be written first.
Information hierarchy
Normally you might fire up word and start writing a new document, and you work out how big the headings are as you go along. However if you are writing into an existing information hierarchy, you'll need to be aware of it. Some larger organisation use Word's horrendous 'styles' feature, but most ignore it. The point is, it's not for the author to decide the sizes and colours of headings, the author should be writing within a style framework, stating only the importance of the heading. As the content is displayed by web site, a level 3 heading is given a color and size according to the site's style sheet.
At some point all your content will have to be converted into HTML and every heading will need to be tagged as h1, h2, h3 etc. This problem needs to addressed anew by each organisation depending on the import tools available and the format of the source documents, but generally small organisations should be more aware of the issues of consistent style markup