Greystone Import
Author: Andy Gardner
1. Introduction
The Greystone import script is designed to import a folder of XML files as structured content in Cascade Server. The import script is not limited to just Greystone XML files as it will work well on any folder of XML files, with slight modifications.
This guide will show you how to use and adjust the Greystone content importer to fit your needs.
*You must have PHP 5 or later. Only those versions support SimpleXML, which is required for this script to operate. You must also have Cascade Server 4.5.1 or later, as support for dynamic metadata is required
2. Components
The importer has very few components, making it fairly easy to use. A breakdown of the components follows.
3. Greystone Configuration file
This is the file that tells the script where your CMS is, where to put the content in the CMS, what metadata and configuration set to use, where your XML to import is, your CMS login, and a log file to log import events. The old images folder specifies where your image links are looking for their images, and the new images folder specifies what directory in the CMS the images are located.
4. Greystone Configuration Data Definition
This is the data definition used to generate the configuration page.
5. The script
The script uses a starting folder containing your XML, and recurs through that folder and its subfolders, importing all XML files it comes across. You will notice a large amount of dynamic metadata imported as well. You need to make sure that you create a metadata set in your CMS that has these fields.
These fields map to the Greystone content as follows:
Greystone tag || Dynamic Metadata Field
pageid = Page ID
libraryid = Library ID
groupid = Group ID
group_short_title = Group Title
page_title = Display Name*
short_page_title = Title*
filename = name of page in CMS
content = main content of page
page_class = Page Class
page_type_id = Page Type ID
meta_keywords = Keywords*
meta_description = Description*
version_number = Version Number
last_modified = embedded in system metadata
language_code = Language Code
publication_date = Publication Date
audience = Audience
serviceline_keyword = Serviceline Keyword
page_keyword = Page Keyword
alternate_language_pageid = Alternate Language Page ID
*Already exists as a wired field in metadata sets
An extra field, called 'Allow Automatic Greystone Update', determines whether or not a page should be updated by the importer. The default value is 'Yes'.
The code is set up to simply create the files from scratch, not edit existing ones. If you want to edit existing ones, simply uncomment the lines marked with the ##EDIT tag. If you want to use the 'Allow Automatic Greystone Update' field, then you will need to perform a read to see if the page exists, and if so, check its contents to see if it allows updates.
NOTE: You must adjust any links inside your content so that the CMS can resolve them. For instance, instead of leaving a link as xhref="apage", you would change it to xhref="/MySite/MyDir/apage".
Files
- greystone-config-data-def.xml
This data definition allows for visually controlling the config file. - greystone-config.xml
This is an example XML config file. - greystone.php
This is the main PHP script that processes the Greystone files. - greystone.log
This is an example log file to keep track of which files have been processed. - greystone-import-PHP.zip
The XML and PHP scripts for the Greystone import.